Download - From Data Dirt Roads to Infocosm
Infocosm (Amit Sheth) 1
FromFrom Data Dirt Roads Data Dirt RoadsToTo InfocosmInfocosm
Amit ShethAmit ShethLarge Scale Distributed Information Systems LabLarge Scale Distributed Information Systems Lab
University of GeorgiaUniversity of Georgiahttp://www.cs.uga.edu/LSDIShttp://www.cs.uga.edu/LSDIS
[email protected]@cs.uga.edu
Special Thanks: Vipul Kashyap, Srilekha Mudumbai
Invited Talk, 7th Intl. Conf. on Management of Data, Pune, India, Dec. 29, 1995.[Some parts of the talk emphasize issues and perspectives of particular interest to developing countries.]
Infocosm (Amit Sheth) 2
OutlineOutline
! Infrastructurecomputing and communication to support informationsociety in the next century
!State-of-the-artInternet, WWW, Electronic Commerceunique opportunities due to Network Computing
!New Challenges in Information Management
Infocosm (Amit Sheth) 3
Our Journey to theOur Journey to theInformation SocietyInformation Society
! Information (Data) Superhighway (aka Infobahn)material/physical object-- geography, distance*infrastructure not services or application*
high cost of broadband fiber-optic networks
promoted TV (not computer) as the user devicecomputer has been found to be a better starting point
promise of applications envisaged earlier have fizzled500 channel TV, VOD and the interactive TV? What went wrong?
* see also for related discussion: The Road Ahead, Bill Gates.
Infocosm (Amit Sheth) 4
Journey and DestinationJourney and Destination
! “The Information superhighway is animpoverished metaphor -- it describes only ameans of transportation. We need a descriptionof the destination, of the Infocosm.” [Ferguson]
Glover Ferguson, Computer World, Vol. 1, Iss. 6, July 17 1995.
Infocosm (Amit Sheth) 5
Destination InfocosmDestination Infocosm
a society whose members (“organisms”) can havemore effective decision making capability usinginformation that is available whenever needed, atany place, and in (m)any form(s) [Sheth 93]a world where people will work, learn and play,unconstrained by time, place and form [Ferguson 95]
Sheth and Kashyap, Information Brokering- A Key Challenge inthe emerging Infocosm, December 1993.
Glover Ferguson, Computer World, Vol. 1, Iss. 6, July 17 1995.
Infocosm (Amit Sheth) 6
Related ThemesRelated Themes
!Telecosm (George Gilder)focus on communication and data
need to add computing and information
!Telepresence! “Information at Fingertips”*
Infocosm (Amit Sheth) 7
ComponentsComponents
InformationInformation
ComputingComputing CommunicationCommunication
Infocosm (Amit Sheth) 8
Computing and CommunicationComputing and Communication
!The last decade belonged to computing,PC is 120B$ business.
!Next decade will belong to communication.!Future computing will be networked oriented
(analogy of neural nets: “the intelligence will be in the network”).We will sell and use Information, not data."Opportunity for Developing Countries!
Infocosm (Amit Sheth) 9
Wireless communicationWireless communication
• Stratospheric market estimates are norms, butprior expectations have exceeded
• Estimates of wireless communication devices:224M by the year 2000, 300M by the year 2002
• Growth rate of 40-50% => economy of scale• Tremendous number of alternatives• India and developing countries will have larger
share of wireless than most expect
Infocosm (Amit Sheth) 10
Wireless Mela?
EBB, February 1995.
Infocosm (Amit Sheth) 1
Wireless communication andWireless communication andcomputingcomputing
PCS
0
10
20
30
40
50
60
1994 1996 1998 2000 2002 2004 2006
Paging
Cellular
U.S. installed base (millions)
EBB, February 1995
Infocosm (Amit Sheth) 2
TelecommunicationsTelecommunications-- a few observation about India-- a few observation about India
Madras ISDN-trials were announced for 8/95in Madras (probably delayed?)$500 deposit (Handset) to $1550 (PBX)
Internet-access announcede-mail available in several cities; WWW in the few largest onesPossible significant use of satellite
64Kbps dedicated lines are routinely used by largecompanies for international communication
Expect big take-off for paging and for possibly cellular
Privatization, while delayed, will have the biggest impact
Infocosm (Amit Sheth) 3The Economist, November 18, 1995
Possible Temporary Setback
Infocosm (Amit Sheth) 4
CYBERSPACE COMPONENTSCYBERSPACE COMPONENTS
END-USER SERVICES & APPLICATIONS
INFRASTRUCTURE SERVICES:PUBLIC & PRIVATE/COMMERCIAL NETWORKS
SOFTWARE PROTOCOLS & “STANDARDS”
COMMUNICATION INFRASTRUCTURE &PROTOCOLS
NODES & REPOSITORIES[ TEXT, AUDIO, IMAGE, VIDEO STRUCTURED DATABASES]
Infocosm (Amit Sheth) 5
CYBERSPACE COMPONENTSCYBERSPACE COMPONENTS
End-User Services & Applications• Electronic Commerce• Information Commerce/ Mall• Digital Library• Video on demand, 500 channels• Edutainment• Virtual Corporation
Infrastructure Services•WWW•White Board, Groupware (Notes)• Payment / Billing / Collection• Security• Authentication
Software Protocols & Standards•EDI, PDES, HTML, ...• http , -telnet, r-login• ftp, X.400
Communication Infrastructure•Periodic Connection;On-line Connection• X.25 Wireless Wired(paging) CopperCellular CableSatellite Fiber
one waytwo way
Infocosm (Amit Sheth) 6
InternetInternet-- a few personal experiences-- a few personal experiences
! organizing international trip! e-mail exchange with a friend in Ahmedabad! ftp and WWW for book project management! use of WWW for
paper distributioncomplete workshop managementpublicity for the lab (4000+ accesses per week)direction to my home
Infocosm (Amit Sheth) 7
InternetInternetCommerceNet-Neilsen SurveyCommerceNet-Neilsen Survey
! 11% (24M) use Internet in USA and Canada;17% (37M) have access; 8% (18M) use Web
!Use in last 24 Hours: Access the Web (72%);Send e-mail (65%); Non-interactive discussion(36%); Download Software (31%); Use Anothercomputer (31%); Interactive Discussion (21%),Real-time audio or video (19%)
!Average use: 5.5 hr per week!
Infocosm (Amit Sheth) 8
WWWWWW
Also from CommerceNet-Neilsen survey--Also from CommerceNet-Neilsen survey--Use of the Web forUse of the Web for
Browse or explore (90%)search for other information (73%)search for information on companies/organizations(60%)Search for information on products/services (55%)purchase product or services (14%; 2.5M)
Web users are upscale with an annual income ofWeb users are upscale with an annual income ofmore than $80K (or $50K depending on the survey).more than $80K (or $50K depending on the survey).
winning the whole world?winning the whole world?weird, wacky, and wow!weird, wacky, and wow!
Infocosm (Amit Sheth) 9
Internet Multimedia MilestonesInternet Multimedia Milestones
Transport Multicasting World Wide Web
1995
1996
1997
Rapid adoption of 56kbit/saccess by small businessesand branch offices via ISDN, frame relay and leased lines
Mbone backbone supportslimited multicasting
Object technology for Web browsers:Hot JavaOpen DocOLEVRML
ATM backbone deploymentby Internet service providersRegional Bell operating companiesbundle ISDN and Internet accessfor small business market
Real-time audio and videoservers based on pseudo-multicasting
Enhanced real-time Internetproducts ship:24-bit Color CU-SeeMe FM Real Audio PlayerInternet Phone, NetphoneMultimedia Netscape Navigator
Projected ISDN installed base : 1.6 million lines (U.S.) 7.68 million lines (global)Projected installedbase of V.34 modems : 5.4 million (global)Telcos bundle ISDN and Internetaccess for home market
Widespread adoption ofIP next generation protocolwith support for broadcasting
Widespread use of Web-centricreal-time Internet tools for entertainment, distance learning,and conferencing
OEM Magazine September 1995
Infocosm (Amit Sheth) 1
Infocosm (Amit Sheth) 2
Infocosm (Amit Sheth) 3
Infocosm (Amit Sheth) 4Wired Oct. 1995
Infocosm (Amit Sheth) 5Netguide June 1996
Infocosm (Amit Sheth) 6Web Week December 1995
Infocosm (Amit Sheth) 7
WWW: Conquereing the Business WorldWWW: Conquereing the Business WorldExample of use in a Workflow ApplicationExample of use in a Workflow Application
SDOH andSDOH andCHREFCHREFmaintainmaintaindatabases,databases,support EDIsupport EDItransactionstransactions
Hospitals and clinics updateHospitals and clinics updatecentral databases aftercentral databases afterencountersencounters
Health providers can obtain up-to-dateHealth providers can obtain up-to-dateclinical and eligibility informationclinical and eligibility information
State and HMO’sState and HMO’scan updatecan updatepatient’s eligibilitypatient’s eligibilitydatadata
Health agencies canHealth agencies canuse reports generated use reports generated to trackto trackpopulation’s needspopulation’s needs
TRACKING SUBSYSTEMTRACKING SUBSYSTEM
Generates:Generates:•• alerts to identify alerts to identifypatient’s needs.patient’s needs.•• contraindications to contraindications tocaution providers.caution providers. Reminders to parentsReminders to parents
Reports to stateReports to stateCTCT
Hospitals andHospitals andcase workerscase workerscan reachcan reachout to theout to thepopulationpopulation HMOs can keep trackHMOs can keep track
of performanceof performance
CLINICAL SUBSYSTEMCLINICAL SUBSYSTEM
Healthcare Info Infra. Tech.project: UGA and CHREF
Infocosm (Amit Sheth) 8
List of OverdueVaccinations
Link to contraindicationinformation obtained from
the InternetClinical Aspects
Web Browsers: Conquereing the Business WorldWeb Browsers: Conquereing the Business WorldExample of use in a Workflow ApplicationExample of use in a Workflow Application
Interface for a PhysicianInterface for a Physician
Infocosm (Amit Sheth) 9
Electronic commerce -- statisticsElectronic commerce -- statistics
Year # of Companieson the WWW
Sales on theInternet
1994 29,000 100K
1995 152,000 75,000K
1996 553,000K(projected)
Table: NBC NewsChart: The Economists
90 91 92 93 94 95
Financialservices
Publishing
Law
500
400
300
200
100
Infocosm (Amit Sheth) 10
Infocosm (Amit Sheth) 1
Saturn Corp. (Automaker http://www.saturncars.com)
Traffic :84,000 people a month reading 27,000 pagesVision :Use web to build image as innovative company; build customer relationshipsWhat you can do: View 1996 models , find aretailer, read Saturn magazine, order brochure,locate and write other owners via bulletin boardPayoff: 25% of brochures requested via Web
Fidelity Investments( $14 billion mutual funds investor http://www.fid-inv.com )
Vision: Use web as new distribution and sales channelWhat you can do: Review and select 160 mutual funds, plan college and retirement savings ,download software demo, participate in survey and “Guess The Dow” contestPayoffs: Undisclosed savings in mailing, handling &printing from electronically delivered prospectuses.
W. W. Grainger, Inc. ($3 billion wholesaler and distributor http://www.grainger.com/index.html)Traffic: 3,000 pages downloaded weeklyVision: Create low-cost way to expand sales reach; lower acquisition costs for customersWhat you can do: Search product databases, review new products, locate branches worldwide,send E-mail, order catalogPayoff: Detailed customer demographics and feedback helps set direction
COMPUTERWORLD November 20, 1995
Infocosm (Amit Sheth) 2
World PCWorld PCaka aka Information Appliance Information Appliance aka aka Browser BoyBrowser Boy
! 500$-700$ PC supporting network computingno hard disk
! $50 LSI Logic “superchip” that incorporates amicroprocessor, memory, high speed modem and audioand video processorJava and Servers complete the computing paradigmCan reach much larger population
many more would have $500 disposable compared to $2000more appealing to less technical user
Infocosm (Amit Sheth) 3
JavaJava
!New programming language (subset of C++)especially suitable to run on network (from Sun)
!Applets (small efficient program) delivered onnetwork
!Microsoft licensed it (a first for Microsoft), IBMtoo
!Already hundreds of applications, includingspreadsheets, wordprocessors and games
Infocosm (Amit Sheth) 4
JavaJava
User asks for object
Browserdoesn’tunderstandobject type
ObjectDisplayed
Object
Javacode tosupport object
request
reply
request
reply
Browser Network Server
Time
•Java: C++ minus : Typedefs, Preprocessor, ... Functions, Multiple Inheritance Opeartor Overloading, PointersPlus: Multithreading, ...
• Server Site: Java Souce Compiler Byte Codes
• Client Site:Class Loader Byte Code verifier
InterpreterRun-time
Infocosm (Amit Sheth) 5
Browser Boy vs Bill Gates*
* Richard Shaffer, Forbes, December 4, 1995.
The Economist, Oct. 14, 1995.
Java
Infocosm (Amit Sheth) 6
Network computing--Network computing--the Equalizerthe Equalizer
Impact on marketing!Marketing a software product will no longer
involve the huge investment; WWW provideslevel playing field-- (almost) as easy to have thepresence on the Web for a small company; levelplaying field in delivery, sales, payment
Infocosm (Amit Sheth) 7
An opportunity for DevelopingAn opportunity for DevelopingCountriesCountries
!New Communication Infrastructure Alternativespotential for fast catch-up
!New Computing Paradigm leading todiminishing importance of geographic separation and distancenew marketing, sales, support alternativesnew ways to interact with clients and customersnew ways to develop software, new software marketplace
!New commodity to sell-- information
Infocosm (Amit Sheth) 8
Focus on informationFocus on information
!Exponential growth in the capability ofcomputing (Moore’s law) and communicationbandwidth is well documented.
!Our ability to represent information andknowledge: from numbers and letters to objectsand relationships, from syntax to semantics, fromtransactions to workflows, from data toinformation, ... has received less attention, isharder to address, and has lagged.
Infocosm (Amit Sheth) 9
Data vs InformationData vs Information
DataSet of facts
Data Measurements about the real worldobtained from human/machine sensors
Interoperability ==> transformation acrossdifferent forms, representations and querylanguages
Data + Knowledge about Meaning of data+ Knowledge of when to apply it
InformationApplication of facts and knowledge of“when” to use facts
Derivation from facts using cognitiveand perceptual processes
Interoperability => transformation of knowledge to make it suitable for application of different facts in a differentenvironment
= Information
Information can be used for decision making based on data
Infocosm (Amit Sheth) 10
Technical Challenges inTechnical Challenges inGlobal Information SystemsGlobal Information Systems
Difficulties in information accessDifficulties in information access::cosmic Easter egg hunt cosmic Easter egg hunt problemproblem-- hard to locate and access pertinentinformation;write-only database write-only database problem problem -- easy to create, hard to maintain
ScaleScale: : needle in the haystack needle in the haystack problemproblemvast amount of information; large number of autonomous sites
HeterogeneityHeterogeneity:: tower of Babel tower of Babel problemproblemInformation represented in different ways
Query expressivenessQuery expressiveness: : the Pidgin the Pidgin problemproblemquery language not expressive enough to specify the user’s interest
Information OverloadInformation Overload::too much junk (less relevant) information on the network
Infocosm (Amit Sheth) 1
Some approaches ....Some approaches ....
User centered approachUser centered approach::menu-based browsinghypertext browsing
Syntactic/structural approachSyntactic/structural approach::information retrieval, indexing techniquesname and attribute-based search, pattern matching
Descriptive (symbolic) semantics-based approachDescriptive (symbolic) semantics-based approachmaking design assumptions explicitcapturing the semantics of the query
Cognitive (sub-symbolic) semantics-based approachCognitive (sub-symbolic) semantics-based approachPattern/Speech Recognition AlgorithmsNeural Networks
Infocosm (Amit Sheth) 2
Challenges with current techniques forChallenges with current techniques forInformation Resource DiscoveryInformation Resource Discovery
Unattractiveness of Navigation and Browsing:tend to give up if the number of links are more than 3 or 4need to annotate links with contextual information in order to help reduce the “link-chasing”
Scalability problems in Indexing information:cannot index all the information on the internet !!difficult to index heterogeneous but related informationcombining results obtained by using independent/ different indices
Hard to maintain pre-determined relationships:file update might make some hyper-links meaningless !!hierarchical organizations might prove expensive to searchif user specified criteria for search is different from criteria of organization
Infocosm (Amit Sheth) 3
Infocosm viaInfocosm viaInfoHarness and InfoQuiltInfoHarness and InfoQuilt
! InfoHarnessaccess, scale, heterogeneity
! InfoQuiltquery expressiveness, semanticscorrelation of heterogeneous media
InfoHarness is a trademark of Bellcore. Adapt/X Harness is a commercial productbased on the InfoHarness system (see http://www.bellcore.com/features/index.html).
Infocosm (Amit Sheth) 4
InfoHarness: Business Need ExampleInfoHarness: Business Need Example
Req., Design, ....Documents(Framemaker)
Source code( C functions ), man pages( Unix files )
Figures( postscript files )
Third party tools
Where ?How to access ?
A Software Business House
* Leon Shklar, Satish Thatte
Infocosm (Amit Sheth) 5
InfoHarness: Business Need ExampleInfoHarness: Business Need Example
Req., Design, ....Documents(Framemaker)
Source code( C functions ), man pages( Unix files )
Figures( postscript files )
Third party tools
A Software Business House*
Now I know ...
InfoHarness
- Uniform access- Integrated view of heterogeneous information
* Leon Shklar, Satish Thatte
Infocosm (Amit Sheth) 6
InfoHarnessInfoHarness
Dealing with Data Heterogeneity:Use of Domain Independent Metadata
1. Information Unit1.1 Type1.2 Location1.3 Other Attributes
2. List of Collections thatinclude this IHO
Text file(or its portion), bitmap, emailmessage, manpage, directory of man pages
Physical Data•Logical structuring of information space without restructuring, reformatting or relocating•Accessing information via logical units•Utilizing third party indexing tools to search for information
Results of the WAIS QUERY Let’s Lookat the 2ndarticle.
Keyword-based Access
Kilpatrickis not theauthor! Heis referenced.
We can use keywords to querythe WAIS collection, but we cannot provide the semantics “author”with the keyword “kilpatrick”
An attribute-basedaccess method allowsthe specification ofsemantics like “author”and “date”. The typedattribute “date” allowsdata access notsupportedby keyword basedmethods
Attribute-based Access
The results will onlycontain those articlesauthored by Kilpatrickthat were posted afterJuly 1, 1995.
Keyword-based and Attribute-based access are complementary
Infocosm (Amit Sheth) 1
InfoHarness Project: ScalabilityInfoHarness Project: Scalability
Partition 1(Database of Textual object)
Partition 2(Database ofTextual object)
. . . . Partition n(Database of objects with Textual and Image Components)
AttributeMetadata
Index11
Index12
Partition 1 Metadata Object Partition 2 M.O.
Index21
Partition n M.O
Indexn1
Indexn2
CombiningPartial ResultsQuery Processor
Query Result
http://www.cs.uga.edu/LSDIS/infoharness
Infocosm (Amit Sheth) 2
INFORMATION COMMERCEINFORMATION COMMERCEA proposed Architecture
INFORMATION BROKERING
INFORMATION PROVIDERS
INFORMATION CONSUMERS
I n f o rm a t i onR e q u est
. . .
. .
Ontologies/User Models
I n f o r m a t i o nSy st e m IS 1
I n f o r m a t i o nSy st e m IS 2
I n f o r m a t i o nSy st e m ISm
I n f o r m a t i o nR e q u e st
I n f o r m a t i o nR e q u e st
Infocosm (Amit Sheth) 3
An a t o m y o fAn a t o m y o fI n f o r m a t i o nI n f o r m a t i o n
B r o k e r i n g T a sk sB r o k e r i n g T a sk s! I n f o r m a t i o n R e so u r c e
Di sc o v e r yi d e n t i f i c a t i o n o f t h e i n f o r m a t i o nso u r c e s r e l e v a n t t o a g i v e n q u e r yo r i n f o r m a t i o n n e e d
!Q u e r y P r o c e ssi n g I n f o r m a t i o n F o c u si n g
i d e n t i f i c a t i o n o f t h e su b se t o fi n f o r m a t i o n i n a g i v e n i n f o r m a t i o nso u r c e r e l e v a n t t o a g i v e n q u e r y
I n f o r m a t i o n Co r r e l a t i o nc o m b i n i n g t h e r e l e v a n ti n f o r m a t i o n f r o m d i f f e r e n t
Infocosm (Amit Sheth) 4
Challenges in Information BrokeringChallenges in Information Brokering
Ne w Ch a l l e n g e s a n dR e se a r c h Di r e c t i o n s
!Se m a n t i c s- - k e y t oi n f o r m a t i o n
wh a t d o y o u wa n t ? wh a t i sa v a i l a b l e ?R e l a t i o n sh i p b e t we e nst r u c t u r e a n d se m a n t i c s
Co n t e x t , c o n t e x t , c o n t e x tUn c e r t a i n t y , p a r t i a li n f o r m a t i o n , i n c o n si st e n c y
Infocosm (Amit Sheth) 5
Se m a n t i c s ?Se m a n t i c s ?Wh a t ? Wh e r e ?Wh a t ? Wh e r e ?
Vocabulary Ontology (domain specific)
Content Metadata
(domain specific metadata)
Content-descriptive
Data
(abstract structure)content-based (indices)
content-independentStructure
+ relationships
what is semantics ? Where is semantics ?
Infocosm (Amit Sheth) 6
CO NT E XTCO NT E XT
Q u e r y t o Wh i t e Ho u se se r v e ra sk i n g f o r d o c u m e n t s o n“ I n d i a ” .2 562 9 O f f i c e - o f - Navajo-and-Hopi-Indian- R e l o c a t i o n2 5654 Na t i o n a l - Co m m i ssi o n - o n - American-Indian, - Al a sk a -
Na t i v e , - a n d - Na t i v e - Ha wi i a n - Ho u si n g2 5668 I n st i t u t e - o f - American-Indian- a n d - Al a sk a -
Na t i v e - Cu l t u r e - a n d - Ar t s- De v e l o p m e n t1 4 8 62 6 National-Indian- Ga m i n g - Co m m i ssi o n1 4 8 63 2 Bu r e a u - o f - Indian-Affairs1 53 62 2 P u b l i c - a n d - Indian-Housing- P r o g r a m s1 58 9 3 0 Indian-Health-Services1 3 3 2 0 6 1 9 9 4 - 0 4 - 2 9 - Ba b b i t t - a n d - De e r - Br i e f i n g -
o n - Indian-Affairs1 3 3 3 0 5 1 9 9 4 - 0 4 - 2 9 - P r e si d e n t - i n - M e e t i n g - wi t h -
Indian-Tribal- L e a d e r s3 0 8 3 7 1 9 9 4 - 0 5- 1 9 - P r e si d e n t - a n d - India-PM-RAO-
i n - P r e ss- Av a i l a b i l i t y3 0 9 2 5 1 9 9 4 - 0 5- 1 1 - P r e si d e n t - Na m e s- F r a n k -
Wi sn e r - a s- Ambassador-to-India8 1 9 60 1 9 9 4 - 0 8 - 0 2 - E i g h t - Na m e d - Na t i o n a l -
Ad v i so r s- o n - Indian-Education8 1 9 62 1 9 9 4 - 0 8 - 0 3 - F o u r - o n - Am e r i c a n - Indian-
Culture- De v e l o p m e n t - Bo a r d1 7 9 3 9 5 Aid t I di 10 01 93
Infocosm (Amit Sheth) 7
E n a b l i n gE n a b l i n gI n f o c o smI n f o c o sm
Using metadata PatchQuilt and user models/ontologies to support informationrequests over globally distributed heterogeneous media repositories
InfoQuilt Project:
http ://www .c s .uga.e du/LSDIS /in foquilt
Infocosm (Amit Sheth) 8
I n f o Q u i l tI n f o Q u i l t
Semantic Relationships betweenMetadata
Q u e r y : Ge t m e r e g i o n s( b l o c k s, c o u n t i e s) h a v i n ga population g r e a t e r t h a n 50 0a n d area g r e a t e r t h a n 50sq f e e t h a v i n g a n u r b a nland c o v e r a n d m o d e r a t erelief q u e r y r e p r e se n t s se m a n t i cr e l a t i o n sh i p s b e t we e n t h em e t a d a t a :
Infocosm (Amit Sheth) 9
I n f o Q u i l tI n f o Q u i l t
Population:Area:
Land Cover:Relief:
Correlation
St r u c t u r e dDa t aUS Ce n su sB u r e a u
SQL Queriesreturn blocks,counties
SQL Queries returnboundaries of blocks,counties
St r u c t u r e dDa t aT I GE R / L i n e
I m a g eDa t a
L a n d Co v e rE l e v a t i o n
IP functions compute regions Land cover, Relief
SQL queries return blocks with Land cover, relief
Correlation of blocks satisfying various constraints in different databases!!
Extraction of Domain Specific Metadata
Infocosm (Amit Sheth) 1 0
I n f o Q u i l t :I n f o Q u i l t :M u l t i m e d i aM u l t i m e d i aCo r r e l a t i o nCo r r e l a t i o n
Infocosm (Amit Sheth) 1
InfoQuilt:InfoQuilt:Multimedia CorrelationMultimedia Correlation
Infocosm (Amit Sheth) 2
InfoQuilt/OBSERVER:InfoQuilt/OBSERVER:Vocabulary SharingVocabulary Sharing
Ontology-Based System Enhanced with Relationships for Vocabulary hEterogeneity Resolution
Infocosm (Amit Sheth) 3
InfoQuiltInfoQuilt
•Top-down processing: - Capture the context of user query - Construct and display ontologies for specific domains
• Bottom-up processing: - Extract metadata from information source - Generate mappings between metadata and information - Construct information resource context from the metadata
COMPARE
DomainOntology
QueryContext
InformationResourceContext
Metadata
Using descriptive and content-based metadata approachUsing descriptive and content-based metadata approach
Infocosm (Amit Sheth) 4
A MessageA Message
!Emerging network computing and increasingimportance of communication and the availablealternatives are tearing up traditionalgeographical and market boundaries, and givingonce-in-a-life-time opportunity for the developingcountries to catch-up
!Use information to be happy; know how tosupply information to be wealthy
!But make no mistake-- data is not information
Infocosm (Amit Sheth) 5
Memorable and Interesting QuotesMemorable and Interesting Quotes
“By means of electricity, the world of matter has become a great nerve, vibratingthousand of miles in a breathless point of time. The round globe is a vast ... brain,instinct with intelligence!” [AI Gore’s Quotation of Nathaniel Hawthorne, 1851]
Zooming is when you overcome your fears and trust the universe to make things right.You fly and float and hum and weave and sing. Opportunity knocks. Hello! I likeplaying with people who zoom. Win-win deals all the time. It’s cooool ..... On reallycool days I zoom. On reallllllly cooooooooool days I zooooooooom.
[Dave Winer <[email protected]>, DaveNet, 9/22/95]
Infocosm (Amit Sheth) 6
About the SpeakerAbout the SpeakerDr. Amit Sheth directs the Large Scale Distributed Information Systems (LSDIS) Lab, is an AssociateProfessor of Computer Science at the University of Georgia, and an Adj. Assoc. Professor in the College ofComputing at the Georgia Institute of Technology. Earlier he worked for nine years in the R&D labs atBellcore, Unisys, and Honeywell. His primary current research interests include workflow automation(project METEOR), management of heterogeneous digital data and semantic issues in global informationsystems (projects InfoHarness and InfoQuilt), and electronic/information commerce.
Prof. Sheth has led projects on heterogeneous DBMS, factory information system, integration of AI-database systems (project/system BrAID), transactional workflows (PROMPT and METEOR), federateddatabase tools (BERDI and TAILOR), multidatabase consistency, and data quality(Q-Data). LSDIS lab(http://www.cs.uga.edu/LSDIS) maintains active collaboration with industry, and has won significant fundedprojects in the areas of interoperable and global information system. Prof. Sheth has published over 80papers in the areas of federated databases, workflow management, multidatabase consistency, metadata andinformation modeling, and data consistency and semantics. He has participated in over 30 program/organiz-ation committees for conferences , given over 45 invited and colloquia talks and 14 tutorials, and lead twointernational conferences and a workshop as a General/Program (Co-)Chair. Currently he is aGeneral (Co-)Chair of the Intl. Conference on Cooperative Information Systems, the Program Chairof the NSF Workshop on Workflow and Process Automation, and is on the editorial board of five journals.He has also served twice as an ACM Lecturer.
Infocosm (Amit Sheth) 7
Partial BibliographyPartial BibliographyBesides the articles referred in the presentation, there is considerable information in popular and technical literature on
the topics of information superhighway, metadata, electronic commerce and related topics. I suggest using any ofthe existing Web tools for a search (for example, most Internet-tools will return more than one hundred URLs forany of these topics). Because the list is too large, below is a partial list of research publications with which thespeaker has been associated. These and other LSDIS publications can be obtained fromhttp://www.cs.uga.edu/LSDIS.
A. Sheth and L. Kalinechenko, "Information Modeling in Multidatabase Systems: Beyond Data Modeling" (invited paper) Proc of the 1st InternationalConference on Information and Knowledge Management (CIKM), Baltimore, November 1992.
A. Sheth and V. Kashyap, "So Far (Schematically) yet So Near (Semantically)" (invited paper) Proc of the DS-5 Semantics of Interoperable DatabaseSystems, Lorne, Australia, November 1992; In IFIP Transactions A-25, North-Holland, 1993.
V. Kashyap and A. Sheth, "Semantics-based Information Brokering: A step towards realizing Infocosm", Technical Report DCS-TR-307, Dept. ofComputer Science, Rutgers University, March 1994 (Position Paper, December 1993).
V. Kashyap and A. Sheth, "Semantic based Information Brokering" Proceedings of the 3rd Intl. Conf. on Information and Knowledge Systems,November 1994.
W. Klas and A. Sheth, Eds., "Metadata for Digital Media", Special issue of SIGMOD Record, December 1994.L. Shklar, A. Sheth, V. Kashyap, and K. Shah, "InfoHarness: Use of Automatically Generated Metadata for Search and Retrieval of Heterogeneous
Information" Proceedings of CAiSE-95, June 1995.V. Kashyap and A. Sheth, "Schematic and Semantic Semilarities between Database Objects: A Context-based Approach" to appear in the VLDB
Journal.V. Kashyap, K. Shah, and A. Sheth, "Metadata for building the MultiMedia Patch Quilt" (to appear in) Multimedia Database Systems: Issues
and Research Directions, S. Jajodia and V.S.Subrahmaniun, Eds., Springer-Verlag, 1995.A. Sheth, V. Kashyap and W. LeBlanc, “Attribute-based Access of Heterogeneous Digital Data,” Proceedings of the Workshop on Providing Web
Access to Legacy Data, the 4th International World Wide Web Conference, December 1995.A. Sheth, "Data Semantics: What, Where and How?" to appear in Database Application Semantics, Proceedings of the 6th IFIP Working
Conference on Data Semantics (DS-6), R. Meersman and L. Mark (Eds.), Chapman abd Hall, London, UK, 1996.