![Page 1: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/1.jpg)
Bioinformatics 2.0/3.0
Kei Cheung
Yale Center for Medical Informatics
![Page 2: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/2.jpg)
Outline
• Introduction
• Web 2.0
• Web 3.0 – Semantic Web– Topic Map
• Merging Web 2.0 and Web 3.0
![Page 3: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/3.jpg)
Introduction
• The Human Genome Project (HGP) has transformed genome sciences from being experimental to being increasingly computational
• HGP has intensified the growth of bioinformatics• The Web has become a popular medium for accessing
information over the Internet• Numerous bioinformatics databases and tools are Web
accessible• These databases and tools as well as the Web have
become indispensable for modern-day genomic research• Web 1.0 -> Web 2.0 -> Web 3.0
![Page 4: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/4.jpg)
Web 1.0
• It is read-only
• It is about a single person, organization, …
• It is document centric
• It is based on HTML
• It is for human to read
![Page 5: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/5.jpg)
Web 2.0
![Page 6: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/6.jpg)
Web 2.0
• Social networking (wiki, blog, tagging, bookmarking, rating, etc)
• Multimedia content (photo, audio, video, etc)
• Interactive, responsive, and dynamic web interface (Facebook, Flickr, YouTube, etc)
• Mashup (assembly tools and visualization tools)
![Page 7: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/7.jpg)
Folksonomy (Social Tagging)
• Folksonomy is the practice and method of collaboratively creating and managing tags to annotate and categorize content
• In contrast to traditional subject indexing, metadata is not only generated by experts but also by creators and consumers of the content
• Freely chosen keywords are used instead of a controlled vocabulary
![Page 8: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/8.jpg)
Tag Cloud
• A tag cloud (or weighted list in visual design) is a visual depiction of user-generated tags used typically to describe the content of web sites.
![Page 9: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/9.jpg)
Web 2.0 (cont’d)
• It is decentralized
• It is a community/collaborator model instead of authority/consumer model
• It is fun
• It can be seriously used to share and integrate scientific datasets and algorithms
![Page 10: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/10.jpg)
Bioinformatics Applications of Web 2.0
![Page 11: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/11.jpg)
Wiki Proteins
![Page 12: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/12.jpg)
Nature Precedings (pre-publication research and preliminary findings)
![Page 13: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/13.jpg)
Scientific Podcasts
![Page 14: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/14.jpg)
Multimedia (cont’d)
![Page 15: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/15.jpg)
Journal of Visualized Experiments
![Page 16: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/16.jpg)
myExperiment
![Page 17: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/17.jpg)
Mashup (1): Assembly Tools
• Dapper (scrape web content and convert it into machine readable format)
• Yahoo! Pipes (fetch, filter, and integrate data)
![Page 18: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/18.jpg)
Yahoo! Pipes Demo
![Page 19: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/19.jpg)
Yahoo! Pipes Use Case
![Page 20: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/20.jpg)
GeoCommons: Mashup of Maps
![Page 21: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/21.jpg)
Mashup (2): Visualization Tools
• E.g., Google Earth
![Page 22: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/22.jpg)
Geo-Mashup: Google Earth (tracking H5N1 virus over time)
![Page 23: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/23.jpg)
Bioinformatics Mashup’s
• Mashup of biological entities of the same type– Protein network mashup– Sequence annotation mashup
• Mashup of biological entities of different types
![Page 24: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/24.jpg)
Mashup of pathway data and gene expression data
Calvin cycle pathway associated with gene expressions
![Page 25: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/25.jpg)
Challenges to Data Mashup
• Lack of annotation
• Lack of links
• Lack of link semantics
• Lack of data semantics
• Lack of standards or use of standards
![Page 26: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/26.jpg)
Lack of Semantic Annotation
Kei Tsi Daniel Cheng(this is not me!!)
Kei Cheung (16 years ago)
Kei Cheung(6 months ago)
![Page 27: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/27.jpg)
Lack of Links
colllaborators
![Page 28: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/28.jpg)
Lack of Link Semantics
(?)prototyped
![Page 29: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/29.jpg)
Lack of Data Semantics
<html”<body> …<table><tr><td>Alcohol Dehydrogenase 1B (class I), beta polypeptide</td><td>ADH1B</td></tr> …</table> …</body></html>
![Page 30: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/30.jpg)
Lack of Standards (Use of Standards)
• Different naming rules (based on phenotype, sequence, function, organisms, etc)– Armadillo (fruitflies) vs. i-catenin (mice)– PSM1 (human) = PSM2 (yeast); PSM1 (yeast) = PSM2 (human)– Sonic Hedgehog
• ID proliferation – Different ID schemes: 1OF1 (PDB ID) and P06478 (SwissProt
ID) correspond to Herpes Thymidine Kinase– Lexcial variation: GO1234, GO:1234, GO-1234
• Synonyms vs. homonyms– Dopamine receptor D2: DRD2, DRD-2, D2– PSA: prostate specific antigen, puromycin-sensitive
aminopeptidase, psoriatric arthritis, pig serum albumin
![Page 31: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/31.jpg)
Web 3.0
![Page 32: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/32.jpg)
Web 3.0
• It refers to a third generation of Internet-based services that emphasize machine-facilitated understanding of information in order to provide a more productive and intuitive user experience. – Semantic Web– Topic Map
![Page 33: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/33.jpg)
Semantic Web• "The Semantic Web is an extension of the current web in which
information is given well-defined meaning, better enabling computers and people to work in cooperation." -- Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001
• It provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries
• It is based on the Resource Description Framework (RDF)– URI for naming/identify web objects– Graph structure (directed acyclic graph or DAG) for connecting web
objects
![Page 34: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/34.jpg)
Resource Description Framework (RDF)
• It is a standard data model (directed acyclic graph) for representing information (metadata) about resources in the World Wide Web
• In general, it can be used to represent information about “things” or “resources” that can be identified (using URI’s) on the Web
• It is intended to provide a simple way to make statements (descriptions) about Web resources
![Page 35: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/35.jpg)
RDF Statement
A RDF statement consists of:• Subject: resource identified by a URI• Predicate: property (as defined in a name space identified by a
URI) • Object: property value (literal) or a resource
A resource can be described by multiple statements.
![Page 36: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/36.jpg)
<?xml version="1.0"?> <rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#” xmlns:en=“http://en.wikipedia.org/wiki/” ><rdf:Description about=“http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&list_uids=125”>
<en:name>Alcohol Dehydrogenase 1B (class I), beta polypeptide”></en:name><en:synonym>ADH1B</en:synonym>
</rdf:Description></rdf:RDF>
Graphical & XML Representationhttp://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=Retrieve&list_uids=125
“Alcohol Dehydrogenase 1B (class I), beta polypeptide”
http://en.wikipedia.org/wiki/Namehttp://en.wikipedia.org/wiki/Snynonym
“ADH1B”
![Page 37: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/37.jpg)
RDF Schema (RDFS)
• RDF Schema terms:– Class– Property– type– subClassOf– range– Domain
• Example:<DNASequence, type, Class><Promoter,subClassOf,DNASequence><Protein,type,Class><TranscriptionFactor,subClassOf,Protein><Bind,type,Property><Bind,domain, TranscriptionFactor><Bind,range, Promoter>
![Page 38: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/38.jpg)
Ontologies
• In both computer science and information science, an ontology is a representation of a set of concepts within a domain and the relationships between those concepts.
• It is a shared conceptualization of a domain
• Ontologies are commonly encoded using ontology languages.
![Page 39: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/39.jpg)
Web Ontology Language (OWL)
• Latest standard in ontology languages from the W3C
• Built on top of RDF
• OWL semantically extends RDF while it is syntactically the same as RDF
• Three species of OWL– OWL-Lite– OWL-DL– OWL-Full
![Page 40: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/40.jpg)
OWL > RDF/RDFS
• Cardinality restrictions: (e.g., a gene may have more than one transcription factor binding sites)
• Disjointedness of classes: (e.g., mRNA may be classified either as introns or exons)
• Other OWL constructs – uniqueness: (e.g.,a GO term can have only one GO identifier)– unionOf: (e.g., gene may be the unionOf intron and exons– sameAs: specifying synonymous relationship between classes
(e.g., “Cerebellar Purkinje Cell” sameAs “Purkinje Neuron”).
![Page 41: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/41.jpg)
Topic Map• A topic map (an ISO standard) is used represent
information using topics (concepts), associations, and occurrences
• It is used to organize information in a way that can be optimized for navigation.
association
occurrence
![Page 42: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/42.jpg)
Neuroscience Topic Map
![Page 43: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/43.jpg)
Topic Map Encoding/Querying
• XML Topic Map (XTM)
• Top Map Query Language (TMQL)
![Page 44: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/44.jpg)
Visual Topic Maps
• A Visual Topic Map can be defined as a topic map including visual topics. A visual topic is defined by a topic name which refers to a visual content.
![Page 45: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/45.jpg)
NCBI Site Map
![Page 46: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/46.jpg)
Mosaic of Chinese Characters in Stories about the Meaning of Ideograms
![Page 47: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/47.jpg)
![Page 48: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/48.jpg)
Visualization of the del.icio.us Tags in an Interactive Graph
![Page 49: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/49.jpg)
Combining Semantic Web and Topic Map
Topic MapSemantic Web
Visualization
Machinereasoning
Knowledge organization & representation (mapping between XTM and RDF/OWL)
![Page 50: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/50.jpg)
Web 2.0 Meets Web 3.0
• Folksonomy meets ontology– Tags can evolve into standard heavy-weight
ontologies, while light-weight ontologies can be applied to tagging
• Human readability meets machine readability– Visual network vs. semantic network
• Social network meets semantic network– FOAF, semantic wiki
• Syntactic mashup meets semantic mashup– Dapper and yahoo pipes may become ontologically
aware
![Page 51: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/51.jpg)
Conclusions
• Web 2.0 and 3.0 provides a platform for data/tool sharing and integration (mashup) and scientific collaboration
• More use cases are needed• Question?
– While Web 1.0 has played an important role in organizing/disseminating information produced by HGP, can Web 2.0/3.0 offer more to present “big science” projects like ENCODE?
![Page 52: Bioinformatics 2.0/3.0 Kei Cheung Yale Center for Medical Informatics](https://reader030.vdocuments.us/reader030/viewer/2022032800/56649d2a5503460f949ffc21/html5/thumbnails/52.jpg)
The End