11
Face-to-Face Meeting
Semantic Web for Healthcare and Life Sciences Interest Group
http://www.w3.org/2001/sw/hcls/
W3C HCLS chairs,Eric Neumann - Clinical Semantics Group
Tonya Hongsermeier - Partners Healthcare
22
F2F Agenda F2F Agenda
Thursday November 8, 2007Thursday November 8, 2007• 1.30pm - 2.00pm - F2F Kickoff, Welcome and Introductions (EricN & Tonya)1.30pm - 2.00pm - F2F Kickoff, Welcome and Introductions (EricN & Tonya)• 2.00pm - 2.30pm - Clinical Observations Interoperability: How can Semantic 2.00pm - 2.30pm - Clinical Observations Interoperability: How can Semantic
Technologies Help? (Vipul)Technologies Help? (Vipul)• 2.30pm - 3.00pm - Clincal Observations Interoperability (COI) Use Case (Rachel)2.30pm - 3.00pm - Clincal Observations Interoperability (COI) Use Case (Rachel)• 3.00pm - 3.30pm - Tea/Coffee3.00pm - 3.30pm - Tea/Coffee• 3.30pm - 4.00pm - Detailed Clinical Models and the COI Use Case (Tom)3.30pm - 4.00pm - Detailed Clinical Models and the COI Use Case (Tom)• 4.00pm - 4.15pm - Demo: Semantic DB System at Cleveland Clinic (Chimezie)4.00pm - 4.15pm - Demo: Semantic DB System at Cleveland Clinic (Chimezie)• 4.15pm - 4.30pm - Demo: SHER System by IBM and Columbia University (Chintan)4.15pm - 4.30pm - Demo: SHER System by IBM and Columbia University (Chintan)• 4.30pm - 5.30pm - Round Table and Feedback on Next Steps for COI (Moderator: 4.30pm - 5.30pm - Round Table and Feedback on Next Steps for COI (Moderator:
Vipul)Vipul)• 5.30pm - 6.00pm - Wrap Up (EricN & Tonya)5.30pm - 6.00pm - Wrap Up (EricN & Tonya)• 6.30pm - Dinner with COI participants to work on functional requirements 6.30pm - Dinner with COI participants to work on functional requirements
spreadsheet, etc.spreadsheet, etc.
33
F2F Agenda F2F Agenda
Friday November 9, 2007Friday November 9, 2007•8:30am - 9:00am - Welcome and Introductions (EricN & Tonya)8:30am - 9:00am - Welcome and Introductions (EricN & Tonya)•9.00am - 10.30am - Group Reviews: BioRDF, ACPP, DSE, & BIONT/Clinical 9.00am - 10.30am - Group Reviews: BioRDF, ACPP, DSE, & BIONT/Clinical
Trials + EMRTrials + EMR•10.30am - 11.00am - Break10.30am - 11.00am - Break•11.00am - 12.00pm - HCLS questionnaire results and charter discussion 11.00am - 12.00pm - HCLS questionnaire results and charter discussion
(EricP et al)(EricP et al)•12.00pm - 1.00pm - Lunch12.00pm - 1.00pm - Lunch•1.00pm - 2.30pm - BioRDF: Understanding and enhancing the 1.00pm - 2.30pm - BioRDF: Understanding and enhancing the
knowledgebase (Alan, et al.)knowledgebase (Alan, et al.)•2.30pm - 3.00pm - Break2.30pm - 3.00pm - Break•3.00pm - 4.00pm - BioRDF: URI note discussion (Jonathan)3.00pm - 4.00pm - BioRDF: URI note discussion (Jonathan)•4.00pm - 5.00pm - Next Steps / Wrap up4.00pm - 5.00pm - Next Steps / Wrap up
44
What is HCLS? What is HCLS?
““The Semantic Web Health Care and Life Sciences Interest Group is The Semantic Web Health Care and Life Sciences Interest Group is chartered to develop and support the use of Semantic Web chartered to develop and support the use of Semantic Web technologies and practices to improve collaboration, research and technologies and practices to improve collaboration, research and development, and innovation adoption in the health care and life development, and innovation adoption in the health care and life science domains. Success in these domains will ultimately depend science domains. Success in these domains will ultimately depend on a foundation of semantically rich systems, processes and on a foundation of semantically rich systems, processes and information interoperability.”information interoperability.”
The scope of HCLSIG includes:The scope of HCLSIG includes:• Core vocabularies and ontologies to support cross-community Core vocabularies and ontologies to support cross-community
data integration and collaborative effortsdata integration and collaborative efforts• Guidelines and Best Practices for Resource Identification to Guidelines and Best Practices for Resource Identification to
support integrity and version controlsupport integrity and version control• Better integration of Scientific Publication with people, data, Better integration of Scientific Publication with people, data,
software, publications, and clinical trialssoftware, publications, and clinical trials
55
HCLS Themes HCLS Themes
Principle activities have centered around:Principle activities have centered around: • Building a broad and strong community Building a broad and strong community • Exploring and documenting Use Cases Exploring and documenting Use Cases • Converting resources to RDF-OWL Converting resources to RDF-OWL • Learning to work with semantic web query/inference Learning to work with semantic web query/inference
technology such as SPARQL, OWL, and rule engines.technology such as SPARQL, OWL, and rule engines.
66
OrganizationOrganization
• Chairs: Eric Neumann, Tonya Hongsermeier Chairs: Eric Neumann, Tonya Hongsermeier • Group divided in to task forces (coordinator) Group divided in to task forces (coordinator)
– BioRDF. Established initially to convert biomedical data to RDF (Susie BioRDF. Established initially to convert biomedical data to RDF (Susie Stephens)Stephens)
– BIONT. Established initially to be resource for ontology needs for other BIONT. Established initially to be resource for ontology needs for other groups (Vipul Kashyap)groups (Vipul Kashyap)
– DSE (Drug Safety and Efficacy). Established initially to work on SW DSE (Drug Safety and Efficacy). Established initially to work on SW technology to support monitoring drug safety, pharmacovigilance (Eric technology to support monitoring drug safety, pharmacovigilance (Eric Neumann)Neumann)
– ACPP (Adaptable Clinical Protocols and Pathways) Established initially ACPP (Adaptable Clinical Protocols and Pathways) Established initially to work on method of representing and computing applicability of to work on method of representing and computing applicability of protocols to dynamically changing patient status (Helen Chen)protocols to dynamically changing patient status (Helen Chen)
– COI (Clinical Observations Interoperability). Established recently with two COI (Clinical Observations Interoperability). Established recently with two goals 1) Establish new collaboration with health care industry players 2) goals 1) Establish new collaboration with health care industry players 2) Work on issues at the intersection of electronic medical records and Work on issues at the intersection of electronic medical records and health care organization needs. (Vipul Kashyap)health care organization needs. (Vipul Kashyap)
77
MembershipMembership
• 64 participants from 38 organizations 64 participants from 38 organizations • 3 Invited Experts3 Invited Experts• Many more non-member participants on-line Many more non-member participants on-line
88
Meetings to-dateMeetings to-date
• Formal F2F, January 2006, Cambridge Formal F2F, January 2006, Cambridge • Formal F2F, October 2006, Amsterdam Formal F2F, October 2006, Amsterdam • Workshop, ISWC November 2006, Banff Workshop, ISWC November 2006, Banff • Informal F2F (Demo) 3 x March/April, 2007, Cambridge Informal F2F (Demo) 3 x March/April, 2007, Cambridge • Workshop, WWW 2007, May 2007, Banff Workshop, WWW 2007, May 2007, Banff • Informal F2F (URI), July 2007, Cambridge Informal F2F (URI), July 2007, Cambridge • Formal F2F, November 2007, CambridgeFormal F2F, November 2007, Cambridge
99
Demonstrations and Examples Demonstrations and Examples
• HCLS NeuroScience Demo - WWW2007 BanffHCLS NeuroScience Demo - WWW2007 Banff– http://esw.w3.org/topic/HCLS/Banff2007Demohttp://esw.w3.org/topic/HCLS/Banff2007Demo
• Clinical Trials Data Management and ViewingClinical Trials Data Management and Viewing– http://eneumann.org/exhibit/clinicaldemo/http://eneumann.org/exhibit/clinicaldemo/
• Taverna, AIDATaverna, AIDA– http://taverna.sourceforge.nethttp://taverna.sourceforge.net, , http://myexperiment.org/http://myexperiment.org/
• Mining Disease Relations from Semantically Mining Disease Relations from Semantically Integrated Genome - Phenome MapsIntegrated Genome - Phenome Maps– http://www2007.org/workshops/paper_146.pdfhttp://www2007.org/workshops/paper_146.pdf
1010
PresentationsPresentations• WWW2007 Demo WWW2007 Demo • ISMB 2007 Demo ISMB 2007 Demo • ISMB BioOntology SIG Poster 2007 ISMB BioOntology SIG Poster 2007 • Society for Neuroscience Poster Nov 2007 Society for Neuroscience Poster Nov 2007 • Selection of presentation venues of members showing HCLS workSelection of presentation venues of members showing HCLS work
– Bridging Pharma and IT Bridging Pharma and IT – Drug Discovery Technology of Innovative Therapeutics Drug Discovery Technology of Innovative Therapeutics – 1st European Semantic Web Conference 1st European Semantic Web Conference – Bio-IT World Bio-IT World – Norwegian Semantic Web Day Norwegian Semantic Web Day – InfoTech Pharma InfoTech Pharma – Modern Drug Discovery and Development Summit Modern Drug Discovery and Development Summit – Massachusetts Biotechnology Panel Massachusetts Biotechnology Panel – eScience Institute; RDF, Ontologies and Meta-Data Workshop eScience Institute; RDF, Ontologies and Meta-Data Workshop – Virginia Biotechnology Summit Virginia Biotechnology Summit – Systems Biology Systems Biology – Semantic Web Gathering Semantic Web Gathering – Allen Institute for Brain Sciences Allen Institute for Brain Sciences – Informatics and Interactomes in Huntington’s Disease Informatics and Interactomes in Huntington’s Disease – Ontology for Biomedical Informatics Workshop Ontology for Biomedical Informatics Workshop – Clinical Trial Ontology Workshop Clinical Trial Ontology Workshop – Jackson Laboratories Jackson Laboratories – Pubmed Plus Pubmed Plus – NIH Blueprint NIF WorkshopNIH Blueprint NIF Workshop
1111
Drug R&D
HCP
Biomed Research
Insurers
Gov/Regulatory
Public
CROs
Gov/Funding
Grants
Publications and Public Databases
Disease Areas
Chem Manuf
Mol Path Res Clin Res
BiomarkerTox Preclin
Clin Safety
Clin POC Surveillance
Drug Programs
Large Studies
HMO,PPO
MarketingVA System
R&D
BioKB
SafetyCommons
Risks & Benefits
JANUS
HCPChoices
EHR
HCLS EcosystemHCLS Ecosystem
CDC
1212
Diseases
Assays
Distributed Nature of R&D InformationDistributed Nature of R&D InformationSilos of Data…Silos of Data…
MolModels
Targets
ToxRegistry
Libraries
Biomarkers
Genotypes
HCS
1313
Data IntegrationData Integration
Forcing data to fit a specific Forcing data to fit a specific applicationapplication
App 1 App 2
?
1515
Data AggregationData Aggregation
Think: Smart Mash-upThink: Smart Mash-upData aggregated for any Data aggregated for any application ins semantically application ins semantically consistent way.consistent way.
App
?
App
?
?
1717
The Current WebThe Current Web
What the computer sees: “Dumb” What the computer sees: “Dumb” linkslinks
No semantics - <a href> treated No semantics - <a href> treated just like <bold>just like <bold>
Minimal machine-processable Minimal machine-processable informationinformation
Resources
Resources
Resources ResourcesResources
Resources
ResourcesResources
1818
The Semantic WebThe Semantic Web
RDF- Resource Description RDF- Resource Description FrameworkFramework
Machine-processable semantic Machine-processable semantic informationinformation
Semantic context published – Semantic context published – making the data more informative making the data more informative to both humans and machinesto both humans and machines
Clinical Study
Subject
Finding Adverse EventIntervention
Design
BiomarkerSample
GeneExpression
hasSubject using
hasFindingshasAE
treatment
derivedhasExpression
1919
The Layer CakeThe Layer Cake
2020
Facts as triplesFacts as triples
PARK1PARK1 Parkinson diseaseParkinson disease
has_associated_diseasehas_associated_disease
subject predicate object
2121
From triples to a graphFrom triples to a graph
PARK1 Parkinson disease
has_associated_disease
MAPT Parkinson disease
MAPT Pick disease
TBP Parkinson disease
TBP Spinocerebellar ataxia
PARK1 Parkinson disease
Parkinson diseaseMAPT
Pick disease
Parkinson diseaseTBP
Spinocerebellar ataxia
PARK1 Parkinson disease
MAPT Pick disease
TBP Spinocerebellar ataxia
2222
Connecting graphsConnecting graphs
• Integrate graphs from multiple resourcesIntegrate graphs from multiple resources• Query across resourcesQuery across resources
APP Alzheimer disease
PARK1 Parkinson disease
has_associated_disease
Alzheimer disease
Parkinson disease
Neurodegenerative diseases
isa
2323
Recombinant DataRecombinant Data
Graphs can be filtered and Graphs can be filtered and pivoted, without losing meaningpivoted, without losing meaning
2424
Where does Semantic Web Fit In?Where does Semantic Web Fit In?
Connecting Legacy Data to SW:
D2R, MagLev
2525
Semantic Web Applied to Disease Semantic Web Applied to Disease Mechanism and PharmacologyMechanism and Pharmacology
2626
Drug Safety and EfficacyDrug Safety and Efficacy
• Group focuses on: Group focuses on: – Draft an approach for clinical trial data that is in line with the Clinical Draft an approach for clinical trial data that is in line with the Clinical
Data Interchange Standards(CDISC) Study Data Tabulation Model Data Interchange Standards(CDISC) Study Data Tabulation Model (SDTM) (SDTM)
– Can SW standards help with EDC? Can SW standards help with EDC? – Work on aggregating patient data with pharmacogenomic information.Work on aggregating patient data with pharmacogenomic information.– Use Case around Pharmacovigilance Use Case around Pharmacovigilance
• Emphasis on display/visualization of clinical trial data, rather than Emphasis on display/visualization of clinical trial data, rather than query query
• They have a demo that uses the Simile project’s Exhibit tools to They have a demo that uses the Simile project’s Exhibit tools to display information merging three points of viewdisplay information merging three points of view– Demographics Demographics – Treatments Treatments – Adverse EventsAdverse Events– SNPSNP
2727
Clinical Observations InteroperabilityClinical Observations Interoperability
• Recently formed Recently formed http://esw.w3.org/topic/HCLS/OntologyTaskForce/BIONTDSEDCMhttp://esw.w3.org/topic/HCLS/OntologyTaskForce/BIONTDSEDCM
• Focusing on problem of identifying clinical trial candidates based on Focusing on problem of identifying clinical trial candidates based on constraints of participation and information in medical records. constraints of participation and information in medical records.
• Clinical trial recruitment is an everybody-wins situation Clinical trial recruitment is an everybody-wins situation – Patients want new cures and development of them is dependent on trials Patients want new cures and development of them is dependent on trials – Pharma want new drugs to bring huge profits Pharma want new drugs to bring huge profits – Running clinical trials is profitable for CROs Running clinical trials is profitable for CROs – Doctors and hospitals get money for identifying, treating, and recording Doctors and hospitals get money for identifying, treating, and recording
information about patients. information about patients. • Technical problems (evaluation of constraints) are very similar to Technical problems (evaluation of constraints) are very similar to
those identified and somewhat addressed in ACPP. Driver of rule and those identified and somewhat addressed in ACPP. Driver of rule and OWL technology. OWL technology.
• Has brought in new industry and academic participants. Has brought in new industry and academic participants. • DSE, ACPP, and BioONT participants are joining this effort.DSE, ACPP, and BioONT participants are joining this effort.
2828
BioRDF Demo: Neurocommons Triple StoreBioRDF Demo: Neurocommons Triple Store• Challenge: Go beyond toy size examples of Semantic WebChallenge: Go beyond toy size examples of Semantic Web• Strategy: 1) Work on translating a dozen existing databases using OBO Strategy: 1) Work on translating a dozen existing databases using OBO
methodology and OWL, focused on neuroscience research questions methodology and OWL, focused on neuroscience research questions – Pubmed/Mesh mappings Pubmed/Mesh mappings – OBO ontologies, including Gene functions, molecular processes, cellular components OBO ontologies, including Gene functions, molecular processes, cellular components – Neuroscience ontologies: Senselab, BAMS, Allen Brain AtlasNeuroscience ontologies: Senselab, BAMS, Allen Brain Atlas
• Strategy: 2) Develop capability to run interesting SPARQL queries at Strategy: 2) Develop capability to run interesting SPARQL queries at scalescale
– 300 Million triples 300 Million triples – Open source, reproducible Open source, reproducible – Queries that could only be done previously with a lot of effortQueries that could only be done previously with a lot of effort
• Demonstrate “Practice” a useful prelude to developing “Best Practice”Demonstrate “Practice” a useful prelude to developing “Best Practice”• Enthusiastic response from community. We often hearEnthusiastic response from community. We often hear
– “ “ This is the first time I’ve seen semantic web technology do anything useful” This is the first time I’ve seen semantic web technology do anything useful” – WWW2007: “Thank you from the semantic web”WWW2007: “Thank you from the semantic web”
• Some problems solved, many remain, ongoing work.Some problems solved, many remain, ongoing work.
2929
Adaptable Clinical Protocols and PathwaysAdaptable Clinical Protocols and Pathways
• This group was active mostly in the first year This group was active mostly in the first year • Worked on Guideline Reference Ontology Worked on Guideline Reference Ontology • Prototyped three instances of guidelines based on the ontology Prototyped three instances of guidelines based on the ontology • Worked on reasoning with inclusion/exclusion criteria Worked on reasoning with inclusion/exclusion criteria • Developed clear use cases that need temporal concepts reasoning Developed clear use cases that need temporal concepts reasoning • Two rule engines used. Helen Chen: Euler, Chimezie Ogbuji: Fuxi (his own Two rule engines used. Helen Chen: Euler, Chimezie Ogbuji: Fuxi (his own
work!) work!) • This work is great arena for exploring the difference between using OWL This work is great arena for exploring the difference between using OWL
versus rules versus rules • Although research is finished, need more manpower to write up results Although research is finished, need more manpower to write up results • AGFA benefited significantly from the work and it has influenced their internal AGFA benefited significantly from the work and it has influenced their internal
models.models.• SW is a strong focus in AGFA health care for evidence based patient care, SW is a strong focus in AGFA health care for evidence based patient care,
and they expect to invest in it for years to come. and they expect to invest in it for years to come. • Helen, Chimezie are now bringing their experiences to the new Clinical Helen, Chimezie are now bringing their experiences to the new Clinical
Observations Interoperability group.Observations Interoperability group.
3030
URI recommendations projectURI recommendations project
• Jonathan Rees leading work in BioRDF on upcoming note Jonathan Rees leading work in BioRDF on upcoming note • Problem is stable identifiers for knowledge resources of import, and Problem is stable identifiers for knowledge resources of import, and
for entities in the world for entities in the world – Proteins, Organs, Symptoms etc. Proteins, Organs, Symptoms etc.
• Motivated by complaints about inadequate instruction Motivated by complaints about inadequate instruction – From providers not knowing good practices for minting URIs From providers not knowing good practices for minting URIs – From users not knowing how to name or make statements about From users not knowing how to name or make statements about
resources that can be effectively integrated resources that can be effectively integrated • Community finds current AWWW and TAG documentation and Community finds current AWWW and TAG documentation and
recommendations inadequate recommendations inadequate • Problem is not easily separable from issues of providing good Problem is not easily separable from issues of providing good
practices for making use of the URIs, such as having mechanisms practices for making use of the URIs, such as having mechanisms for figuring out what they meanfor figuring out what they mean
3131
New charter processNew charter process
• Goal is to design one or two charters for working group Goal is to design one or two charters for working group and/or interest group. and/or interest group.
• Initial step was drafting of questionnaire circulated to Initial step was drafting of questionnaire circulated to wide swath of life sciences and health care community. wide swath of life sciences and health care community. We listed possible activities and asked which would be of We listed possible activities and asked which would be of interest for participation. Currently almost 50 responses interest for participation. Currently almost 50 responses
• http://www.w3.org/2007/06/HCLSForm http://www.w3.org/2007/06/HCLSForm • Next step: analyze and draft a charter that is responsive Next step: analyze and draft a charter that is responsive
to the feedback gained from the questionnaire. Target is to the feedback gained from the questionnaire. Target is for new charter to circulate early November.for new charter to circulate early November.
3232
Looking ForwardLooking Forward
• Work with TAG over architectural needs for Semantic Web for science. Work with TAG over architectural needs for Semantic Web for science. Important to get time with them at Tech Plenary meeting in November. Important to get time with them at Tech Plenary meeting in November.
• Find and recruit more members like AGFA, who feel they got value from Find and recruit more members like AGFA, who feel they got value from group participation. Current membership has no shortage of other group participation. Current membership has no shortage of other organizations with some tie to health care/life sciences, either as organizations with some tie to health care/life sciences, either as vendors (e.g. IBM) or primary, such as pharma (Merck, Pfizer) vendors (e.g. IBM) or primary, such as pharma (Merck, Pfizer)
• Targeting pharma, health care, vendors, not currently SW-ready Targeting pharma, health care, vendors, not currently SW-ready standards groups (e.g. CDISC), FDA, NIH sponsored projects, and standards groups (e.g. CDISC), FDA, NIH sponsored projects, and academic research leaders. academic research leaders.
• Further recruitment should be based on listening to needs, addressing Further recruitment should be based on listening to needs, addressing them with focused projects within the next HCLS. Focus on increasing them with focused projects within the next HCLS. Focus on increasing credibility. credibility.
• Acceptance of SW technology’s role in pharma/ health care is growing, Acceptance of SW technology’s role in pharma/ health care is growing, hopefully to soon match enthusiasm from life sciences community. They hopefully to soon match enthusiasm from life sciences community. They are looking for guidance, and we need to provide it.are looking for guidance, and we need to provide it.
3333
Upcoming Event: C-SHALSUpcoming Event: C-SHALS
• Conference on Semantics in Health and Life SciencesConference on Semantics in Health and Life Sciences• March 5-7 2008, Cambridge, MAMarch 5-7 2008, Cambridge, MA• Industry Presentations on the State of the Art for Industry Presentations on the State of the Art for
intelligent semantic applications in Drug R&Dintelligent semantic applications in Drug R&D• Sponsors: Pfizer, Merck, …Sponsors: Pfizer, Merck, …
3434