hl7 clinical-genomics sig: a shared genotype model

35
HL7 Clinical-Genomics SIG: A Shared Genotype Model HL7 V3 Compliant HL7 Clinical-Genomics SIG Facilitator Amnon Shabo (Shvo) IBM Research Lab in Haifa Atlanta, September 2004

Upload: darren

Post on 19-Jan-2016

47 views

Category:

Documents


0 download

DESCRIPTION

HL7 Clinical-Genomics SIG: A Shared Genotype Model. HL7 V3 Compliant. Amnon Shabo (Shvo) IBM Research Lab in Haifa. HL7 Clinical-Genomics SIG Facilitator. Atlanta, September 2004. Current Work. Clinical-Genomics Storyboard. Clinical-Genomics Storyboard. Family History. Genotype - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: HL7 Clinical-Genomics SIG: A Shared Genotype Model

HL7 Clinical-Genomics SIG:A Shared Genotype Model

HL7 V3 Compliant

HL7 Clinical-Genomics SIG Facilitator

Amnon Shabo (Shvo)

IBM Research Lab in Haifa

Atlanta, September 2004

Page 2: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Current Work

Clinical-Genomics

Storyboard

Clinical-Genomics

Storyboard

Clinical-Genomics

Storyboard

Clinical-Genomics

Storyboard

Genotype

Shared

Model

Tissue Typing

Cystic Fibrosis Pharmacogenomics

BRCA

Family

History

ClinicalStatement

SharedModel

Page 3: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The Genotype CMET

Represents genomic data in HL7 RIM Classes Not meant to be a biological model Concise and targeted at healthcare use for

personalized medicine

Consists of: A Genotype (entry point) 1 .. 3 alleles Polymorphisms

Mutations SNPs

Haplotypes DNA Sequencing Gene expression Proteomics Phenotypes (clinical data such as diseases, allergies, etc.)

Page 4: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The Genotype CMET(cont.)

Design Principles: Shared model (a reusable component in different use cases) Basic encapsulation of genomic data that might be used in healthcare

regardless of the use case Stemmed from looking for commonalities in specific use cases Presented as the CG SIG DIM (Domain Information Model) in ballot#6&8 Most of the clones are optional, thus allowing the representation of merely a

genotype with a minimum of one allele (a typical use by early adopters) At the same time, allows the use of finer-grain / raw genomic data, thus

accommodating the more complex use cases such as tissue typing or clinical trials

Its use is currently illustrated in four R-MIMs: Tissue Typing Cystic Fibrosis Viral genotyping Pharmacogenomics

Page 5: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

0..1 pertinentMutation

typeCode*: <= PERT

pertinentInformation

0..1 pertinentGeneExpression

typeCode*: <= PERT

pertinentInformation3

0..* pertinentPolymorphism

typeCode*: <= PERT

pertinentInformation6

IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNcode*: CE CWE [1..1] (allele identifier & classification, e.g. GeneBank)text: ED [0..1]methodCode: SET<CE> CWE [0..*] (The method by which the code was determined)

SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP identifier & classification, e.g. Entrez dbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]

HaplotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1]

GenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., HETEROZYGOTE)text: ED [0..1]effectiveTime: IVL<TS> [0..1] (the time of genotyping)

0..* haplotype

typeCode*: <= COMP

componentOf

1..3 individualAllele

typeCode*: <= COMP

component

0..* pertinentSNP

typeCode*: <= PERTpertinentInformation1

AlleleSequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: [1..1] (the sequence standard code, e.g.BSML, GMS)text: (the annotated sequence)effectiveTime: [1..1]value: ED [1..1] (the actual sequence)methodCode: (the sequencingmethod)

0..1 pertinentAlleleSequence

typeCode*: <= PERTpertinentInformation2

GeneExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE <= ActCode (the standard's code (e.g., MAGE-ML identifier)text:effectiveTime:value: ED [1..1] (the actual geneexpression levels)methodCode:

PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1](idnetifier & classification ofthe protein, e.g., SwissProt,) (PDB, PIR, HUPO)text:

0..* outcomePolypeptide

typeCode*: <= OUTC

outcome

DeterminantPeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE (identifier and classification of the determinant, e.g., Entrez)text: ED

0..* pertinentDeterminantPeptide

typeCode*: <= PERT

pertinentInformation2

MutationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE (mutation identifier andclassification, e.g. LOINC MOLECULARGENETICS NAMING)text:

0..* pertinentMutation

typeCode*: <= PERT

pertinentInformation4

ClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (disease, allergy, sensitivity, ADE, etc.)text: ED [0..1]uncertaintyCode: CE CNE [0..1]value: ANY [0..1]

HL7 Clinical Genomics SIGDocument: Individual Genotype DIM (to be registered as a CMET)Subject: Genomics Data Rev: 0.5 Date: April 24, 2004Facilitator: Amnon Shabo (Shvo), IBM Research in Haifa

Note:There must be at least oneIndividualAllele and threeat the most. The typical casewould be an allele pair, oneon the paternal chromosome andone on the maternal chromosome.

The third allele could bepresent if the patient hasthree copies of a chromosome asin the Down’s Syndrome.

Mutation

0..* haplotype

typeCode*: <= COMP

componentOf

Constrained to a restricted MAGE-MLcontent model, specified elesewhere.

Constraint: GeneExpression.value

Constrained to a restrictedBSML or GMS content model,specified elsewhere.

Constraint: AlleleSequence.value

0..* pertinentMethod

typeCode*: <= PERTpertinentInformation1

MethodclassCode*: <= PROCmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1] <=ActCode (type of method)text: ED [0..1] (free text description of themethod used)methodCode: SET<CE>CWE [0..*]

0..* pertinentIndividualAllele

typeCode*: <= PERT

pertinentInformation5

Note:A related allele that is on adifferent haplotype, and stillhas significant interrelationwith the source allele.

IndividualAllele

0..* priorClinicalPhenotype

typeCode*: <= SEQL

sequelTo

ExternalClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The id of an external observation (e.g., in a problemlist)

Note:An external observation is a valid Observationinstance existing in any other HL7-compliantartifact, e.g., a document or a message.

Note:An observation of a clinical conditionrepresented internally in this model.

Note: Shadowed observationsare copies of other observationsand thus have all of the originalact attributes.

Note:Use methodCode ifyou don’t use theassociated methodprocedure.

Note:Could refine ActRelationship typeCodeto elaborate on different types of genomicto phenotype effects.

Method0..* pertinentMethod

typeCode*: <= PERTpertinentInformation

Note:Usually this is a computed outcome, i.e.,the lab does not produce the actual protein.

0..* referredToExternalClinicalPhenotype

typeCode*: <= x_ActRelationshipExternalReference

reference

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype0..* priorClinicalPhenotype

typeCode*: <= SEQLsequelTo

0..* priorClinicalPhenotype

typeCode*: <= SEQL

sequelTo

0..* priorClinicalPhenotype

typeCode*: <= SEQL

sequelTo

Haplotype

Note:The classCode should beOBSGENPOLMUTwhich stands for mutation-polymorphismgenomic observation,a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype ofOBSGEN (genomicobservation).

Note:The classCode should beOBSGENPOLSNP whichstands forSNP-polymorphismgenomic observation,a subtype ofOBSGENPOL(polymorphism genomicobservation) which is asubtype of OBSGEN(genomic observation).

PolymorphismclassCode*: <= OBSmoodCode*: <=EVNid: II [0..1]code: CD CWE [0..1] <= ActCodetext: ED [0..1]value: ANY [0..1]

Note:The classCode should beOBSGENPOL which standsfor polymorphism genomicobservation, a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype of OBSGEN(genomic observation).

Genotype(POCG_RM000004)

Entry point to theClinical-GenomicsGenotype Model

The Genotype Model

Individual Allele (1..3)

SNP

Allele Sequence

MutationProteomic

s

Gene Expression

Clinical Phenotype

Haplotype

Entry Point: Genotype

Sequencing

Method

Polymorphism

Page 6: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Coexistence of HL7 Objects and Bioinformatics Markup

Clinical PracticeGenomic Data Sources

EHR System

HL7 CG Messages with m

ainly

Encapsulating HL7 Objects HL7 C

G M

essa

ges

with

bot

h

enca

psul

atin

g an

d

Speci

aliz

ed H

L7 O

bjec

ts

Bubbling up the clinically-significant raw

genomic data into specialized HL7 objects and

linked them with clinical data from the patient EHR

Decision Support

Applications

Knowledge(KBs, Ontologies, registries,

Evidence-Based, Papers, etc.)

Page 7: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Coexistence of HL7 Objects and Bioinformatics Markup (cont.)

Genetic CounselingDNA Lab

EHR System

HL7 CG Messages with an

AlleleSequence HL7 Object

encapsulating the raw sequencing

results

HL7 C

G M

essa

ges

with

bot

h

enca

psul

atin

g an

d

Speci

aliz

ed H

L7 O

bjec

ts

Bubbling up the clinically-significant SNP data into

HL7 SNP and Mutation objects and

linked them with clinical data from the patient EHR

Decision Support

Applications

Sequencing Example…

Page 8: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Coexistence of HL7 Objects and Bioinformatics Markup (cont.)

AlleleSequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g.BSML, GMS)text: ED [0..1] (sequence'sannotations)effectiveTime: GTS [1..1]value: ED [1..1] (the actual sequence)methodCode: SET<CE> CWE [0..*] (the sequencing method)

IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1] (allele classification)text: ED [0..1]value: ANY [0..1] (e.g. accession no. in GeneBank)methodCode: SET<CE> CWE [0..*] (The method by which the code was determined)

MutationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (mutation classification)text: ED [0..1]value: ANY [0..1] (mutation code, e.g. drawn from LOINC MOLECULAR GENETICS NAMING)

SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP classification, e.g. from EntrezdbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]

SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP classification, e.g. from EntrezdbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]

ClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., disease, allergy, sensitivity, ADE, etc.)text: ED [0..1]value: ANY [0..1]

Sequencing data

encapsulated as

bioinformatics markup

The patient's allele

HL7 genomic-specialized

Objects

Bubbling-up… Bubbling-up…B

ub

bli

ng

-up

Bu

bb

ling

-up

Page 9: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The Family History Model

PersonclassCode*: <= PSNdeterminerCode*: <= INSTANCEid: SET<II> [0..*] (e.g., SSN)name: BAG<EN> [0..*]telecom: BAG<TEL> [0..*]administrativeGenderCode: CECWE [0..1] <= AdministrativeGenderbirthTime: TS [0..1]deceasedInd: BL [0..1] "false"deceasedTime: TS [0..1]raceCode: SET<CE> CWE [0..*] <=RaceethnicGroupCode: SET<CE> CWE[0..*] <= Ethnicity

1..1 patientPerson

PatientclassCode*: <= PATid*: SET<II> [1..*]

0..1 relationshipHolder

PersonalRelationship 0..* relationshipHolder

classCode*: <= PRSid: SET<II> [0..*] (use this attribute to hold pedigree ID)code: CE CWE [0..1] <= RoleCode "FAMMEMB"

Note:First-degree relatives.FAMMEMB could be usedfor unidentified relatives, butalso any of the more specificcodes like PRN (parent) orNMTH (natural mother).

Note:A shadow of Person allows for recursiverepresentation of any higher leveldegree of relations, e.g., grandfather,through the same clone - PersonalRelationship,nesting in Person.

Note:Should be replaced with ageneric clinical statementCMET so it is capable ofholding any pertinentclinical data of the patientor his/her relative.

Note:This should be replaced with theClinical-Genomics Genotype model(as a CMET) to deal with all typesof genomic data.

Note:Person holds details that arenot specific the family role.Person is also the scoper ofthe relative roles (for moredetails see the V3 RoleCodevocabulary, domain =PersonalRelationshipRoleType).

FamilyHistory(COCT_RM999999)

This model is intended tobe a CMET and has thecapability of representingany part of the patientpedigree.

GenotypeclassCode*: <= OBSmoodCode*: <= EVN

ClinicalStatementclassCode*: <= ACTmoodCode*: <= EVNcode: CD CWE [1..1] <= ActCodenegationInd: BL [0..1]effectiveTime: GTS [0..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityuncertaintyCode: CE CNE [0..1] <= ActUncertainty

ClinicalGenomicChoice

Note:Clinical Genomics choice similar to thechoice associated with the Patient role(the entry point of this model).

0..* clinicalGenomicChoicetypeCode*: <= SBJsubjectOf1

0..*

subjectOf

Person

Genotype CMET

Page 10: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Family History – Harmonization Proposals

Age: Age of subject when subject’s diagnosis was made Age at time of death

Proposed solution: a new data type to refer to from effectiveTime:<effectiveTime xsi:type="TSR"> <!--TSR=Time Stamp Relative--> <epoch code="B"/>

<offset value="20" unit="mo"/></effectiveTime>

Vocabulary proposals Observation Interpretation (Deleterious, Unknown significance, Polymorphism, No mutation)

Personal relation codes and qualifiers

Personal Relationship association names A naming algorithm problem (HL7 tooling issue)

Page 11: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The Genotype Model in Tissue Typing

BMT Tissue Typing

Tissue Typing Observation

Genotype

AlleleSNP

Haplotype

Individual1 HLA

Matching

Individual2 HLA

Donor Banks

BMT

Ward

Tissue-Typing Lab

Page 12: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Tissue Typing

Observation

How the Genotype fits to Tissue-Typing

Tissue Typing in the context of Bone-Marrow Transplantation:

BMT Center

Donor

Bank

BMT unique Order/Entry

Page 13: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

1..* hLA_AntigenGenotype

typeCode*: <= COMP

component

0..1

TT-TestingLabclassCode*: <= QUAL

TissueTypingFacilityclassCode*: <= ENTdeterminerCode*: <= INSTANCE

TissueTypingObservationclassCode*: <= OBSmoodCode*: <= EVNid:code: CS CWE <= TissueTypingTestingClass

0..1 tT-TestingLab

typeCode*: <= PPRFprimaryPerformer

0..1 playingPerson

DonorclassCode*: <= ROL

PersonclassCode*: <= PSNdeterminerCode*: <= INSTANCE

CMET: (PAT) R_Patient[universal]

(COCT_MT050000)

0..1 participant

typeCode*: <= SBJsubject

TissueTypingMatchingObservationclassCode*: <= OBSmoodCode*: <= EVNcode: CS CWE <= TissueTypingMatchingClasstext:

1..1 priorTissueTypingObservation

typeCode*: <= SEQLsequelTo1

TissueTypingResultLetterclassCode*: <= DOCCLINmoodCode*: <= EVNcode: CE CWE <=TissueTypingDocumentType

0..* tissueTypingResultLetter

typeCode*: <= DOC

documentationOf

0..1 class I Antigens

typeCode*: <= COMP

component2

Class I AntigensclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCode

0..1 class II Antigens

typeCode*: <= COMP

component1

Class II AntigensclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCode

Genotype CMETclassCode*: <= OBSmoodCode*: <= EVN

1..* hLA_AntigenGenotype

typeCode*: <= COMP

component

LocusMatchingclassCode*: <= OBSmoodCode*: <= EVNcode: CS CWE <=TissueTypingLocusMatchingClasstext:

0..* locusMatching

typeCode*: <= COMP

component

TissueTypingObservation

1..1 priorTissueTypingObservation

typeCode*: <= SEQL

sequelTo

The number of LocusMatchingObservations is dependent onthe no. of loci examinedin the tissue-typing testing

Constraint: LocusMatching

Note:TissueTypingLocusMatchingClassshould be a new vocabularyin HL7 (may use recent NMDP effort)

Note:The no. of genotypes is dependenton the no. of loci examined in eachHLA class (usually, class I includesA, B and C antigens and class IIincludes the DR antigen family)

TTObservation(UUDD_RMnnnnnn)

Tissue-TypingObservation

Note:TissueTypingMatchingClassshould be a new vocabularyin HL7, e.g., 2-haplotype matchSubjectChoice

TT-Matching(UUDD_RMnnnnnn)

Description

Note:This module is developed by the Clinical-Genomics SIG.It will registered as a CMET but for now it appears here asan observation. For details, see the Genotype R-MIM.All genomic data are encapsulated in this CMET, includingmutations which are the essence of the CF testing for example.

How the Genotype fits to Tissue-Typing

Single

Tissue Typing

Observation

Class I Antigens

Class II Antigens

The Genotype model is used for each HLA

Antigen

Tissue Typing

Matching Observation

Page 14: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Tissue Typing Scenario Simulation

Real Case with… A Hutch Patient and sibling and unrelated donor candidates are in Hadassah

Information exchange… is simulated through a series of XML files following the TT storyboard activity diagram and using the HL R-MIMs + Genotype CMET

Documented in the following doc: HL7-Clinical-Genomics-TissueTypingInfoExchangeSimulation.doc Contact Amnon Shabo to get the document ([email protected])

Page 15: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The Genotype Model in Cystic Fibrosis

Entry Point: Blood Sample

Patient

Provider EMR System

MGS Report

DNA

Genotype CMET

MLG Counselor

ML Consultant

Molecular Genetic

lab

Page 16: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The Genotype Model in Viral Genotyping

Entry Point: Specimen

Pathogen

Patient

Viral DNA Sequencing

Viral DNA Regions

Genotype CMET

DNA Lab

Test Panel

Sponsor

Report

Resistance

Profile

Page 17: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The Genotype Model in Pharmacogenomics-Based Clinical Trial & Submission

Pharmacogenomics testing

Patient

Gene Selection

Genotype CMET

Genomic data

Submission

Sponsor

CRO

Report

CRO

Regulator

Data Validation

Analysis

device

Data Analysis

Trial design

SNP/Hap

Discovery

Page 18: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Constrained-BSML Schema

BSML – Bioinformatics Sequence Markup Language

Aimed at any biological sequence, for example: DNA RNA Protein

Constraining the BSML DTD to fit the healthcare needs Leave out research and display markup Ensure the patient identification

Creating an XML Schema, set up as the content model of an HL7 attribute of type ED

Page 19: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Constrained-MAGE-ML Schema

Cope with data outside of the XML (referenced)

Shared issues: Eliminate research & display elements and requires the

presence of certain elements, for example - patient identifiers

Require that one and only one patient will be the subject of the data, to avoid bringing data of another patient into the HL7 message

Require that data will refer to only one allele with which the encapsulating HL7 object is associated

Page 20: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

OBS Specialization Examples

PublicHealthCase detectionMethodCode :: CE transmissionModeCode :: CE diseaseImportedCode :: CE

Diagnostic Image subjectOrientationCode:: CE

The above examples are relatively ‘simple’ considering the uniqueness of the genomic observation attributes

Propose to add a genomic specialization to the RIM Observation Class

Rationale: has additional attributes that are unique to genomics (LSID, Bioinformatics Markup, etc.)

Page 21: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Genomic Specializations of Observation

GenomicObservation

LSID

Polymorphism

typepositionlengthreferenceregion

SNP

tagSNP

Mutation

knownAssciatedDiseases (not the actual phenotype)

Gene Expression

MAGE

Bio Sequence

BSML

Page 22: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

New Class Codes Proposal

OBSGEN GenomicObservation

OBSGENPOL Polymorphism

OBSGENPOLMUT Mutation

OBSGENPOLSNP SNP

classCode Class name

Page 23: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

New Attributes Proposal GenomicObservation: LSIDIdentifier

AlleleSequence: moleculeSequence A constrained XML Markup based on the BSML markup.

Polymorphism:o type (SNP, Mutation, Other)o position (the position of the polymorphism)o length (the length of the polymorphism)o reference (the base reference for the above attributes)o region (when the polymorphism scope is a specific gene region)

SNP: Tag SNPA Boolean field indicating whether this SNP is part of small SNP-Set that determines a SNP-haplotype.

GeneExpression: expressionLevels A constrained XML Markup based on the MAGE markup.

Proteomic clones: TBD.

Page 24: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Proposed HL7 Vocabularies

Genomics Vocabularies: Polymorphism:

General types (SNP, Mutation, Sequence Variation)

Nucleotide-based types (substitution, insertion, deletion, etc.)

Alleles Relation (recessive / dominant, homozygote / heterozygote)

Genotype-to-phenotype types of effects

Genomic observation interpretation (Deleterious, Unknown significance, polymorphism, No mutation)

SequencingMethodCode(example in next slide)

Page 25: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

HL7 Vocabulary Example

SequencingMethodCode:

SSOPH -Sequence specific oligonucleotide probe hybridization

SSP -Sequence specific primers

SBT -Sequence-based typing

RSCA -Reference strand conformation analysis

Page 26: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Proposed HL7 Vocabularies (cont.)

Tissue Typing related Vocabularies: TissueTypingLocusMatchingClass

TissueTypingMatchingClass

TissueTypingTestingClass

TissueTypingTestingMethod

TissueTypingDocumentType

TissueTypingOrderClass

DonorType (allogeneic, autologous, etc.)

Class I & II antigens classification

Page 27: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

XML Examples

Genotype Examples:o GenotypeSample1.xml

A genotype of two HLA alleles in the B locus

o GenotypeSample2.xmlA genotype of two HLA alleles in the B locus, along with a SNP designation in the first allele

Tissue Typing Observation Examples:o TissueTypingObservationSample1.xml

Consists of a single tissue typing observation of a patient or a donor

o TissueTypingObservationSample2.xmlConsists of two tissue typing observations of a patient & donor, leading to a tissue typing matching observation

Donor Search Examples:o TissueTypingDonorBankSample1.xml

This example is aimed at illustrating an unsolicited message from a BMT Center to a donor bank, sending a patient's tissue typing observation for the purpose of searching an appropriate donor

Page 28: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Next Steps

HL7 Formally submission of our harmonization proposals Continue with 2 alternatives until harmonization is resolved Register the Genotype Family History models as CMETs Hand craft sample instances (for review and experimental use) Derive a Genetic Testing model from the HL7 Lab SIG Models

Vocabularies HL7- develop External- get HL7 to recognize them

Constraining Bioinformatics Markup (continue the effort and include markup in the next ballot) MAGE-ML or MIAME BSML (done) caBIO (?)

Page 29: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Linking to the NCI Rembrandt Model

SNPFrequency

Polymorphism

ClinicalPhenotype(from Clinical)

Population(from Population)

Allele

Genotype

0..n

1..n

0..n

1..n0..2

1..n+PaternalMaternalAllele

0..2

1..n

0..1

1..n +ExtraAllele

0..1

1..n

Haplotype Chromosome1..n 11..n 1

SNP

0..n

0..n

0..n

0..n

0..n

0..2

0..n

0..2

1..n

0..n

1..n

0..n

1

1..n

1

1..n

LengthPolymorphism

Probe

signalValue

(from GeneExpression)

<<Interface>>

1

1

1

1

Clone11 11

Gene

1

1..n

1

1..n1

0..n

1

0..n

MapLocation

1..n

1

1..n

1

10..n 10..n

1..n0..n 1..n0..n

1

1

1

1

1..n

1..n1..n

1..n

Use-case driven modeling, designed with the HL7-Genotype model as a starting point and will eventually extend the caBio model.

Page 30: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Alternative Genotype Models

0..1 priorMutation

typeCode*: <= SEQL

sequelTo

0..1 geneExpression

typeCode*: <= SUBJ

subject5

0..* polymorphism

typeCode*: <= SUBJ

subject4

IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1] (allele classification)text: ED [0..1]value: ANY [0..1] (e.g. accession no. in GeneBank)methodCode: SET<CE> CWE [0..*] (The method by which the code was determined)

SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP classification, e.g. from EntrezdbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]

HaplotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1]value: ANY [0..1]

GenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., HETEROZYGOTE)text: ED [0..1]effectiveTime: IVL<TS> [0..1] (the time of genotyping)

0..* haplotype

typeCode*: <= COMP

componentOf

1..3 individualAllele

typeCode*: <= COMP

component

0..* sNP

typeCode*: <= SUBJsubject6

AlleleSequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g.BSML, GMS)text: ED [0..1] (sequence'sannotations)effectiveTime: GTS [1..1]value: ED [1..1] (the actual sequence)methodCode: SET<CE> CWE [0..*] (the sequencing method)

0..1 alleleSequence

typeCode*: <= SUBJsubject7

GeneExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE <= ActCode (the standard's code (e.g., MAGE-ML identifier)text:effectiveTime:value: ED [1..1] (the actual geneexpression levels)methodCode:

PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1](classification of the protein, e.g.,SwissProt, PDB, PIR, HUPO)text: ED [0..1]value: ANY [0..1]

0..* causePolypeptide

typeCode*: <= MFST

manifestationOf

DeterminantPeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (classification of thedeterminant)text: ED [0..1]value: ANY [0..1]

0..* derivedDeterminantPeptide

typeCode*: <= DRIV

derivation

MutationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (mutation classification)text: ED [0..1]value: ANY [0..1] (mutation code, e.g. drawn from LOINC MOLECULAR GENETICS NAMING)

0..* mutation

typeCode*: <= SUBJ

subject

ClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., disease, allergy, sensitivity, ADE, etc.)text: ED [0..1]value: ANY [0..1]

HL7 Clinical Genomics SIGDocument: Individual Genotype DIM (to be registered as a CMET) - Genomic Attributes as HL7 ClonesSubject: Genomics Data Rev: 0.17 Date: September 14, 2004Facilitator: Amnon Shabo (Shvo), IBM Research in Haifa, [email protected]

Note:There must be at least oneIndividualAllele and threeat the most. The typical casewould be an allele pair, oneon the paternal chromosome andone on the maternal chromosome.

The third allele could bepresent if the patient hasthree copies of a chromosome asin the Down’s Syndrome.

Mutation

0..* haplotype

typeCode*: <= COMP

componentOf

Constrained to a restricted MAGE-MLcontent model, specified elesewhere.

Constraint: GeneExpression.value

Constrained to a restrictedBSML content model,specified elsewhere.

Constraint: AlleleSequence.value

0..* method

typeCode*: <= SUBJsubject

MethodclassCode*: <= PROCmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1] <=ActCode (type of method)text: ED [0..1] (free text description of themethod used)methodCode: SET<CE>CWE [0..*]

0..* referredToIndividualAllele

typeCode*: <= REFR

reference

Note:A related allele that is on adifferent haplotype, and stillhas significant interrelationwith the source allele.

IndividualAllele

0..* causedClinicalPhenotype

typeCode*: <= CAUS

causeOf

ExternalClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The id of an external observation (e.g., in a problemlist)

Note:An external observation is a valid Observationinstance existing in any other HL7-compliantartifact, e.g., a document or a message.

Note:An observation of a clinical conditionrepresented internally in this model.

Note: Shadowed observationsare copies of other observationsand thus have all of the originalact attributes as well as all‘outbound’ associations.

Note:Use methodCode ifyou don’t use theassociated methodprocedure.

Note:Should refine ActRelationship typeCodeto elaborate on different types of genomicto phenotype interrelations.

Method0..* pertinentMethod

typeCode*: <= PERTpertinentInformation

Note:This might be a computed outcome, i.e.,the lab does not provide the actual protein,but secondary processes populate thisclone with the translational protein.

0..* referredToExternalClinicalPhenotype

typeCode*: <= x_ActRelationshipExternalReference

reference

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype0..* causedClinicalPhenotype

typeCode*: <= CAUScauseOf

0..* causedClinicalPhenotype

typeCode*: <= CAUS

causeOf

0..* causedClinicalPhenotype

typeCode*: <= CAUS

causeOfHaplotype

Note:The classCode should beOBSGENPOLMUTwhich stands for mutation-polymorphismgenomic observation.

Note:The classCode should beOBSGENPOLSNP whichstands forSNP-polymorphismgenomic observation.

PolymorphismclassCode*: <= OBSmoodCode*: <=EVNid: II [0..1]code: CD CWE [0..1] <= ActCodetext: ED [0..1]value: ANY [0..1]

Note:The classCode should beOBSGENPOL which standsfor polymorphism genomicobservation, a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype of OBSGEN(genomic observation).

Genotype(POCG_RM000004)

Entry point to theClinical-GenomicsGenotype Model

DeterminantPeptide

0..* causeDeterminantPeptide

typeCode*: <= MFSTmanifestationOf

PolymorphismAttributesclassCode*: <= ActContainermoodCode*: <= EVN

PolyTypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]value: CE CWE [1..1]

0..1 polyType

typeCode*: <= COMP

component4

PolyLengthclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]value: INT [1..1]

0..1 polyLength

typeCode*: <= COMP

component5

PolyPositionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]value: INT [1..1]

0..1 polyPosition

typeCode*: <= COMP

component6

PolyReferenceclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]value: ED [0..1]

PolyRegionclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]value: ED [0..1]

0..1 polyReference

typeCode*: <= COMP

component7

0..1 polyRegion

typeCode*: <= COMP

component8

PolymorphismAttributes

PolymorphismAttributes

0..1 polymorphismAttributestypeCode*: <= SUBJ

subject9

0..1 polymorphismAttributes

typeCode*: <= SUBJ

subject

0..1 polymorphismAttributes

typeCode*: <= SUBJ

subject

Note:A container of commonpolymorphism attributes.

Note: A code attribute was not added to any of the polymorphismattribute clones as this seems to be implicit from the clone name.

knownAssociatedDiseasesclassCode*: <= OBSmoodCode*: <= DEFcode: CD CWE [0..1] <= ActCodetext: ED [0..1]value: ANY [0..1]

0..* riskKnownAssociatedDiseases

typeCode*: <= RISKrisk

Note:These diseases are not the actualphenotype for the patient, rather theyare the known risks of this mutation.

tagSNPclassCode*: <= OBSmoodCode*: <= DEF

0..1 tagSNP

typeCode*: <= SUBJ

subject

Note:The presence of thisclone indicates that thesource SNP clone is atag SNP (note that ithas a DEF mood),

translationalDataclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1] <= ActCodevalue: ANY [0..1]

0..* pertinenttranslationalData

typeCode*: <= PERTpertinentInformation

Constrained to a restricted caBiocontent model, specified elsewhere.

Constraint: translationalData.value Entry Point: Genotype

Polymorphism

PolymorphismAttributesContainer

PolymorphismAttributes

Polymorphism

Attributes Shadow asso. W / Mutation

A model without genomic specializations of the HL7 RIM Observation class:

Page 31: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Comments received on the Genotype Model

Revalidate/collapse the polymorphism hierarchy Add a RIM class “SequenceVariance” Representing all types of polymorphisms Type could be placed in the code attribute ‘position’ and ‘length’ could be parts of a boundary in a

RegionOfInterest type of Observation Could represent any bio-sequence (DNA, RNA, Protein, etc.)

Patient data vs. generic knowledge tagSNP, knownAssociatedDiseases and haplotype are a type of

knowledge Should they only be referenced (pointing to KBs)?

Types of relationships between the various Genotype observations: Pertinent, Component, Subject,…? It’s tricky as it should apply to the observations and not to the

observed entities

Page 32: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Comments on the Genotype Model (cont.)

Distinguishing the encapsulating objects from the bubbled-up ones associate encapsulated objects to a bubbled-up objects, with

options: XFRM (transformation), XCRPT (excerpt), SUMM (summary), DRIV (derived from)… what’s best?

Method object should be in DEF mood? Could it be that there is a need to describe a method per patient?

Is the SNP Mutation association useful? Changed the association type to XFRM to demonstrate a possible

“bubbled-up” association, i.e., a SNP was encountered as a mutation

Page 33: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

SLIST Data Type

Table 37: Components of Sampled Sequence

Name Type Description

origin TThe origin of the list item value scale, i.e., the physical quantity that a zero-digit in the sequence would represent.

scale T.diffA ratio-scale quantity that is factored out of the digit sequence.

digits list<int>A sequence of raw digits for the sample values. This is typically the raw output of an A/D converter.

Use HL7 data types to represent bio-sequences SLIST<CV> (applied to CV=Coded Value) could hold either of the

following: ACGTCGGTTCA… Leu-Ala-Met-Gly-Ala-…

Page 34: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

Issues with just SequenceVariation…

SNP: Link to Haplotype is valid only for SNP type of Polymorphism tagSNP is valid only for SNP

Mutation: code&value are constrained to LOINC or other medical-oriented

taxonomy rather than to an LS taxonomy as in polymorphism The attribute knownAssociatedDiseases moves to the phenotype

choice so it’s resolved

SNP Mutation association needs now a recursive association within Sequence Variation

Technical issue: cannot shadow a choice box

Page 35: HL7 Clinical-Genomics SIG: A Shared Genotype Model

Haifa Research Lab

The End…

Thank you…