HL7 Clinical-Genomics SIG:A Shared Genotype Model
HL7 V3 Compliant
HL7 Clinical-Genomics SIG Facilitator
Amnon Shabo (Shvo)
IBM Research Lab in Haifa
Atlanta, September 2004
Haifa Research Lab
Current Work
Clinical-Genomics
Storyboard
Clinical-Genomics
Storyboard
Clinical-Genomics
Storyboard
Clinical-Genomics
Storyboard
Genotype
Shared
Model
Tissue Typing
Cystic Fibrosis Pharmacogenomics
BRCA
Family
History
ClinicalStatement
SharedModel
Haifa Research Lab
The Genotype CMET
Represents genomic data in HL7 RIM Classes Not meant to be a biological model Concise and targeted at healthcare use for
personalized medicine
Consists of: A Genotype (entry point) 1 .. 3 alleles Polymorphisms
Mutations SNPs
Haplotypes DNA Sequencing Gene expression Proteomics Phenotypes (clinical data such as diseases, allergies, etc.)
Haifa Research Lab
The Genotype CMET(cont.)
Design Principles: Shared model (a reusable component in different use cases) Basic encapsulation of genomic data that might be used in healthcare
regardless of the use case Stemmed from looking for commonalities in specific use cases Presented as the CG SIG DIM (Domain Information Model) in ballot#6&8 Most of the clones are optional, thus allowing the representation of merely a
genotype with a minimum of one allele (a typical use by early adopters) At the same time, allows the use of finer-grain / raw genomic data, thus
accommodating the more complex use cases such as tissue typing or clinical trials
Its use is currently illustrated in four R-MIMs: Tissue Typing Cystic Fibrosis Viral genotyping Pharmacogenomics
Haifa Research Lab
0..1 pertinentMutation
typeCode*: <= PERT
pertinentInformation
0..1 pertinentGeneExpression
typeCode*: <= PERT
pertinentInformation3
0..* pertinentPolymorphism
typeCode*: <= PERT
pertinentInformation6
IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNcode*: CE CWE [1..1] (allele identifier & classification, e.g. GeneBank)text: ED [0..1]methodCode: SET<CE> CWE [0..*] (The method by which the code was determined)
SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP identifier & classification, e.g. Entrez dbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]
HaplotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1]
GenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., HETEROZYGOTE)text: ED [0..1]effectiveTime: IVL<TS> [0..1] (the time of genotyping)
0..* haplotype
typeCode*: <= COMP
componentOf
1..3 individualAllele
typeCode*: <= COMP
component
0..* pertinentSNP
typeCode*: <= PERTpertinentInformation1
AlleleSequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: [1..1] (the sequence standard code, e.g.BSML, GMS)text: (the annotated sequence)effectiveTime: [1..1]value: ED [1..1] (the actual sequence)methodCode: (the sequencingmethod)
0..1 pertinentAlleleSequence
typeCode*: <= PERTpertinentInformation2
GeneExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE <= ActCode (the standard's code (e.g., MAGE-ML identifier)text:effectiveTime:value: ED [1..1] (the actual geneexpression levels)methodCode:
PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1](idnetifier & classification ofthe protein, e.g., SwissProt,) (PDB, PIR, HUPO)text:
0..* outcomePolypeptide
typeCode*: <= OUTC
outcome
DeterminantPeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE (identifier and classification of the determinant, e.g., Entrez)text: ED
0..* pertinentDeterminantPeptide
typeCode*: <= PERT
pertinentInformation2
MutationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE (mutation identifier andclassification, e.g. LOINC MOLECULARGENETICS NAMING)text:
0..* pertinentMutation
typeCode*: <= PERT
pertinentInformation4
ClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (disease, allergy, sensitivity, ADE, etc.)text: ED [0..1]uncertaintyCode: CE CNE [0..1]value: ANY [0..1]
HL7 Clinical Genomics SIGDocument: Individual Genotype DIM (to be registered as a CMET)Subject: Genomics Data Rev: 0.5 Date: April 24, 2004Facilitator: Amnon Shabo (Shvo), IBM Research in Haifa
Note:There must be at least oneIndividualAllele and threeat the most. The typical casewould be an allele pair, oneon the paternal chromosome andone on the maternal chromosome.
The third allele could bepresent if the patient hasthree copies of a chromosome asin the Down’s Syndrome.
Mutation
0..* haplotype
typeCode*: <= COMP
componentOf
Constrained to a restricted MAGE-MLcontent model, specified elesewhere.
Constraint: GeneExpression.value
Constrained to a restrictedBSML or GMS content model,specified elsewhere.
Constraint: AlleleSequence.value
0..* pertinentMethod
typeCode*: <= PERTpertinentInformation1
MethodclassCode*: <= PROCmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1] <=ActCode (type of method)text: ED [0..1] (free text description of themethod used)methodCode: SET<CE>CWE [0..*]
0..* pertinentIndividualAllele
typeCode*: <= PERT
pertinentInformation5
Note:A related allele that is on adifferent haplotype, and stillhas significant interrelationwith the source allele.
IndividualAllele
0..* priorClinicalPhenotype
typeCode*: <= SEQL
sequelTo
ExternalClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The id of an external observation (e.g., in a problemlist)
Note:An external observation is a valid Observationinstance existing in any other HL7-compliantartifact, e.g., a document or a message.
Note:An observation of a clinical conditionrepresented internally in this model.
Note: Shadowed observationsare copies of other observationsand thus have all of the originalact attributes.
Note:Use methodCode ifyou don’t use theassociated methodprocedure.
Note:Could refine ActRelationship typeCodeto elaborate on different types of genomicto phenotype effects.
Method0..* pertinentMethod
typeCode*: <= PERTpertinentInformation
Note:Usually this is a computed outcome, i.e.,the lab does not produce the actual protein.
0..* referredToExternalClinicalPhenotype
typeCode*: <= x_ActRelationshipExternalReference
reference
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype0..* priorClinicalPhenotype
typeCode*: <= SEQLsequelTo
0..* priorClinicalPhenotype
typeCode*: <= SEQL
sequelTo
0..* priorClinicalPhenotype
typeCode*: <= SEQL
sequelTo
Haplotype
Note:The classCode should beOBSGENPOLMUTwhich stands for mutation-polymorphismgenomic observation,a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype ofOBSGEN (genomicobservation).
Note:The classCode should beOBSGENPOLSNP whichstands forSNP-polymorphismgenomic observation,a subtype ofOBSGENPOL(polymorphism genomicobservation) which is asubtype of OBSGEN(genomic observation).
PolymorphismclassCode*: <= OBSmoodCode*: <=EVNid: II [0..1]code: CD CWE [0..1] <= ActCodetext: ED [0..1]value: ANY [0..1]
Note:The classCode should beOBSGENPOL which standsfor polymorphism genomicobservation, a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype of OBSGEN(genomic observation).
Genotype(POCG_RM000004)
Entry point to theClinical-GenomicsGenotype Model
The Genotype Model
Individual Allele (1..3)
SNP
Allele Sequence
MutationProteomic
s
Gene Expression
Clinical Phenotype
Haplotype
Entry Point: Genotype
Sequencing
Method
Polymorphism
Haifa Research Lab
Coexistence of HL7 Objects and Bioinformatics Markup
Clinical PracticeGenomic Data Sources
EHR System
HL7 CG Messages with m
ainly
Encapsulating HL7 Objects HL7 C
G M
essa
ges
with
bot
h
enca
psul
atin
g an
d
Speci
aliz
ed H
L7 O
bjec
ts
Bubbling up the clinically-significant raw
genomic data into specialized HL7 objects and
linked them with clinical data from the patient EHR
Decision Support
Applications
Knowledge(KBs, Ontologies, registries,
Evidence-Based, Papers, etc.)
Haifa Research Lab
Coexistence of HL7 Objects and Bioinformatics Markup (cont.)
Genetic CounselingDNA Lab
EHR System
HL7 CG Messages with an
AlleleSequence HL7 Object
encapsulating the raw sequencing
results
HL7 C
G M
essa
ges
with
bot
h
enca
psul
atin
g an
d
Speci
aliz
ed H
L7 O
bjec
ts
Bubbling up the clinically-significant SNP data into
HL7 SNP and Mutation objects and
linked them with clinical data from the patient EHR
Decision Support
Applications
Sequencing Example…
Haifa Research Lab
Coexistence of HL7 Objects and Bioinformatics Markup (cont.)
AlleleSequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g.BSML, GMS)text: ED [0..1] (sequence'sannotations)effectiveTime: GTS [1..1]value: ED [1..1] (the actual sequence)methodCode: SET<CE> CWE [0..*] (the sequencing method)
IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1] (allele classification)text: ED [0..1]value: ANY [0..1] (e.g. accession no. in GeneBank)methodCode: SET<CE> CWE [0..*] (The method by which the code was determined)
MutationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (mutation classification)text: ED [0..1]value: ANY [0..1] (mutation code, e.g. drawn from LOINC MOLECULAR GENETICS NAMING)
SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP classification, e.g. from EntrezdbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]
SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP classification, e.g. from EntrezdbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]
ClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., disease, allergy, sensitivity, ADE, etc.)text: ED [0..1]value: ANY [0..1]
Sequencing data
encapsulated as
bioinformatics markup
The patient's allele
HL7 genomic-specialized
Objects
Bubbling-up… Bubbling-up…B
ub
bli
ng
-up
…
Bu
bb
ling
-up
…
Haifa Research Lab
The Family History Model
PersonclassCode*: <= PSNdeterminerCode*: <= INSTANCEid: SET<II> [0..*] (e.g., SSN)name: BAG<EN> [0..*]telecom: BAG<TEL> [0..*]administrativeGenderCode: CECWE [0..1] <= AdministrativeGenderbirthTime: TS [0..1]deceasedInd: BL [0..1] "false"deceasedTime: TS [0..1]raceCode: SET<CE> CWE [0..*] <=RaceethnicGroupCode: SET<CE> CWE[0..*] <= Ethnicity
1..1 patientPerson
PatientclassCode*: <= PATid*: SET<II> [1..*]
0..1 relationshipHolder
PersonalRelationship 0..* relationshipHolder
classCode*: <= PRSid: SET<II> [0..*] (use this attribute to hold pedigree ID)code: CE CWE [0..1] <= RoleCode "FAMMEMB"
Note:First-degree relatives.FAMMEMB could be usedfor unidentified relatives, butalso any of the more specificcodes like PRN (parent) orNMTH (natural mother).
Note:A shadow of Person allows for recursiverepresentation of any higher leveldegree of relations, e.g., grandfather,through the same clone - PersonalRelationship,nesting in Person.
Note:Should be replaced with ageneric clinical statementCMET so it is capable ofholding any pertinentclinical data of the patientor his/her relative.
Note:This should be replaced with theClinical-Genomics Genotype model(as a CMET) to deal with all typesof genomic data.
Note:Person holds details that arenot specific the family role.Person is also the scoper ofthe relative roles (for moredetails see the V3 RoleCodevocabulary, domain =PersonalRelationshipRoleType).
FamilyHistory(COCT_RM999999)
This model is intended tobe a CMET and has thecapability of representingany part of the patientpedigree.
GenotypeclassCode*: <= OBSmoodCode*: <= EVN
ClinicalStatementclassCode*: <= ACTmoodCode*: <= EVNcode: CD CWE [1..1] <= ActCodenegationInd: BL [0..1]effectiveTime: GTS [0..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityuncertaintyCode: CE CNE [0..1] <= ActUncertainty
ClinicalGenomicChoice
Note:Clinical Genomics choice similar to thechoice associated with the Patient role(the entry point of this model).
0..* clinicalGenomicChoicetypeCode*: <= SBJsubjectOf1
0..*
subjectOf
Person
Genotype CMET
Haifa Research Lab
Family History – Harmonization Proposals
Age: Age of subject when subject’s diagnosis was made Age at time of death
Proposed solution: a new data type to refer to from effectiveTime:<effectiveTime xsi:type="TSR"> <!--TSR=Time Stamp Relative--> <epoch code="B"/>
<offset value="20" unit="mo"/></effectiveTime>
Vocabulary proposals Observation Interpretation (Deleterious, Unknown significance, Polymorphism, No mutation)
Personal relation codes and qualifiers
Personal Relationship association names A naming algorithm problem (HL7 tooling issue)
Haifa Research Lab
The Genotype Model in Tissue Typing
BMT Tissue Typing
Tissue Typing Observation
Genotype
AlleleSNP
Haplotype
Individual1 HLA
Matching
Individual2 HLA
Donor Banks
BMT
Ward
Tissue-Typing Lab
Haifa Research Lab
Tissue Typing
Observation
How the Genotype fits to Tissue-Typing
Tissue Typing in the context of Bone-Marrow Transplantation:
BMT Center
Donor
Bank
BMT unique Order/Entry
Haifa Research Lab
1..* hLA_AntigenGenotype
typeCode*: <= COMP
component
0..1
TT-TestingLabclassCode*: <= QUAL
TissueTypingFacilityclassCode*: <= ENTdeterminerCode*: <= INSTANCE
TissueTypingObservationclassCode*: <= OBSmoodCode*: <= EVNid:code: CS CWE <= TissueTypingTestingClass
0..1 tT-TestingLab
typeCode*: <= PPRFprimaryPerformer
0..1 playingPerson
DonorclassCode*: <= ROL
PersonclassCode*: <= PSNdeterminerCode*: <= INSTANCE
CMET: (PAT) R_Patient[universal]
(COCT_MT050000)
0..1 participant
typeCode*: <= SBJsubject
TissueTypingMatchingObservationclassCode*: <= OBSmoodCode*: <= EVNcode: CS CWE <= TissueTypingMatchingClasstext:
1..1 priorTissueTypingObservation
typeCode*: <= SEQLsequelTo1
TissueTypingResultLetterclassCode*: <= DOCCLINmoodCode*: <= EVNcode: CE CWE <=TissueTypingDocumentType
0..* tissueTypingResultLetter
typeCode*: <= DOC
documentationOf
0..1 class I Antigens
typeCode*: <= COMP
component2
Class I AntigensclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCode
0..1 class II Antigens
typeCode*: <= COMP
component1
Class II AntigensclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1] <= ActCode
Genotype CMETclassCode*: <= OBSmoodCode*: <= EVN
1..* hLA_AntigenGenotype
typeCode*: <= COMP
component
LocusMatchingclassCode*: <= OBSmoodCode*: <= EVNcode: CS CWE <=TissueTypingLocusMatchingClasstext:
0..* locusMatching
typeCode*: <= COMP
component
TissueTypingObservation
1..1 priorTissueTypingObservation
typeCode*: <= SEQL
sequelTo
The number of LocusMatchingObservations is dependent onthe no. of loci examinedin the tissue-typing testing
Constraint: LocusMatching
Note:TissueTypingLocusMatchingClassshould be a new vocabularyin HL7 (may use recent NMDP effort)
Note:The no. of genotypes is dependenton the no. of loci examined in eachHLA class (usually, class I includesA, B and C antigens and class IIincludes the DR antigen family)
TTObservation(UUDD_RMnnnnnn)
Tissue-TypingObservation
Note:TissueTypingMatchingClassshould be a new vocabularyin HL7, e.g., 2-haplotype matchSubjectChoice
TT-Matching(UUDD_RMnnnnnn)
Description
Note:This module is developed by the Clinical-Genomics SIG.It will registered as a CMET but for now it appears here asan observation. For details, see the Genotype R-MIM.All genomic data are encapsulated in this CMET, includingmutations which are the essence of the CF testing for example.
How the Genotype fits to Tissue-Typing
Single
Tissue Typing
Observation
Class I Antigens
Class II Antigens
The Genotype model is used for each HLA
Antigen
Tissue Typing
Matching Observation
Haifa Research Lab
Tissue Typing Scenario Simulation
Real Case with… A Hutch Patient and sibling and unrelated donor candidates are in Hadassah
Information exchange… is simulated through a series of XML files following the TT storyboard activity diagram and using the HL R-MIMs + Genotype CMET
Documented in the following doc: HL7-Clinical-Genomics-TissueTypingInfoExchangeSimulation.doc Contact Amnon Shabo to get the document ([email protected])
Haifa Research Lab
The Genotype Model in Cystic Fibrosis
Entry Point: Blood Sample
Patient
Provider EMR System
MGS Report
DNA
Genotype CMET
MLG Counselor
ML Consultant
Molecular Genetic
lab
Haifa Research Lab
The Genotype Model in Viral Genotyping
Entry Point: Specimen
Pathogen
Patient
Viral DNA Sequencing
Viral DNA Regions
Genotype CMET
DNA Lab
Test Panel
Sponsor
Report
Resistance
Profile
Haifa Research Lab
The Genotype Model in Pharmacogenomics-Based Clinical Trial & Submission
Pharmacogenomics testing
Patient
Gene Selection
Genotype CMET
Genomic data
Submission
Sponsor
CRO
Report
CRO
Regulator
Data Validation
Analysis
device
Data Analysis
Trial design
SNP/Hap
Discovery
Haifa Research Lab
Constrained-BSML Schema
BSML – Bioinformatics Sequence Markup Language
Aimed at any biological sequence, for example: DNA RNA Protein
Constraining the BSML DTD to fit the healthcare needs Leave out research and display markup Ensure the patient identification
Creating an XML Schema, set up as the content model of an HL7 attribute of type ED
Haifa Research Lab
Constrained-MAGE-ML Schema
Cope with data outside of the XML (referenced)
Shared issues: Eliminate research & display elements and requires the
presence of certain elements, for example - patient identifiers
Require that one and only one patient will be the subject of the data, to avoid bringing data of another patient into the HL7 message
Require that data will refer to only one allele with which the encapsulating HL7 object is associated
Haifa Research Lab
OBS Specialization Examples
PublicHealthCase detectionMethodCode :: CE transmissionModeCode :: CE diseaseImportedCode :: CE
Diagnostic Image subjectOrientationCode:: CE
The above examples are relatively ‘simple’ considering the uniqueness of the genomic observation attributes
Propose to add a genomic specialization to the RIM Observation Class
Rationale: has additional attributes that are unique to genomics (LSID, Bioinformatics Markup, etc.)
Haifa Research Lab
Genomic Specializations of Observation
GenomicObservation
LSID
Polymorphism
typepositionlengthreferenceregion
SNP
tagSNP
Mutation
knownAssciatedDiseases (not the actual phenotype)
Gene Expression
MAGE
Bio Sequence
BSML
Haifa Research Lab
New Class Codes Proposal
OBSGEN GenomicObservation
OBSGENPOL Polymorphism
OBSGENPOLMUT Mutation
OBSGENPOLSNP SNP
classCode Class name
Haifa Research Lab
New Attributes Proposal GenomicObservation: LSIDIdentifier
AlleleSequence: moleculeSequence A constrained XML Markup based on the BSML markup.
Polymorphism:o type (SNP, Mutation, Other)o position (the position of the polymorphism)o length (the length of the polymorphism)o reference (the base reference for the above attributes)o region (when the polymorphism scope is a specific gene region)
SNP: Tag SNPA Boolean field indicating whether this SNP is part of small SNP-Set that determines a SNP-haplotype.
GeneExpression: expressionLevels A constrained XML Markup based on the MAGE markup.
Proteomic clones: TBD.
Haifa Research Lab
Proposed HL7 Vocabularies
Genomics Vocabularies: Polymorphism:
General types (SNP, Mutation, Sequence Variation)
Nucleotide-based types (substitution, insertion, deletion, etc.)
Alleles Relation (recessive / dominant, homozygote / heterozygote)
Genotype-to-phenotype types of effects
Genomic observation interpretation (Deleterious, Unknown significance, polymorphism, No mutation)
SequencingMethodCode(example in next slide)
Haifa Research Lab
HL7 Vocabulary Example
SequencingMethodCode:
SSOPH -Sequence specific oligonucleotide probe hybridization
SSP -Sequence specific primers
SBT -Sequence-based typing
RSCA -Reference strand conformation analysis
Haifa Research Lab
Proposed HL7 Vocabularies (cont.)
Tissue Typing related Vocabularies: TissueTypingLocusMatchingClass
TissueTypingMatchingClass
TissueTypingTestingClass
TissueTypingTestingMethod
TissueTypingDocumentType
TissueTypingOrderClass
DonorType (allogeneic, autologous, etc.)
Class I & II antigens classification
Haifa Research Lab
XML Examples
Genotype Examples:o GenotypeSample1.xml
A genotype of two HLA alleles in the B locus
o GenotypeSample2.xmlA genotype of two HLA alleles in the B locus, along with a SNP designation in the first allele
Tissue Typing Observation Examples:o TissueTypingObservationSample1.xml
Consists of a single tissue typing observation of a patient or a donor
o TissueTypingObservationSample2.xmlConsists of two tissue typing observations of a patient & donor, leading to a tissue typing matching observation
Donor Search Examples:o TissueTypingDonorBankSample1.xml
This example is aimed at illustrating an unsolicited message from a BMT Center to a donor bank, sending a patient's tissue typing observation for the purpose of searching an appropriate donor
Haifa Research Lab
Next Steps
HL7 Formally submission of our harmonization proposals Continue with 2 alternatives until harmonization is resolved Register the Genotype Family History models as CMETs Hand craft sample instances (for review and experimental use) Derive a Genetic Testing model from the HL7 Lab SIG Models
Vocabularies HL7- develop External- get HL7 to recognize them
Constraining Bioinformatics Markup (continue the effort and include markup in the next ballot) MAGE-ML or MIAME BSML (done) caBIO (?)
Haifa Research Lab
Linking to the NCI Rembrandt Model
SNPFrequency
Polymorphism
ClinicalPhenotype(from Clinical)
Population(from Population)
Allele
Genotype
0..n
1..n
0..n
1..n0..2
1..n+PaternalMaternalAllele
0..2
1..n
0..1
1..n +ExtraAllele
0..1
1..n
Haplotype Chromosome1..n 11..n 1
SNP
0..n
0..n
0..n
0..n
0..n
0..2
0..n
0..2
1..n
0..n
1..n
0..n
1
1..n
1
1..n
LengthPolymorphism
Probe
signalValue
(from GeneExpression)
<<Interface>>
1
1
1
1
Clone11 11
Gene
1
1..n
1
1..n1
0..n
1
0..n
MapLocation
1..n
1
1..n
1
10..n 10..n
1..n0..n 1..n0..n
1
1
1
1
1..n
1..n1..n
1..n
Use-case driven modeling, designed with the HL7-Genotype model as a starting point and will eventually extend the caBio model.
Haifa Research Lab
Alternative Genotype Models
0..1 priorMutation
typeCode*: <= SEQL
sequelTo
0..1 geneExpression
typeCode*: <= SUBJ
subject5
0..* polymorphism
typeCode*: <= SUBJ
subject4
IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1] (allele classification)text: ED [0..1]value: ANY [0..1] (e.g. accession no. in GeneBank)methodCode: SET<CE> CWE [0..*] (The method by which the code was determined)
SNPclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (SNP classification, e.g. from EntrezdbSNP)text: ED [0..1]value: BAG<ED> [0..*] (the SNP itself)methodCode: SET<CE> CWE [0..*]
HaplotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1]value: ANY [0..1]
GenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., HETEROZYGOTE)text: ED [0..1]effectiveTime: IVL<TS> [0..1] (the time of genotyping)
0..* haplotype
typeCode*: <= COMP
componentOf
1..3 individualAllele
typeCode*: <= COMP
component
0..* sNP
typeCode*: <= SUBJsubject6
AlleleSequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g.BSML, GMS)text: ED [0..1] (sequence'sannotations)effectiveTime: GTS [1..1]value: ED [1..1] (the actual sequence)methodCode: SET<CE> CWE [0..*] (the sequencing method)
0..1 alleleSequence
typeCode*: <= SUBJsubject7
GeneExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE <= ActCode (the standard's code (e.g., MAGE-ML identifier)text:effectiveTime:value: ED [1..1] (the actual geneexpression levels)methodCode:
PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code*: CE CWE [1..1](classification of the protein, e.g.,SwissProt, PDB, PIR, HUPO)text: ED [0..1]value: ANY [0..1]
0..* causePolypeptide
typeCode*: <= MFST
manifestationOf
DeterminantPeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (classification of thedeterminant)text: ED [0..1]value: ANY [0..1]
0..* derivedDeterminantPeptide
typeCode*: <= DRIV
derivation
MutationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (mutation classification)text: ED [0..1]value: ANY [0..1] (mutation code, e.g. drawn from LOINC MOLECULAR GENETICS NAMING)
0..* mutation
typeCode*: <= SUBJ
subject
ClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., disease, allergy, sensitivity, ADE, etc.)text: ED [0..1]value: ANY [0..1]
HL7 Clinical Genomics SIGDocument: Individual Genotype DIM (to be registered as a CMET) - Genomic Attributes as HL7 ClonesSubject: Genomics Data Rev: 0.17 Date: September 14, 2004Facilitator: Amnon Shabo (Shvo), IBM Research in Haifa, [email protected]
Note:There must be at least oneIndividualAllele and threeat the most. The typical casewould be an allele pair, oneon the paternal chromosome andone on the maternal chromosome.
The third allele could bepresent if the patient hasthree copies of a chromosome asin the Down’s Syndrome.
Mutation
0..* haplotype
typeCode*: <= COMP
componentOf
Constrained to a restricted MAGE-MLcontent model, specified elesewhere.
Constraint: GeneExpression.value
Constrained to a restrictedBSML content model,specified elsewhere.
Constraint: AlleleSequence.value
0..* method
typeCode*: <= SUBJsubject
MethodclassCode*: <= PROCmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1] <=ActCode (type of method)text: ED [0..1] (free text description of themethod used)methodCode: SET<CE>CWE [0..*]
0..* referredToIndividualAllele
typeCode*: <= REFR
reference
Note:A related allele that is on adifferent haplotype, and stillhas significant interrelationwith the source allele.
IndividualAllele
0..* causedClinicalPhenotype
typeCode*: <= CAUS
causeOf
ExternalClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The id of an external observation (e.g., in a problemlist)
Note:An external observation is a valid Observationinstance existing in any other HL7-compliantartifact, e.g., a document or a message.
Note:An observation of a clinical conditionrepresented internally in this model.
Note: Shadowed observationsare copies of other observationsand thus have all of the originalact attributes as well as all‘outbound’ associations.
Note:Use methodCode ifyou don’t use theassociated methodprocedure.
Note:Should refine ActRelationship typeCodeto elaborate on different types of genomicto phenotype interrelations.
Method0..* pertinentMethod
typeCode*: <= PERTpertinentInformation
Note:This might be a computed outcome, i.e.,the lab does not provide the actual protein,but secondary processes populate thisclone with the translational protein.
0..* referredToExternalClinicalPhenotype
typeCode*: <= x_ActRelationshipExternalReference
reference
ClinicalPhenotype
ClinicalPhenotype
ClinicalPhenotype0..* causedClinicalPhenotype
typeCode*: <= CAUScauseOf
0..* causedClinicalPhenotype
typeCode*: <= CAUS
causeOf
0..* causedClinicalPhenotype
typeCode*: <= CAUS
causeOfHaplotype
Note:The classCode should beOBSGENPOLMUTwhich stands for mutation-polymorphismgenomic observation.
Note:The classCode should beOBSGENPOLSNP whichstands forSNP-polymorphismgenomic observation.
PolymorphismclassCode*: <= OBSmoodCode*: <=EVNid: II [0..1]code: CD CWE [0..1] <= ActCodetext: ED [0..1]value: ANY [0..1]
Note:The classCode should beOBSGENPOL which standsfor polymorphism genomicobservation, a subtype ofOBSGENPOL (polymorphismgenomic observation) whichis a subtype of OBSGEN(genomic observation).
Genotype(POCG_RM000004)
Entry point to theClinical-GenomicsGenotype Model
DeterminantPeptide
0..* causeDeterminantPeptide
typeCode*: <= MFSTmanifestationOf
PolymorphismAttributesclassCode*: <= ActContainermoodCode*: <= EVN
PolyTypeclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]value: CE CWE [1..1]
0..1 polyType
typeCode*: <= COMP
component4
PolyLengthclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]value: INT [1..1]
0..1 polyLength
typeCode*: <= COMP
component5
PolyPositionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]value: INT [1..1]
0..1 polyPosition
typeCode*: <= COMP
component6
PolyReferenceclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]value: ED [0..1]
PolyRegionclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]value: ED [0..1]
0..1 polyReference
typeCode*: <= COMP
component7
0..1 polyRegion
typeCode*: <= COMP
component8
PolymorphismAttributes
PolymorphismAttributes
0..1 polymorphismAttributestypeCode*: <= SUBJ
subject9
0..1 polymorphismAttributes
typeCode*: <= SUBJ
subject
0..1 polymorphismAttributes
typeCode*: <= SUBJ
subject
Note:A container of commonpolymorphism attributes.
Note: A code attribute was not added to any of the polymorphismattribute clones as this seems to be implicit from the clone name.
knownAssociatedDiseasesclassCode*: <= OBSmoodCode*: <= DEFcode: CD CWE [0..1] <= ActCodetext: ED [0..1]value: ANY [0..1]
0..* riskKnownAssociatedDiseases
typeCode*: <= RISKrisk
Note:These diseases are not the actualphenotype for the patient, rather theyare the known risks of this mutation.
tagSNPclassCode*: <= OBSmoodCode*: <= DEF
0..1 tagSNP
typeCode*: <= SUBJ
subject
Note:The presence of thisclone indicates that thesource SNP clone is atag SNP (note that ithas a DEF mood),
translationalDataclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1] <= ActCodevalue: ANY [0..1]
0..* pertinenttranslationalData
typeCode*: <= PERTpertinentInformation
Constrained to a restricted caBiocontent model, specified elsewhere.
Constraint: translationalData.value Entry Point: Genotype
Polymorphism
PolymorphismAttributesContainer
PolymorphismAttributes
Polymorphism
Attributes Shadow asso. W / Mutation
A model without genomic specializations of the HL7 RIM Observation class:
Haifa Research Lab
Comments received on the Genotype Model
Revalidate/collapse the polymorphism hierarchy Add a RIM class “SequenceVariance” Representing all types of polymorphisms Type could be placed in the code attribute ‘position’ and ‘length’ could be parts of a boundary in a
RegionOfInterest type of Observation Could represent any bio-sequence (DNA, RNA, Protein, etc.)
Patient data vs. generic knowledge tagSNP, knownAssociatedDiseases and haplotype are a type of
knowledge Should they only be referenced (pointing to KBs)?
Types of relationships between the various Genotype observations: Pertinent, Component, Subject,…? It’s tricky as it should apply to the observations and not to the
observed entities
Haifa Research Lab
Comments on the Genotype Model (cont.)
Distinguishing the encapsulating objects from the bubbled-up ones associate encapsulated objects to a bubbled-up objects, with
options: XFRM (transformation), XCRPT (excerpt), SUMM (summary), DRIV (derived from)… what’s best?
Method object should be in DEF mood? Could it be that there is a need to describe a method per patient?
Is the SNP Mutation association useful? Changed the association type to XFRM to demonstrate a possible
“bubbled-up” association, i.e., a SNP was encountered as a mutation
Haifa Research Lab
SLIST Data Type
Table 37: Components of Sampled Sequence
Name Type Description
origin TThe origin of the list item value scale, i.e., the physical quantity that a zero-digit in the sequence would represent.
scale T.diffA ratio-scale quantity that is factored out of the digit sequence.
digits list<int>A sequence of raw digits for the sample values. This is typically the raw output of an A/D converter.
Use HL7 data types to represent bio-sequences SLIST<CV> (applied to CV=Coded Value) could hold either of the
following: ACGTCGGTTCA… Leu-Ala-Met-Gly-Ala-…
Haifa Research Lab
Issues with just SequenceVariation…
SNP: Link to Haplotype is valid only for SNP type of Polymorphism tagSNP is valid only for SNP
Mutation: code&value are constrained to LOINC or other medical-oriented
taxonomy rather than to an LS taxonomy as in polymorphism The attribute knownAssociatedDiseases moves to the phenotype
choice so it’s resolved
SNP Mutation association needs now a recursive association within Sequence Variation
Technical issue: cannot shadow a choice box
Haifa Research Lab
The End…
Thank you…