hl7 clinical genomics and structured documents work groups

13
HL7 Clinical Genomics and Structured Documents Work Groups CDA Implementation Guide: Genetic Testing Report DRAFT PROPOSAL Amnon Shabo (Shvo), PhD [email protected] HL7 Clinical Genomics WG Co-chair and Modeling Facilitator HL7 Structured Documents WG CDA R2 Co-editor CCD Implementation Guide Co-editor

Upload: avery

Post on 25-Feb-2016

56 views

Category:

Documents


0 download

DESCRIPTION

HL7 Clinical Genomics and Structured Documents Work Groups. CDA Implementation Guide: Genetic Testing Report DRAFT PROPOSAL Amnon Shabo (Shvo), PhD [email protected] HL7 Clinical Genomics WG Co-chair and Modeling Facilitator HL7 Structured Documents WG CDA R2 Co-editor - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: HL7 Clinical Genomics and Structured Documents  Work Groups

HL7 Clinical Genomics and Structured Documents

Work Groups

CDA Implementation Guide: Genetic Testing Report

DRAFT PROPOSAL

Amnon Shabo (Shvo), [email protected]

HL7 Clinical Genomics WGCo-chair and Modeling Facilitator

HL7 Structured Documents WGCDA R2 Co-editorCCD Implementation Guide Co-editor

Page 2: HL7 Clinical Genomics and Structured Documents  Work Groups

2

Haifa Research Lab

The HL7 Clinical Genomics SIG

Mission: to enable the standard use of patient-related genetic data such as DNA sequence variations and gene expression levels, for healthcare purposes (‘personalized medicine’) as well as for clinical trials & research

Genomic Data Clinical Data

HL7

DICOMX12

HL7 Clinical Genomics -

A bridge standard…

MAGE

BSMLPSI

GenBank

HUGOSwissProt

SNOMED

ICDLOINC

Page 3: HL7 Clinical Genomics and Structured Documents  Work Groups

3

Haifa Research Lab

How to Handle Raw and Mass Data

Could we learn from the imaging integration effort?

existing standardsIMAGINGDICOM

GENOMICSBSML;MAGE-ML;......

Mass and noisy data

Summary, interpretation,Narrative, etc.

Pixels

Radiologist-Report

Bio-sequences;Pathways; Gene ExpressionGeneticist-Report

Page 4: HL7 Clinical Genomics and Structured Documents  Work Groups

4

Haifa Research Lab

HL7 Clinical Genomics v3 Static ModelsFamily

History

Genetic

Loci

Utilize

Genetic

Locus

Constrained GeneticVariation

Phenotype(utilizing the HL7

Clinical Statement)

Utilize

Utilize

Utilize

Implementation Topic

Normative

DSTU

Constrained Gene ExpressionImplementation Topic

Comments

RCRIM LAB

Other domains

Utilize

Utilize

CDA IG

Ref

eren

ce

Reference

Page 5: HL7 Clinical Genomics and Structured Documents  Work Groups

5

Haifa Research Lab

0..* associatedObservation

typeCode*: <= COMPcomponent

0..* associatedProperty

typeCode*: <= DRIVderivedFrom2

0..* polypeptide

typeCode*: <= DRIVderivedFrom5

SEQUENCES & PROTEOMICS

0..* expression

typeCode*: <= COMPcomponent1

0..* sequenceVariation

typeCode*: <= COMPcomponent3

IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (allele code, drawn from HUGO-HGVS or OMIM)methodCode: SET<CE> CWE [0..*]

GeneticLocusclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., ALLELIC, NON_ALLELIC)text: ED [0..1]effectiveTime: IVL<TS> [0..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityuncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: CD [0..1] (identifying a gene through GenBank GeneID with an optional translation to HUGO name.)methodCode: SET<CE> CWE [0..*]

0..* individualAllele

typeCode*: <= COMPcomponent1

SequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g. BSML)text: ED [0..1] (sequence's annotations)effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual sequence)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*] (the sequencing method)

ExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [1..1] (the standard's code (e.g., MAGE-ML identifier)negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual gene or protein expression levels)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]

PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (protein code, drawn from SwissProt, PDB, PIR,HUPO, etc.)methodCode: SET<CE> CWE [0..*]

DeterminantPeptidesclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (peptide code, drawn from referencedatabases like those used in the Polypeptide class)methodCode: SET<CE> CWE [0..*]

Constrained to a restrictedMAGE-ML constrained schema,specified separately.

Constraint: GeneExpression.value

Note:A related allele that is ona different locus, and hasinterrelation with thesource allele, e.g.,translocated duplicatesof the gene.

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

ExternalObservedClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The unique id of an external observation residing outside of the instance)code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]

Note:An external observation is preferably a valid observationinstance existing in any other HL7-compliant instance,e.g., a document or a message.Use the id attribute of this class to point to the uniqueinstance identifier of that observation.

Note:A phenotype which has been actuallyobserved in the patient representedinternally in this model.

Note:This is a computed outcome, i.e.,the lab does not test for the actualprotein, but secondary processespopulate this class with thetranslational protein.

SequenceVariationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1] (The variation itself expressed with recognized notation like 269T>C or markup like BSML or drawn from an external reference like OMIM or dbSNP.)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]

KnownClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= DEFcode: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= ActUncertaintyvalue: ANY [0..1]

Note:These phenotypes are not the actual (observed)phenotypes for the patient, rather they are thescientifically known phenotypes of the sourcegenomic observation (e.g., known risks of amutation or know responsiveness to a medication).

Note:Code: COPY_NUMBER, ZYGOSITY, DOMINANCY, GENE_FAMILY,etc. For example, if code = COPY_NUMBER, then the value is oftype INT and is holding the no. of copies of this gene or allele.

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

EXPRESSION DATA

SEQUENCE VARIATIONS

Polypeptide

Note:The Expression class refers to both gene and proteinexpression levels. It is an encapsulating class that allowsthe encapsulation of raw expression data in its value attribute.

0..* sequence

typeCode*: <= COMPcomponent2

0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

Note:The code attribute indicates inwhat molecule the variation occurs,i.e., DNA, RNA or Protein.

0..* expression

typeCode*: <= COMPcomponent5

Note:Use the associations to the shadowclasses when the data set type (e.g.,expression) is not at deeper levels(e.g., allelic level) and needs to beassociated directly with the locus(e.g., the expression level is thetranslational result of both alleles).

0..* associatedObservationtypeCode*: <= COMPcomponent2

0..1 associatedObservation

typeCode*: <= COMPcomponent4 Note:

This recursive associationenables the association of anRNA sequence derived froma DNA sequence and apolypeptide sequence derivedfrom the RNA sequence.

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

Note:This class is a placeholder for a specific locus on the genome - that is - a position of a particulargiven sequence in the subject’s genome or linkage map.Note that the semantics of the locus (e.g., gene, marker, variation, etc.) is defined by data assignedin the code & value attributes of this class, and also by placing additional data relating to thislocus into the classes associated with this class like Sequence, Expression, etc..

Note:The term 'Individual Allele' doesn't refer necessarily to aknown variant of the gene/locus, rather it refers to theindividual patient data regarding the gene/locus and mightwell contain personal variations w/unknown significance.

AssociatedObservationclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]methodCode: SET<CE> CWE [0..*]

Note:The code attribute could hold codes likeNORMALIZED_INTENSITY, P_VALUE, etc.The value attribute is populated based on theselected code and its data type is then setupaccordingly during instance creation.

Note:The code attribute could hold codes like TYPE,POSITION.GENOME, LENGTH, REFERENCE, REGION, etc..The value attribute is populated based on the selected codeand its data type is then setup accordingly during instancecreation. Here are a few examples:If code = TYPE, then the value is of type CV and holds one of thefollowing: SNP (tagSNP), INSERTION, DELETION,TRANSLOCATION, etc.

if code = POSITION, then value is of type INT and holdsthe actual numeric value representing the variation positionalong the gene.

if code = LENGTH, then value is of type INT and holdsthe actual numeric value representing the variation length.

If code = POSITION.GENE, then value is of type CV and is oneof the following codes:INTRON, EXON, UTR, PROMOTER, etc.

If code = POSITION.GENOME, then value is of type CV and is oneof the following codes:NORMAL_LOCUS, ECTOPIC, TRANSLOCATION, etc.

If the code = REFERENCE, then value istype CD and holds the reference gene identifier drawn from areference database like GenBank.

The full description of the allowed vocabularies for codes and itsrespective values could be found in the specification.

AssociatedObservation

Note:Code: CLASSIFICATION, etc.For example, if code =CLASSIFICATION, then the valueis of type CV and is holding eitherKNOWN or NOVEL.

reference

0..* geneticLocus

typeCode*: <= REFR

Note:A related gene that is on adifferent locus, and stillhas significant interrelationwith the source gene (similarto the recursive associationof an IndividualAllele).

ClinicalPhenotypeclassCode*: <= ORGANIZERmoodCode*: <= EVN

0..* observedClinicalPhenotype

typeCode*: <= COMPcomponent1

0..* knownClinicalPhenotype

typeCode*: <= COMPcomponent2

0..* externalObservedClinicalPhenotype

typeCode*: <= COMPcomponent3

At least one of the target acts ofthe three component act relationshipsshould be populated, since this isjust a wrapper class.

Constraint: ClinicalPhenotype

Note:- code should indicate the type of source, e.g., OMIM- text could contain pieces from research papers- value could contain a phenotype code if known (e.g., if it’s a disease, then the disease code)

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

0..1 identifiedEntity

typeCode*: <= SBJcontextControlCode: CS CNE [0..1] <= ContextControl "OP"

subject

reference

0..* individualAllele

typeCode*: <= REFR

ObservedClinicalPhenotype

Note:This CMET might be replacedwith the Clinical Statement SharedModel for richer expressivity, whenthe that mode is approved(currently in ballot).

Constrained to a restricted BSMLcontent model, specified in aseparate schema.

Constraint: Sequence.value

0..* sequence

typeCode*: <= COMPcomponent4

0..* sequenceVariation

typeCode*: <= COMPcomponent3

AssociatedPropertyclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1]text: ED [0..1]value: ANY [0..1]

0..* associatedProperty

typeCode*: <= DRIVderivedFrom1

AssociatedObservation

0..* associatedObservation

typeCode*: <= COMPcomponent

AssociatedPropertyAssociatedObservation

0..* associatedProperty

typeCode*: <= DRIVderivedFrom

AssociatedProperty0..* associatedProperty

typeCode*: <= DRIVderivedFrom1

AssociatedObservation0..* associatedObservation

typeCode*: <= COMPcomponent

0..* sequenceVariationtypeCode*: <= DRIVderivedFrom3derivedFrom2

0..* sequence

typeCode*: <= DRIV

0..* determinantPeptides

typeCode*: <= DRIVderivedFrom4

0..* determinantPeptides

typeCode*: <= DRIVderivedFrom

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation 0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

AssociatedProperty

0..* associatedProperty

typeCode*: <= DRIVderivedFrom

AssociatedProperty

GeneticLociclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]

0..* geneticLocitypeCode*: <= COMPcomponentOf

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

GeneticLoci0..* geneticLoci

typeCode*: <= COMPcomponentOf

GeneticLoci0..* geneticLoci

typeCode*: <= COMPcomponentOf

0..* polypeptide

typeCode*: <= DRIVderivedFrom1

Polypeptide

0..* polypeptide

typeCode*: <= DRIVderivedFrom2

Note:Use this class to indicate a set of genetic locito which this locus belongs. The loci set couldbe a haplotype, a genetic profile and so forth.Use the id attribute to point to the GeneticLociinstance if available. The other attributesserve as a minimal data set about the loci group.

PHENOTYPES

Note:Any observation related to the variation and is notan inherent part of the variation observation (the lattershould be represented in the AssociatedProperty class).For example, the zygosity of the variation.

Note:Use this class to point to a variationgroup to which this variation belongs.For example, a SNP haplotype.

Note:Any observation related to the sequence and is notan inherent part of the sequence observation (the lattershould be represented in the AssociatedProperty class).For example, splicing alternatives.

Note:Key peptides in the proteinthat determine its function.

Note:There could be zero to manyIndividualAllele objects in aspecific instance. A typicalcase would be an allele pair,one on the paternalchromosome and one on thematernal chromosome.

Note:Use this class toshow an allelehaplotype like in HLA.

Note:Any observationrelated to theexpression assayand is not aninherent part ofthe expressionobservation.

Note:Use this class forinherent dataabout the locus, e.g.chromosome no.

IdentifiedEntityclassCode*: <= IDENTid: SET<II> [0..*]code: CE CWE [0..1] <= RoleCode

Note:Use this role to identify a different subject(e.g., healthy tissue, virus, etc.) than theone propagated from the wrappingmessage or payload (e.g., GeneticLoci).

ScopingEntityclassCode*: <= LIVdeterminerCode*: <= INSTANCEid: SET<II> [0..*]code: CE CWE [0..1] <= EntityCode

0..* assignedEntity

typeCode*: <= PRFcontextControlCode: CS CNE [0..1] <= ContextControl "OP"

performer

0..*performer

0..*performer1

0..*performer2

0..*performer1

0..*performer2

Genetic Locus(POCG_RM000010)The entry point tothe GeneticLocus modelis any locus on the genome.

Constrained to a restricted MAGE-MLcontent model, specified in aseparate schema.

Constraint: Expression.value

Expression

Sequence

SequenceVariation

SequenceVariation

0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation

ClinicalPhenotype

CMET: (ASSIGNED) R_AssignedEntity

[universal](COCT_MT090000)

0..1 scopedRoleName

CMET: (ACT) A_SupportingClinicalInformation

[universal](COCT_MT200000)

The Locus and its Alleles

SequenceVariations

ExpressionData

Sequenceand

Proteomics

ClinicalPhenotypes

The GeneticLocus Model - Focal Areas:

Page 6: HL7 Clinical Genomics and Structured Documents  Work Groups

6

Haifa Research Lab

The Underlying Paradigm: Encapsulate & Bubble-up

Clinical PracticesGenomic Data Sources

EHR System

HL7 CG Messages with mainly

Encapsulating HL7 Objects HL7 C

G Mes

sage

s with

enca

psula

ted da

ta as

socia

ted w

ith

HL7 cl

inica

l obje

cts (p

heno

types

)

Bubble up the most clinically-significant raw genomic data into specialized HL7 objects and

link them with clinical data from the patient EHR

Decision Support Applications

Knowledge(KBs, Ontologies, registries,

reference DBs, Papers, etc.)

the challenge…

Encapsulation by predefined & constrained

bioinformatics schemas

Bubbling-up is done continuously by specialized DS

applications

Page 7: HL7 Clinical Genomics and Structured Documents  Work Groups

7

Haifa Research Lab

0..* associatedObservation

typeCode*: <= COMPcomponent

0..* associatedProperty

typeCode*: <= DRIVderivedFrom2

0..* polypeptide

typeCode*: <= DRIVderivedFrom5

SEQUENCES & PROTEOMICS

0..* expression

typeCode*: <= COMPcomponent1

0..* sequenceVariation

typeCode*: <= COMPcomponent3

IndividualAlleleclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (allele code, drawn from HUGO-HGVS or OMIM)methodCode: SET<CE> CWE [0..*]

GeneticLocusclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [0..1] (e.g., ALLELIC, NON_ALLELIC)text: ED [0..1]effectiveTime: IVL<TS> [0..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityuncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: CD [0..1] (identifying a gene through GenBank GeneID with an optional translation to HUGO name.)methodCode: SET<CE> CWE [0..*]

0..* individualAllele

typeCode*: <= COMPcomponent1

SequenceclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [1..1] (the sequence standard code, e.g. BSML)text: ED [0..1] (sequence's annotations)effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual sequence)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*] (the sequencing method)

ExpressionclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CE CWE [1..1] (the standard's code (e.g., MAGE-ML identifier)negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= Uncertaintyvalue: ED [1..1] (the actual gene or protein expression levels)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]

PolypeptideclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (protein code, drawn from SwissProt, PDB, PIR,HUPO, etc.)methodCode: SET<CE> CWE [0..*]

DeterminantPeptidesclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: CD [0..1] (peptide code, drawn from referencedatabases like those used in the Polypeptide class)methodCode: SET<CE> CWE [0..*]

Constrained to a restrictedMAGE-ML constrained schema,specified separately.

Constraint: GeneExpression.value

Note:A related allele that is ona different locus, and hasinterrelation with thesource allele, e.g.,translocated duplicatesof the gene.

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

ExternalObservedClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= EVNid*: II [1..1] (The unique id of an external observation residing outside of the instance)code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]

Note:An external observation is preferably a valid observationinstance existing in any other HL7-compliant instance,e.g., a document or a message.Use the id attribute of this class to point to the uniqueinstance identifier of that observation.

Note:A phenotype which has been actuallyobserved in the patient representedinternally in this model.

Note:This is a computed outcome, i.e.,the lab does not test for the actualprotein, but secondary processespopulate this class with thetranslational protein.

SequenceVariationclassCode*: <= OBSmoodCode*: <= EVNid: II [0..1]code: CD CWE [0..1]negationInd: BL [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1] (The variation itself expressed with recognized notation like 269T>C or markup like BSML or drawn from an external reference like OMIM or dbSNP.)interpretationCode: SET<CE> CWE [0..*] <= ObservationInterpretationmethodCode: SET<CE> CWE [0..*]

KnownClinicalPhenotypeclassCode*: <= OBSmoodCode*: <= DEFcode: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]uncertaintyCode: CE CNE [0..1] <= ActUncertaintyvalue: ANY [0..1]

Note:These phenotypes are not the actual (observed)phenotypes for the patient, rather they are thescientifically known phenotypes of the sourcegenomic observation (e.g., known risks of amutation or know responsiveness to a medication).

Note:Code: COPY_NUMBER, ZYGOSITY, DOMINANCY, GENE_FAMILY,etc. For example, if code = COPY_NUMBER, then the value is oftype INT and is holding the no. of copies of this gene or allele.

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

EXPRESSION DATA

SEQUENCE VARIATIONS

Polypeptide

Note:The Expression class refers to both gene and proteinexpression levels. It is an encapsulating class that allowsthe encapsulation of raw expression data in its value attribute.

0..* sequence

typeCode*: <= COMPcomponent2

0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

Note:The code attribute indicates inwhat molecule the variation occurs,i.e., DNA, RNA or Protein.

0..* expression

typeCode*: <= COMPcomponent5

Note:Use the associations to the shadowclasses when the data set type (e.g.,expression) is not at deeper levels(e.g., allelic level) and needs to beassociated directly with the locus(e.g., the expression level is thetranslational result of both alleles).

0..* associatedObservationtypeCode*: <= COMPcomponent2

0..1 associatedObservation

typeCode*: <= COMPcomponent4 Note:

This recursive associationenables the association of anRNA sequence derived froma DNA sequence and apolypeptide sequence derivedfrom the RNA sequence.

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

Note:This class is a placeholder for a specific locus on the genome - that is - a position of a particulargiven sequence in the subject’s genome or linkage map.Note that the semantics of the locus (e.g., gene, marker, variation, etc.) is defined by data assignedin the code & value attributes of this class, and also by placing additional data relating to thislocus into the classes associated with this class like Sequence, Expression, etc..

Note:The term 'Individual Allele' doesn't refer necessarily to aknown variant of the gene/locus, rather it refers to theindividual patient data regarding the gene/locus and mightwell contain personal variations w/unknown significance.

AssociatedObservationclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]methodCode: SET<CE> CWE [0..*]

Note:The code attribute could hold codes likeNORMALIZED_INTENSITY, P_VALUE, etc.The value attribute is populated based on theselected code and its data type is then setupaccordingly during instance creation.

Note:The code attribute could hold codes like TYPE,POSITION.GENOME, LENGTH, REFERENCE, REGION, etc..The value attribute is populated based on the selected codeand its data type is then setup accordingly during instancecreation. Here are a few examples:If code = TYPE, then the value is of type CV and holds one of thefollowing: SNP (tagSNP), INSERTION, DELETION,TRANSLOCATION, etc.

if code = POSITION, then value is of type INT and holdsthe actual numeric value representing the variation positionalong the gene.

if code = LENGTH, then value is of type INT and holdsthe actual numeric value representing the variation length.

If code = POSITION.GENE, then value is of type CV and is oneof the following codes:INTRON, EXON, UTR, PROMOTER, etc.

If code = POSITION.GENOME, then value is of type CV and is oneof the following codes:NORMAL_LOCUS, ECTOPIC, TRANSLOCATION, etc.

If the code = REFERENCE, then value istype CD and holds the reference gene identifier drawn from areference database like GenBank.

The full description of the allowed vocabularies for codes and itsrespective values could be found in the specification.

AssociatedObservation

Note:Code: CLASSIFICATION, etc.For example, if code =CLASSIFICATION, then the valueis of type CV and is holding eitherKNOWN or NOVEL.

reference

0..* geneticLocus

typeCode*: <= REFR

Note:A related gene that is on adifferent locus, and stillhas significant interrelationwith the source gene (similarto the recursive associationof an IndividualAllele).

ClinicalPhenotypeclassCode*: <= ORGANIZERmoodCode*: <= EVN

0..* observedClinicalPhenotype

typeCode*: <= COMPcomponent1

0..* knownClinicalPhenotype

typeCode*: <= COMPcomponent2

0..* externalObservedClinicalPhenotype

typeCode*: <= COMPcomponent3

At least one of the target acts ofthe three component act relationshipsshould be populated, since this isjust a wrapper class.

Constraint: ClinicalPhenotype

Note:- code should indicate the type of source, e.g., OMIM- text could contain pieces from research papers- value could contain a phenotype code if known (e.g., if it’s a disease, then the disease code)

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

ClinicalPhenotype

0..1 identifiedEntity

typeCode*: <= SBJcontextControlCode: CS CNE [0..1] <= ContextControl "OP"

subject

reference

0..* individualAllele

typeCode*: <= REFR

ObservedClinicalPhenotype

Note:This CMET might be replacedwith the Clinical Statement SharedModel for richer expressivity, whenthe that mode is approved(currently in ballot).

Constrained to a restricted BSMLcontent model, specified in aseparate schema.

Constraint: Sequence.value

0..* sequence

typeCode*: <= COMPcomponent4

0..* sequenceVariation

typeCode*: <= COMPcomponent3

AssociatedPropertyclassCode*: <= OBSmoodCode*: <= EVNcode: CD CWE [0..1]text: ED [0..1]value: ANY [0..1]

0..* associatedProperty

typeCode*: <= DRIVderivedFrom1

AssociatedObservation

0..* associatedObservation

typeCode*: <= COMPcomponent

AssociatedPropertyAssociatedObservation

0..* associatedProperty

typeCode*: <= DRIVderivedFrom

AssociatedProperty0..* associatedProperty

typeCode*: <= DRIVderivedFrom1

AssociatedObservation0..* associatedObservation

typeCode*: <= COMPcomponent

0..* sequenceVariationtypeCode*: <= DRIVderivedFrom3derivedFrom2

0..* sequence

typeCode*: <= DRIV

0..* determinantPeptides

typeCode*: <= DRIVderivedFrom4

0..* determinantPeptides

typeCode*: <= DRIVderivedFrom

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation 0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

AssociatedProperty

0..* associatedProperty

typeCode*: <= DRIVderivedFrom

AssociatedProperty

GeneticLociclassCode*: <= OBSmoodCode*: <= EVNid: SET<II> [0..*]code: CD CWE [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]

0..* geneticLocitypeCode*: <= COMPcomponentOf

0..* clinicalPhenotype

typeCode*: <= PERTpertinentInformation

GeneticLoci0..* geneticLoci

typeCode*: <= COMPcomponentOf

GeneticLoci0..* geneticLoci

typeCode*: <= COMPcomponentOf

0..* polypeptide

typeCode*: <= DRIVderivedFrom1

Polypeptide

0..* polypeptide

typeCode*: <= DRIVderivedFrom2

Note:Use this class to indicate a set of genetic locito which this locus belongs. The loci set couldbe a haplotype, a genetic profile and so forth.Use the id attribute to point to the GeneticLociinstance if available. The other attributesserve as a minimal data set about the loci group.

PHENOTYPES

Note:Any observation related to the variation and is notan inherent part of the variation observation (the lattershould be represented in the AssociatedProperty class).For example, the zygosity of the variation.

Note:Use this class to point to a variationgroup to which this variation belongs.For example, a SNP haplotype.

Note:Any observation related to the sequence and is notan inherent part of the sequence observation (the lattershould be represented in the AssociatedProperty class).For example, splicing alternatives.

Note:Key peptides in the proteinthat determine its function.

Note:There could be zero to manyIndividualAllele objects in aspecific instance. A typicalcase would be an allele pair,one on the paternalchromosome and one on thematernal chromosome.

Note:Use this class toshow an allelehaplotype like in HLA.

Note:Any observationrelated to theexpression assayand is not aninherent part ofthe expressionobservation.

Note:Use this class forinherent dataabout the locus, e.g.chromosome no.

IdentifiedEntityclassCode*: <= IDENTid: SET<II> [0..*]code: CE CWE [0..1] <= RoleCode

Note:Use this role to identify a different subject(e.g., healthy tissue, virus, etc.) than theone propagated from the wrappingmessage or payload (e.g., GeneticLoci).

ScopingEntityclassCode*: <= LIVdeterminerCode*: <= INSTANCEid: SET<II> [0..*]code: CE CWE [0..1] <= EntityCode

0..* assignedEntity

typeCode*: <= PRFcontextControlCode: CS CNE [0..1] <= ContextControl "OP"

performer

0..*performer

0..*performer1

0..*performer2

0..*performer1

0..*performer2

Genetic Locus(POCG_RM000010)The entry point tothe GeneticLocus modelis any locus on the genome.

Constrained to a restricted MAGE-MLcontent model, specified in aseparate schema.

Constraint: Expression.value

Expression

Sequence

SequenceVariation

SequenceVariation

0..* clinicalPhenotypetypeCode*: <= PERTpertinentInformation

ClinicalPhenotype

CMET: (ASSIGNED) R_AssignedEntity

[universal](COCT_MT090000)

0..1 scopedRoleName

CMET: (ACT) A_SupportingClinicalInformation

[universal](COCT_MT200000)

The GeneticLocus ModelIndividual

Allele Bio Sequenc

e

Sequence Variation

(SNP, Mutation,

Polymorphism, etc.)

Polypeptide

Expression Data

Clinical Phenotype

Entry Point: GeneticLocus

Determinant

PolypeptideExpression

Attributes

Variation

Attributes

Encapsulating Obj.

Bubbled-up Obj.

Related

Allele

genotypephenotype

Page 8: HL7 Clinical Genomics and Structured Documents  Work Groups

8

Haifa Research Lab

The GeneticVariation Model

0..* associatedObservation

typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"

sourceOf

0..* associatedProperty

typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"

derivedFrom

0..* sequenceVariation

typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"

component1

IndividualAlleleclassCode*: <= SEQVARmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]negationInd: BL [0..1]title: ED [0..1]text: ED [0..1]statusCode: CS CNE [0..1] <= ActStatuseffectiveTime*: GTS [1..1]reasonCode: SET<CE> CWE [0..*] <= ActReasonvalue: CD CWE [0..1] <= C:interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretation

GeneticLocusclassCode*: <= LOCmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]code: CE CWE [0..1] (default=Gene)negationInd: BL [0..1]title: ED [0..1]text: ED [0..1]statusCode: CS CNE [0..1] <= ActStatuseffectiveTime*: IVL<TS> [1..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialityreasonCode: SET<CE> CWE [0..*] <= GeneticActReasonvalue*: ANY [1..1]interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode*: SET<CE> CWE [1..1]

0..* individualAllele

typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"

component2

SequenceclassCode*: <= SEQmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]code: CD CWE [1..1] (the type of sequence (observed, reference, etc.))text: ED [0..1] (sequence's annotations)effectiveTime: GTS [0..1]reasonCode: SET<CE> CWE [0..*] <= ActReasonvalue: ED [1..1] ((the actual sequence in a recognized bioinformatics content model) (such as BSML)interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode: SET<CE> CWE [0..*] (the sequencing method)

HL7 Clinical Genomics SIGDocument: Genotype Topic - The GeneticVariation ModelRev: POCG_RM000011.v9 Date: November 18, 2007Facilitator: Amnon Shabo (Shvo), IBM Research in Haifa, [email protected]

Note:A related allele that is ata different locus, and hasinterrelation with thesource allele, e.g.,translocated duplicatesof a gene.

0..* phenotype

typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"

pertinentInformation

SequenceVariationclassCode*: <= SEQVARmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: II [0..1]code: CD CWE [0..1]negationInd: BL [0..1]title: ED [0..1]text: ED [0..1]effectiveTime: GTS [0..1]value: ANY [0..1]interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode: SET<CE> CWE [0..*]

Note:Code: COPY_NUMBER, ZYGOSITY, DOMINANCY, GENE_FAMILY,etc. For example, if code = COPY_NUMBER, then the value is oftype INT and is holding the no. of copies of this gene or allele.

0..* sequence

typeCode*: <= COMPcomponent2

0..* phenotype

typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"

pertinentInformation

0..* phenotype

typeCode*: <= PERTpertinentInformation

Note:The code attribute indicates inwhat molecule the variation occurs,i.e., DNA, RNA or Protein.

Note:Use the associations to the shadowclasses when the variation and orthe sequence data are not at theallelic level.

0..* associatedObservation

typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"

sourceOf

0..1 associatedObservation

typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"

sourceOf

Note:This recursive associationenables the association of anRNA sequence derived froma DNA sequence and apolypeptide sequence derivedfrom the RNA sequence.

0..* phenotype

typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"

pertinentInformation

Note:This class is a placeholder for specifying a locus on the genome, i.e., a position of a particulargiven sequence in the subject’s genome.Note that the semantics of the locus (e.g., gene) is defined by data assigned in the code & valueattributes of this class, and also by placing additional data relating to this locus into the classes(and CMETs) associated with this class.

Note:The term 'Individual Allele' doesn't refer necessarily to aknown variant of the gene/locus, rather it refers to theindividual patient data regarding the gene/locus, and mightcontain personal variations with unknown significance atthe effective time of this observation.

AssociatedObservationclassCode*: <= GENmoodCode*: <=x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: SET<II> [0..*]code*: CD CWE [1..1]text: ED [0..1]effectiveTime*: GTS [1..1]value: ANY [0..1]methodCode: SET<CE> CWE [0..*]

Note:The code attribute could hold codes like TYPE,POSITION.GENOME, LENGTH, REFERENCE, REGION, etc..The value attribute is populated based on the selected codeand its data type is then setup accordingly during instancecreation. Here are a few examples:If code = TYPE, then the value is of type CV and holds one of thefollowing: SNP (tagSNP), INSERTION, DELETION,TRANSLOCATION, etc.

if code = POSITION, then value is of type INT and holdsthe actual numeric value representing the variation positionalong the gene.

if code = LENGTH, then value is of type INT and holdsthe actual numeric value representing the variation length.

If code = POSITION.GENE, then value is of type CV and is oneof the following codes:INTRON, EXON, UTR, PROMOTER, etc.

If code = POSITION.GENOME, then value is of type CV and is oneof the following codes:NORMAL_LOCUS, ECTOPIC, TRANSLOCATION, etc.

If the code = REFERENCE, then value istype CD and holds the reference gene identifier drawn from areference database like GenBank.

More details about vocabularies for codes and itsrespective values could be found in the specification.

Note:Code: CLASSIFICATION, etc.For example, if code =CLASSIFICATION, then the valueis of type CV and is holding eitherKNOWN or NOVEL.

reference

0..* geneticLocus

typeCode*: <= REFRcontextConductionInd: BL [0..1] "TRUE"

Note:A related locus that has significantinterrelation with the source locusand is not part of this loci set representedin this instance.

reference

0..* individualAllele

typeCode*: <= REFRcontextConductionInd: BL [0..1] "TRUE"

Constrained to a restricted BSMLcontent model, specified in aseparate schema.

Constraint: Sequence.value

0..* sequence

typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"

component4

0..* sequenceVariation

typeCode*: <= COMPcontextConductionInd: BL [0..1] "TRUE"

component3

AssociatedPropertyclassCode*: <= GENmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)code*: CD CWE [1..1]text: ED [0..1]value: ANY [0..1]

0..* associatedProperty

typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"

derivedFrom

0..* associatedProperty

typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"

derivedFrom

0..* associatedProperty

typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"

derivedFrom1

0..* associatedObservation

typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"

sourceOf

0..* sequenceVariation

typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"

derivedFrom3derivedFrom2

0..* sequence

typeCode*: <= DRIVcontextConductionInd: BL [0..1] "TRUE"

Note:Any observation related to the variation and is notan inherent part of the variation observation (the lattershould be represented in the AssociatedProperty class).For example, the zygosity of the variation.

Note:Any observation related to the sequence and is notan inherent part of the sequence observation, e.g.,splicing alternatives.Note that inherent characteristics of the sequenceshould be represented in the AssociatedProperty class.

Note:There could be zero to manyIndividualAllele objects in aspecific instance. A typicalcase would be an allele pair,one on the paternalchromosome and one on thematernal chromosome.

Note:Use this class forinherent dataabout the locus, e.g.chromosome no.

0..* phenotype

typeCode*: <= PERTpertinentInformation

AssociatedProperty

AssociatedProperty

AssociatedProperty

SequenceVariation

Sequence

SequenceVariation

Note:An internal CMET used to representclinical phenotypes, both observed inthe patient and known in thescientific literature.

CMET: (ORGANIZER) A_Phenotype

[universal](POCG_MT000030UV)

CMET: (ORGANIZER) A_Phenotype

[universal](POCG_MT000030UV)

CMET: (ORGANIZER) A_Phenotype

[universal](POCG_MT000030UV)

CMET: (ORGANIZER) A_Phenotype

[universal](POCG_MT000030UV)

CMET: (ORGANIZER) A_Phenotype

[universal](POCG_MT000030UV)

Holds the variation expressed with arecognized notation like 269T>C ora markup like BSML or drawn from anexternal reference like OMIM or dbSNP.Data type should be set accordingly.

Constraint: value

If code = "Gene", value data type shall be set to CD and containa code identifying a gene through GenBank GeneID, HUGO name,OMIM ID or any other internationally recognized identification of genes.If the locus is not a gene then the data type should be set to theappropriate type, e.g., ST for locus notation like “10q24.32”.

Constraint: value

GeneticLociclassCode*: <= LOCmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: SET<II> [0..*]code*: CD CWE [1..1] <= GeneticVariationnegationInd: BL [0..1]title: ED [0..1]text: ED [0..1]statusCode*: CS CNE [1..1] <= ActStatuseffectiveTime*: GTS [1..1]confidentialityCode: SET<CE> CWE [1..1] <= ConfidentialityreasonCode: SET<CE> CWE [0..*]interpretationCode: SET<CE> CWE [0..*] <= GeneticObservationInterpretationmethodCode: SET<CE> CWE [0..*] <= ObservationMethod

0..* geneticLocus

typeCode*: <= COMPcomponent1

0..* assignedEntity

typeCode*: <= AUTcontextControlCode: CS CNE [0..1] "OP"

author

0..* assignedEntity

typeCode*: <= VRFcontextControlCode: CS CNE [0..1] "OP"

verifier

0..* assignedEntity

typeCode*: <= PRFcontextControlCode: CS CNE [0..1] "OP"

performer

CMET: (ASSIGNED) R_AssignedEntity

[universal](COCT_MT090000UV)

0..1 roleName

GeneticDocumentclassCode*: <= DOCmoodCode*: <= x_ActMoodDefEvnRqoPrmsPrp (default=EVN)id: SET<II> [0..1]code*: CD CWE [1..1] <= DocumentTypetitle: ED [0..1]text: ED [0..1]statusCode*: CS CNE [1..1] <= ActStatuseffectiveTime*: GTS [1..1]confidentialityCode: SET<CE> CWE [0..*] <= ConfidentialitysetId: II [0..1]

0..* geneticDocument

typeCode*: <= DOCcontextConductionInd: BL [0..1] "TRUE"

documentation

relatedDocument

0..* geneticDocument

typeCode*: <= x_ActRelationshipDocumentcontextConductionInd: BL [0..1] "TRUE"seperatableInd: BL [0..1]

Note:Use the separation indicator to indicatewhen a document should not be separatedfrom its associated document (like in theEGFR-KRAS2 use case from HPCGG)

Note:There are two ways to refer to a clinical document: 1. Populate the id attribute with the document id 2. Place the entire CDA instance within the text attribute

The other attributes in this class are essential data aboutthe document and they are repeated in the documentinstance itself. It’s meant to ease the parsing process.

0..* associatedObservation

typeCode*: <= ActRelationshipType (default=COMP)contextConductionInd: BL [0..1] "TRUE"

sourceOf

CMET: (ORGANIZER) A_Phenotype

[universal](POCG_MT000030UV)

0..* phenotype

typeCode*: <= PERTcontextConductionInd: BL [0..1] "TRUE"

pertinentInformation

GeneticVariation(POCG_RM000011UV)The entry point to the combinedGenetic Loci/Locus model thatrepresent genetic variations data.

0..1 identifiedEntity

typeCode*: <= SBJcontextControlCode: CS CNE [0..1] "OP"

subjectIdentifiedEntityclassCode*: <= IDENTid: SET<II> [0..*]code: CE CWE [0..1] <= RoleCode

ScopingEntityclassCode*: <= LIVdeterminerCode*: <= INSTANCEid: SET<II> [0..*]code: CE CWE [0..1] <= EntityCode

0..1performer

AssociatedObservation

AssociatedObservation

AssociatedObservation

AssociatedObservation

0..*

author

0..1performer 0..1

performer

0..1performer

0..1performer

If interpretationCode is assigned with a valuea reasonCode shall be assigned a value to setthe context for the interpretation semantics.

Constraint: GeneticLoci.reasonCode&interpretationCode

AssociatedProperty0..* associatedProperty

typeCode*: <= DRIVderivedFrom

Genetic Loci

Genetic

Locus

Individual Allele

Sequence

Variation

Sequence

(observed or reference)

Point to CDA Documents

participants

Associated data (vocab. Controlled)

Page 9: HL7 Clinical Genomics and Structured Documents  Work Groups

9

Haifa Research Lab

CDA IG for Genetic Testing Report Design principles:

Follow existing report formats commonly used in healthcare & research Emphasis on interpretations & recommendations Provide inline & detailed (generic) information on tests performed

Interpretation: Utilize patterns of ‘genotype-phenotype’ associations in the HL7 v3 Clinical Genomics and implement them as templates in this IG

Reference HL7 Clinical Genomics instances (most likely constrained) Place holders of raw data (evidences) and for structured family history

Section outline: Content sections: Genetic Variations, Gene Expression, others Sub-sections in each content section:

Specimen Findings Interpretations Recommendations Test Information Family History Open the draft outline

Page 10: HL7 Clinical Genomics and Structured Documents  Work Groups

10

Haifa Research Lab

Technical Issues

Design & register genotype-phenotype templates Similar a bit to the CCD templates for “Allergies, Adverse Reactions,

Alerts” where ‘agent’ is the genomic entity/observation and the reaction is the phenotypic information

Note that in CCD the relationship is fixed to “MFST” while in genomics we’ll have a variety of codes representing various ‘genotype-phenotype’ relationships

Enable associating a genotype to phenotypes in several places across the document (reference an observation)

Links to HL7 v3 Clinical Genomics instances Similar to referencing images in CDA Diagnostic Report IG

Page 11: HL7 Clinical Genomics and Structured Documents  Work Groups

11

Haifa Research Lab

Cause of

allergy

Allergen is manifested

by…

Manifestation of the allergy

Page 12: HL7 Clinical Genomics and Structured Documents  Work Groups

12

Haifa Research Lab

Referencing a DICOM Object

Page 13: HL7 Clinical Genomics and Structured Documents  Work Groups

13

Haifa Research Lab

The End

• Thank you for your attention…

• Questions?