handbook - centomd® · clinical significance according to centomd® .....14 information on disease...
TRANSCRIPT
Page | 1
CentoMD® 3.0 Handbook_V1_May2016
Handbook
Precautions/warnings:
For professional use only.
To support clinical diagnosis.
Page | 2
CentoMD® 3.0 Handbook_V1_May2016
Contents
Introduction ................................................................................................. 3
Intended use ................................................................................................ 3
Facts and Features ........................................................................................ 3
Technologies used ......................................................................................... 5
Data acquisition and curation policy ................................................................... 7
Database curators ........................................................................................... 7
Data acquisition ............................................................................................. 7
Curation workflow .......................................................................................... 8
Quality status ................................................................................................ 8
Variant-related information ............................................................................. 9
Genetic variants ............................................................................................ 9
Variant location ............................................................................................ 10
Variant type on DNA level ................................................................................ 11
Coding effect ............................................................................................... 12
Variant zygosity ............................................................................................ 13
Allele frequency at CentoMD® ........................................................................... 14
Publication status .......................................................................................... 14
Clinical significance according to CentoMD® .......................................................... 14
Information on disease and inheritance ............................................................. 17
Individual-related information on phenotype and demographics .............................. 18
Clinical statement of CENTOGENE AG ................................................................ 21
Appendix .................................................................................................. 23
Abbreviations used in CentoMD® 3.0 ................................................................... 23
Evidence-based annotation rules to determine the clinical statement ............................ 23
Glossary .................................................................................................... 28
Page | 3
CentoMD® 3.0 Handbook_V1_May2016
Introduction
Diagnosing a patient with a rare disease is a complex task because not all existing genetic
variants have been described or precisely annotated. Medical professionals need to obtain
all available knowledge about the detected genetic variants in a patient in order to
establish a possible most accurate diagnosis.
CentoMD® is a holistic database that combines phenotype and genotype information
gathered from genetic tests conducted at CENTOGENE AG. This means that every variant
reported in CentoMD® is linked to at least one clinically described individual analyzed
through a standardized workflow with accredited quality. Respectively, CentoMD® is a
growing database; newly generated data will be imported quarterly.
This handbook describes the content of CentoMD®, how this content is generated, how
clinical significance classes are defined, and how quality standards are fulfilled. The
accompanying CentoMD® user guide provides a detailed description of how to use this
web based database.
Intended use
CentoMD® is browser based software that supports a comprehensive and unique
repository of genetic and clinical information based on patient’s diagnosis. It aids
medically trained professionals in the evaluation of the genetic variants that have been
identified in their own patients. This enhances the validity of the genetic analytical
workflow and aids the clinicians in evaluating treatment options for patients with
hereditary diseases. It correlates the clinical information of consented patients and
probands of different ethnical background with a large dataset of genetic variants, and
biomarkers (where available). The genetic variants are detected utilizing accredited
laboratory technologies for Sanger, NGS and WES sequencing as well as insertion and
deletion analysis by MLPA/qPCR.
Facts and Features
CentoMD® provides detailed information of variants detected in consented individuals
who were referred to genetic testing by their physicians in order to evaluate whether
Page | 4
CentoMD® 3.0 Handbook_V1_May2016
they are affected by or are carriers of mutations which cause rare hereditary diseases.
This patient cohort is a unique representation of the global population originating from
more than 100 countries. The allele frequencies stated in CentoMD® reflect the
frequency observed in this particular worldwide cohort. For every analyzed individual,
CentoMD® provides information about the genotype-phenotype correlation based on
tested clinical cases. Therefore, all genetic variants are associated to epidemiological
data and clinical information – such as signs and symptoms of the disease – if described by
the physician.
CentoMD® 3.0 contains more than 40,000 variants which are classified and curated (see
Variant quality status). In total, ~ 2,900 phenotypes and ~ 200 million alleles have been
identified in > 74,000 screened individuals. The current release contains more than
11,000 HPO (Human Phenotype Ontology) terms and approximately 23,000 individuals-
HPO term(s) associations.
CentoMD® 3.0 provides the following key features:
o Classified variants and WES variants are now integrated in Genotype to Phenotype
module: Based on approved gene symbols given by the users, CentoMD® provides
detailed data on corresponding genetic variants and the associated epidemiological
data and clinical information following HPO nomenclature.
o Advanced Phenotype to Genotype module: Based on HPO terms provided by the
users, CentoMD® provides hints on candidate genes and related variants underlying
the phenotype of interest.
o Interactive search interface: Users are given the flexibility to perform searching,
sorting, filtering and access specific data contents by simple clicks.
o For clinically relevant (CRV) and uncertain (VUS) variants data can be retrieved at
4 different levels: variant rationale, curated individuals, statistics, and individual
view. Users can see the reasons behind the variant classification, and view
statistics and detailed individual-related data.
Rationale: Summary supporting the clinical significance according to the ACMG
guidelines and internal evidences. In the current release, ~3,700 CRV/VUS are
linked with rationales.
Curated individuals: Detailed information on curated individuals tested
positive for the variant of interest.
Page | 5
CentoMD® 3.0 Handbook_V1_May2016
Statistics: Statistical analyses of curated individuals tested positive for the
variant of interest.
Individual view: Information on individuals (curated and uncurated) tested
positive for the variant of interest as well as classified and curated and/or
classified variants associated with each individual.
o Co-occurrences are indicated: Users can view the association of the variant of
interest with other CRV/VUS variants in the same gene or other genes.
o Data export functions: Users can export data into read-only excel file.
o The annotation and classification of genetic variants is strictly curated by medical
professionals: Users have access to high quality data.
o 56% of CentoMD® classified and curated CRV/VUS variants are unpublished: Users
can access data of CRV/VUS which have not been previously published in
literature.
o Users are notified when variants are re-classified: Users get the latest information
on the clinical significance class of variants of interest.
Technologies used
The following validated technologies are used at CENTOGENE AG to detect changes on
genetic levels and to identify the cause of the disease:
o Sanger: Classical method of DNA sequencing, developed by Fred Sanger, using
chemically altered "dideoxy" bases to terminate newly synthesized DNA fragments
at specific bases (either A, C, T, or G). These fragments are then size-separated,
and the DNA sequence can be read.
o NGS: Next-Generation Sequencing: High-throughput sequencing technology,
allowing the parallel sequencing of multiple genes, producing thousands or millions
of sequences concurrently.
o qPCR: Quantitative Polymerase Chain Reaction: Method to amplify and
simultaneously quantify a targeted DNA molecule. Used especially for detecting
large/gross and gene rearrangements.
o MLPA: Multiplex Ligation-dependent Probe Amplification: Variation of the
multiplex PCR that permits multiple targets to be amplified with only a single
primer pair. Used especially for detecting large/gross and gene rearrangements.
Page | 6
CentoMD® 3.0 Handbook_V1_May2016
o Other method: Used when another methodology has been employed to detect the
variants (like fragment length).
o WES: Whole Exome Sequencing: Brute-force approach that involves modern day
sequencing technology and DNA sequence assembly tools to piece together all
coding portions of the genome. The sequence is then compared to a reference
genome and any differences are noted.
Interpretations of the enzymatic activities and biomarker levels are provided, when
available, as supporting evidence for the relevance of the detected genetic change. For
example, for Fabry disease, which is an X-linked rare genetic lysosomal storage disease,
measurements of enzymatic activities are conducted in males, and measurements of the
biomarker levels are conducted in both males and females.
The terms used to describe the results of biochemical analyses are explained as below:
o Biochemical analysis: Method to analyze enzymatic activity or levels of biomarkers
in samples obtained from patients usually suspected of being affected by a
metabolic disorder. This is a test performed via Tandem Mass Spectrometry to
detect, diagnose, and monitor diseases, disease processes, and susceptibility, and
to determine a course of treatment.
o Biomarker interpretation: Evaluation of the biomarker levels compared to the
reference interval
Normal: Biomarker levels are within the normal range (no change).
Pathological: Biomarker levels are significantly increased compared to the
normal range.
Slightly decreased: Biomarker levels are only slightly decreased compared to
the normal range.
Slightly increased: Biomarker levels are only slightly increased compared to
the normal range.
o Enzyme interpretation: Evaluation of the enzyme activity compared to the
reference interval
Normal: Levels of activity are within the normal range (no change).
Pathological: Levels of activity are significantly decreased compared to the
normal range.
Page | 7
CentoMD® 3.0 Handbook_V1_May2016
Slightly decreased: Levels of activity are only slightly decreased compared
to the normal range.
Slightly increased: Levels of activity are only slightly increased compared to
the normal range.
Data acquisition and curation policy
Curation is the process of collection, association, update and review of genetic and
phenotypic data of patients genetically analyzed at CENTOGENE AG into a structured and
standardized format. It utilizes a combination of computer-based tools and manual
review in order to assure the accuracy, efficiency and quality of the curation process.
Database curators
CentoMD® curators are biologists with strong background in human genetics. They
continuously undergo extensive training to ensure curation consistency and
standardization. They confirm that CentoMD® is error-free (items properly associated and
interpreted, no inconsistencies, and/or discrepancies against detected observations in
house and external sources), and close the curation process by manual approval that
reviewed and curated data agree with standard procedures established in house.
Data acquisition
Data gathering and variant curation are procedures developed and implemented in a web-
based software, that is compliant with the HGNC, HGVS and HPO nomenclatures allowing
collection of variants detected in nuclear coding, nuclear non-coding and mitochondrial
genes. The software integrates in-house sample management systems and analysis
platforms, and additionally utilizes external databases providing the curator with a
comprehensive and straightforward overview of the evidences regarding genotype-
phenotype correlation available in-house versus external information.
The data is gathered by a combination of manual submission and data import following an
individual-oriented model where characteristics belonging to a particular individual
(patient information, clinical data, methodology and detected genetic variants) are
stored and associated together.
Page | 8
CentoMD® 3.0 Handbook_V1_May2016
Curation workflow
To provide high-quality data, the curation process at CENTOGENE AG is divided in 3
phases: variant-wise, individual-wise and warnings-wise procedures.
Curation by variant: To begin the curation process, the variant-linked information is
reviewed. This includes approval of variant nomenclature, terminology, accuracy,
consistency, record completeness.
Curation by individual: In order to start curation by individual, all variants detected in
this individual must be approved. It aims at assuring that the entries belonging to an
individual follow the rules for clinical statement closely, and that all associated data is in
agreement with the agreed guidelines. The following factors are considered as critical for
the clinical statement: variant clinical significance, patient genotype (number of
clinically relevant changes, their zygosity and location -i.e. cis vs. trans), inheritance
pattern of the disorder, the sex of the patient (for X-linked diseases), the phenotypic
description, and if available- levels of biomarkers.
Curation by warning: The database generates warnings at different levels (variant,
individual, gene, database levels) to detect errors, invalid terms and nomenclatures,
inconsistencies, and can provide hints where updates and reviews are necessary. Mostly
these warnings are due to additional evidences obtained internally (medical reports
issued at CENTOGENE AG) or detected externally (e.g. additional articles, publications
and external databases). Each warning is manually resolved. Whenever additional
evidence becomes available, the variants are revised and re-classified accordingly.
Quarterly, all approved individuals are anonymized and then released to CentoMD®,
offering the most complete and up-to-date information possible to its users.
CentoMD® is a constantly growing and enriched database. Whenever additional evidence
provided by the medical professionals in house or by peer-reviewed literature becomes
available, the variants are revised and re-classified accordingly. A detailed overview of
the clinical significance classes captured in CentoMD® is provided in the chapters
“Variant-related information” and “Clinical significance of according to CentoMD®”.
Quality status
CentoMD® 3.0 offers a dataset of variants derived from the integration of classified
variants and WES variants and processed through a standardized workflow which follows
Page | 9
CentoMD® 3.0 Handbook_V1_May2016
international standards and ensures high data quality. In CentoMD® 3.0, different types
of variant and individual quality status are indicated.
There are three types of variant quality status:
o Classified and curated (++): a variant has been assigned to a clinical significance
class and curated by following strictly the ACMG guidelines and internal expertise.
o Classified (+): a variant has been assigned to a clinical significance class according
to ACMG guidelines but has not yet been curated.
o Unclassified (0): a variant has not yet been assigned to any clinical significance
class due to the lack of information. Further evaluation is required.
There are two types of individual quality status:
o Curated (++): An individual associated with classified and curated CRV and/or VUS.
o Uncurated (+): An individual associated with classified (only) CRV and/or VUS or
with unclassified variants.
Variant-related information
Genetic variants
CentoMD® includes genetic variants detected in all types of genes. A gene is defined by a
sequence of DNA that represents a basic unit of heredity, being expressed in RNA and
proteins.
o Mitochondrial: A gene located in the mitochondria.
o Nuclear coding: A gene located in the cell nucleus of a eukaryote that encodes for
protein.
o Nuclear non-coding: A gene located in the cell nucleus that does not encode for a
protein product.
In CentoMD®, each gene is linked with a transcript or reference sequence, i.e. a digital
nucleic acid sequence, assembled by scientists as a representative example of a species'
set of genes. Coding DNA reference sequence refers to a cDNA-derived sequence
containing the full length of all coding regions and non-coding untranslated regions.
Page | 10
CentoMD® 3.0 Handbook_V1_May2016
According to the reference sequence used, the genetic variants are linked with the
corresponding location within the gene, with a particular mutation type on three
different levels: genomic/mitochondrial, cDNA, and protein, closely following the HGVS
guidelines and recommendations, for both small and gross / gene rearrangements.
o Genomic DNA change: Change at gDNA level following numbering based on genomic
DNA reference sequence.
o cDNA change: Change at cDNA level following numbering based on coding DNA
reference sequences.
o Protein change: Change at protein level following numbering based on the amino
acid sequence, using one letter amino acid code and X for designating a translation
termination codon.
Variant location
Variant location refers to the location of the DNA change relative to the transcriptional
initiation site, initiation codon, polyadenylation site, or termination codon of the
corresponding gene.
o Upstream: The region located 5' (upstream) from the 5'UTR region of the gene.
o 5'UTR (5'-Untranslated Region): Sequences on the 5' end of messenger RNA (mRNA)
but not translated into protein. It extends from the transcription start site to just
before the ATG translation initiation codon. 5' UTR may contain sequences that
regulate translation efficiency or mRNA stability.
o Exon: The protein-coding DNA sequences of the gene.
o Intron: The non-coding regions of a gene that interrupt the protein coding regions
(exons).
o 3'UTR (3' Untranslated Region): Particular section of mRNA that starts with the
nucleotide immediately following the stop codon of the coding region. This region
contains transcription and translation regulating sequences.
o Downstream: The region located 3' (downstream) from the polyadenilation signal of
the gene.
For large deletions/duplications and gene rearrangements, the location is indicated by
the first and the last exon affected by the change (for example, e1_e9 stands for a large
deletion/duplication affecting exon 1 to exon 9). If, for example, only one exon is linked
Page | 11
CentoMD® 3.0 Handbook_V1_May2016
with a large deletion, this indicates that particular exon is completely removed (see
mutation types below).
Please note that for mitochondrial genes, only the following locations are valid:
upstream, exon 1, and downstream.
For nuclear non-coding genes, 5’UTR and 3’UTR are invalid entries.
Variant type on DNA level
The variant type describes the different types of changes that can occur in the DNA
sequence. The following types are included in CentoMD®:
o Chromosomal deletion: Loss of parts of chromosomes.
o Complex rearrangement: Involves the structures or number of the chromosomes, it
is referred to as chromosome mutation, or rearrangement, rearranged
chromosomes.
o Conversion: Non-reciprocal transfer of information between homologous
sequences; one DNA sequence replaces a homologous sequence such that the
sequences become identical after the conversion event.
o Deletion: An abnormality in which part of a chromosome (carrying genetic
material) is lost.
o Duplication: Duplication of a sequence of DNA or section of chromosome.
o Gain of methylation: Gain of the normal DNA methylation level.
o Gene & regulatory region(s) deletion: Refers to loss of the entire gene and flanking
regions.
o Gene & regulatory region(s) duplication: Refers to the gain of the entire gene and
flanking regions.
o Gene deletion: Refers to loss of the entire gene.
o Gene duplication: Refers to gain/duplication of the entire gene.
o Gross deletion: Refers to loss of parts of a gene.
o Gross duplication: Refers to gain/duplication of part(s) of a gene.
o Gross inversion: Refers to 180 degree inversion of part(s) of a gene.
o Insertion/Deletion (Indel): Refers to the mutation class that includes a combination
of both insertions and deletions.
o Insertion: Genetic mutation where one or more nucleotides are added (inserted)
into a DNA sequence, or it may involve portions of a chromosome.
Page | 12
CentoMD® 3.0 Handbook_V1_May2016
o Inversion: Chromosomal abnormality where a segment of a chromosome is rotated
180 degree and reinserted.
o Loss of methylation: Loss of the normal DNA methylation level.
o Other/complex: Refers to all other types not included in any category.
o Pathological allele (D4Z4 motif): Deletion of 3.3-kb repeats from a chromosomal
tandem repeat called D4Z4 located near the end of chromosome 4 at the 4q35-ter
location. D4Z4 contains an ORF encoding a putative homeobox protein called
DUX4, a large polymorphic repeat structure consisting of 1–100 KpnI units.
o Repeat expansion: Refers to an increase number of repeats of a genomic tandemly
repeated DNA sequence.
o Retrotransposon insertion: Retrotransposons (also called transposons via RNA
intermediates) are genetic elements that can amplify themselves in a genome, and
can induce mutations by inserting near or within genes. Retrotransposon-induced
mutations are relatively stable, because the sequence at the insertion site is
retained as they transpose via the replication mechanism.
o Substitution: A sequence change where one nucleotide is replaced by one other
nucleotide. Substitutions are described using a ">"-character (indicating "changes
to").
Coding effect
The coding effect describes the sequence changes at protein level. The following types
are distinguished:
o Effect unknown: The coding effect on protein level has not been analyzed. An
effect is expected but difficult to predict.
o Frameshift: Special type of amino acid deletion/insertion affecting an amino acid
between the first (initiation, ATG) and last codon (termination, stop), replacing
the normal C-terminal sequence with one encoded by another reading frame.
o Increased polyglutamine tract/expanded polyQ: Portion of a protein consisting of a
sequence of several glutamine (Glu; Q) units.
o In-frame: A mutation that does not cause a shift in the triplet reading frame.
o Missense: Point mutation in which a single nucleotide change results in a codon
that codes for a different amino acid. Not all missense mutations are deleterious;
some changes can have no effect. Because of the ambiguity of missense mutations,
Page | 13
CentoMD® 3.0 Handbook_V1_May2016
it is often difficult to interpret the consequences of these mutations in causing
disease.
o New translation initiation site: A change affecting the translation initiation codon
(Met-1) introducing a new upstream initiation codon extending the N-terminus of
the encoded protein.
o Non-coding: The change on DNA level produces no effect on protein, or the effect
of regulatory mutations is unknown.
o Nonsense: Point mutation in a sequence of DNA that results in a premature stop
codon, and in a truncated, incomplete protein product.
o Silent: A form of point mutation at DNA level resulting in a codon that codes for
the same amino acid without any functional change in the protein product.
o Splicing mutation: DNA changes affecting the splicing process (i.e. intron removal
and exons joining). Splice-site mutations occur within genes in the noncoding
regions (introns) just next to the coding regions (exons). Splice-site mutations can
eliminate an existing donor or acceptor site, which will cause an exon to be
skipped and possibly result in a frameshift.
o Start loss: A start-loss mutation is a point mutation in the ATG start codon that
prevents the original start translation site from being used. This kind of mutation
will obviously eliminate gene function.
o New translation termination codon: A change affecting the translation termination
codon (Ter/*) introducing a new downstream termination codon extending the C-
terminus of the encoded protein.
Variant zygosity
Zygosity indicates if a variant is detected on one chromosome or on both chromosomes
and therefore describes the degree of similarity of the alleles for a trait in an organism.
The following zygosities are included in CentoMD®:
o Heterozygous (Het): Gene locus when cells contain two different alleles of a gene.
o Homozygous (Hom): Gene when identical alleles of the gene are present on both
homologous chromosomes.
o Hemizygous (Hem): Used for alleles detected in genes located on X-chromosome
for male cases.
Page | 14
CentoMD® 3.0 Handbook_V1_May2016
For the mitochondrial variants, the zygosity must be read as the degree of heteroplasmy,
i.e. as a mixture of more than one type of mitochondrial DNA (mDNA) within a
cell/individual. In those cases where a mutant in mDNA is responsible for a disease, the
larger the proportion of mutant mitochondria, the more likely the person will show
symptoms of the disease.
Two degrees of heteroplasmy are included:
o Heteroplasmic: The cell has some mitochondria that have a mutation in the mDNA
and some that do not.
o Homoplasmic: The cell has a uniform collection of mDNA: either completely normal
mDNA or completely mutant mDNA.
Allele frequency at CentoMD®
This number indicates the allele frequency of a particular variant which was observed at
CENTOGENE AG in comparison to the total number of analyzed individuals.
Publication status
The publication status indicates if the identified variant has previously been published in
the literature as a disease causing variant or not.
Additionally, the Single Nucleotide Polymorphism Database (dbSNP) ID is provided, if
available. The dbSNP is an archive of genetic variations within and across different
species developed and hosted by the National Center for Biotechnology Information
(NCBI) in collaboration with the National Human Genome Research Institute (NHGRI) and
available to the public.
Clinical significance according to CentoMD®
The classification of genetic germline variants is done according to the ACMG guidelines
(Richards et al. (2015), Genet. Med., doi:10.1038/gim2015.30 - except that neutral is
used instead of benign) with some modifications. These modifications arise from our
continuously growing internal expertise in the field of molecular diagnostic and are
represented mainly by new evidences regarding internal observed frequencies,
segregation data, genotype-phenotype correlation, co-occurrence, enzymatic and
biomarker levels.
Page | 15
CentoMD® 3.0 Handbook_V1_May2016
The detected genetic variants are first classified into one of the three classes concerning
their likelihood to predispose to or to cause the observed phenotype/ disease (see Figure
1): clinically relevant variants (CRV), clinically irrelevant variants (CIV) and uncertain
variants (VUS).
The CRV class includes the following subclasses: pathogenic, likely pathogenic, risk
factors and modifiers. Classification is based on their impact on disease presence,
severity or increased susceptibility. Main adjustment of the ACMG guidelines refers to the
classification as pathogenic which is only assigned to published variants for which there is
enough evidence for pathogenicity (e.g. loss of function variants (LoF), found in at least
two unrelated patients, with well-established functional studies or biochemistry data).
Variants that were classified as pathogenic or likely pathogenic according to HGMD are
re-assessed by evaluation of the original papers, and the variant is accordingly
reclassified. Novel variants (publication not available) which are LoF and missense
variants leading to a novel amino acid change where a previous pathogenic variant was
described (highly conserved and predicted as damaging by in silico tools) are classified as
likely pathogenic as well as de novo variants which lead to insertion or deletions within a
non-repetitive region or lead to a de novo amino acid change (if highly conserved and
predicted as damaging by in silico tools). Additionally, variants found in at least 3
unrelated, similarly affected patients or in 2 unrelated similarly affected patients for
whom biochemical confirmation is available or familial segregation is present are also
classified as likely pathogenic.
Page | 16
CentoMD® 3.0 Handbook_V1_May2016
Figure 1: Classification of genetic variants in CentoMD® 3.0. The classification rules determining the clinical significance of a genetic variant are provided in the text. CG:
CENTOGENE
The CIV includes the following sub-classes: neutral, likely neutral, disease-associated
polymorphisms, CENTOGENE (likely) neutral - published as (likely) pathogenic. They are
classified into this category based on their high frequency in population(s), no observed
impact on disease presence/ severity/ susceptibility, or non-segregation and /or co-
occurrence detected, etc. In addition, disease-associated polymorphisms are included for
disorders with known multigene, complex inheritance. Reported variants must have a
maximum MAF of 5% in public databases and the association should be replicated by at
least 2 independent studies or in 1 study with functional evidence. When the internal
evidence regarding the clinical significance of a variant is inconsistent compared to other
external resources, the sub-class “CENTOGENE (likely) neutral - published as (likely)
pathogenic” is used in order to emphasize the importance of this observation. Variants of
this category were detected in at least 2 unrelated, healthy/unaffected individuals
(taking into account for example age at onset for the disease) or that the variant is found
Page | 17
CentoMD® 3.0 Handbook_V1_May2016
in at least 1 patient (affected with another genetic disease) in whom a CRV has been
previously identified.
The VUS class includes rare variants reported or not in the literature with unknown risk of
developing/causing the disease, or when prediction software shows inconsistent effects
or, family studies did not support clear statement on its impact on the phenotype.
Variant re-evaluation and re-classification is a key feature of CentoMD® and performed
regularly in the light of literature, publicly available clinical databases and most
important, based on CENTOGENE AG’s own continuously growing and improving
proprietary information.
Information on disease and inheritance
Every genetic disorder which has been suggested or suspected by the physician is
described according to the OMIM catalog. OMIM (Online Mendelian Inheritance in Man)
was developed for the world-wide-web by NCBI and contains a list of human genes and
genetic diseases with links to other relevant resources
(http://www.ncbi.nlm.nih.gov/omim). Every entry in OMIM is linked to a unique
identifier, which is also captured in CentoMD®.
Each genetic disorder is linked with the observed mode of inheritance (MOI). MOI is
defined by the manner in which a particular genetic trait or disorder is passed from one
generation to the next. The following MOIs are included in CentoMD®:
o Autosomal dominant (AD): The pattern of inheritance in which an affected
individual has one copy of a mutant gene and one normal gene on a pair of
autosomal chromosomes.
o Autosomal recessive (AR): The pattern of inheritance in which both copies of an
autosomal gene must be abnormal for a genetic condition or disease to occur.
o Digenic (Di): The pattern of inheritance that is similar to recessive inheritance,
except that the trait only develops when mutations are found in one copy of each
of the two independent genes simultaneously.
o Imprinting/Epigenetic (Imp/Epi): The pattern of inheritance by mechanisms not
directly involving nucleotide sequences, but paramutations and parental
imprinting.
Page | 18
CentoMD® 3.0 Handbook_V1_May2016
o Mitochondrial (Mito): The pattern of inheritance of a trait encoded in the
mitochondrial genome.
o Multifactorial (MF): The pattern of inheritance caused by the interplay between
genetic factors and environmental factors.
o Pseudoautosomal dominant (P-AD): The inheritance pattern seen with genes in the
pseudoautosomal region of the X and Y chromosome that can exchange regularly
between the two sex chromosomes. Alleles for genes in the pseudoautosomal
region can show male-to-male transmission, and therefore mimic autosomal
inheritance, because they can cross over from the X to the Y during male
gametogenesis and be passed on from a father to his male offspring.
o X-linked (X): The mode of inheritance in which a mutation in a gene on the X
chromosome causes the phenotype to be expressed in males who are hemizygote
for the mutated gene (i.e., they have only one X chromosome) and in females who
are homozygote for the mutated gene (i.e., they have a copy of the gene mutation
on each of their two X chromosomes). Carrier females who have only one copy of
the mutation do not usually express the phenotype, although differences in X-
chromosome inactivation can lead to varying degrees of clinical expression in
carrier females.
o Y-linked (Y): The pattern of inheritance that may result from a mutant gene
located on a Y chromosome. By definition, only males are affected.
o Unknown (?): This mode of inheritance is selected for genes not yet associated with
any pathological condition or disease, therefore no pattern of inheritance has been
observed.
Individual-related information on phenotype and demographics
All patient data in CentoMD® is fully anonymized. The following epidemiological and
clinical data are reported for individuals associated with classified and curated CRV
and/or VUS in CentoMD®:
o Random patient ID: Unique identifier assigned to each consented individual in
CentoMD®.
o Finding: Indicates if a variant is related to the indication for testing. Primary findings
are variants related to the indication for testing. Incidental findings are derived from
Page | 19
CentoMD® 3.0 Handbook_V1_May2016
whole exome sequencing (WES) and are pathogenic or likely pathogenic variants
identified in genes for which incidental findings are reported, based on the ACMG
recommendations for reporting of incidental findings in clinical exome and genome
sequencing (Genetics in Medicine, 2013). Incidental findings are unrelated to the
indication for testing.
o OMIM disease: OMIM number of the disease suspected by the corresponding physician
according to the clinical symptoms.
o MOI: Mode of inheritance. It is defined by the manner in which a particular genetic
trait or disorder is passed from one generation to the next.
o Anonymized random family number (ARFN): Unique family number used to keep all
members together when relationship links are provided.
o Pedigree: Indicates the connection/relation among individuals by blood, marriage, or
adoption. Based on the ARFN and the relationships within one family, it is possible to
reconstruct the family trees accordingly. In each family, the index patient is
indicated. The index patient represents the affected individual through whom the
family with a genetic disorder is first diagnosed.
o Sex: Indicates the biological state of the individual of being male, female or unknown
sex (when no information was provided or a prenatal case was analyzed).
o Age: Age at diagnosis. It is calculated as date of sample entry at CENTOGENE AG
minus date of birth, and is expressed in years. For patients referred to CENTOGENE AG
several times, the date of the first order entry is used by default to calculate the age
at diagnosis.
o Country: Country of sample origin. It indicates the area of the world the patient is
coming from. The basis for this information is the country from which the sample has
been sent to CENTOGENE AG. If physician provides information about the ethnicity of
the patient (e.g. Canadian citizen of German origin), then this (in this case Germany)
is the country selected in this situation.
o Region: Continental region the patient is coming from.
o Clinical information (HPO terms): Description of features and characteristics that the
corresponding physician has provided as supporting evidence of the presence of a
particular disease translated into the vocabulary defined by the HPO
(http://www.human-phenotype-ontology.org/) by medical experts.
Page | 20
CentoMD® 3.0 Handbook_V1_May2016
Sometimes it is not possible to describe the clinical picture accurately, because the
details are not given by the physician or only general assumptions have been made.
Such cases are documented in CentoMD® in the following manner:
No information/unknown: selected when no clinical information has been
provided;
Not affected/asymptomatic: selected when the physician has explicitly
indicated that the person is healthy, asymptomatic, or not affected;
Suspected/affected: selected when only very general statements are provided
by the physician (e.g. “patient suffering from Breast Cancer” or “clinical
features of Parkinson”).
o Variant zygosity: Indication if the variant is detected on one chromosome or on both
chromosomes.
o Total number of variants: Total number of detected variants for this case (clinically
relevant; clinically irrelevant) on this particular gene. For example, “10 (1 ; 9)” is to
be interpreted as follows: the total number of variants that were identified in this
proband/patient for this particular gene is 10, one of which is clinically relevant,
while 9 are clinically irrelevant variants.
o Genotype: Genetic constitution of this case with respect to the number of alleles and
their clinical significance for this particular gene.
o Enzyme and Biomarker interpretation: Interpretation of the enzyme activity and
biomarker levels compared to the reference interval.
o Clinical statement: The finding or the conclusion of the molecular genetic test
conducted at CENTOGENE AG.
o Sample type: Includes DNA, Cells, Tissue, Blood, DBS (dry blood spot), AF/CV
(amniotic fluid/ chorionic villi).
o Age at onset: Refers to the age at which an individual acquires, develops, or first
experiences a condition or symptoms of a disease or disorder.
o Carrier testing: Indicates if the individual was interested in performing a carrier
screening when the presence of specific genetic variant was detected already in other
family members.
o Consanguineous parents: Refers to the marriage between two genetically related
persons.
Page | 21
CentoMD® 3.0 Handbook_V1_May2016
o Family history: Indicates the presence or the absence of a particular disorder or
symptomatology in blood relatives of a patient.
o Detailed family history: Detailed description of disorders from which direct blood
relatives of the patient have suffered.
Clinical statement of CENTOGENE AG
The clinical statement is the finding or the conclusion of the molecular genetic test
conducted at CENTOGENE AG. The clinical statement may confirm or disprove the
suspected diagnosis, or serve to elucidate the genetic cause of an uncertain or
questionable condition or disease. When deriving the clinical statement, the following
factors are considered:
o The mode of inheritance of the disorder
o The patient’s genotype
o The clinical significance of all identified genetic variants
o The clinical data provided, if available
o Additionally, sex and/or biochemical evidences, if applicable
The evidence-based rules determining the clinical significance class are summarized in
Table 1 and Figure 2. The following clinical statements are used in CentoMD®:
o Affected: Indicates an individual where rules applied to determine clinical
statement confirmed the suspected diagnosis.
o Probably affected: Refers only to Fabry male patients carrying a VUS associated
with only pathological enzymatic levels, but not with pathological biomarker
levels. Identification of males carrying that particular VUS with pathological
biomarker levels induces the VUS re-classification into a likely pathogenic variant.
o At least carrier: Describes a patient suspected for a disease with autosomal
recessive mode of inheritance, who carries one CRV or VUS.
o Probably carrier: Indicates a carrier of a VUS screened for either recessive
disorders or females screened for X-linked disorders.
o Carrier: An individual who is heterozygote or other/complex (like 2 heterozygous
mutations located in cis) for an autosomal recessive disorder. This statement is not
accepted for autosomal dominant disorders.
Page | 22
CentoMD® 3.0 Handbook_V1_May2016
o Increased risk of developing the disease: Describes an individual carrying the
disease-causing mutation(s) where either the clinical details were not provided or
the patient is too young to develop the disorder. Usually used for late-onset
disorders.
o Uncertain: Indicates an individual carrying genetic variant(s) with unknown clinical
significance.
o Unaffected: Indicates an individual where the susceptibility of the disease was not
confirmed in respect to the screened gene.
For example, for an autosomal dominant disorder where the patient’s genotype is heterozygote,
meaning he carries one clinical relevant variant (except VUS), the expected clinical statement is
either “Affected” or “Increased risk for developing the disease” (according to the provided
clinical information).
Page | 23
CentoMD® 3.0 Handbook_V1_May2016
Appendix
Abbreviations used in CentoMD® 3.0
Evidence-based annotation rules to determine the clinical statement
(next 2 pages)
MOI Mode of Inheritance
Abbreviation Definition
AD Autosomal dominant
AR Autosomal recessive
Di Digenic
Imp/Epi Imprinting/Epigenetic
Mito Mitochrondrial
MF Multifactoral
P-AD Pseudoautosomal dominant
X X-linked
Y Y-linked
? unknown
Genotype
Abbreviation Definition
Comp Het Compound heterozygote
Hem Hemizygote
Het Heterozygote
Hom Homozygote
Other Other/complex
WT Wild type
Zygosity
Abbreviation Definition
Hem Hemizygous
Het Heterozygous
Hom Homozygous
Page | 24
CentoMD® 3.0 Handbook_V1_May2016
Genotype1)
MOI2)
Significance3)
Significance 23)
CI4)
Clinical statement
AD
AR
X-linked7)
Path5)
VUS6)
Path5)
VUS6)
- + ?
Hom/
Hem
x x x increased risk
x x x affected
x x x affected / increased risk
x x x uncertain
x x x uncertain
x x x uncertain
x x x increased risk
x x x affected
x x x affected
x x x uncertain
x x x uncertain
x x x uncertain
x x x increased risk
x x x affected
x x x affected / increased risk
x x x uncertain
x x x uncertain
x x x Uncertain
Het
x x x increased risk
x x x affected
x x x affected / increased risk
x x x uncertain
x x x uncertain
x x x uncertain
x x x carrier
x x x carrier
x x x carrier
x x x probably carrier
x x x probably carrier
x x x probably carrier
x x x carrier
x x x carrier
x x x carrier
x x x uncertain
x x x uncertain
x x x uncertain
Page | 25
CentoMD® 3.0 Handbook_V1_May2016
Table 1: Evidence-based annotation rules to determine the clinical statement at CentoMD®. See Figure 2 for further illustration of the decision process.
1): the most detected annotation classes are included. The wild type genotype is excluded. For wild type the clinical statement is “Unaffected”. 2) Mode of Inheritance 3): indicates the clinical significance of the identified variant 4): clinical information
-: indicates the absence of signs and symptoms of the disease (i.e. healthy/unaffected) +: indicates the presence of signs and symptoms of the disease ?: indicates that no clinical information was provided
5): refers to a variant annotated as pathogenic, likely pathogenic or risk factor 6): Uncertain variant 7): Two X-linked diseases (i.e. Fabry disease and Hunter disease) do not follow these definitions closely, as additional information is available and used as a decision factor when selecting the finding. For these two diseases, please see the decision trees presented in Figure 3.
Comp Het
x x x x increased risk
x x x x affected
x x x x Affected / increased risk
x x x x increased risk
x x x x affected
x x x x Affected / increased risk
x x x x uncertain
x x x x uncertain
x x x x uncertain
x x x x increased risk
x x x x affected
x x x x affected
x x x x at least carrier
x x x x at least carrier
x x x x at least carrier
x x x x uncertain
x x x x uncertain
x x x x uncertain
x x x x increased risk
x x x x affected
x x x x affected
x x x x at least carrier
x x x x affected
x x x x at least carrier
x x x x uncertain
x x x x uncertain
x x x x uncertain
Page | 26
CentoMD® 3.0 Handbook_V1_May2016
Figure 2: Decision trees that illustrate the evidence-based annotation rules which determine the clinical statement at CentoMD® The decision levels illustrated are: MOI – Genotype – Clinical significance (variant effect) – Clinical information – Clinical statement (the
caption of Table 1 also applies to this figure).
Page | 27
CentoMD® 3.0 Handbook_V1_May2016
Figure 3: Decision trees that illustrate the evidence-based annotation rules which determine the clinical statement for Fabry and Hunter disease. The decision levels illustrated are: MOI – Genotype – Clinical significance (variant effect) – Clinical information – Clinical
statement (the caption of Table 1 also applies to this figure).
Page | 28
CentoMD® 3.0 Handbook_V1_May2016
Glossary
Biochemical analysis
Method to analyze enzymatic activity or levels of biomarkers in samples
obtained from patients usually suspected being affected by a metabolic
disorder.
Enzyme
interpretation Evaluation of the enzyme activity compared to the reference interval.
Pathological Levels of activity are significantly decreased compared to the normal range.
Normal Levels of activity are compared with the normal range (no change).
Slightly increased Levels of activity are only slightly increased compared to the normal range.
Slightly decreased Levels of activity are only slightly decreased compared to the normal range.
Biomarker
interpretation Evaluation of the biomarker levels compared to the reference interval.
Pathological Biomarker levels are significantly increased compared to the normal range.
Biomarker level-
Normal Biomarkers levels are compared with the normal range (no change).
Slightly increased Biomarkers levels are only slightly increased compared to the normal range.
Slightly decreased Biomarker levels are only slightly decreased compared to the normal range.
Disease
Particular abnormal, pathological condition that affects part or all of an
organism. It is often construed as a medical condition associated with
specific symptoms and signs.
Mode of Inheritance
(MOI)
The manner in which a particular genetic trait or disorder is passed from one
generation to the next.
Autosomal dominant
(AD)
The pattern of inheritance in which an affected individual has one copy of a
mutant gene and one normal gene on a pair of autosomal chromosomes.
Autosomal recessive
(AR)
The pattern of inheritance in which both copies of an autosomal gene must
be abnormal for a genetic condition or disease to occur.
Digenic (Di)
The pattern of inheritance that is similar to recessive inheritance, except
that the trait only develops when mutations are found in one copy of each of
the two independent genes simultaneously.
Imprinting/Epigenetic
(Imp/Epi)
The pattern of inheritance by mechanisms not directly involving nucleotide
sequences, but paramutations and parental imprinting.
Mitochondrial (Mito) The pattern of inheritance of a trait encoded in the mitochondrial genome.
Multifactorial (MF) The pattern of inheritance caused by the interplay between genetic factors
and environmental factors.
Pseudoautosomal
dominant (P-AD)
The inheritance pattern seen with genes in the pseudoautosomal region of
the X and Y chromosome that can exchange regularly between the two sex
Page | 29
CentoMD® 3.0 Handbook_V1_May2016
chromosomes. Alleles for genes in the pseudoautosomal region can show
male-to-male transmission, and therefore mimic autosomal inheritance,
because they can cross over from the X to the Y during male gametogenesis
and be passed on from a father to his male offspring.
Unknown (?)
This mode of inheritance is selected for genes not yet being associated with
any pathological condition or disease, and therefore no pattern of
inheritance observed.
X-linked (X)
The mode of inheritance in which a mutation in a gene on the X chromosome
causes the phenotype to be expressed in males who are hemizygote for the
mutated gene and in females who are homozygote for the mutated gene.
Y-linked (Y) The pattern of inheritance that may result from a mutant gene located on a
Y chromosome. By definition, only males are affected.
Gene Sequence of DNA that represents a basic unit of heredity, being
expressed in RNA and proteins.
Gene symbol The HUGO Gene Nomenclature Committee (HGNC) has assigned unique gene
symbols and names to almost 38,000 human loci, of which around 19,000 are
protein coding.
Nuclear coding A gene located in the cell nucleus of a eukaryote that encodes for protein.
Nuclear non-coding A gene located in the cell nucleus that does not encode for a protein
product.
Mitochondrial A gene located in the mitochondria.
Transcript/Reference
Sequence
Digital nucleic acid sequence, assembled by scientists as a representative
example of a species' set of genes. Coding DNA reference sequence refers to
a cDNA-derived sequence containing the full length of all coding regions and
non-coding untranslated regions.
cDNA DNA that is synthesized from a messenger RNA template; the single-stranded
form is often used as a probe in physical mapping.
mDNA An extranuclear double-stranded DNA found exclusively in mitochondria that
in most eukaryotes is a circular molecule and is maternally inherited.
Transcript used in
CentoMD®
The transcript that is used at CENTOGENE AG/CentoMD® as a reference
sequence.
Genotype Represents the genetic constitution of an individual with respect to the
number of alleles and their clinical significance identified for a particular
gene.
Compound
heterozygote (Comp
An individual carrying two different, heterozygous, in trans, clinically
relevant (includes uncertain, likely pathogenic, pathogenic, risk factor)
Page | 30
CentoMD® 3.0 Handbook_V1_May2016
Het) alleles at a given locus.
Hemizygote (Hem) A male individual carrying one clinically significant (includes pathogenic,
likely pathogenic, uncertain, risk factor) allele located on X-chromosome.
Heterozygote (Het) An individual carrying one clinically significant (includes pathogenic, likely
pathogenic, uncertain, risk factor) allele.
Homozygote (Hom) An individual carrying two identical, clinically relevant (includes pathogenic,
likely pathogenic, uncertain, risk factor) alleles at one locus.
Other/complex
(other)
Individuals carrying clinically relevant (includes pathogenic, likely
pathogenic, uncertain, risk factor) alleles in other combinations than
described above (e.g. two alleles located in cis, three heterozygous
mutations, one homozygous and one heterozygous, etc.).
Wild type (WT) Individuals carrying alleles with no clinical significance (includes neutral,
likely neutral, disease-associated polymorphism, CENTOGENE (likely) neutral
- published as (likely) pathogenic).
Clinical statement of
CENTOGENE AG
The clinical statement is the finding or the conclusion of the molecular
genetic test conducted at CENTOGENE AG.
Affected Indicates an individual where rules applied to determine final statement
confirmed the suspected diagnosis.
At least carrier
Describes a patient suspected for a disease with autosomal recessive mode
of inheritance, who carries one (likely) pathogenic variant or one variant
with uncertain clinical significance.
Carrier
An individual who is heterozygote or other/complex (like 2 heterozygous
mutations located in cis) for an autosomal recessive disorder. This statement
is not accepted for autosomal dominant disorders.
Increased risk of
developing the
disease (Risk)
Describes an individual carrying the disease-causing mutation(s) where
either the clinical details were not provided or the patient is too young to
develop the disorder. Usually used for late-onset disorders.
Probably affected
Refers only to Fabry male patients carrying a VUS associated with only
pathological enzymatic levels, but not with pathological biomarker levels.
Identification of males carrying that particular VUS with pathological
biomarker levels induces the VUS re-classification into a likely pathogenic
variant.
Probably carrier Indicates a carrier of a VUS screened for either recessive disorders or
females screened for X-linked disorders.
Uncertain Indicates an individual carrying genetic variant(s) with unknown clinical
significance.
Unaffected Indicates an individual where the susceptibility of the disease was not
confirmed in respect to the screened gene.
Page | 31
CentoMD® 3.0 Handbook_V1_May2016
Screening method The test used to identify the cause of the disease.
MLPA Multiplex Ligation-dependent Probe Amplification: Variation of the
multiplex PCR that permits multiple targets to be amplified with only a
single primer pair. Used especially for detecting large/gross and gene
rearrangements, if gross/gene rearrangements are detected.
NGS Next-Generation Sequencing: High-throughput sequencing technology,
allowing the parallel sequencing of multiple genes, producing thousands or
millions of sequences concurrently.
Other method Other methodology used to detect the variants (like fragment length).
qPCR Quantitative Polymerase Chain Reaction: Method to amplify and
simultaneously quantify a targeted DNA molecule. Used in special to detect
large/gross and gene rearrangements.
Sanger Classical method of DNA sequencing, developed by Fred Sanger, using
chemically altered "dideoxy" bases to terminate newly synthesized DNA
fragments at specific bases (either A, C, T, or G). These fragments are then
size-separated, and the DNA sequence can be read.
WES Whole Exome Sequencing: Brute-force approach that involves modern day
sequencing technology and DNA sequence assembly tools to piece together
all coding portions of the genome. The sequence is then compared to a
reference genome and any differences are noted.
Phenotype
Case ID Random patient ID referring to a consented individual where the diagnosis
was confirmed by genetic testing at CENTOGENE AG.
HPO ID Unique HPO identifier for the attributed HPO term.
HPO term Phenotypic description of individuals provided by medical experts and
translated into the vocabulary defined by the HPO.
Shared HPO terms Indication how many HPO terms of a case analyzed at CENTOGENE AG
match the HPO terms provided by the users.
P-value Defines the likeliness of obtaining the corresponding similarity score or
higher by accident. The p-value is calculated by comparing individuals with
random symptoms and their similarity scores. The p-value reasons over the
similarity score distribution. The higher the p-value, the more likely it is to
obtain the corresponding similarity score by accident. The p-value ranges
from 0 to 1, where 0 is best.
Similarity score Phenotypic semantic similarity measure based on the HPO. The similarity
score of two patients is a formal measure of their resemblance with
respect to their standardized symptoms. The score is calculated by a
Page | 32
CentoMD® 3.0 Handbook_V1_May2016
pairwise comparison between each symptom of the two patients. The
higher the score, the more similar the patients.
Similar cases The cases analyzed at CENTOGENE AG which match the HPO terms
provided by the user. In Phenotype to Genotype module, by default only
similar cases sharing a minimum similarity score of 1 are indicated.
Clinical information
(HPO terms)
Description of features and characteristics that the corresponding physician
has provided as supporting evidence of the presence of a particular disease
translated into the vocabulary defined by the HPO by medical experts.
No information/
unknown
Selected when no clinical information has been provided.
Not affected/
asymptomatic
Selected when the physician has explicitly indicated that the person is
healthy, asymptomatic, or not affected.
Suspected/
Affected
Selected when only very general statements are provided by the physician
(e.g. "patient is suffering from Breast Cancer" or "clinical features of
Parkinson").
Individual Represents a unique individual who was tested for a certain disease,
condition or carrier status at CENTOGENE AG.
Sex Indicates the biological state of the individual of being male (m), female (f)
or unknown (?) sex (when no information was provided or a prenatal case
was analyzed).
Age at diagnosis Is calculated as date of sample entry at CENTOGENE AG minus date of
birth, and is expressed in years. For patients referred to CENTOGENE AG
several times, the date of the first order entry is used by default to
calculate the age at diagnosis.
Age at onset Refers to the age at which an individual acquires, develops or first
experience a condition or symptoms of a disorder.
Country Indicates the area of the world the patient is coming from. The basis for
this information is the country where the patient lives. If physician
provides information about the ethnicity of the patient (e.g. Canadian
citizen of German origin), then this (in this case Germany) is the item
selected in this situation.
Pedigree Indicates the connection/relation among individuals by blood, marriage, or
adoption.
Index patient Represents the affected individual through whom the family with a genetic
disorder is brought to the attention of others.
Anonymized random
family number (ARFN)
Family unique number used to keep all members together when
relationship links are provided.
Page | 33
CentoMD® 3.0 Handbook_V1_May2016
Variant A sequence variation in a gene.
Allele frequency at
CentoMD®
Indicates the allele frequency of a particular variant which was observed at
CENTOGENE AG in comparison to the total number of analyzed individuals.
cDNA change Change at cDNA level following numbering based on coding DNA reference
sequences.
Genomic DNA change Change at gDNA level following numbering based on genomic DNA
reference sequence.
Protein change Change at protein level following numbering based on the amino acid
sequence, using one letter amino acid code and X for designating a
translation termination codon.
Total number of
variants
The total number of detected variants for a case (clinically
relevant/uncertain; clinically irrelevant) on a particular gene.
Positive individuals Indication how many times a particular variant was observed at
CENTOGENE AG in comparison to the total number of analyzed individuals
for a particular gene.
Positive individuals (%) Indication how many times a particular variant was observed at
CENTOGENE AG relative to the number of analyzed individuals for a
particular gene (provided as %).
Location The location of the DNA change relative to the transcriptional initiation
site, initiation codon, polyadenylation site or termination codon of the
corresponding gene.
Downstream The region placed 3' (downstream) from the polyadenilation signal of the
gene.
Exon The protein-coding DNA sequences of the gene.
Intron The non-coding regions of a gene that interrupt the protein coding regions
(exons).
Upstream The region located 5' (upstream) from the 5'UTR region of the gene.
3'UTR 3' Untranslated Region: Particular section of messenger RNA (mRNA) that it
starts with the nucleotide immediately following the stop codon of the
coding region. This region contains transcription and translation regulating
sequences.
5'UTR 5'-Untranslated Region: Sequences on the 5' end of mRNA but not
translated into protein. It extends from the transcription start site to just
before the ATG translation initiation codon. 5' UTR may contain sequences
that regulate translation efficiency or mRNA stability.
Clinical significance
according to CentoMD
Indicates the likelihood of this variant to predispose to or to cause the
disorder.
CENTOGENE (likely)
neutral - published as
Variants published in the literature as (likely) pathogenic, but at
CENTOGENE re-classified as (likely) neutral based on the observed
Page | 34
CentoMD® 3.0 Handbook_V1_May2016
(likely) pathogenic frequency or family segregation studies.
Clinically irrelevant
variant (CIV)
Includes variants of the following significance: neutral, likely neutral,
disease-associated polymorphism, CENTOGENE (likely) neutral - published
as (likely) pathogenic.
Clinically relevant
variant (CRV)
Includes variants of the following significance: likely pathogenic,
pathogenic, risk factor, modifier.
Disease associated
polymorphism (DP)
Variant reported to be significantly associated with a phenotype/disease.
Likely neutral Variants reported to be likely neutral, prediction software indicates a
probably not pathological effect, and or high frequency in population
observed. This classification class is equivalent to “likely benign”.
Likely pathogenic Variants with probable pathogenicity, or the effect on the protein function
is predicted to be likely deleterious (>90% probability to cause the
disease).
Neutral Variants reported not to influence the disease risk of the individual, or
predicted to be neutral based on the high frequency in population, no
effect on protein or regulatory regions. This classification class is
equivalent to “benign”.
Modifier A genetic variant that can alter the expression of another gene in the
phenotype of an individual.
Uncertain variant
(VUS)
Variants reported in the literature with unknown risk of developing/
causing the disease, prediction software show inconsistent effects or,
family studies did not support clear statement on its impact on the
phenotype.
Pathogenic Variants that are known to cause the phenotype/disease.
Pathological D4Z4
allele
Large, polymorphic repeat structure associated with a rough and inverse
relationship between clinical severity and the residual repeat size, with
the smallest repeats causing the most severe phenotype.
Risk factor Variants reported to be associated with the phenotype/disease and
influencing the function(s) of the protein.
Secondary
mitochondrial
mutation
The primary molecular defect resides in a nuclear gene, which leads to
secondary mDNA abnormalities, such as loss of mDNA copy number or
multiple mDNA deletions.
Type of variant on DNA
level
Different types of change than can occur in the DNA sequence.
Chromosomal deletion Loss of parts of chromosomes.
Complex
rearrangement
Involves the structures or number of the chromosomes, it is referred to as
chromosome mutation, or rearrangement, rearranged chromosomes.
Conversion Non-reciprocal transfer of information between homologous sequences;
Page | 35
CentoMD® 3.0 Handbook_V1_May2016
one DNA sequence replaces a homologous sequence such that the
sequences become identical after the conversion event.
Deletion An abnormality in which part of a chromosome (carrying genetic material)
is lost.
Duplication Duplication of a sequence of DNA or section of chromosome.
Gain of methylation Gain of the normal DNA methylation level.
Gene deletion Refers to loss of the entire gene.
Gene duplication Refers to gain /duplication of the entire gene.
Gene duplication Refers to gain /duplication of the entire gene.
Gene & regulatory
region(s) deletion
Refers to loss of the entire gene and flanking regions.
Gene & regulatory
region(s) duplication
Refers to the gain of the entire gene and flanking regions.
Gross deletion Refers to loss of parts of a gene.
Gross duplication Refers to gain /duplication of part(s) of a gene.
Gross inversion Refers to 180 degree inversion of part(s) of a gene.
Insertion/Deletion
(Indel)
Refers to the mutation class that includes a combination of both insertions
and deletions.
Insertion Genetic mutation where one or more nucleotides are added (inserted) into
a DNA sequence or it may involve portions of a chromosome.
Inversion Chromosomal abnormality where a segment of a chromosome is rotated
180° and reinserted.
Loss of methylation Loss of the normal DNA methylation level.
Other/complex Refers to all other types not included in any category under Variant-
Mutation type.
Pathological allele
(D4Z4 motif)
Deletion of 3.3-kb repeats from a chromosomal tandem repeat called D4Z4
located near the end of chromosome 4 at the 4q35-ter location. D4Z4
contains an ORF encoding a putative homeobox protein called DUX4, a
large polymorphic repeat structure consisting of 1–100 KpnI units.
Repeat expansion Refers to an increase number of repeats of a genomic tandemly repeated
DNA sequence.
Retrotransposon
insertion
Retrotransposons (also called transposons via RNA intermediates) are
genetic elements that can amplify themselves in a genome, and can induce
mutations by inserting near or within genes. Retrotransposon-induced
mutations are relatively stable, because the sequence at the insertion site
is retained as they transpose via the replication mechanism.
Substitution A sequence change where one nucleotide is replaced by one other
nucleotide. Substitutions are described using a ">"-character (indicating
"changes to").
Page | 36
CentoMD® 3.0 Handbook_V1_May2016
Coding effect Refers to the impact the observed DNA change has on protein level.
Effect unknown The coding effect on protein level has not been analyzed. An effect is
expected but difficult to predict.
Extension Affect either the first (start, translation initiation, N-terminus. ATG) or last codon (translation termination, stop) and as a consequence extend the protein sequence N- or C-terminally with one or more amino acids.
Frameshift Special type of amino acid deletion/insertion affecting an amino acid
between the first (initiation, ATG) and last codon (termination, stop),
replacing the normal C-terminal sequence with one encoded by another
reading frame.
Increased
polyglutamine tract/
expanded polyQ
Portion of a protein consisting of a sequence of several glutamine (Glu; Q)
units.
In-frame A mutation that does not cause a shift in the triplet reading frame.
Missense Point mutation in which a single nucleotide change results in a codon that
codes for a different amino acid. Not all missense mutations are
deleterious, some changes can have no effect. Because of the ambiguity of
missense mutations, it is often difficult to interpret the consequences of
these mutations in causing disease.
New translation
initiation site
A change affecting the translation initiation codon (Met-1) introducing a
new upstream initiation codon extending the N-terminus of the encoded
protein.
Non-coding The change on DNA level produces no effect on protein, or the effect of
regulatory mutations is unknown.
Nonsense Point mutation in a sequence of DNA that results in a premature stop
codon, and in a truncated, incomplete protein product.
Silent A form of point mutation at DNA level resulting in a codon that codes for
the same amino acid but without any functional change in the protein
product.
Splicing mutation DNA changes affecting the splicing process (i.e. intron removal and exons
joining). Splice-site mutations occur within genes in the noncoding regions
(introns) just next to the coding regions (exons). Splice site mutations can
eliminate an existing donor or acceptor site, which will cause an exon to be
skipped and possibly result in a frameshift.
Start loss A start-loss mutation is a point mutation in the ATG start codon that
prevents the original start translation site from being used. This kind of
mutation will obviously eliminate gene function.
Translation initiation
codon
A translation initiation codon is a point mutation creating a new ATG start
codon upstream of the original start translation site. If the new ATG is
close enough to the original one (so that it is within the processed
Page | 37
CentoMD® 3.0 Handbook_V1_May2016
transcript and downstream of a ribosome-binding site) and in frame, it will
be used to initiate translation, adding amino acids to the amino terminus
of the original protein.
Translation
termination codon
A change affecting the translation termination codon (Ter/*) introducing a
new downstream termination codon extending the C-terminus of the
encoded protein.
Zygosity Indicates if a variant is detected on one chromosome or on both
chromosomes. Describes the degree of similarity of the alleles for a trait in
an organism.
Hemizygous (Hem) Used for alleles detected in genes located on X-chromosome for male
individuals.
Heterozygous (Het) Gene locus when cells contain two different alleles of a gene.
Het/Hom/Hem Ratio indicating the number of individuals relative to variant zygosity.
Homozygous (Hom) Gene when identical alleles of the gene are present on both homologous
chromosomes.
Degree of
heteroplasmy
Mixture of more than one type of mitochondrial DNA (mDNA) within a
cell/individual. In those cases where a mutant in mDNA is responsible for a
disease, the larger the proportion of mutant mitochondria, the more likely
the person will show symptoms of the disease.
Heteroplasmic Cell has some mitochondria that have a mutation in the mDNA and some
that do not.
Homoplasmic Cell has a uniform collection of mDNA: either completely normal mDNA or
completely mutant mDNA.
Publication status Indicates if the identified variant has previously been published in the
literature as a disease causing variant or not.
dbSNP The Single Nucleotide Polymorphism Database (dbSNP) is a free public
archive for genetic variation within and across different species developed
and hosted by the National Center for Biotechnology Information (NCBI) in
collaboration with the National Human Genome Research Institute (NHGRI).
PMID PubMed-Indexed for MEDLINE, PubMed identifier or PubMed unique
identifier is a unique number assigned to each PubMed record.
Published Indicates that the identified genetic variant has been already published
and/or characterized and associated with clinical data.
Unpublished Indicates that the detected genetic variant has either not been previously
published in literature, or is not yet associated with any disease.