pit fall s with mtdna analysis in medical genetics hans-jürgen bandelt (hamburg) embo world...

Pitfalls with mtDNA analysis in medical genetics

Hans-Jürgen Bandelt (Hamburg)

EMBO World Programme Workshop on Human Evolution and Disease, Hyderabad, India. 6th – 9th December 2006

From the journal Oncogene, this year, one could pick up the following exciting news:

“Very strikingly, the mitochondrial sequences in this study (tumor samples as well as controls) with the Indian population revealed a unique profile of eight sequence variants, viz. A73G, A263G, A1438G, A2706G, A4769G, C7028T, A8860G and A15326G appeared at high frequencies in all samples and could be of evolutionary significance.”

Well, not, quite ...

R0 =“... the profile could be specific to the Indian population.”

Leaving aside mutations that were systematically missed (such as A750G and C14766T, probably due to the use of a wrong reference sequence), the claim would translate into:

... Near-absence of haplogroup R0 could be specific to the Indian population.

Note that haplogroup R0 is specific to the West Eurasian mtDNA pool.

Absence of phylogenetic knowledge about human mitochondrial DNA is characteristic of clinical genetics.

In a paper from Muscle and Nerve, 2003, the authors (from Taipei) identified among completely sequenced 17 cases “a double mutation (A3243G and A14693G) in a patient with MELAS syndrome, who had a diabetic mother and normal siblings. The A14693G substitution is significant from structural and evolutionary points of view. This result indicates that mtDNA should be sequenced in its entirety for the complete evaluation of mitochondriopathy.”

However, the complete mtDNA sequence of that MELAS patient was not reported ...

Y162663834

16231 16223 16126 14693 14178 10398 8392

15244 14914 7859 6941 5147 482

15460 15221 10097 146

However, it has been known since 2003 that A14693G is a characteristic mutation of the East Asian haplogroup Y

“A14693G is not identified in 205 human controls and 76 randomly examined species.”

Bandelt, Yao & Kivisild (2005)

According to Trejaut et al. (2005), haplogroup Y2 does occur in parts of Taiwan

Suppose that you would now find A14693G in some disease context and would consult MITOMAP prior to publishing your case:

Green light from MITOMAP, thus – and here you go:

Dong-Ling Tang, Xin Zhou, Xia Li, Lei Zhao, Fang Liu

Diabetes Res Clin Pract. 2006 Jan 13

“… in a Chinese population, a total of 184 T2DM cases and 279 matched healthy controls were recruited. … Our results suggest that the mutations

of T3394C and A14693G may contribute to genetic predisposition to T2DM, with the T16189C variant being associated with insulin resistance.”Dong-ling Tang, Xin Zhou, Ke-yuan Zhou, Xia Li, Lei Zhao,

Fang Liu, Fang Zheng, Song-mei Liu

Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2005 Dec 22

“A total of 184 cases of type 2 diabetes mellitus and 210 matched healthy controls with normal glucose tolerance were recruited for the study. ….

The mutations of 3394 (T-->C) and 14693 (A-->G) may contribute to the genetic predisposition to type 2 diabetes; 16189 (T-->C) variant is associated with insulin resistance and risk factor of diabetes.”

Double-publication does not seem to be a rare phenomenon in China.

Another way of re-cycling is to publish one and the same mutation as novel multiple times – either found in one and the same patient who was re-examined again and again, or in several different patients – as long as the mutation would not find its way into MITOMAP.

The control group investigated by Tang et al. (2006) has the peculiar feature that in a sample of size 279 there was absolutely no haplogroup M9a‘b lineage (T3394C). Other studies would, however, rather suggest a figure of 2.7% (in Han Chinese). Under the hypothesis that this percentage is the true frequency, then the event of observing 0/279 M9a‘b lineages would have a probability of 0.05%!

There are further cases, in cancer research, where control group data are almost void of any variation and thus are either cooked or fabricated, e.g. taken over (via copy-and-paste) from other studies (which have a most peculiar error spectrum themselves).

11719 73

4769 1438

7028 2706

15326 8860

315+C 263

CRS UC-Case 3

16524+G10027+G

95728494T8483T8021+A7917+G7904+G

645651014919+T

4032-4033del2036+C1080+C1074+C

98246455

16324 14364

5899+C

11084 11017

M7a1a5

522-523del 13768

150431478310400489

163625178A4883

UC-Case 2

13586124171185311771G11102+G10999G

7337500del

1622397553391207199194

1614714570G1195611712G11457G11247G118539883G9860G594246623338G3285G1486G1000A

UC-Case 1

15301108731039895408701

1622312705

Control 1

11955C11947C10590G

39692472del

1476614368G14365G14272G14199G13702G113359559G49853423G3106del

1288212406106096962

16129 137599053

F1a’c

16304 13928C

103106392

249del

R0=pre-HV

Control 2

885086396479+A2687del

8281-8289del16189

1466884143010

16209 12771 4958 4386 2772 2626

1629815487T

85847196A 4715

161842835

163191447086846179

333333

Controls Patients

Missed mutations

False mutations

Cooked data

In particular, mtDNA analysis in cancer studies is riddled with all kinds of error – for example, sample mix-up or contamination:

Bundles of perceived somatic mutations then actually trace parts of phylogenetic pathways in the mtDNA phylogeny (Salas et al. 2005).

How would authors of inflicted papers react?

Well, they could say that the revisited pathway would just express the disease...

In one remarkable case it was proclaimed:

“This patient had a germline mitochondrial haplotype J, which “shifted” in positions 185, 295, and 16126 back to the phylogenetically older haplotype H, but shifted in position 195 to haplotype W and in position 204 to nowhere.”

*Said of a style of humour: bizarre and surreal

Pythonesque...*

Monty Python is the collective name of the creators of Monty Python's Flying Circus, a British television comedy sketch show broadcast by the BBC from 1969 to 1974.

The Dead Parrot sketch is one of the most famous in the history of television comedy.

MP‘s foot

Information and links: Wikipedia

The Dead Parrot sketch portrays a conflict between a disgruntled customer and a shopkeeper, who hold

contradictory positions on the vital state of a Norwegian Blue parrot (an apparent absurdity in itself).

"I know a dead parrot when I see one, and I'm lookin' at one right now."

The customer complains that the parrot he has recently purchased at the location is, in fact, dead. The

shopkeeper denies this and points out the beauty of its plumage, further suggesting that the bird is merely asleep.

Monty Python's Dead Parrot sketch has come to life in molecular anthropology:

The Caucasian King Size parrot

starring

Ivani Nasidze & Mark Stoneking

from the MP Institute EVA

Natal history of the Caucasian King Size parrot:

2001 Nasidze and Stoneking published a Caucasian HVS-I data set (but did not show the data)

2003 At the International Symposium on Forensic DNA Technologies in Münster this Caucasian data set was accused of having an enormous (king size) error spectrum

2004 Nasidze and colleagues gave a false statement (in the Annals of Human Genetics) about the

vital state of their old data set, by employing a sort of a filter analysis (however, non-adjusted to actual sequencing range):

Labels 16000+

“... (analysis not shown). These reticulations were just due to one sequence*, as removal of this sequence removed the excess reticulations.”

False statement!

Here‘s the torso of the reticulate network after removal of that sequence

(Bandelt and Kivisild 2006).

* Corrected Sequence not shown

This Caucasian HVS-I data set has now been displayed publicly and can be inspected in the book:

Human Mitochondrial DNA and the Evolution of Homo sapiens

Bandelt, Hans-Jürgen; Richards, Martin; Macaulay, Vincent (Eds.)

Springer-Verlag, 2006, 117.65€

Those who wish to evaluate the idiosyncratic variation of this Caucasian data set but have no prior knowledge of natural mtDNA variation may proceed as follows:

Take out all mutations ever observed in any one of the complete sequences referred to in Max Ingman‘s database (http://www.genpat.uu.se/mtDB/)

and represent the surviving variation in the form of a quasi-median network that highlights the character conflicts

and compare with other data sets:

Beautiful plumage From: Bandelt & Dür (2006)*

*Hans-Jürgen Bandelt & Arne Dür (2007) Translating DNA data tables into quasi-median networks for parsimony analysis and error detection. Molecular Phylogenetics and Evolution 42: 256–271.

In this article it is demonstrated that the mtDB2005-filtered variation of the Caucasian data is even messier than corresponding randomised data tables.

The stone dead Caucasian King Size parrot:

Bandelt & Kivisild (2006) Quality assessment of DNA sequence data: autopsy of a mis-sequenced mtDNA population sample. Annals of Human Genetics 70: 314–326.

Stoneking & Nasidze (2006) The patient is not dead yet: premature autopsy of a mtDNA data set. Annals of Human Genetics 70: 327–331.

Parson (2006) The art of reading sequence electropherograms. Annals of Human Genetics

Stoneking & Nasidze (2006) Reply to Parson. Annals of Human Genetics

Remarkable electropherogram!

What‘s wrong with it?

This electropherogram don't enter into it. That‘s what‘s wrong with it.

No one has ever claimed that the transition C16168T is a phantom mutation – instead, it‘s the transversion C16168A that may be a phantom mutation.

In Nasidze et al. (2001) the 16168 transition always occurs together with the 16343 transition: this constitutes a confirmed motif within haplogroup U3 (Macaulay et al. 1999).

Eight of these paired electropherograms are irrelevant here because they just show other sequences with the rCRS nucleotide in question. Instead, the heavy strand electropherograms could have been shown in these places!

Walther Parson has aptly demonstrated that those electropherograms reflect well-known and reproducible sequencer artifacts and were not interpreted properly by Nasidze and Stoneking.

They have always asserted that they have read both strands – but without ever giving any evidence for that:

“As stated both in the original papers (Nasidze & Stoneking 2001: p.1198; Nasidze et al. 2004: p.207) and in the reply to Bandelt & Kivisild (Stoneking & Nasidze, 2006: p.329), both strands were indeed sequenced in all samples.”

Thus, they are iterating a false statement.

You know a dead parrot when you see one:

... “(data not shown)” ** and not submitted to GenBank either

... “(analysis not shown)”

In case you wish to register a complaint, please, address yourself to the ombudsperson:

“Scientific honesty and the observance of the principle of good scientific practice are essential in all scientific work which seeks to expand our knowledge and which is intended to earn respect from the public.”

From Mark Stoneking’s website

(http://email.eva.mpg.de/~stonekg/files/ombud.htm):

“As elected Ombudsperson of the Max Planck Institute for Evolutionary Anthropology, I stand at your disposal in case you are experiencing or observing any kind of scientific misconduct, or if you need advice on the subject of good scientific practice.”

“Scientific misconduct includes: false statements, infringement of intellectual property, impairment of the research work of others, joint accountability.”

Acknowledgement:

Sincere thanks are due to

Martin Richards (reminding me of MP‘s immortal Dead Parrot sketch)

and all collaborators (Anita Brandstätter, Claudio Bravi, Mike Coble, Arne Dür, Toomas Kivisild, Jüri Parik, Walther Parson, Antonio Salas, Richard Villems, Yong-Gang Yao)

Thank you for listening

pit fall s with mtdna analysis in medical genetics hans-jürgen bandelt (hamburg) embo world...

parts of taiwan slide

human controls

a14693g substitution

indian population

complete mtdna sequence

absence of haplogroup

haplogroup y2

matched healthy controls

Documents

embo membership pocket guide 2012 · aebersold, ruedi embo...

embo embc facts figures 2013€¦ · science & society|the...

recovery of neandertal mtdna

ancestry tracing with mtdna 7.28.08

mtdna and the islands of the north atlantic: estimating...

mtdna markers for celtic and germanic language areas · pdf...

embo practical course on - ruđer bošković...

mitochondrial dna (mtdna)

embo postdoctoral fellowships€¦ · applications for an...

what is mitochondrial dna (mtdna)?

lipinski embo chemical biology heidelberg 2014

embo cw basic flyer march2020 print - socevol.cl€¦ ·...

historical biogeography and interspeciﬁc mtdna

mmm1p is connected to mtdna

diversity of mtdna lineages in portugal: not a genetic...

mitochondrial molecular biology 2 evolution of mitochondria...

program - embo

pit fall s with mtdna analysis in medical genetics...

cs-corrector. - embo 2015

embo telomeres and the dna