pit fall s with mtdna analysis in medical genetics hans-jürgen bandelt (hamburg) embo world...
Post on 19-Dec-2015
218 Views
Preview:
TRANSCRIPT
Pitfalls with mtDNA analysis in medical genetics
Hans-Jürgen Bandelt (Hamburg)
EMBO World Programme Workshop on Human Evolution and Disease, Hyderabad, India. 6th – 9th December 2006
From the journal Oncogene, this year, one could pick up the following exciting news:
“Very strikingly, the mitochondrial sequences in this study (tumor samples as well as controls) with the Indian population revealed a unique profile of eight sequence variants, viz. A73G, A263G, A1438G, A2706G, A4769G, C7028T, A8860G and A15326G appeared at high frequencies in all samples and could be of evolutionary significance.”
Well, not, quite ...
R0 =“... the profile could be specific to the Indian population.”
Leaving aside mutations that were systematically missed (such as A750G and C14766T, probably due to the use of a wrong reference sequence), the claim would translate into:
... Near-absence of haplogroup R0 could be specific to the Indian population.
Note that haplogroup R0 is specific to the West Eurasian mtDNA pool.
Absence of phylogenetic knowledge about human mitochondrial DNA is characteristic of clinical genetics.
In a paper from Muscle and Nerve, 2003, the authors (from Taipei) identified among completely sequenced 17 cases “a double mutation (A3243G and A14693G) in a patient with MELAS syndrome, who had a diabetic mother and normal siblings. The A14693G substitution is significant from structural and evolutionary points of view. This result indicates that mtDNA should be sequenced in its entirety for the complete evaluation of mitochondriopathy.”
However, the complete mtDNA sequence of that MELAS patient was not reported ...
Y162663834
16231 16223 16126 14693 14178 10398 8392
Y2
16311
15244 14914 7859 6941 5147 482
Y1a
7933
15460 15221 10097 146
Y1b
Y1
N9
5417
N
However, it has been known since 2003 that A14693G is a characteristic mutation of the East Asian haplogroup Y
“A14693G is not identified in 205 human controls and 76 randomly examined species.”
Bandelt, Yao & Kivisild (2005)
According to Trejaut et al. (2005), haplogroup Y2 does occur in parts of Taiwan
Suppose that you would now find A14693G in some disease context and would consult MITOMAP prior to publishing your case:
Green light from MITOMAP, thus – and here you go:
Dong-Ling Tang, Xin Zhou, Xia Li, Lei Zhao, Fang Liu
Diabetes Res Clin Pract. 2006 Jan 13
“… in a Chinese population, a total of 184 T2DM cases and 279 matched healthy controls were recruited. … Our results suggest that the mutations
of T3394C and A14693G may contribute to genetic predisposition to T2DM, with the T16189C variant being associated with insulin resistance.”Dong-ling Tang, Xin Zhou, Ke-yuan Zhou, Xia Li, Lei Zhao,
Fang Liu, Fang Zheng, Song-mei Liu
Zhonghua Yi Xue Yi Chuan Xue Za Zhi. 2005 Dec 22
“A total of 184 cases of type 2 diabetes mellitus and 210 matched healthy controls with normal glucose tolerance were recruited for the study. ….
The mutations of 3394 (T-->C) and 14693 (A-->G) may contribute to the genetic predisposition to type 2 diabetes; 16189 (T-->C) variant is associated with insulin resistance and risk factor of diabetes.”
Double-publication does not seem to be a rare phenomenon in China.
Another way of re-cycling is to publish one and the same mutation as novel multiple times – either found in one and the same patient who was re-examined again and again, or in several different patients – as long as the mutation would not find its way into MITOMAP.
The control group investigated by Tang et al. (2006) has the peculiar feature that in a sample of size 279 there was absolutely no haplogroup M9a‘b lineage (T3394C). Other studies would, however, rather suggest a figure of 2.7% (in Han Chinese). Under the hypothesis that this percentage is the true frequency, then the event of observing 0/279 M9a‘b lineages would have a probability of 0.05%!
There are further cases, in cancer research, where control group data are almost void of any variation and thus are either cooked or fabricated, e.g. taken over (via copy-and-paste) from other studies (which have a most peculiar error spectrum themselves).
11719 73
HV
4769 1438
7028 2706
H
H2
15326 8860
315+C 263
rCRS
L3
CRS UC-Case 3
16524+G10027+G
95728494T8483T8021+A7917+G7904+G
645651014919+T
4032-4033del2036+C1080+C1074+C
98246455
M7a
16324 14364
5899+C
11084 11017
M7a1a
M7a1a5
522-523del 13768
150431478310400489
M
163625178A4883
UC-Case 2
13586124171185311771G11102+G10999G
7337500del
D4c
D4c1
1622397553391207199194
191+A
D4c1a
2766
16245
1614714570G1195611712G11457G11247G118539883G9860G594246623338G3285G1486G1000A
UC-Case 1
15301108731039895408701
1622312705
N
R
Control 1
11955C11947C10590G
39692472del
1476614368G14365G14272G14199G13702G113359559G49853423G3106del
1288212406106096962
F1
16129 137599053
F1a’c
F
R9
16304 13928C
3970
103106392
249del
750
H2a
R0=pre-HV
14766
Control 2
885086396479+A2687del
741
B4
B
8281-8289del16189
16217
1
12
2
2
1466884143010
D
D4
3 4
3
3
5
M7
16209 12771 4958 4386 2772 2626
M7a1
44
4
4
M8
1629815487T
85847196A 4715
161842835
163191447086846179
M8a1
M8a
5
55
5
5
5
5
5
2
22
11
1
1
1
1111
1 111
1
1111
11
22
2
22
2
2
22
3
333333
3
33
33
3
3
3333
3
33333
3
3
3333
4
44
44
44
44
4
4
4
4
44
44
4
4
44
5555
55
55
55
5
5
5555
7852
Controls Patients
Missed mutations
False mutations
Cooked data
In particular, mtDNA analysis in cancer studies is riddled with all kinds of error – for example, sample mix-up or contamination:
Bundles of perceived somatic mutations then actually trace parts of phylogenetic pathways in the mtDNA phylogeny (Salas et al. 2005).
How would authors of inflicted papers react?
Well, they could say that the revisited pathway would just express the disease...
In one remarkable case it was proclaimed:
“This patient had a germline mitochondrial haplotype J, which “shifted” in positions 185, 295, and 16126 back to the phylogenetically older haplotype H, but shifted in position 195 to haplotype W and in position 204 to nowhere.”
*Said of a style of humour: bizarre and surreal
Pythonesque...*
Monty Python is the collective name of the creators of Monty Python's Flying Circus, a British television comedy sketch show broadcast by the BBC from 1969 to 1974.
The Dead Parrot sketch is one of the most famous in the history of television comedy.
MP‘s foot
Information and links: Wikipedia
The Dead Parrot sketch portrays a conflict between a disgruntled customer and a shopkeeper, who hold
contradictory positions on the vital state of a Norwegian Blue parrot (an apparent absurdity in itself).
"I know a dead parrot when I see one, and I'm lookin' at one right now."
The customer complains that the parrot he has recently purchased at the location is, in fact, dead. The
shopkeeper denies this and points out the beauty of its plumage, further suggesting that the bird is merely asleep.
Monty Python's Dead Parrot sketch has come to life in molecular anthropology:
The Caucasian King Size parrot
starring
Ivani Nasidze & Mark Stoneking
from the MP Institute EVA
Natal history of the Caucasian King Size parrot:
2001 Nasidze and Stoneking published a Caucasian HVS-I data set (but did not show the data)
2003 At the International Symposium on Forensic DNA Technologies in Münster this Caucasian data set was accused of having an enormous (king size) error spectrum
2004 Nasidze and colleagues gave a false statement (in the Annals of Human Genetics) about the
vital state of their old data set, by employing a sort of a filter analysis (however, non-adjusted to actual sequencing range):
Labels 16000+
“... (analysis not shown). These reticulations were just due to one sequence*, as removal of this sequence removed the excess reticulations.”
False statement!
Here‘s the torso of the reticulate network after removal of that sequence
(Bandelt and Kivisild 2006).
* Corrected Sequence not shown
This Caucasian HVS-I data set has now been displayed publicly and can be inspected in the book:
Advertisement
Human Mitochondrial DNA and the Evolution of Homo sapiens
Bandelt, Hans-Jürgen; Richards, Martin; Macaulay, Vincent (Eds.)
Springer-Verlag, 2006, 117.65€
Those who wish to evaluate the idiosyncratic variation of this Caucasian data set but have no prior knowledge of natural mtDNA variation may proceed as follows:
Take out all mutations ever observed in any one of the complete sequences referred to in Max Ingman‘s database (http://www.genpat.uu.se/mtDB/)
and represent the surviving variation in the form of a quasi-median network that highlights the character conflicts
and compare with other data sets:
Beautiful plumage From: Bandelt & Dür (2006)*
*Hans-Jürgen Bandelt & Arne Dür (2007) Translating DNA data tables into quasi-median networks for parsimony analysis and error detection. Molecular Phylogenetics and Evolution 42: 256–271.
In this article it is demonstrated that the mtDB2005-filtered variation of the Caucasian data is even messier than corresponding randomised data tables.
The stone dead Caucasian King Size parrot:
Bandelt & Kivisild (2006) Quality assessment of DNA sequence data: autopsy of a mis-sequenced mtDNA population sample. Annals of Human Genetics 70: 314–326.
Stoneking & Nasidze (2006) The patient is not dead yet: premature autopsy of a mtDNA data set. Annals of Human Genetics 70: 327–331.
Parson (2006) The art of reading sequence electropherograms. Annals of Human Genetics
Stoneking & Nasidze (2006) Reply to Parson. Annals of Human Genetics
Remarkable electropherogram!
What‘s wrong with it?
This electropherogram don't enter into it. That‘s what‘s wrong with it.
No one has ever claimed that the transition C16168T is a phantom mutation – instead, it‘s the transversion C16168A that may be a phantom mutation.
In Nasidze et al. (2001) the 16168 transition always occurs together with the 16343 transition: this constitutes a confirmed motif within haplogroup U3 (Macaulay et al. 1999).
Eight of these paired electropherograms are irrelevant here because they just show other sequences with the rCRS nucleotide in question. Instead, the heavy strand electropherograms could have been shown in these places!
Walther Parson has aptly demonstrated that those electropherograms reflect well-known and reproducible sequencer artifacts and were not interpreted properly by Nasidze and Stoneking.
They have always asserted that they have read both strands – but without ever giving any evidence for that:
“As stated both in the original papers (Nasidze & Stoneking 2001: p.1198; Nasidze et al. 2004: p.207) and in the reply to Bandelt & Kivisild (Stoneking & Nasidze, 2006: p.329), both strands were indeed sequenced in all samples.”
Thus, they are iterating a false statement.
You know a dead parrot when you see one:
... “(data not shown)” ** and not submitted to GenBank either
... “(analysis not shown)”
In case you wish to register a complaint, please, address yourself to the ombudsperson:
“Scientific honesty and the observance of the principle of good scientific practice are essential in all scientific work which seeks to expand our knowledge and which is intended to earn respect from the public.”
From Mark Stoneking’s website
(http://email.eva.mpg.de/~stonekg/files/ombud.htm):
“As elected Ombudsperson of the Max Planck Institute for Evolutionary Anthropology, I stand at your disposal in case you are experiencing or observing any kind of scientific misconduct, or if you need advice on the subject of good scientific practice.”
“Scientific misconduct includes: false statements, infringement of intellectual property, impairment of the research work of others, joint accountability.”
Acknowledgement:
Sincere thanks are due to
Martin Richards (reminding me of MP‘s immortal Dead Parrot sketch)
and all collaborators (Anita Brandstätter, Claudio Bravi, Mike Coble, Arne Dür, Toomas Kivisild, Jüri Parik, Walther Parson, Antonio Salas, Richard Villems, Yong-Gang Yao)
Thank you for listening
top related