complete mitochondrial genome sequence from an endangered indian snake, python molurus molurus...
TRANSCRIPT
Complete mitochondrial genome sequence from an endangeredIndian snake, Python molurus molurus (Serpentes, Pythonidae)
Bhawna Dubey • P. R. Meganathan •
Ikramul Haque
Received: 24 May 2011 / Accepted: 25 January 2012 / Published online: 14 February 2012
� Springer Science+Business Media B.V. 2012
Abstract This paper reports the complete mitochondrial
genome sequence of an endangered Indian snake, Python
molurus molurus (Indian Rock Python). A typical snake
mitochondrial (mt) genome of 17258 bp length comprising
of 37 genes including the 13 protein coding genes, 22 tRNA
genes, and 2 ribosomal RNA genes along with duplicate
control regions is described herein. The P. molurus molurus
mt. genome is relatively similar to other snake mt. genomes
with respect to gene arrangement, composition, tRNA
structures and skews of AT/GC bases. The nucleotide com-
position of the genome shows that there are more A–C % than
T–G% on the positive strand as revealed by positive AT and
CG skews. Comparison of individual protein coding genes,
with other snake genomes suggests that ATP8 and NADH3
genes have high divergence rates. Codon usage analysis
reveals a preference of NNC codons over NNG codons in the
mt. genome of P. molurus. Also, the synonymous and non-
synonymous substitution rates (ka/ks) suggest that most of
the protein coding genes are under purifying selection pres-
sure. The phylogenetic analyses involving the concatenated
13 protein coding genes of P. molurus molurus conformed to
the previously established snake phylogeny.
Keywords Codon usage � Mitochondrial genome �Python molurus � Pythonidae � Phylogeny
Introduction
Complete mitochondrial genomes studies lead to a better
insight into the biology and evolution of vertebrates. Mito-
chondrial DNA (mtDNA) has been frequently used for
phylogenetic studies because of conservative gene order,
absence of introns, lack of recombination, maternal inheri-
tance, and presence of various protein coding genes, which
provide evolutionary milieu of the genome. Also, the
applications of mtDNA for wildlife conservation approa-
ches, population structure studies and conservation genetics
involve the use of rapidly evolving mtDNA sequences which
may be used along with other markers like microsatellites
and SNPs [1, 2]. Mitochondrial DNA sequences can also be
used to identify cryptic species/hybrid individuals in a
population [3]. Furthermore, many studies, including ours
have also reported the successful use of mtDNA in forensic
identification of endangered species, thereby aiding in their
conservation programs [4–11]. Phylogenetic tree construc-
tion is another approach commonly used in species identi-
fication, where whole mtDNA sequences/genes are used to
establish the evolutionary relationships of species in ques-
tion to the closest reference sequence [12, 13].
The Indian Rock Python (Python molurus molurus) and
its products are of significant commercial importance in the
Electronic supplementary material The online version of thisarticle (doi:10.1007/s11033-012-1572-5) contains supplementarymaterial, which is available to authorized users.
B. Dubey � P. R. Meganathan � I. Haque (&)
National DNA Analysis Centre, Central Forensic Science
Laboratory, 30-Gorachand Road, Kolkata 700 014,
West Bengal, India
e-mail: [email protected]
Present Address:B. Dubey
Centre for Cellular and Molecular Biology, Uppal Road,
Hyderabad 500 007, India
Present Address:P. R. Meganathan
Department of Biochemistry and Molecular Biology, Institute
for Genomics, Biocomputing and Biotechnology,
Mississippi State University, Mississippi State, MS 39762, USA
123
Mol Biol Rep (2012) 39:7403–7412
DOI 10.1007/s11033-012-1572-5
national and international markets and as a result the
population density of the species has been alarmingly
depleted [14]. In view of this Indian Pythons are the species
of great conservation concern (listed by IUCN Red list
2009, as lower risk, near threatened [15] and the subspe-
cies, P. molurus molurus is listed in the Appendix I of
CITES whereas all other subspecies of P. molurus and
species of Pythonidae are listed in the Appendix II).
Therefore, additional information garnered from the field
of genetics/molecular biology could be of immense value
for conservation efforts of this species. In this regard, there
is always a need to use large quantity of data for better
understanding of phylogenetic/evolutionary relationships
[16]. Further, the insights into the genome makeup and
variations in the endangered species can find conservation
applications in wildlife health management of populations
in wild or captivity [17]. However, the complete mtDNA
sequences of approximately 35 snake species are presently
available in public databases, which have contributed to the
study of evolutionary dynamics and phylogenetics of
snakes, yet, complete mtDNA of only one species of family
Pythonidae has been described [18, 19]. Hitherto, the par-
tial sequences of mtDNA for comparison at species level
have been used to estimate Python phylogenetics [20];
nevertheless, use of complete mtDNAs is essential for
robust comparison, as each of the mt. genes may provide
answers to different phylogenetic questions [21, 22].
In view of the above it is considered imperative to
generate complete mitochondrial DNA sequences for the
endangered Indian snake, P. molurus molurus, which can
be used in phylogenetic, gene variability and identification
studies to facilitate the understanding of genetic relation-
ships among snakes. We have also used this data to analyze
relationships within major snake lineages and phylogenetic
position of P. molurus molurus in family Pythonidae.
Materials and methods
DNA extraction
Blood and tissue samples of P. molurus molurus were
obtained from Snake Transit House, Jabalpur, Madhya
Pradesh and Chennai Snake Park Trust, Chennai. DNA
extraction from tissues was carried out using DNeasy
Tissue kit (Qiagen), and standard Phenol–Chloroform
method [23] was used for blood samples.
Primer Design
In order to make use of Long and Accurate PCR strategy
[24], we designed specific primer sets LAF1–LAR1 and
LAF2–LAR2 (Table 1) which will result in the amplification
of whole mitochondrial genome in two overlapping pieces of
approximately 9 and 10 kb each. These primer sets were
targeted in the 12S rRNA/COX II gene region and 16S
rRNA/ND4 gene sequences (our ‘‘unpublished data’’) of mt.
genome of P. molurus.
Also, a series of internal primer sets (Table 1) were
designed by making use of conserved sites in the alignment,
produced by aligning the complete mitochondrial gene
sequences of available snake species from public databases.
Polymerase chain reaction and DNA sequencing
In order to amplify relatively longer regions, the conditions
of long and accurate PCR (LA-PCR) were utilized. LA
PCRs were set in a total volume of 50 ll which contained
5 ll of 109 LA PCRTM
buffer (with Mg2?), 8 ll of dNTP
mixture (Takara Bio INC. Japan), 1.0 ll each of 10 lM
primers, 0.5 ll of Takara LA TaqTM
polymerase (Takara
Bio INC. Japan) and 3.0 ll genomic DNA. PCR was per-
formed on GeneAmp� PCR system 9700 (Applied Bio-
systems) with following conditions: initial denaturation at
94�C for 5 min followed by 30 cycles of denaturation
(94�C for 30 s), annealing (57�C for 30 s) and extension
(68�C for 8 min) followed by 10 min elongation at 68�C.
Amplified products were purified using low melting aga-
rose. Each of the long amplified fragments was then used as
template for the amplifications using internal primers.
PCRs for targeting smaller regions were set in a volume
of 25 ll, containing 1–2 ll of DNA template, 5 mM
MgCl2, 2.5 mM dNTPs, 0.2 lM of each primer, 2.5 ll 109
buffer and 1.5 U of Taq polymerase (Invitrogen Life
Technologies, Brazil) on GeneAmp� PCR system 9700
(Applied Biosystems). The PCR conditions were: initial
denaturation at 94�C for 5 min followed by 35 cycles of
denaturation (94�C for 30 s), respective annealing (Table 1)
for 30 s, and extension (72�C for 30 s) with a final exten-
sion of 72�C for 5 min followed by 4�C hold. The amplified
fragments were visualized on 2% agarose gel using ethi-
dium bromide stain (0.5 lg/ml). The generated amplicons
were purified using the exosap treatment and then cycle
sequenced using BigDye� Terminator v3.1 cycle sequenc-
ing kit (Applied Biosystems, Foster City, CA). DNA
sequencing was performed on 3100 Avant Genetic Ana-
lyzer (Applied Biosystems).
Annotation/Data analyses
The genome was assembled using BioEdit [25]. Transfer
RNAs and their structures were identified using ARWEN
program [26]. The tRNAs were then used to estimate the
boundaries of protein coding genes, rRNAs and control
regions. Finally, the positions of start and stop codons were
used to assess the sizes of protein coding genes (PCG).
7404 Mol Biol Rep (2012) 39:7403–7412
123
Table 1 List of primers used
and their respective annealing
temperatures
Primer Primer sequence Annealing temperature (�C)
RTF1 AAA GCA CAG CAC TGA AAA TGC 52
RTR1 TTC TTG CTA AAC CAT GAT GC
L85512S GCG YAC ACA CCG CCC GTC 54
H38316S AAR GKD GAA CTW AKA TTC HKT TT
L28716S GTR GCA AAA GAG TGG RAA GAC 55
H106216S GCT TCA CAG GGT CTT CTG GTC TTA
L94316S TTR TAG ACC HGT ATG AAA GGC 52
H294ND1 TGG TAT KGG TAW KGG BGC TCA
L204ND1 ACC CAC ACT TTC CTC CCC AAT CCT ATT 60
H258CONT CAT TGA ACG ACC AAG AAA TGA GGA GCT AA
L61CONT TAC CCC CCC CCA CTT ACA TAG GAG GAA T 53
H713CONT GCA TTA AGA GAT GTA AGC CTC AAG GGA
L687CONT TCC CTT GAG GCT TAC ATC TCT TAA TGC 53
H413ND2 GGB GCD AKT TTY TGT CAK GT
L325ND2 GCH CCA YTY CAY TYY TGA 50
H60ASN GTY TGK RYD RCW ARY WGT WRA A
L18TRP AA ACT AGD RRC CTT CAA AG 51
H57COI GTA KAG KGT KCC RAT RTC TTT
L1324COI TAC TCD GAY TWY CCW GAY GC 53
H200CO2 GTT CAD GCD GCY TCT ART TGT TC
L64CO2 TTY YTA CAY GAC CAY GTV YT 49
H26ATP6 AAT TGW TCR AAT ATR TTT AT
L19ATP6 GAA CAA TTC GCA AGC CCA GAA AT 56
H182CO3 AYR TCK CGT CAY CAT TG
L28CO3 TWG THG AYC CVA GCC CWT GRC C 54
H125ND3 GGG TCR AAK CCR CAT TCR TA
L28GLY TGC CYT CCA AGC AYT WRG HCC C 60
H239ND4L GCD ACD ACW AGG CTD AGK CC
L92ND4L TAT GYV TDG AAR CAA TRA TAC 51
H655ND4 CBA CRT GRG CTT TKG GKA GYC A
L595ND4 GCH TTY HTR GCW AAAA ATA CC 50
H75LEU CTA YYA CTT GGA KTY GCR CC
L28LEU TGG TCT TAG GCA CCA AWA YHC TT 52
H752ND5 ACW ACT ATK GTR CTDGAR TG
L500ND5 TWC ARG CYA TYR TYT AYA AYC G 49
H1175ND5 ATD GTR TCT TTD GAR TAR AAV CC
L1000 ND5 GCA CTA CTA TTC CTA TGT TCA GGA TC 55
H1636ND5 TTA ATT CTG TTG AGG TTT GTT GGC TGA
L92ND4L TAT GYV TDG AAR CAA TRA TAC 57
H655ND4 CBA CRT GRG CTT TKG GKA GYC A
L1550ND5 AAC CAA TTA GCT TTT TTC AAT CTC C 52
H138CYT CTG RAY DGC TAG RAA RAA DCC
RC2F GAA AAA CCA CCG TTG TTA ATC AAC TA 50
RC2R TTA CAA GAA CAATGC TTT
L982CYT ACA TGA RCH GCH WCH AAA CC 51
H505CONT TGC GAC CAA AGG TCT TGG AAA AAG C
LAF1 ACA GAA GAA GTA GAA CAA CTA GAA GC 68
LAR1 AGT TAC ACC TCG ACC TGT CGT GTT A
LAF2 ATA AGA CCA GAA GAC CCT GTG AAG CT 68
LAR2 AGR TCW GTT TGT TGB GRG CAD GTD AG
Mol Biol Rep (2012) 39:7403–7412 7405
123
Further, the proteins were translated to amino acid
sequences and compared with corresponding genes from
available snake mt. genomes to check for the annotations.
The rRNA genes were identified by aligning with other
available snake mtDNA sequences. The nucleotide base
composition for different PCGs was calculated using
MEGA 3.1 [27]. Codon usage for all PCGs of P. molurus
molurus was estimated using CodonW (J. Peden;
http://molbiol.ox.ac.uk/Win95.codonW.zip) and also the
AT and GC content and base skews were calculated to
analyze the genome. The codon usage and other features of
the PCGs of P. molurus molurus were also compared with
two other snakes, P. regius and Boa constrictor. Synony-
mous and non-synonymous substitution rates were calcu-
lated using the ka/ks calculator [28].
Phylogenetic analyses
Phylogenetic analysis was performed in two datasets using
complete nucleotide sequences of all mitochondrial protein
coding genes (dataset I) and partial cyt b gene sequences of
species belonging to Pythonidae (dataset II) (Supplemen-
tary Material 1). Complete gene sequences from snake
species representing major snake families and available in
public databases were downloaded and aligned against the
gene sequences of P. molurus molurus generated in this
study using Clustal X [29]. Ambiguously aligned positions
were eliminated to result in alignment of 11180 bp for 19
taxa, forming the dataset I. Scolecophedians (blind snakes)
are sister group to Alethinophidians (true snakes) hence,
Ramphotyphlops braminus, a Scolecophedian snake was
used to root dataset I. Partial cyt b gene sequences were by
far the most prevalent sequence data in the Pythonidae, and
therefore these were utilized to result in the alignment of
307 bp for 16 taxa including P. molurus molurus to form
dataset II. Many molecular studies suggest that Pythons
appear to be more closely related to archaic macrostomatan
snakes (Loxocemus, [30, 31]) than the Boines, therefore,
phylogenetic analyses for dataset II were performed using
Loxocemus as an outgroup taxa.
Three different analyses were performed on both the
datasets, (1) A maximum likelihood (ML) tree
(1,000 bootstrap replicates) was computed using PHYML
[32] under GTR ? I ? gamma model as selected by
Treefinder software [33], (2) Bayesian analysis was per-
formed using MrBayes [34] with the maximum likelihood
model employed six substitution types (‘‘nst = 6’’) and
rate variation across sites modeled using a gamma distri-
bution (rates = ‘‘gamma’’). The Markov chain Monte
Carlo search was run with 4 chains for 500,000 genera-
tions, with trees begin sampled every 100 generations (the
first 10,000 trees were discarded as ‘‘burnin’’), (3) Maxi-
mum parsimony analyses were carried out using Phylip
software [35] with nonparametric bootstrapping used to
assess support for the nodes in the MP analyses with 1,000
replicates.
Results and discussion
Snake mt. genomes are of great interest in understanding
mitogenomic evolution because of gene duplications and
rearrangements and the fast evolutionary rate of their genes
compared to other vertebrates [36]. Also, the complete
mtDNA information of endangered species can be a useful
tool in the area of conservation genetics of snakes.
Therefore, in the present study we sequenced the complete
mt. genome of P. molurus molurus (an endangered snake)
and present the comparative analyses of some of the
important genomic features.
Genome organization
The gene arrangement of P. molurus molurus mitochon-
drial genome is shown in Fig. 1 and is similar to that
described for other snake species [18, 19, 37]. The
P. molurus mtDNA is a typical circular molecule, of length
17258 bp, with all the 37 genes including the 13 PCGs, 22
tRNAs and 2 ribosomal RNAs that are usually present in
bilaterian mt. genomes.
Fig. 1 Gene map of P. molurus molurus mitochondrial DNA (22
tRNAs are abbreviated using one letter code for the corresponding
amino acid)
7406 Mol Biol Rep (2012) 39:7403–7412
123
Protein coding genes and nucleotide composition
The 12 PCGs, ND1, ND2, COI, COII, ATPase 8, ATPase
6, ND3, COIII, ND4L, ND4, ND5 and Cyt b, were located
on the H-strand whereas the gene, ND6 was located on the
L-strand of mtDNA. Mitochondrial genes may use several
substitutes to ATG as start codon [38] and similarly, all the
PCGs of P. molurus begin with one of the common start
codons ATG/ATA/ATT, except for COX I and COX II
which begin with GTG, which has been known to initiate
these and other genes in many metazoans [39] including
reptiles [40, 41]. Out of the 13 PCGs, 7 show incomplete
stop codon TA/T. This phenomenon has been described in
other species also, where polyadenylation after transcrip-
tion leads to completion of partial T/TA codon into a
functional (TAA) termination codon [42]. The metazoan
mtDNA is compact and consists of some overlappings
between genes. In P. molurus molurus overlappings were
found between the genes ATP 6/ATP 8 (9 bp) and ND
5/ND 6 (4 bp). Overlapping genes found in mtDNAs leads
to an assumption that the ‘‘polycistron model’’ may not
hold true universally, since it would not be possible to
generate full-length RNAs with overlapping message from
a single transcript [43]. In addition, ATP6 and ATP8 genes
commonly overlap in chordate mtDNA and are known to
be translated from the same bicistronic mRNA [44].
Python molurus sequence showed *91% average sim-
ilarity to the P. reguis mtDNA sequences. The nucleotide
variability of each mitochondrial PCG was estimated by
calculating gene-by-gene overall genetic distances in
P. molurus along with P. regius; another species of Pyth-
onidae and Boa constrictor; member of the nearest related
family, Boidae for which complete mtDNA sequences are
available. ATP 8 was the least conserved gene both in
terms of pairwise amino acid identity among the three
snakes (57% on average and range 44–80%), and the
genetic distance values. The distance values were highest
for ATP 8 (0.6136) followed by NADH 3 (0.2852), how-
ever, more conserved gene was COX I with least genetic
distance value (0.0289) (Table 2). Also, the number of
codons for each gene was more or less similar in the three
species compared except for COX I gene in Pythons (534
codons) which was found to be longer than in Boa con-
strictor (482 codons). Some basic features of PCGs
(number of codons, the start/stop codons) compared among
the three snakes are given in Table 2.
The proportion of nucleotides in various genes of
P. molurus and two of the related species is given in
Table 3. The general trend of nucleotide composition for
majority of the PCGs was observed to be A (29–39%) [ C
(24–32%) [ T (19–27%) [ G (10–17%) except for genes
COX III and ND6 where the nucleotide composition fol-
lowed the order C[A[T[G and G[A[T[C, respec-
tively. The A ? T content of 13 PCGs was found to be
57.4% and overall A ? T content was 58%. The CG skew
[calculated as (C–G %)/(C ? G %)] and AT skew [cal-
culated as (A–T %)/(A ? T %)] are a good indicator of
strand specific nucleotide frequency bias [45, 46]. CG
skews were found to be positive in all the positive strand
encoded genes and negative in the negative strand encoded
genes (NADH 6), and a similar trend was observed for the
AT skew values (Table 3). The asymmetry in the nucleo-
tide composition between two strands is well known [46],
where there are more A and C% than T% and G% on the
Table 2 Characteristics of protein-coding genes of P. molurus, and comparison with P. regius and B. constrictor
Protein No. of codons Percent amino acid similarity Predicted start/stop codon Overall genetic
distancesP. molurus P.regius B.constrictor PM/PR PR/BC PM/BC P. molurus P.regius B.constrictor
ATP 8 56 56 56 80.3 46.4 44.6 ATG/TAA ATG/TAA ATG/TAA 0.6136
ATP 6 227 227 226 78.4 77.0 92.5 ATG/TAA ATG/TA ATG/TAG 0.2537
COX I 534 534 482 99.0 86.7 86.1 GTG/AGA GTG/AGA GTG/TA 0.0289
COX II 229 229 229 97.8 87.7 87.3 GTG/TA GTG/TA GTG/TAA 0.2124
COX III 261 261 261 90.0 88.8 98.0 ATG/T ATG/T ATG/T 0.2017
CYT B 370 370 371 89.9 84.0 78.8 ATG/T ATG/T ATG/T 0.2261
NADH1 321 321 322 96.5 86.6 85.7 ATA/T ATA/T ATA/T 0.1795
NADH2 344 344 343 96.5 72.0 72.0 ATT/TAA ATT/TAA ATA/TAA 0.2191
NADH3 114 114 114 89.4 75.4 74.5 ATA/T ATA/T ATA/T 0.2852
NADH4 452 452 452 72.1 61.7 59.7 ATG/A ATG/A ATG/G 0.2729
NADH4L 96 96 96 90.6 73.9 72.9 ATG/TA ATG/TA ATG/TA 0.2398
NADH5 598 598 597 89.2 75.5 74.7 ATG/TAA ATG/TAA ATG/TAA 0.2488
NADH6 171 171 169 99.0 59.6 59.0 ATG/TAG ATG/TAG ATA/TAG 0.2373
PM, Python molurus; PR, Python regius; BC, Boa constrictor
Mol Biol Rep (2012) 39:7403–7412 7407
123
positive strand. The base compositions in P. molurus
mtDNAs are skewed similarly to other vertebrate mtDNAs
[47], with greater A ? C content in the gene-rich strand
than in the gene-poor strand.
Codon usage
The vast majority of prokaryotic and eukaryotic species
have non-random codon usage and the patterns of nucle-
otide usage are of great importance in the definition and
functional investigation of coding regions (http://www.nem
atode.net). The codon usage is influenced by a complex
association of mutational pressures, selection constraints
and genetic drift [48]. The codon usage bias varies within
and among genomes which can facilitate the understanding
of evolution and environmental adaptation of organisms. In
order to facilitate the examination of the codon preference
in three snake species, we analyzed the frequencies of
synonymous codons in the mt. genomes. A comparative
analysis of the codon usage, measured in terms of relative
synonymous codon usage (RSCU) which is the relative
frequency that each codon suits to encode a particular
amino acid (Supplementary Material 2), in all three
organisms show that P. molurus follows a similar pattern as
followed by other snakes except in cases of glycine where
usage of GGC (RSCU = 2.11) was preferred at the
expense of GGA and in tyrosine, where UAU (RSCU =
1.268) was preferred over UAC in P. molurus. In the
absence of any codon usage bias, the RSCU value should
be 1.00 whereas a codon that is used less frequently than
expected will have a value of less than 1.00 and vice versa
for a codon that is used more frequently than expected [49].
All the RSCU values deviating from 1.0 in Python mtDNA
suggest a bias in general for NNC codons over their NNG
counterparts. This appears reasonable as the nucleotide
composition of P. molurus shows the coding strand to be
rich in Cs over Gs. The total number of codons used by
P. molurus was 3775, followed by B. constrictor (3,721)
and P. regius (3,442 codons).
Transfer RNAs
The mt. genome of P. molurus bears all the 22 tRNAs
commonly found in metazoan mtDNA. All tRNAs possess
typical cloverleaf secondary structure (Supplementary
Material 3) except for tRNA Ser, where DHU loop was
absent. This is a common feature of vertebrate tRNA Ser
[50]. The TWC stem is usually 4–5 nucleotides (nlt) in
most of the tRNAs but shortened to 3 nlt in case of tRNA
Gly and tRNA Met and composed of just 2 pair of bases in
tRNA Phe. This shortening of TWC stem has been previ-
ously reported in snake mt. genome [18]. Several mis-
matched nucleotide pairs were found in the stems and most
of them were accompanied by a neighboring G–C pair,
probably to impart compensatory stability to the arms. In
vertebrates, at position 8 adjacent to the amino-acyl stem, a
conserved ‘T’ is usually present [51], which was found to
be replaced by ‘A’ in five of the tRNAs: tRNA-Ser (UCN),
tRNA Lys, tRNA Asn, tRNA Leu (CUN) and tRNA Arg.
Similar replacement has been reported in avian mtDNA
[50] in case of first four tRNAs mentioned above.
Non-coding region
Snake mitochondria are reported to contain duplicate
control regions [18], and the same was found in P. molurus.
Table 3 Nucleotide composition and skews of P. molurus molurus mitochondrial protein-coding and ribosomal RNA genes
GENE (?/-) strand A C G T AT SKEW CG SKEW
ATP 6 (?) 0.333 0.301 0.1 0.266 0.112 0.501
ATP 8 (?) 0.363 0.262 0.113 0.262 0.161 0.397
COX I (?) 0.298 0.28 0.157 0.265 0.058 0.281
COX II (?) 0.319 0.299 0.161 0.221 0.181 0.3
COX III (?) 0.29 0.311 0.158 0.241 0.092 0.326
CYT B (?) 0.305 0.32 0.122 0.253 0.093 0.447
NADH 1 (?) 0.342 0.32 0.101 0.238 0.179 0.52
NADH 2 (?) 0.364 0.329 0.087 0.219 0.248 0.581
NADH 3 (?) 0.315 0.292 0.12 0.274 0.069 0.417
NADH 4 (?) 0.319 0.326 0.114 0.242 0.137 0.392
NADH 4L (?) 0.348 0.29 0.1 0.262 0.14 0.487
NADH 5 (?) 0.356 0.301 0.105 0.238 0.198 0.482
NADH 6 (-) 0.146 0.08 0.314 0.46 -0.518 -0.593
12S rRNA 0.354 0.272 0.179 0.194 0.291 0.206
16S rRNA 0.396 0.247 0.154 0.202 0.324 0.231
7408 Mol Biol Rep (2012) 39:7403–7412
123
One of the control regions was present typically between
tRNA Pro and tRNA Phe whereas other was located
between tRNA Ile and tRNA Leu-Gln-Met cluster. Con
served sequence blocks, CSB 1 and CSB 3 found in ver-
tebrate mtDNA were identified in P. molurus. Both the
control regions were nearly identical in sequence similarity
(over 90%) as they have been proposed to evolve in a
highly concerted fashion [18]. Other than the control
regions, there were 34 nlt distributed, that were unas-
signed to the genes and the composition of these appear
unremarkable.
Synonymous/Non-synonymous substitutions
The comparison between the number of non-synonymous
mutations (dn or ka), and the number of synonymous
mutations (ds or ks), can suggest whether, at the molecular
level, natural selection is acting to promote advantageous
mutations (positive selection) or to remove deleterious
mutations (purifying selection). In general, when positive
selection dominates, the ka/ks ratio is greater than 1, i.e.
the diversity at the amino acid level is favored, likely due
to the fitness advantage provided by the mutations. Con-
versely, when negative selection dominates, the ka/ks ratio
is less than 1, i.e. most of the amino acid changes are
deleterious and, therefore, are selected against [52]. When
the positive and negative selection forces balance each
other, the ka/ks ratio is close to 1.
Analysis of amino acid substitution mutations (non-
synonymous, ka) versus neutral mutations (synonymous,
ks) for all 13 mtDNA protein coding genes of the three
snakes (P. molurus, P. regius, B. constrictor) revealed that,
the ka/ks ratio was less than 1 (Fig. 2). The ka/ks ratio for
the 13 PCGs ranged from 0.0054 to 0.225; in accordance
with the fact that most protein coding genes are considered
to be under the effect of purifying selection. The highest
ka/ks ratio was obtained for ATP 8 gene and COX I show
the smallest ka/ks rates of any mt. gene compared herein.
Consequently, ATP 8 is indicated to be under positive
selection whereas COX I seems to be under stronger
purifying selection. The ka/ks values for ATP 8 and COX I
also corroborate with the evolutionary rates of these genes
as shown by the genetic distance values (Table 2), also,
ATP 8 and COX I have been reported to show similar trend
in previous studies [53].
Phylogenetic analyses
In order to ensure the usefulness of newly sequenced mt.
genome we carried out phylogenetic analyses using the
available mt. genomes of major snake lineages to establish
an overall snake phylogeny. Also, a separate phylogenetic
reconstruction was carried out using partial cyt b gene
sequences of P. molurus in conjunction with other 15
members of family Pythonidae. The tree topologies
obtained from both the analyses are discussed below.
(i) Relationships among the major lineages of snakes
The phylogenetic analyses of dataset I (sequences of all
protein coding mitochondrial genes) show the position of
P. molurus molurus relative to 18 other snake species
belonging to major snake families (Fig. 3). Ramphotyph-
lops (a Scolecophidean snake) was used to root the tree.
Our results were in complete agreement with the previ-
ously established snake phylogeny [54, 55], known to be
comprising of three major lineages: Scolecophidea (blind
snakes), Henophidea (primitive snakes), and Caenophidia
(advanced snakes). Scolecophideans are considered basal
group [54] followed by the henophidians (Python, Boa,
Xenopeltis) and was well supported in our study. Among
henophidian snakes, Pythons differ from the generally
similar boas in the mode of reproduction (viviparous boas;
oviparous pythons), this was also evident in our phyloge-
netic analyses where, Python does not cluster with Boa, but
instead shows a strong relationship with Xenopeltis. Hence,
the complete mitochondrial genome analyses also support
the fact that Pythons are not the immediate relatives of
Boid snakes [56].
The remaining taxa, belong to caenophidia (advanced
snakes) with Achalinus meiguensis (family Xenodermati-
dae) occupying the basal position among caenophidians.
This observation was also supported by Vidal et al. [57].
Colubroidea is the most diverse and vast lineage of cae-
nophidian snakes comprising of 3 major groups; Colubri-
dae, Elapidae and Viperidae. We found viperid snakes to
be basal to Colubroidea which was also in concordance
with Kelly et al. [56]. Colubridae and Elapidae cluster
together before they combine with the Viperidae, thus,
supporting the assumption that Elapidae share an ancestor
with the Colubridae, rather than the Viperidae.
Fig. 2 The synonymous and non-synonymous substitution rates (ka/
ks ratio) calculated for three snake species. PM P. molurus, PRP. regius, BO B. constrictor
Mol Biol Rep (2012) 39:7403–7412 7409
123
(ii) Relationships among Pythonidae
The phylogenetic analyses using dataset II (partial cyt
b gene sequences) discern the position of P. molurus molurus
among the other members of family Pythonidae. Pythonidae
is an old world group of ancestral constricting snakes and the
major genera of the family are divided into two groups on the
basis of their occurrence, namely (i) The Afro-Asian and (ii)
the Australo-Papuan genera. Most of the genera (Liasis,
Apodora, Morelia, Bothrochilus, Leiopython, Antaresia) are
found in the Australo-Papuan region barring the Pythons
[20]. A close relative of pythons, ‘Loxocemus’ was used as an
outgroup member. Our phylogenetic analyses (Fig. 4) show
that P. molurus molurus is sister taxon to P. molurus bivitt-
atus (the Burmese Python) and is well placed within the clade
formed by the Afro-Asian species (P. sebae, P. regius,
P. brongersmai and P. molurus bivitattus). However, we find
that genus Python was not monophyletic as, P. reticulatus
and P. timoriensis formed a separate clade, sister to the
Australo-Papuan species suggesting, evolution once in
P. reticulatus, and once in the lineage leading to the Asian
and African species (P. sebae, P. molurus etc.). Our analyses
were in correlation with the paraphyly of Pythons as already
reported in earlier studies [20].
Our phylogenetic analyses reflect the most widely
accepted interpretations with respect to the overall snake
phylogeny as well as the Python phylogenetics thus signi-
fying the utility of newly generated P. molurus molurus mt.
genome in answering phylogenetic issues. Moreover, the
applications of mt. DNA to wildlife management [58, 59],
hybridization of closely related species and evolutionary
genomics are well established. This study, presents the
complete mt. genome of endangered Indian snake, P. molu-
rus molurus, as the data on codon usage, start/termination
codons and increase in the amount of mt. sequence data
availability could be helpful in various population genetics
or evolutionary studies of these animals.
Fig. 3 Phylogenetic
relationships among snake
lineages as inferred from 13
PCGs (dataset I). Numbers on
the branches indicate Bayesian
posterior probabilities,
maximum likelihood and
maximum parsimony analysis
bootstrap values, respectively
7410 Mol Biol Rep (2012) 39:7403–7412
123
Acknowledgments We gratefully acknowledge Mr. Manish
Kulshreshtha, Director Snake Transit House, Jabalpur and Chennai
Snake Park Trust, Chennai, for providing the valuable research
samples. The study was funded by Directorate of Forensic Science,
Ministry of Home Affairs, Government of India, New Delhi.
References
1. Morin PA, Luikart G, Wayne RK, the SNP workshop group
(2004) SNPs in ecology, evolution, and conservation. Trends
Ecol Evol 19:208–216
2. Pearse DE, Arndt AD, Valenzuela N, Miller BA, Cantarelli VH,
Sites JWJR (2006) Estimating population structure under non-
equilibrium conditions in a conservation context: continent-wide
population genetics of the gaint Amazon river turtle Podocnemisexpansa (Chelonia: Podocnemididae). Mol Ecol 15:985–1006
3. Parham JF, Feldman CR, Boore JL (2006) The complete mito-
chondrial genome of enigmatic bigheaded turtle (Pltysteron):
description of unusual genomic features and reconciliation of
phylogenetic hypotheses based on mitochondrial and nuclear
DNA. BMC Evol Biol 6:11
4. Teletchea F, Maudet C, Hanni C (2005) Food and forensic
molecular identification: update and challenges. Trends Bio-
technol 23:359–366
5. Dubey B, Meganathan PR, Haque I (2009) Multiplex PCR assay
for rapid identification of three endangered snake species of
India. Conserv Genet 10:1861–1864
6. Dubey B, Meganathan PR, Haque I (2009) Molecular identifi-
cation of three Indian snake species using simple PCR-RFLP
method. J Forensic Sci 55(4):1065–1067
7. Dubey B, Meganathan PR, Haque I (2011) DNA mini-barcoding:
an approach for forensic identification of some endangered Indian
snake species. Forensic Sci Int Gen 5(3):181–184
8. Meganathan PR, Dubey B, Haque I (2009) Molecular identifi-
cation of crocodile species using novel primers for forensic
analysis. Conserv Genet 10:767–770
9. Meganathan PR, Dubey B, Haque I (2009) Molecular identifi-
cation of Indian crocodile species: PCR-RFLP method for
forensic authentication. J Forensic Sci 54(5):1042–1045
10. Meganathan PR, Dubey B, Jogayya KN, Whitaker N, Haque I
(2010) A novel multiplex PCR assay for the identification of
Indain crocodiles. Mol Ecol Resour 10(4):744–747
11. Meganathan PR, Dubey B, Batzer MA, Ray DA, Haque I (2011)
Complete mitochondrial genome sequences of three Crocodylusspecies and their comparison within the Order Crocodylia. Gene
478:35–41
12. Avise JC (1994) Molecular markers, natural history and evolu-
tion. Chapman & Hall, New York
13. Roman J, Bowen BW (2000) The mock turtle syndrome: genetic
identification of turtle meat purchased in the south-eastern United
States of America. Anim Conserv 3:61–65
14. Whitaker R (2006) Common Indian Snakes: a field guide. Mac-
millan India, New Delhi
15. World Conservation Monitoring Centre (1996) Python molurus.
In: IUCN (2009) IUCN Red List of threatened species. Version
2009.2. www.iucnredlist.org. Downloaded on 10 March 2010
16. Boore JL (1999) Animal mitochondrial genomes. Nuc Acids Res
27:1767–1780
17. Ryder OA (2005) Conservation genomics: applying whole gen-
ome studies to species conservation efforts. Cytogenet Genome
Res 108:6–15
18. Kumazawa Y, Ota H, Nishida M, Ozawa T (1996) Gene rear-
rangements in snake mitochondrial genomes: highly concerted
evolution of control-region like sequences duplicated and inserted
into a tRNA gene cluster. Mol Biol Evol 13:1242–1254
19. Kumazawa Y, Dong S (2005) Complete mitochondrial DNA
sequences of six snakes: phylogenetic relationships and molec-
ular evolution of genomic features. J Mol Evol 61:12–22
20. Rawlings LH, Rabosky DL, Donnellan SC, Hutchinson MN
(2008) Python phylogenetics: inference from morphology and
mitchondrial DNA. Biol J Linn Soc 93:603–619
21. Arnason U, Johnsson E (1992) The complete mitochondrial DNA
sequence of the harbor seal, Phoca vitulina. J Mol Evol 34:
493–505
22. Cao Y, Adachi J, Janke A, Paabo S, Hasegawa M (1994) Phy-
logenetic relationships among eutherian orders estimated from
inferred sequences of mitochondrial proteins: Instability of tree
based on a single gene. J Mol Evol 39:519–527
23. Sambrook JE, Fritsch F, Maniatis T (1989) Molecular cloning. A
laboratory manual, 2nd edn. Cold Spring Harbor Laboratory
Press, New York
24. Barnes WM (1994) PCR amplification of up to 35-kb DNA with
high fidelity and high yield from k bacteriophage templates. Proc
Natl Acad Sci USA 91:2216–2220
25. Hall TA (1999) BioEdit: a user-friendly biological sequence
alignment editor and analysis. http://www.mbio.ncsu.edu/Bio
Edit/bioedit.html
Fig. 4 Phylogenetic relationships among members of Pythonidae as
inferred from partial cytochrome b gene sequences (dataset II).
Numbers on the branches indicate Bayesian posterior probabilities,
maximum likelihood and maximum parsimony analysis bootstrapvalues respectively
Mol Biol Rep (2012) 39:7403–7412 7411
123
26. Laslett D, Canback B (2008) ARWEN, a program to detect tRNA
genes in metazoan mitochondrial nucleotide sequences. Bioin-
formatics 24:172–175
27. Kumar S, Tamura K, Nei M (2004) MEGA 3: integrated software
for molecular evolutionary genetics analysis and sequence
alignment. Brief Bioinform 5:150–163
28. Zhang ZLJ, Zhao XQ, Wang J, Wong GK, Yu J (2006)
KaKs_Calculator: calculating Ka and Ks through model selection
and model averaging. Genomics Proteomics Bioinform 4:259–
263
29. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins
DG (1997) The ClustalX windows interface: flexible strategies
for multiple sequence alignment aided by quality analysis tools.
Nucleic Acids Res 24:4876–4882
30. Wilcox TP, Zwickl DJ, Heath TA, Hillis DM (2002) Phylogenetic
relationships of the dwarf boas and a comparison of Bayesian and
bootstrap measures of phylogenetic support. Mol Phylogenet
Evol 25:361–371
31. Noonan BP, Chippindale PT (2006) Dispersal and vicariance: the
complex evolutionary history of boid snakes. Mol Phylogenet
Evol 40:347–358
32. Guindon S, Gascuel O (2003) A simple, fast, and accurate
algorithm to estimate large phylogenies by maximum likelihood.
Syst Biol 52:696–704
33. Jobb G, von Haeseler A, Strimmer K (2004) TREEFINDER: a
powerful graphical analysis environment for molecular phylog-
entics. BMC Evol Biol 4:18
34. Huelsenbeck JP, Ronquist FR (2001) MRBAYES: Bayesian
inference of phylogenetic tree. Bioinformatics 17:754–755
35. Felsenstein J (1993) Phylogenetic INFERENCE PROGRAMs
(PHYLIP). University of Washington, Seattle, and University
Herbarium. University of California, Berkeley
36. Douglas AD, Gower JD (2010) Snake mitochondrial genomes:
phylogenetic relationships and implications of extended taxon
sampling for interpretations of mitogenomic evolution. BMC
Genomics 11:14
37. Yan Jie, Li Hongdan, Zhou Kaiya (2008) Evolution of the
mitochondrial genome in snakes: gene rearrangements and phy-
logenetic relationships. BMC Genomics 9:569
38. Boore JL (2004) Complete mitochondrial genome sequence of
Urechis caupo, a representative of the phylum Echiura. BMC
Genomics 5:67
39. Wolstenholme DR, Macfarlane JL, Okimoto R, Clary DO, Wa-
hleithner JA (1987) Bizarre tRNAs inferred from DNA sequences
of mitochondrial genomes of nematode worms. Proc Natl Acad
Sci USA 84:1324–1328
40. Janke A, Erpenbeck D, Nilsson M, Arnason U (2001) The
mitochondrial genomes of the iguana (Iguana iguana) and the
caiman (Caiman crocodylus): implications for amniote phylog-
eny. Proc Biol Sci 268(1467):623–631
41. Zhang M, Wang Y, Yan P, Wu Xiaobing (2011) Crocodilian
phylogeny inferred from twelve mitochondrial protein-coding
genes, with new complete mitochondrial genomic sequences for
Crocodylus acutus and Crocodylus novaeguineae. Mol Phylo-
genet Evol 60:62–67
42. Ojala D, Montoya J, Attardi G (1981) tRNA punctuation model
of RNA processing in human mitochondria. Nature 290:470–474
43. Li Hu, Gao Jianyu, Liu Haiyu, Cai Wanzhi (2009) Progress in the
researches on insect mitochondrial genome and analysis of gene
order. Sci Found China 17(2):39–45
44. Fearney IM, Walker JE (1986) Two overlapping genes in bovine
mitochondrial DNA encode membrane components of ATP
synthase. EMBO J 5:2003–2008
45. Hassanin A, Leger N, Deutsch J (2005) Evidence for multiple
reversals of asymmetric mutational constraints during the evo-
lution of the mitochondrial genome of metazoa, and conse-
quences for phylogenetic inferences. Syst Biol 54:277–298
46. Perna NT, Kocher TD (1995) Patterns of nucleotide composition
at fourfold degenerate sites of animal mitochondrial genomes.
J Mol Evol 41:353–358
47. Asakawa S, Kumazawa Y, Araki T, Himeno H, Miura K, Wa-
tanabe K (1991) Strand-specific nucleotide composition bias in
echinoderm and vertebrate mitochondrial genomes. J Mol Evol
32(6):511–520
48. Jia W, Higgs PG (2008) Codon usage in mitochondrial genomes:
distinguishing context-dependent mutation from translational
selection. Mol Biol Evol 25(2):339–351
49. Shardiwal RK, Sartaj SS (2009) A more elaborative way to check
codon quality: an open source program. EMBnet.news 15(3):
18–21
50. Harlid A, Janke A, Arnason U (1998) The complete mitochon-
drial genome of Rhea Americana and early Avian divergences.
Mol Biol Evol 46:669–679
51. Harlid A, Janke A, Arnason U (1997) The mt DNA sequence of
Ostrich and the divergence between paleognathous and neog-
nathous birds. Mol Biol Evol 14:754–761
52. Hurst LD (2002) The Ka/Ks ratio: diagnosing the form of
sequence evolution. Trends Genet 18(9):486–487
53. Deodoro CSG, Oliveira, Raychoudhury R, Dennis VL, John HW
(2008) Rapidly evolving mitochondrial genome and directional
selection in mitochondrial genes in the parasitic wasp nasonia
(Hymenoptera: Pteromalidae). Mol Biol Evol 25(10):2167–2180
54. Heise JP, Maxson LR, Dowling GH, Hedges SB (1995) Higher-
level snake phylogeny inferred from mitochondrial DNA
sequences of 12s rRNA and 16s rRNA Genes. Mol Biol Evol
12:259–265
55. Dessauer HC, Cadle JE, Lawson R (1987) Patterns of snake
evolution suggested by their proteins. Fieldiana Zool 34:1–34
56. Kelly CMR, Barker NP, Villet MH (2003) Phylogenetics of
advanced snakes (Caenophidia) based on four mitochondrial
genes. Syst Biol 52:439–459
57. Vidal N, Delmas Anne-Sophie, David P, Cruaud C, Couloux A,
Hedges SB (2007) The phylogeny and classification of caeno-
phidian snakes inferred from seven nuclear protein-coding genes.
CR Biologies 330:182–187
58. Ferris SD, Berg WJ (1987) The utility of mitochondrial DNA in
fish genetics and fishery management. In: Ryman N, Utter F (eds)
Population genetics and fishery management. University Wash-
ington Press, Seattle, pp 277–300
59. Quinn TW, White BN (1987) Analysis of DNA sequence varia-
tion. In: Cooke F, Buckley PA (eds) Avian genetics. Academic
Press, London, pp 163–198
7412 Mol Biol Rep (2012) 39:7403–7412
123