supplementary material - springer10.1007/s00239-012-9496... · unrooted ml tree based on the...

8
SUPPLEMENTARY MATERIAL- Genes and Processed Paralogs Coexist in Plant Mitochondria Journal of Molecular Evolution

Upload: truongtuyen

Post on 08-Feb-2018

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

SUPPLEMENTARY MATERIAL-

Genes and Processed Paralogs Coexist in Plant Mitochondria

JournalofMolecularEvolution

Page 2: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

A TGG TC C CT T A TGAAG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTATA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTA

T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAAG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA CT CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGK R

C T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTA

C T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T AAGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAAG TA T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG TA TGG TA T T CC C T TG T T C CCA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC C CT T A TGAGG TC T CAA T CGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T CGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C CTG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC C CT T A TGAGG TC T CTA T CGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC C CT T A TGAGG TC T CTA T CGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAAG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT T CC TG TAA T T TGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAAG TC T CTA T TGGT C TGA T T C TGA T T - A CTG TA C TAA TA TGTG TGGGT T CT TG TAA T T TGAG TGAGA T TG T CA TGG CGCAAAAGCAGAT C TGG T T CGG TA T T CC C T TG T T C CC TGTAA TGG TC T CT T A TGAAG TC T CTA T TGGT C TGA T T C TGA T T - A CTG TA C TAA TA TGTG TAGGT T CT TG TAA T T TGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC TGTAA TGG TC T CT T A TGAGG TC T CTA T CGGT C TGA T T C TGA T T - A CTG TA C TAA T C TGTG TAGGT T CT TG TAA T T TGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T TGG TA T T CC C T TG T T C CC TGTAA TGG TC T CT T A TGAGG TC T CTA T CGGT C TGA T T C TGA T T - A CTG TA C TAA T C TGTG TAGGT T CT TG TAA T T T TAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T TGG TA T T CC C T TG T T C CC TGTAA TGG TMT CT TA TGAGG TC T CTA T TGGT C TGA T T C T TA T T - A CTG T C C TAA T C TGTG TAGGT T CT TG TAA T T TGAG TGAGA T TG TGA TGG CGCAAAAGCAGATA TGG T T TGG TA T T CC C T TG T T C CC TGT CA TGG TC T CT T A TGAGG TC T CTA T TGGT C TGA T T C TGA T T - A CTG TA C T CA TA TGTG TAGGT T CT TG TAA T T TGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTAA TGG TC T CT T A TGAGG TC T CTA T CGGT C TGA T T C TGA T T - A CTG TA C TAA T C TGTG TAGGT T CT TG TAA T T T TAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T TGG TA T T CC C T TG T T C CC TGTAA TGG TC T CT T A TGAGG TC T CTA T TGGT C T T A T T C T TA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC CA TGG TC C CT T A TGAGG TC T CTA T CGGT C TGA T T C TGA T T - A CTG TA C TAA TA TGTG TAGGT C CC TG TAA T T CGAG TGAGA T TG T CA TGG CGCAAAAGCAGATA TGG T T CGG TA T T CC C T TG T T C CC CGTA

1. Butomus2. Scheuchzeria3. Cymodocea4. Lilaea5. Heterozostera6. Zostera7. Phylospadix8. Potomageton9. Aponogeton10. Baldellia11. Luronium12. Echinodiorus13. Sagittaria14. Stratiotes15. Ottelia16. Blyxa17. Ruppia18. Syringodium19. Hydrocharis20. Limnobium21. Caldesia22. Limnocharis23. Hydrocleys24. Hydrilla25. Nechamandra26. Vallisneria27. Najas28. other paralogE29. Egeria30. Elodea2

intron

Figure S1 Alignment of the partial sequences of the exon B and C of nad1Reduced alignment including the ca. 130 pb used in our analyses. The location where the cis-spliced intron is inserted in sequences 1 to 20 is indicated. Edited sites experimentally observed are indicated with rectangles. Sequences 1 to 20 are paralog I sequences, and from 21 to 30 paralog E sequences.

Page 3: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

0.01

Dioscoreales Dioscorea rockii

Araceae

Laurales Calycanthus floridus

Pandanales Pandanus tectorius

Fagales

Magnoliales Liriodendron tulipifera

Malpigiales

Apiales Chaerophyllum procumbens

Cucurbitales

Alismatids

Austrobaileyales Trimenia moorei

Liliales Androcymbium dregei

Poales

Caryophyllales

Rosales Pronus persica

Rosales Ziziphys obtusifolia

Tofeldiaceae

Oxalidales Hua gabonica

Dioscoreales Narthecium ossifragum

Asparagales

Ericales Actinidia deliciosa

Apiales Panax japonicus

Dioscoreales Tacca plantaginea

Gentaniales Apocynaceae sp.

Figure S2 Phylogenetic reconstruction of 71 angiosperm taxa based on intron nad1i477 Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G model. Clades and taxa were colored as follows: red lines for basal angiosperms, purple for Magnoliids, blue for Eudicots and green for Monocots. Generic names are indicated. The order Alismatales was split into the families To�eldeaceae, Araceae, and the remaining 12 families, all included in an informal clade named Alismatids. GenBank accession numbers available upon request.

Page 4: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

0.06

CU

C 7

sp

+ S

OL

1 sp

ALI

Lim

noch

aris

RO

S Z

izip

hus

OX

A H

ua

AS

P O

rchi

dace

ae 4

sp

DIO

Nar

thec

ium

PA

RA

LOG

E

CE

L B

rexi

a

DIO

Hex

apte

rella

FAG

Jug

lans

ALI

Hal

ophi

la

MA

G L

iriod

endr

on

ALI

Het

eroz

oste

ra

RO

S P

runu

sLA

U C

alyc

anth

us

MA

L Li

cani

a

ER

I Prim

ula

MA

L R

affle

sia

ALI

Ege

ria

LIL

And

rocy

mbi

um

ALI

Cym

odoc

ea

ALI

Sag

ittar

ia

PO

A Z

izan

iops

is

PA

N P

anda

nus

ALI

Rup

pia

ALI

Syr

ingo

dium

ALI

Hyd

roch

aris

ALI

Stra

tiote

s

MA

L P

etal

ostig

ma

MA

L O

chna

MA

L E

ndod

esm

ia

GE

N A

pocy

nace

a

AP

I Pan

ax

PO

A (1

0sp)

+ A

LI A

race

ae (1

8sp)

+ D

IO (2

sp)

ALI

Zos

tera

ALI

Tof

ield

ia

ALI

Pot

amog

eton

ALI

Ech

inod

orus

DIO

Cam

pylo

siph

on

ALI

Val

lisne

ria

AS

P P

hrag

mip

ediu

m

DIO

Bur

man

nia

ALI

Naj

as

AC

O A

coru

s

ALI

Sym

ploc

arpu

sA

LI G

ymni

stac

hys

OX

A O

xalis

MA

L P

assi

flora

ZIG

Lar

rea

MA

L E

liea

ALI

Bly

xa

ALI

Elo

dea

ALI

Lur

oniu

m

MA

L M

anih

ot

ALI

Hyd

rilla

BR

A B

rass

ica

ALI

Pot

amog

eton

ALI

Cal

desi

a

MA

L S

auva

gesi

a

DIO

Thi

smia

GE

N P

agam

ea

ALI

Lila

ea

MA

L V

iola

CE

L Le

pido

botry

s

AS

P A

gave

CU

C C

ucum

is

PO

A T

ritic

um

ALI

Phy

llosp

adix

MA

L R

hiza

nthe

s

MA

L E

lvas

ia

DIO

Ste

nom

eris

DIO

Apt

eria

FAG

Fag

us

CO

M E

ichh

orni

a

ALI

Alis

ma

ALI

But

omus

ALI

Oro

ntiu

m

ALI

Lim

nobi

um

MA

L La

chno

styl

is

ALi

Otte

lia

AU

S T

rimen

ia

MY

R O

enot

hera

AM

B A

mbo

rella

ALI

Apo

noge

ton

AP

I Cha

erop

hyllu

GE

R G

eran

iace

ae 4

1 sp

AU

S A

ustro

baile

ya

DIO

Dio

scor

ea

ALI

Bal

delli

a

ALI

Hyd

rocl

eys

DIO

Geo

mitr

a

AS

P A

llium

DIO

Tac

ca

ALI

Naj

as

ER

I Act

inid

ia

Page 5: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

Figure S3 Phylogenetic reconstruction of 181 angiosperm taxa based on nad1 exons

B and C.

Unrooted ML tree based on analysis of partial sequences of nad1 using a JC69+G model.

Paralog E and I sequences are indicated in yellow and green, respectively. Only one of

the twelve identical paralog E sequences was included in the analysis (indicated as

PARALOG E). Classification by order in accordance with APG III is indicated as

follows: AMB = Amborellales, AUS = Austrobaileyales, GER = Geraniales, CUC =

Cucurbitales, ROS = Rosales, ZIG = Zygophyllales, MAL = Malpigiales, CEL =

Celastrales, BRA = Brassicales, FAG = Fagales, OXA = Oxalidales, GEN = Gentianales,

API = Apiales, MAG = Magnoliales, ERI = Ericales, COM = Commelinales, DIO =

Dioscoreales, PAN = Pandanales, ACO = Acorales, LIL = Liliales, APS = Asparagales,

POA = Poales, ALI = Alismatales. To save space some clades were collapsed: from

bottom up: 1) 41 species of Geraniales, 2) seven species of Cucurbitales plus Solanum

(Solanales), 3) four orchid species (Asparagales), 4) ten species of Poales, plus 18 species

of Araceae (Alismatales), and two species of Dioscoreales.

Page 6: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

TableS1.Primersusedforamplificationofnad1.

Name Position Sequence

Nad1EB ExonB GCATTACGATCTGCAGCTCA

Nad1ECr ExonC GGAGCTCGATTAGTTTCTGC

Nad1iB Intron2 GCTCCACCGGCCTAAACGAGGAGC

Nad1iB2 Intron2 CGCTCGATCGACGGGACGGAC

Nad1iB2 Intron2 GTCGAGCATACGACGATGCCGC

Nad1iB2R Intron2 GGCGGCATCGTCGTATGCTCG

Nad1p_E1_30F ExonA GCTGTTCCAGCGGAAATACT

Nad1p_E1_288R ExonA CCCAAGCGACCAGACTTAAC

Nad1p_E2_441F ExonB GCAGCTCAAATGGTCCCTTA

Nad1p_E5_971R ExonE AACACCTGAAACGGGGACTA

Primersnad1EBandnad1ECrweredevelopedbyref.[37].

Page 7: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

TableS2.PlantmaterialandGeneBankaccessionnumbersusedinthisstudy

Family Taxon Voucher* Paralog I Paralog E

Acoraceae Acorus calamus C998 HM576887 absent

Acorus gramineus C2444 HM576888 absent

Araceae Arisaema amurense C1295 HM576827 absent

Gymnostachys anceps C205 HM576828 absent

Orontium aquaticum C2449 HM576829 absent

Symplocarpus foetidus C2450 HM576830 absent

Alismataceae Alisma plantago-aquatica C30 HM576848 absent

Baldellia ranunculoides C1082 HM576846 absent

Caldesia oligococca C246 absent HM576860

Echinodorus cordifolius C1526 HM576849 HM576881

Echinodorus uruguayensis C20 HM576850 HM576884

Luronium natans C2377 HM576847 absent

Ranalisma humile C2373 HM576851 absent

Sagittaria sagittifolia C1284 HM576852 HM576879

Aponogetaceae Aponogeton crispus C114 HM576845 HM576883

Butomaceae Butomus umbellatus C29 HM576844 absent

Cymodoceae Amphibolis griffithii C2375 HM576834 absent

Cymodocea nodosa C1260 HM576835 absent

Syringodeum isoetifolium C2376 HM576857 absent

Hydrocharitaceae Blyxa aubertii C107 HM576855 HM576880

Egeria naias C1525 absent HM576871

Elodea canadensis C798 absent HM576863

Elodea nutalii C2458 absent HM576873

Halophila sp. C864 absent HM576864

Page 8: Supplementary Material - Springer10.1007/s00239-012-9496... · Unrooted ML tree based on the partial sequences of nad1i477 using the GTR+G ... OXA Hua ROS Ziziphus CUC 7 sp + SOL

Hydrilla verticillata C112 absent HM576865

Hydrocharis morsus-ranae C524 HM576858 absent

Limnobium laevigatum C1449 HM576859 absent

Najas guadalupensis C1485 absent HM576868

Najas sp. C113 absent HM576869

Nechamandra alternifolia C111 absent HM576866

Ottelia ovalifolia C245 HM576854 HM576875

Stratiotes aloides C23 HM576853 absent

Vallisneria sp. C1139 absent HM576867

Limnocharitaceae Hydrocleys nymphoides C1434 absent HM576862

Limnocharis flava C1435 absent HM576861

Juncaginaceae Lilaea scilloides C25 HM576837 absent

Triglochin maritima C1440 - HM576872

Triglochin palustre C476 HM576838 HM576878

Posidoniaceae Posidonia oceanica C262 HM576836 HM576882

Potamogetaceae Potamogeton lucens C531 HM576842 -

Potamogeton natans C534 HM576843 HM576877

Zannichellia palustris C840 - HM576885

Ruppiaceae Ruppia cirrhosa C530 HM576856 HM576874

Scheuchzeriaceae Scheuchzeria palustris C522 HM576833 HM576870

Tofieldeaceae Tofieldia pusilla C462 HM576831 absent

Pleea tenuifolia C2448 - absent

Zosteraceae Heterozostera tasmanica C2374 HM576839 -

Phyllospadix scouleri C1093 HM576841 HM576876

Zostera sp. C2453 HM576840 HM576886

*AllvouchernumbersrefertotheCopenhagenUniversity’sHerbariumandDNAbank.