supplementary methods patient material selection criteria
TRANSCRIPT
SUPPLEMENTARY METHODS
Patient material selection criteria
The search term was “metastatic adenocarcinoma” and the years included were 2000 through
2011 (N =3823). To be included for further analysis, the patients had to be deceased, have
detailed clinical data on primary cancer, axillary metastasis as well as distant metastasis
available, and enough paraffin embedded material to enable exome sequencing, gene
expression and immunohistochemical stains from each site. Core and fine needle biopsies
were not eligible for inclusion. In total twenty patients met the criteria. Formalin fixed
paraffin embedded (FFPE) tissue sections were retrieved from all lesions. From the majority
of primary cancers and metastases, multiple tumor areas of different topography were isolated
(>5 mm distance from each other) resulting in 104 samples. Five metastatic samples (two
samples of bone relapse in patient 6, one region of local recurrence sample in patient 13 and
two samples of bone relapse in patient 12) failed during exome sequencing due to insufficient
DNA, resulting in a total of 99 samples.
Tissue microarray (TMA) and IHC staining
FFPE sections were conditioned in CC1 solution (Ventana Medical Systems, Tucson, AZ,
USA) for 36 min (Ki67) to 64 min (PR) and incubated with mouse monoclonal antibodies for
Ki67 (clone Mib-1) (Dako A/S, Glostrup, Denmark) and rabbit monoclonal primary
antibodies (Ventana) for ER (clone SP1), PR (clone 1E2), and HER2 (clone 4B5) at 35 °C
(HER2) or 37 °C (others) for 16 min (Ki67) to 44 min (ER) according to the manufacturer’s
instructions, and finally counterstained with hematoxylin. Two independent pathologists
(NFM and GS) at Karolinska Institutet performed scoring of ER, PR, HER2 and Ki67 and the
consensus values were used to determine IHC-based surrogate subtype for each cancer
sample. The assessments of ER, PR, HER2 and Ki67 IHC were combined into surrogate
subtypes using definitions recommended by expert recommendations (1-3). For a laboratory
specific threshold for Ki67 in Tissue Microarray (TMA) specimens, we incorporated digital
image analysis of a previously published cohort (n=130) of consecutive cancer specimens
collected at the Department of Pathology, Uppsala University Hospital, Uppsala, Sweden
from January 1 1987 through December 31 1989 (4-7). Surrogate subtype classification
based on IHC is illustrated below:
Luminal A-like: ER ≥ 1 % and PR ≥ 20 %. HER2 “negative” and Ki67 < 4.1 %.
Luminal B-like: ER ≥ 1 % or PR ≥ 1 % and HER2 “negative” and Ki67 ≥ 4.1 %, or
ER ≥ 1 % or PR ≥ 1 % and HER2 “positive” Any Ki67or ER ≥ 1 % and PR < 20 % and
HER2 “negative”. Any Ki67.
HER2-enriched-like: ER < 1 % and PR < 1 %. HER2 “positive”. Any Ki67.
Basal-like: ER < 1 % and PR < 1 %. HER2 “negative”. Any Ki67.
PAM50 molecular subtyping after subgroup-specific gene-centering
PAM50 molecular subtyping (8) of each tumour sample was performed after subgroup-
specific gene-centering (9). The population- based Stockholm cohort with primary breast
cancer patients (10) (GEO:GSE1456) was used as training cohort. The subgroup of patients
with breast cancer relapse within the first five years was used to mimic the tumour
progression cohort. All molecular subtype analysis was done in R/Bioconductor.
The PAM50 centroids and Entrez Gene IDs in the pam50 data object in the package genefu
was used. The hgu113a.db and hgu133b.db annotation packages were used for the Stockholm
data and 49/50 PAM50 genes had mapped probesets on the Affymetrix HG-U133A and HG-
U133B arrays. For probesets that were present on both arrays, the average value was used.
For probesets that were mapped to the same Entrez Gene ID, the one with highest
interquartile range was selected. For each PAM50 gene, the subgroup-specific percentile of
the global median in the training cohort was identified. The value 50 (i.e. the median) was
imputed for the one gene (KRT17) where gene-expression data was missing.
In the tumour progression cohort, all PAM50 genes have mapped probesets on the Affymetrix
Human Transcriptome Array (HTA) 2.0 platform (GEO:GPL17586) as given by the
manufacturer’s annotations. Again, for probesets that were mapped to the same Entrez Gene
ID, the one with highest interquartile range was selected. The baseline expression of each
gene was assigned at the subgroup-specific percentile of the breast samples in the tumour
progression cohort (median aggregated by patient). Thereafter expression data for each
sample was gene-centered by subtracting the baseline expression.
For each sample in the tumour progression cohort data, the Spearman’s rank correlation
between the sample after subgroup-specific gene-centering and each of the five PAM50
subtype centroids was calculated and the class of the most highly correlated centroid was
assigned to the sample. Finally, a stringent criterion of nearest centroid correlation coefficient,
larger than 0.25, was applied to assign a final subtype classification.
Main assumptions in Dollo parsimony
We used a variant of parsimony-based phylogenetic reconstruction method named Dollo
parsimony to reconstruct phylogenetic tree for each patient. We used Rdollop() from R
package Rphylip, which uses the implementation “dollop” given in PHYLIP version 3.696.
Following are the main assumptions in Dollo parsimony:
1. We know the state of each ancestral site (in germline) to be 0.
2. The sites (mutations) evolve independently.
3. Each lineage in the phylogenetic tree evolve independently of each other.
4. Probability of acquiring a mutation, i.e., changing from state 0 to 1 is small.
5. Probability of a losing a mutation (a deletion), i.e., changing from state 1 to 0 is also small,
but still far greater than the probability of acquiring a mutation.
Validation of phylogenetic trees
We validated the phylogenetic trees produced by Dollo parsimony using two approaches.
First, we performed phylogenetic reconstruction by an orthogonal method “LICHeE” v1.0
(60), We used the following parameters: -minVAFPresent 0.05 -minClusterSize 10 -
maxClusterDist 0.25 -maxVAFAbsent 0.
Second, we validated that the phylogenetic trees are not affected by variable coverage and/or
different tumor purity between samples. We adapted a modified approach from Yates et al
(16) to identify and remove mutations whose presence or absence in any sample from a
patient is indeterminate due to either read coverage or lower tumor purity, i.e., they can be
missed by chance. Then, phylogenetic trees were reconstructed using Dollo parsimony after
removing all indeterminate mutations. Supplementary Figure 11 contains a side-by-side
comparison of the trees in each patient.
To identify indeterminate mutations, we computed the upper 95% confidence interval (CI) of
VAF for each absent mutation in each sample according to the binomial distribution. If the
upper 95% CI exceeded a threshold VAFthr, the mutation was marked as indeterminate.
VAFthr is defined for each mutation as the maximum observed VAF for that mutation in other
samples from the same patient multiplied by the ratio between tumor purity in the sample
having the maximum VAF and tumor purity of the considered sample. Although this
approach does not take into account copy number information and assumes similar underlying
cancer cell fraction, we believe that it removes majority of mutations that have ambiguous
placement in phylogetentic trees. Binomial confidence intervals were computed according to
the “bayes” method using binom.confint() function in binom package in R.
Subset analysis to validate the robustness of phylogenetic inference
Intratumor heterogeneity in the primary cancer (11) can complicate the inference of seeding
origin of metastases. In order to ameliorate this effect, we sequenced multiple primary blocks
in some patients which demonstrated, for instance in patient 4, how different primary regions
seeded different metastases (Fig. 3b). However, on the other hand, this also raises the question
whether the number of primary samples sequenced affects the inference of progression model.
This is termed as incomplete taxon sampling problem in phylogenetic inference. In order to
show that Dollo parsimony is robust to this problem, we performed subset analysis for the
following two cases.
i. In case of patient 4, a parallel progression case, where we have 6 primary samples,
taking all 62 possible subsets of primary samples with three metastases and
estimating the probability of linear progression. A case where we observe lower
probability of linear progression in each subset will ultimately support a higher
probability for the existing inference of parallel progression.
ii. In case of patient 5, a linear progression case where we have 2 primary samples,
taking the 2 possible subsets of primary samples with two metastases and
estimating the probability of linear progression. A case where we observe higher
probability of linear progression in each subset will ultimately support the existing
hypothesis reported in the manuscript.
We used the following method to infer the probability of linear progression. We reconstructed
1000 bootstrap trees from available subset of samples as described in the Methods. Then, for
each of the bootstrap tree, we used the separating property to test whether any of the primary
samples is blocking the path among the metastases. If blocking, we have a NO result for
linear progression; if not blocking, we have a YES result for linear progression. Finally we
combined the results across all the 1000 trees to estimate the probability of linear progression.
Subset analysis for patient 4
In patient 4, we have 62 possible subsets. This includes 6 possible subsets containing 1
primary sample, 15 possible subsets containing 2 primary samples, 20 possible subsets
containing 3 primary samples, 15 possible subsets containing 4 primary samples, and 6
possible subsets containing 5 primary samples. The results are given in Supplementary Table
8. We observe from the results that, across all possible 62 subsets, we obtain either zero or
almost zero probability that all three metastases are seeded in a linear fashion. This confirms
that the primary tumor has seeded at least two or all three metastases in parallel, which is in
line with the results for all samples taken together (Fig. 3).
Next, we take into account the paired metastases cases (Uterus to Brain, Uterus to Colon, and
Brain to Colon) where, for a metastases pair, the earlier metastasis has seeded the latter
metastases in a linear fashion. For Uterus to Brain pair, we see only 1/62 case with more than
50% probability meaning that 98% of the subsets support Uterus did not seed the Brain
metastases. For the rest of two possible cases (Uterus-Colon, and Brain-Colon), there is not a
single case with a probability of 50% or higher of linear progression in any pair. Overall, the
subset analysis supports the parallel progression model for patient 4.
Subset analysis for patient 5
In patient 5, we have 2 possible subsets where we take one primary sample each with the two
bone metastases. The results are given in Supplementary Table 9. We observe from the results
that the probability of linear progression in both subsets is almost 100% which supports the
results when full data is used for inferring the progression model.
In summary, we see that Dollo parsimony is robust to the number and combination of primary
samples taken for inferring the phylogenetic tree. This, in turns, means that progression model
inference performed using separating property also does not change when different subsets
are considered.
Parameter values used in PyClone Out of the available three models in PyClone, we used the authors’ recommended genotype-
aware PyClone-beta-binomial model with all model’s parameter values set to recommended
values (the rest of the two models are genotype-naive infinite binomial mixture model and
infinite beta-binomial mixture model). We tested the robustness of cellular prevalence (CP)
cut-off of 0.05 as follows. We set the cellular prevalence cut-off to 0.04 and 0.02 and
compared it to CP cut-off of 0.05 to check if the seeding patterns are altered. We observed
that, overall, the progression and lymph node seeding results were not changed for CP
threshold of 0.04 and they were quite similar for the CP threshold of 0.02 (Supplementary
Table 10). Regarding the number of iterations in MCMC, the following criterion was used. If
the number of samples in patient were less than 5, 10000 iterations were used; if the number
of samples in a patient were between 5 and 7, 15000 iterations were used; if the number of
samples in a patient were between 7 and 10, 20000 iterations were used; and if the number of
samples were more than 10, 50000 iterations were used. The first 25% percent iterations were
thrown as burnin, thereafter every 10th sample was considered, i.e., a thinning value of 10
was used. To test convergence, we ran two independent PyClone analyses for each patient and
compared the results. For patient 11, we found that using 15000 iterations for MCMC
sampling were not enough for convergence. Subsequently, we used 30000 iterations and
observed convergence.
Mutational Signatures
We extracted a number of signatures ranging between 2-10 with five repetitions, and
computed the residuals sum of squares (RSS) and the explained variance between the
observed profile and fitted spectrum for different number of signatures. The final number of
signatures (four) was decided based on the first inflection point when plotting RSS and
explained variance change with number of signatures (Supplementary Fig. 7a). The accuracy
of the fitted signatures is dependent on the number of samples used for extraction. To allow
higher accuracy of fitting, we merged our cohort with an external in-house cohort of primary
breast cancers from 129 patients with exome sequencing. The external cohort analysis was
performed in a similar pipeline, which excludes potential batch effects.
To identify the biological processes underlying each signature, the Euclidean distance was
computed between the frequencies of different mutation classes in our four signatures and
those in the validated signatures published by Alexandrov et al (12). Based on the shortest
Euclidean distance, we were able to reliably map signatures S1 and S2 to the age-associated
signature 1 and APOBEC-associated signature 2 from Alexandrov et al respectively
(Supplementary Fig. 7b). Signature S3 had a similar distance to several published signatures.
We believe that the best candidate for S3 is signature 8 which has an unknown etiology since
they share the characteristic of weak strand bias in C>A substitutions (Supplementary Fig. 7c)
and since signature 8 was also found in breast cancer. We found that elevated contribution of
signature S4 is significantly associated with BRCA1/2 deleterious germline mutations in the
external cohort (p-value = 0.0009, Mann-Whitney). Consequently, S4 was mapped to
signature 3 in Alexandrov et al (12) which is associated with homologous recombination
deficiency.
References1. GuiuS,MichielsS,AndreF,CortesJ,DenkertC,DiLeoA,HennessyBT,SorlieT,
SotiriouC,TurnerN,etal.Molecularsubclassesofbreastcancer:howdowedefinethem?TheIMPAKT2012WorkingGroupStatement.Annalsofoncology:officialjournaloftheEuropeanSocietyforMedicalOncology/ESMO.2012;23(12):2997-3006.
2. CoatesAS,WinerEP,GoldhirschA,GelberRD,GnantM,Piccart-GebhartM,ThurlimannB,SennHJ,andPanelM.Tailoringtherapies--improvingthemanagementofearlybreastcancer:StGallenInternationalExpertConsensusonthePrimaryTherapyofEarlyBreastCancer2015.Annalsofoncology:officialjournaloftheEuropeanSocietyforMedicalOncology/ESMO.2015;26(8):1533-46.
3. PratA,CheangMC,MartinM,ParkerJS,CarrascoE,CaballeroR,TyldesleyS,GelmonK,BernardPS,NielsenTO,etal.Prognosticsignificanceofprogesteronereceptor-positivetumorcellswithinimmunohistochemicallydefinedluminalAbreastcancer.Journalofclinicaloncology:officialjournaloftheAmericanSocietyofClinicalOncology.2013;31(2):203-9.
4. LindahlT,LandbergG,AhlgrenJ,NordgrenH,NorbergT,KlaarS,HolmbergL,andBerghJ.OverexpressionofcyclinEproteinisassociatedwithspecificmutationtypesinthep53geneandpoorsurvivalinhumanbreastcancer.Carcinogenesis.2004;25(3):375-80.
5. SjogrenS,InganasM,NorbergT,LindgrenA,NordgrenH,HolmbergL,andBerghJ.Thep53geneinbreastcancer:prognosticvalueofcomplementaryDNAsequencingversusimmunohistochemistry.JNatlCancerInst.1996;88(3-4):173-82.
6. LinderholmB,KarlssonE,KlaarS,LindahlT,BorgAL,ElmbergerG,andBerghJ.Thrombospondin-1expressioninrelationtop53statusandVEGFexpressioninhumanbreastcancers.Europeanjournalofcancer.2004;40(16):2417-23.
7. StalhammarG,FuentesMartinezN,LippertM,TobinNP,MolholmI,KisL,RosinG,RantalainenM,PedersenL,BerghJ,etal.Digitalimageanalysisoutperformsmanualbiomarkerassessmentinbreastcancer.ModPathol.2016;29(4):318-29.
8. ParkerJS,MullinsM,CheangMC,LeungS,VoducD,VickeryT,DaviesS,FauronC,HeX,HuZ,etal.Supervisedriskpredictorofbreastcancerbasedonintrinsicsubtypes.JClinOncol.2009;27(8):1160-7.
9. ZhaoX,RodlandEA,TibshiraniR,andPlevritisS.Molecularsubtypingforclinicallydefinedbreastcancersubgroups.BreastCancerRes.2015;17(29.
10. PawitanY,BjohleJ,AmlerL,BorgAL,EgyhaziS,HallP,HanX,HolmbergL,HuangF,KlaarS,etal.Geneexpressionprofilingsparesearlybreastcancerpatientsfromadjuvanttherapy:derivedandvalidatedintwopopulation-basedcohorts.BreastCancerRes.2005;7(6):R953-64.
11. YatesLR,GerstungM,KnappskogS,DesmedtC,GundemG,VanLooP,AasT,AlexandrovLB,LarsimontD,DaviesH,etal.Subclonaldiversificationofprimarybreastcancerrevealedbymultiregionsequencing.NatMed.2015;21(7):751-9.
12. AlexandrovLB,Nik-ZainalS,WedgeDC,AparicioSA,BehjatiS,BiankinAV,BignellGR,BolliN,BorgA,Borresen-DaleAL,etal.Signaturesofmutationalprocessesinhumancancer.Nature.2013;500(7463):415-21.
1 2 3 4 5 6 7 8 9 10
1_Pr
imar
y1_
Lung
1.R
11_
Lung
2.R
11_
Live
r1.R
21_
Live
r2.R
21_
Nor
mal
2_Pr
imar
y2_
ALN
2_C
olon
1.R
12_
Col
on2.
R1
2_N
orm
al
3_Pr
imar
y3_
ALN
3_Bo
ne.R
13_
Nor
mal
4_Pr
imar
y64_
Prim
ary4
4_Pr
imar
y24_
Prim
ary3
4_Pr
imar
y14_
Prim
ary5
4_U
teru
s.R
24_
Brai
n.R
34_
Col
on.R
44_
Nor
mal
5_Pr
imar
y15_
Prim
ary2
5_Bo
ne.R
15_
Bone
.R2
5_N
orm
al
6_Pr
imar
y6_
Nor
mal
7_Pr
imar
y.lo
cal.r
ecur
renc
e7_
Skin
1.R
17_
Skin
2.R
17_
Nor
mal
.17_
Nor
mal
.27_
Nor
mal
.3
8_Pr
imar
y8_
ALN
8_Sk
in8_
Bone
8_N
orm
al.1
8_N
orm
al.2
9_Pr
imar
y9_
Brai
n3.R
19_
Brai
n2.R
19_
Brai
n1.R
19_
Nor
mal
.19_
Nor
mal
.2
10_P
rimar
y110
_Prim
ary2
10_A
LN1
10_A
LN2
10_S
kin3
.R2
10_S
kin2
.R2
10_S
kin1
.R2
10_N
orm
al
0.00
0.25
0.50
0.75
1.00
Targ
ets
perc
enta
ge c
over
ed
11 12 13 14 15 16 17 18 19 20
11_P
rimar
y111
_Prim
ary5
11_P
rimar
y311
_Prim
ary4
11_P
rimar
y211
_Bra
in2.
R1
11_B
rain
1.R
111
_Nor
mal
12_P
rimar
y12
_ALN
12_N
orm
al
13_P
rimar
y.lo
cal.r
ecur
renc
e.1
13_P
rimar
y.lo
cal.r
ecur
renc
e.2
13_A
LN.re
laps
e13
_Bon
e1.R
113
_Bon
e2.R
113
_Nor
mal
14_P
rimar
y14
_ALN
14_B
rain
1.R
114
_Bra
in2.
R1
14_N
orm
al
15_P
rimar
y15
_ALN
15_L
iver
.R1
15_N
orm
al.1
15_N
orm
al.2
16_P
rimar
y.lo
cal.r
ecur
renc
e.1
16_P
rimar
y.lo
cal.r
ecur
renc
e.2
16_P
rimar
y.lo
cal.r
ecur
renc
e.3
16_S
kin3
.R1
16_S
kin2
.R1
16_S
kin1
.R1
16_N
orm
al
17_P
rimar
y217
_Prim
ary1
17_A
LN1
17_A
LN2
17_B
rain
.R1
17_N
orm
al
18_P
rimar
y218
_Prim
ary1
18_A
LN1
18_A
LN2
18_S
kin1
.R2
18_S
kin2
.R2
18_N
orm
al
19_P
rimar
y19
_Bra
in1.
R3
19_B
rain
1.R
519
_Bra
in2.
R5
19_N
orm
al
20_P
rimar
y420
_Prim
ary2
20_P
rimar
y120
_Prim
ary3
20_O
vary
8.R
220
_Ova
ry6.
R2
20_O
vary
2.R
220
_Ova
ry5.
R2
20_O
vary
4.R
220
_Ova
ry7.
R2
20_O
vary
9.R
220
_Ova
ry3.
R2
20_O
vary
1.R
220
_Nor
mal
.120
_Nor
mal
.2
sample
Coverage Level> x0
> x5
> x10
> x30
> x50
> x80
sample
0.00
0.25
0.50
0.75
1.00
Targ
ets
perc
enta
ge c
over
ed
Supplementary Figure 1
Prim
ary
Dia
gnos
is
1
2
Neo
Adju
vant
3
4
5
6
7
8
9
10
55 months
1 year
CT
RT
HT
M1 M2
57 monthsM1
L+
11 monthsL+ M1
116 months
M1 M2 M3
M4
58 months
M1
11
12
13
14
15
16
17
18
19
20
M1
92 months
84 months
M2adr.gl
M3boneM1
82 monthsL+ M1
48 months
M2M1
L+M1
76 months
33 months
M1
36 monthsL+M1
163 months
M1
108 monthsL+
142 monthsL+
40 months
M1
80 monthsL+
M1
115 monthsL+
M3
90 months
M1 M2
M1
lung liver
colon
Loc
bone
Contrlung uterus brain
colon
H2
Her2 therapyL+ positive axillary lymph nodeLoc: Local relapseContr: contralateral Grey text: Not available/not sequenced
TAM
PAM
50 p
rimar
y tu
mor
boneM2bone
boneM2bone
M3lung
Locbone
M2liverskin
LA
LB
BL
BL
BL
LocContrskin
M4brain
brain lung
BL
Locskin Contrskin
M2skin
M3bone
BL
brainM2lung
M3skin
Locaxilla bone
Locbreast bone
Locaxillary
BL
M2bone
Contr M2brain
M3esophagusliver
Loc skinM2lung
M3brain
Contr brain
Contr Loc skin skin boneM4lung
M1bone M2
boneM3brain
M4bone
M5brain
Loc
boneM2ovary
295 months
L+
182 months 262 months
LB
H2
LA/LB
LB
5 y 10 y 15 y
LA Luminal ABL Basal like
LB Luminal B
H2 Her 2 enriched
LA/NBL
N/A
N/D
LB
N/A
N/A
N/A
NBL Normal breast like
M1Loc
Supplementary Figure 2
Exome Sequencing
Data
Somatic Mutations
Copy Number Calls
Tumor cell fraction
Phylogenetic Reconstruction
Subclonal Analysis
100
80
90
PYCLONE
DOLLO PARSIMONY MUTECT
ASCATNGS
Germline
Clu
ster
Clu
ster
Clu
ster
Clu
ster
Clu
ster
Clu
ster
Clu
ster
Clu
ster
Clu
ster
Primary
Primary
one R
one R
Prim
ary
Prim
ary
one
R
one
R
Supplementary Figure 3
A
B
Primary
Germline
Metastasis 2a Metastasis 2b
Primary
Germline
Lymph node
Metastasis 2a Metastasis 2b
Metastasis 1
Metastasis 1
Supplementary Figure
Liver1.R2
Liver2.R2
Lung1.R1
Lung2.R1
Primary
Primary
Lung2.R1
Lung1.R1
Liver2.R2
Liver1.R2
0.4 0.43 0.46 0.49 1
0.66 0.69 0.79 1 0.49
0.68 0.69 1 0.79 0.46
0.73 1 0.69 0.69 0.43
1 0.73 0.68 0.66 0.4
Percentage of shared mutations in patient 1
Liver1.R2
Liver2.R2
Lung1.R1
Lung2.R1
Primary
Primary
Lung2.R1
Lung1.R1
Liver2.R2
Liver1.R2
0.33 0.29 0.33 0.3 0
0.12 0.1 0.12 0 0.38
0.1 0.1 0 0.12 0.4
0.15 0 0.25 0.26 0.48
0 0.16 0.26 0.28 0.51
Percentage of exclusive (specific) mutation in patient 1
ALN
Colon1.R1
Colon2.R1
Primary
Primary
Colon2.R1
Colon1.R1
ALN
0.44 0.35 0.46 1
0.34 0.4 1 0.46
0.24 1 0.4 0.35
1 0.24 0.34 0.44
Percentage of shared mutations in patient 2
ALN
Colon1.R1
Colon2.R1
Primary
Primary
Colon2.R1
Colon1.R1
ALN
0.14 0.57 0.4 0
0.25 0.51 0 0.34
0.26 0 0.31 0.33
0 0.74 0.62 0.52
Percentage of exclusive (specific) mutation in patient 2
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Supplementary igure 5
ALN
Bone.R1
Primary
Primary
Bone.R1
ALN
0.28 0.4 1
0.23 1 0.4
1 0.23 0.28
Percentage of shared mutations in patient 3
ALN
Bone.R1
Primary
Primary
Bone.R1
ALN
0.63 0.35 0
0.72 0 0.48
0 0.47 0.45
Percentage of exclusive (specific) mutation in patient 3Brain.R3
Colon.R4
Primary1
Primary2
Primary3
Primary4
Primary5
Primary6
Uterus.R2
Uterus.R2
Primary6
Primary5
Primary4
Primary3
Primary2
Primary1
Colon.R4
Brain.R3
0.34 0.29 0.32 0.34 0.34 0.36 0.38 0.34 1
0.44 0.4 0.48 0.56 0.61 0.64 0.57 1 0.34
0.42 0.39 0.46 0.56 0.59 0.58 1 0.57 0.38
0.46 0.42 0.5 0.62 0.61 1 0.58 0.64 0.36
0.44 0.4 0.53 0.57 1 0.61 0.59 0.61 0.34
0.42 0.4 0.48 1 0.57 0.62 0.56 0.56 0.34
0.43 0.41 1 0.48 0.53 0.5 0.46 0.48 0.32
0.4 1 0.41 0.4 0.4 0.42 0.39 0.4 0.29
1 0.4 0.43 0.42 0.44 0.46 0.42 0.44 0.34
Percentage of shared mutations in patient 4
Brain.R3
Colon.R4
Primary1
Primary2
Primary3
Primary4
Primary5
Primary6
Uterus.R2
Uterus.R2
Primary6
Primary5
Primary4
Primary3
Primary2
Primary1
Colon.R4
Brain.R3
0.44 0.52 0.46 0.45 0.46 0.43 0.41 0.46 0
0.36 0.43 0.32 0.26 0.24 0.2 0.26 0 0.52
0.39 0.45 0.34 0.27 0.26 0.26 0 0.28 0.49
0.36 0.43 0.31 0.23 0.25 0 0.27 0.24 0.51
0.37 0.43 0.27 0.25 0 0.23 0.25 0.24 0.52
0.4 0.44 0.33 0 0.29 0.24 0.29 0.3 0.53
0.4 0.45 0 0.36 0.34 0.35 0.39 0.38 0.56
0.4 0 0.38 0.41 0.42 0.4 0.43 0.43 0.57
0 0.45 0.39 0.42 0.42 0.38 0.43 0.41 0.54
Percentage of exclusive (specific) mutation in patient 4
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Supplementary igure 5
Bone.R1
Bone.R2
Primary1
Primary2
Primary2
Primary1
Bone.R2
Bone.R1
0.59 0.58 0.71 1
0.57 0.6 1 0.71
0.71 1 0.6 0.58
1 0.71 0.57 0.59
Percentage of shared mutations in patient 5
Bone.R1
Bone.R2
Primary1
Primary2
Primary2
Primary1
Bone.R2
Bone.R1
0.18 0.2 0.13 0
0.23 0.21 0 0.2
0.16 0 0.28 0.32
0 0.18 0.31 0.32
Percentage of exclusive (specific) mutation in patient 5
Primary.local.recurrence
Skin1.R1
Skin2.R1
Skin2.R1
Skin1.R1
Primary.local.recurrence
0.17 0.19 1
0.44 1 0.19
1 0.44 0.17
Percentage of shared mutations in patient 7
Primary.local.recurrence
Skin1.R1
Skin2.R1
Skin2.R1
Skin1.R1
Primary.local.recurrence
0.54 0.31 0
0.5 0 0.79
0 0.22 0.78
Percentage of exclusive (specific) mutation in patient 7
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Supplementary igure 5
ALN
Bone
Primary
Skin
Skin
Primary
Bone
ALN
0.41 0.67 0.41 1
0.58 0.38 1 0.41
0.41 1 0.38 0.67
1 0.41 0.58 0.41
Percentage of shared mutations in patient 8
ALN
Bone
Primary
Skin
Skin
Primary
Bone
ALN
0.54 0.22 0.51 0
0.32 0.35 0 0.29
0.53 0 0.52 0.16
0 0.26 0.21 0.22
Percentage of exclusive (specific) mutation in patient 8
Brain1.R1
Brain2.R1
Brain3.R1
Primary
Primary
Brain3.R1
Brain2.R1
Brain1.R1
0.66 0.65 0.63 1
0.76 0.78 1 0.63
0.8 1 0.78 0.65
1 0.8 0.76 0.66
Percentage of shared mutations in patient 9
Brain1.R1
Brain2.R1
Brain3.R1
Primary
Primary
Brain3.R1
Brain2.R1
Brain1.R1
0.1 0.1 0.12 0
0.14 0.12 0 0.31
0.11 0 0.12 0.3
0 0.11 0.13 0.29
Percentage of exclusive (specific) mutation in patient 9
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Supplementary igure 5
ALN1
ALN2
Primary1
Primary2
Skin1.R2
Skin2.R2
Skin3.R2
Skin3.R2
Skin2.R2
Skin1.R2
Primary2
Primary1
ALN2
ALN1
0.15 0.21 0.22 0.2 0.4 0.37 1
0.18 0.2 0.26 0.23 0.37 1 0.37
0.2 0.19 0.26 0.23 1 0.37 0.4
0.27 0.28 0.3 1 0.23 0.23 0.2
0.2 0.27 1 0.3 0.26 0.26 0.22
0.17 1 0.27 0.28 0.19 0.2 0.21
1 0.17 0.2 0.27 0.2 0.18 0.15
Percentage of shared mutations in patient 10
ALN1
ALN2
Primary1
Primary2
Skin1.R2
Skin2.R2
Skin3.R2
Skin3.R2
Skin2.R2
Skin1.R2
Primary2
Primary1
ALN2
ALN1
0.79 0.74 0.74 0.7 0.4 0.45 0
0.75 0.75 0.7 0.66 0.45 0 0.48
0.74 0.76 0.7 0.67 0 0.48 0.45
0.62 0.64 0.64 0 0.56 0.58 0.64
0.61 0.56 0 0.36 0.31 0.33 0.44
0.68 0 0.6 0.42 0.5 0.5 0.5
0 0.73 0.71 0.5 0.54 0.58 0.67
Percentage of exclusive (specific) mutation in patient 10
Brain1.R1
Brain2.R1
Primary1
Primary2
Primary3
Primary4
Primary5
Primary5
Primary4
Primary3
Primary2
Primary1
Brain2.R1
Brain1.R1
0.14 0.23 0.22 0.23 0.26 0.26 1
0.16 0.28 0.26 0.29 0.3 1 0.26
0.11 0.4 0.26 0.26 1 0.3 0.26
0.21 0.22 0.3 1 0.26 0.29 0.23
0.24 0.21 1 0.3 0.26 0.26 0.22
0.18 1 0.21 0.22 0.4 0.28 0.23
1 0.18 0.24 0.21 0.11 0.16 0.14
Percentage of shared mutations in patient 11
Brain1.R1
Brain2.R1
Primary1
Primary2
Primary3
Primary4
Primary5
Primary5
Primary4
Primary3
Primary2
Primary1
Brain2.R1
Brain1.R1
0.78 0.32 0.65 0.61 0.32 0.54 0
0.78 0.32 0.63 0.57 0.34 0 0.63
0.87 0.4 0.71 0.69 0 0.65 0.71
0.71 0.4 0.58 0 0.36 0.53 0.65
0.66 0.36 0 0.51 0.3 0.52 0.64
0.81 0 0.77 0.74 0.46 0.68 0.74
0 0.32 0.56 0.56 0.59 0.62 0.71
Percentage of exclusive (specific) mutation in patient 11
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
J
Supplementary igure 5
ALN.relapse
Bone1.R1
Bone2.R1
Primary.local.recurrence.1
Primary.local.recurrence.2
Primary.local.recurrence.2
Primary.local.recurrence.1
Bone2.R1
Bone1.R1
ALN.relapse
0.37 0.31 0.29 0.37 1
0.31 0.27 0.22 1 0.37
0.48 0.62 1 0.22 0.29
0.5 1 0.62 0.27 0.31
1 0.5 0.48 0.31 0.37
Percentage of shared mutations in patient 13
ALN.relapse
Bone1.R1
Bone2.R1
Primary.local.recurrence.1
Primary.local.recurrence.2
Primary.local.recurrence.2
Primary.local.recurrence.1
Bone2.R1
Bone1.R1
ALN.relapse
0.38 0.47 0.56 0.43 0
0.49 0.56 0.66 0 0.49
0.24 0.14 0 0.61 0.54
0.31 0 0.31 0.59 0.56
0 0.35 0.43 0.56 0.52
Percentage of exclusive (specific) mutation in patient 13
ALN
Brain1.R1
Brain2.R1
Primary
Primary
Brain2.R1
Brain1.R1
ALN
0.24 0.59 0.55 1
0.2 0.64 1 0.55
0.22 1 0.64 0.59
1 0.22 0.2 0.24
Percentage of shared mutations in patient 14
ALN
Brain1.R1
Brain2.R1
Primary
Primary
Brain2.R1
Brain1.R1
ALN
0.63 0.2 0.23 0
0.71 0.22 0 0.35
0.67 0 0.21 0.31
0 0.59 0.62 0.6
Percentage of exclusive (specific) mutation in patient 14
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
K
Supplementary igure 5
ALN
Liver.R
1
Primary
Primary
Liver.R1
ALN
0.29 0.12 1
0.1 1 0.12
1 0.1 0.29
Percentage of shared mutations in patient 15
ALN
Liver.R
1
Primary
Primary
Liver.R1
ALN
0.6 0.65 0
0.88 0 0.85
0 0.65 0.49
Percentage of exclusive (specific) mutation in patient 15
Primary.local.recurrence.1
Primary.local.recurrence.2
Primary.local.recurrence.3
Skin1.R1
Skin2.R1
Skin3.R1
Skin3.R1
Skin2.R1
Skin1.R1
Primary.local.recurrence.3
Primary.local.recurrence.2
Primary.local.recurrence.1
0.08 0.09 0.08 0.45 0.44 1
0.12 0.12 0.13 0.43 1 0.44
0.09 0.1 0.08 1 0.43 0.45
0.4 0.45 1 0.08 0.13 0.08
0.41 1 0.45 0.1 0.12 0.09
1 0.41 0.4 0.09 0.12 0.08
Percentage of shared mutations in patient 16
Primary.local.recurrence.1
Primary.local.recurrence.2
Primary.local.recurrence.3
Skin1.R1
Skin2.R1
Skin3.R1
Skin3.R1
Skin2.R1
Skin1.R1
Primary.local.recurrence.3
Primary.local.recurrence.2
Primary.local.recurrence.1
0.68 0.55 0.58 0.43 0.32 0
0.57 0.49 0.45 0.49 0 0.44
0.57 0.42 0.5 0 0.27 0.32
0.49 0.39 0 0.91 0.86 0.91
0.48 0 0.37 0.89 0.86 0.9
0 0.35 0.34 0.9 0.86 0.91
Percentage of exclusive (specific) mutation in patient 16
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Supplementary igure 5
ALN1
ALN2
Brain.R1
Primary1
Primary2
Primary2
Primary1
Brain.R1
ALN2
ALN1
0.25 0.27 0.04 0.26 1
0.22 0.22 0.04 1 0.26
0.04 0.04 1 0.04 0.04
0.52 1 0.04 0.22 0.27
1 0.52 0.04 0.22 0.25
Percentage of shared mutations in patient 17
ALN1
ALN2
Brain.R1
Primary1
Primary2
Primary2
Primary1
Brain.R1
ALN2
ALN1
0.61 0.62 0.86 0.63 0
0.61 0.65 0.87 0 0.54
0.95 0.95 0 0.96 0.94
0.24 0 0.84 0.64 0.52
0 0.37 0.88 0.68 0.6
Percentage of exclusive (specific) mutation in patient 17
ALN1
ALN2
Primary1
Primary2
Skin1.R2
Skin2.R2
Skin2.R2
Skin1.R2
Primary2
Primary1
ALN2
ALN1
0.26 0.14 0.2 0.26 0.73 1
0.28 0.15 0.21 0.26 1 0.73
0.25 0.35 0.3 1 0.26 0.26
0.26 0.32 1 0.3 0.21 0.2
0.23 1 0.32 0.35 0.15 0.14
1 0.23 0.26 0.25 0.28 0.26
Percentage of shared mutations in patient 18
ALN1
ALN2
Primary1
Primary2
Skin1.R2
Skin2.R2
Skin2.R2
Skin1.R2
Primary2
Primary1
ALN2
ALN1
0.61 0.64 0.65 0.39 0.14 0
0.6 0.62 0.64 0.4 0 0.17
0.7 0.47 0.64 0 0.68 0.68
0.62 0.31 0 0.35 0.66 0.67
0.73 0 0.63 0.5 0.81 0.82
0 0.4 0.55 0.37 0.54 0.57
Percentage of exclusive (specific) mutation in patient 18
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Supplementary igure 5
Brain1.R3
Brain1.R5
Brain2.R5
Primary
Primary
Brain2.R5
Brain1.R5
Brain1.R3
0.12 0.08 0.11 1
0.67 0.48 1 0.11
0.4 1 0.48 0.08
1 0.4 0.67 0.12
Percentage of shared mutations in patient 19
Brain1.R3
Brain1.R5
Brain2.R5
Primary
Primary
Brain2.R5
Brain1.R5
Brain1.R3
0.71 0.67 0.7 0
0.27 0.11 0 0.85
0.58 0 0.49 0.91
0 0.09 0.1 0.83
Percentage of exclusive (specific) mutation in patient 19
Ovary1.R2
Ovary2.R2
Ovary3.R2
Ovary4.R2
Ovary5.R2
Ovary6.R2
Ovary7.R2
Ovary8.R2
Ovary9.R2
Primary1
Primary2
Primary3
Primary4
Primary4
Primary3
Primary2
Primary1
Ovary9.R2
Ovary8.R2
Ovary7.R2
Ovary6.R2
Ovary5.R2
Ovary4.R2
Ovary3.R2
Ovary2.R2
Ovary1.R2
0.04 0.15 0.17 0.16 0.16 0.16 0.16 0.18 0.17 0.27 0.31 0.29 1
0.05 0.18 0.18 0.2 0.19 0.19 0.19 0.21 0.19 0.33 0.35 1 0.29
0.06 0.2 0.22 0.22 0.21 0.2 0.22 0.22 0.24 0.36 1 0.35 0.31
0.04 0.17 0.19 0.19 0.2 0.18 0.19 0.2 0.2 1 0.36 0.33 0.27
0.22 0.52 0.52 0.51 0.59 0.5 0.55 0.58 1 0.2 0.24 0.19 0.17
0.24 0.53 0.57 0.55 0.59 0.54 0.6 1 0.58 0.2 0.22 0.21 0.18
0.22 0.54 0.6 0.55 0.61 0.56 1 0.6 0.55 0.19 0.22 0.19 0.16
0.23 0.51 0.56 0.56 0.55 1 0.56 0.54 0.5 0.18 0.2 0.19 0.16
0.25 0.62 0.57 0.55 1 0.55 0.61 0.59 0.59 0.2 0.21 0.19 0.16
0.21 0.52 0.55 1 0.55 0.56 0.55 0.55 0.51 0.19 0.22 0.2 0.16
0.21 0.52 1 0.55 0.57 0.56 0.6 0.57 0.52 0.19 0.22 0.18 0.17
0.25 1 0.52 0.52 0.62 0.51 0.54 0.53 0.52 0.17 0.2 0.18 0.15
1 0.25 0.21 0.21 0.25 0.23 0.22 0.24 0.22 0.04 0.06 0.05 0.04
Percentage of shared mutations in patient 20
Ovary1.R2
Ovary2.R2
Ovary3.R2
Ovary4.R2
Ovary5.R2
Ovary6.R2
Ovary7.R2
Ovary8.R2
Ovary9.R2
Primary1
Primary2
Primary3
Primary4
Primary4
Primary3
Primary2
Primary1
Ovary9.R2
Ovary8.R2
Ovary7.R2
Ovary6.R2
Ovary5.R2
Ovary4.R2
Ovary3.R2
Ovary2.R2
Ovary1.R2
0.95 0.8 0.75 0.77 0.78 0.76 0.78 0.75 0.77 0.64 0.59 0.61 0
0.94 0.74 0.7 0.68 0.71 0.68 0.7 0.66 0.71 0.51 0.48 0 0.45
0.92 0.72 0.66 0.65 0.7 0.66 0.66 0.66 0.66 0.48 0 0.49 0.44
0.94 0.74 0.68 0.69 0.71 0.69 0.69 0.67 0.7 0 0.45 0.49 0.48
0.75 0.34 0.25 0.26 0.28 0.25 0.24 0.22 0 0.64 0.57 0.64 0.6
0.74 0.37 0.26 0.28 0.32 0.26 0.25 0 0.31 0.65 0.62 0.64 0.62
0.76 0.36 0.24 0.28 0.3 0.25 0 0.25 0.33 0.67 0.62 0.67 0.66
0.76 0.41 0.3 0.3 0.37 0 0.31 0.32 0.4 0.7 0.66 0.69 0.67
0.72 0.25 0.19 0.21 0 0.19 0.17 0.19 0.24 0.63 0.6 0.63 0.61
0.77 0.39 0.29 0 0.36 0.26 0.3 0.3 0.37 0.68 0.63 0.66 0.66
0.77 0.39 0 0.29 0.34 0.27 0.27 0.29 0.37 0.68 0.63 0.69 0.64
0.72 0 0.22 0.23 0.23 0.21 0.22 0.23 0.29 0.67 0.61 0.65 0.63
0 0.31 0.29 0.29 0.29 0.22 0.29 0.24 0.33 0.82 0.74 0.8 0.78
Percentage of exclusive (specific) mutation in patient 20
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Percentage of of mutations present in sample on y-axis but absent in sample on x-axis
Supplementary igure 5
Patient 1: ER-/PR-/HER2-
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 1 99 TP53;NCOA3
17 39 0 Primary 6 47 MAP3K6;KDM5C Metastasis 7 81 XPO1;MED12
12 29 0
8 33 0
CDK6
100
100
76
100Primary Lung2.R1
Lung1.R1
Liver2.R2 Liver1.R2
TP53NCOA3AKT3MYCHRASRB1STK11(102)
MAP3K6KDM5C
(77)
(38)
(29)
MED12(59)
(10)
XPO1
(36)
(37)
(36)
(19 mo)
(19 mo)
(49 mo) (49 mo)
ERBB2MCL1MDM4RELCCNE1MDM2AKT1IGF1RPIK3CAPDGFRATERTEGFRBRCA1CDKN1BPBRM1PTEN
CCNE1FGFR3ARID1ABAP1PBRM1
ERBB2
MCL1MDM4MYCNRELARID1ACDKN1BBAP1PBRM1
Germline
A
Primary
Lung2.R1
Liver2.R2
Liver1.R2
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 2: ER-/PR-/HER2-
Cluster color
Cluster ID Mutation count
Driver genes
Truncal
1 21 TP53
3 60 BAP1
7 190 ERBB3;NF1;EPHA5;RRM2B
10 (except lymph) 31 0 Primary & Lymph
6 10 0
Lymph 2 193 KAT6B;MED12Metastasis 9 53 0
Colon2.R1
Germline
Primary
ALN
Colon1.R1
(78)
BAP1RRM2B
(84)
ERBB3(89)
KAT6BERBB3MED12MCL1PDGFRAMYC
(233)
(61)
RRM2B
(75)
(36)
100
100
100
TP53
ERBB3
ZNF217AURKA
BAP1EPHAS
NF1
(57 mo)
(57 mo)
NOTCH3CCNE1AKT2PIK3CA
B
Primary
ALN
Colon2.R1
Colon1.R1
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 3: ER-/PR-/HER2+
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 1 13 0 Primary & Lymph
2 59 FANCA
Primary & Metastasis
3 25 PIK3CA;FGFR4
Lymph & Metastasis
4 55
100
100
Primary
ALN
Bone.R1
(20)
FANCA(33)
PIK3CAFGFR4TERTMYC
(32)
(47)(29)
Germline
(11 mo)
ERBB2
AKT3CCND1MDM2NSD1ARID1ARB1BRCA1BAP1PBRM1CDKN2ACDKN2B
0
C
Primary
ALN
Bone.R1
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
D
Colon.R4
Primary
1
Uterus.R2
Primary5Prim
ary2
Primary
4Primary
3
Primary6
Brain.R3
Germline
JAK1
ERBB2
(58)
DNMT3A
BRCA2
IGF1R
ZNF217
AURKA
NOTCH3
CCNE1
BRCA1
BAP1
PBRM1
CDKN2A
CDKN2B
(103)
DDX11
(3)(2)
(0)
(115)PBRM1
MDM4
AKT3
IGF1R
MYCN
ZNF217
AURKA
TERT
MYC
ARID1A
RB1
BRCA1
STK11
BAP1
PBRM1
CDKN2A
CDK2B
HRAS
(118)
(55) (38)
(42)
(11)
DDR2
BRCA2
(60)
(79)
(17)
(94)
(7)
(54)
100
100
26 79
39
42
6473
(74 mo)
(93 mo)
EPHA3
ARID1A
RB1
BRCA1
STK11
BAP1
PBRM1
CDKN2A
CDKN2B
HRAS
PTEN
Cluster color
Cluster ID Mutation count
Driver genes
Truncal
20 15
18 47 DNMT3A
3 70 JAK1
12 (except Brain.R3) 29 0
Primary
1 / P6,4,2,3,5 15 0
6 / P1 44 PBRM1
8 / P2 27 EPHA3
15 / P4 47 PLCG1
9 / P6 24 0
Primary 16 & Uterus.R2
14 / P5, Uterus.R2 79 BRCA2;DDR2;ROS1;KDM6A
Metastasis 4 / Colon 45 BRCA2
5 / Brain 57
0
0
AKT3
TERT
Patient 4: ER-/PR-/HER2+
(97 mo)
AKT1
IGF1R
PAX8
MYCN
NSD1
NOTCH3
CCNE1
AKT2
EGFRDNMT3A
FGFR2
FGFR3
KDM6A
MDM4
ZNF217
AURKA
ARID1A
RB1
BRCA1
STK11
BAP1
PBRM1
CDKN2A
CDKN2B
HRAS
DNMT3A
ROS1
MDM4
AKT3
AKT1
IGF1R
PAX8
MYCN
TERT
NSD1
NOTCH3
CCNE1
EGFR
CDK6
MYC
ARID1A
RB1
BRCA1
STK11
BAP1
PBRM1
CDKN2A
CDKN2B
DNMT3A
MDM4
AKT3
AKT1
PAX8
TERT
NSD1
NOTCH3
CCNE1
AKT2
EGFR
CDK6
ARID1A
RB1
BRCA1
STK11
BAP1
PBRM1
CDKN2A
CDKN2B
HRAS
MDM4
AKT3AKT1
IGF1R
PAX8
MYCN
ZNF217
AURKA
TERT
NSD1
ARID1A
RB1
BRCA1
STK11
BAP1
PBRM1
CDKN2A
CDKN2B
Uterus.R2
Primary5
Primary1
Primary3
Primary2
Primary4
Primary6
Colon.R4
Brain.R3
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 5: ER+/PR+/HER2-
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 2 12 0
3 121 GATA1
9 30 PPM1D
Primary 8 10 0 11 10 0 12 18
Metastasis 16 34 0
1 18 0
15 24 0
0
10066
100
Primary1
Primary2Bone.R2 Bone.R1
(124)PPM1DCCND1IGF1RZNF217AURKAERLIN2TERTARID1APTENBRCA1
(51)
(53)
(10)
(38)
PPM1DMDM4CCND1IGF1RZNF217AURKAESR1(38)
(32)
GATA1CDK4MDM2PIK3CA
MST1SLIT2CCND1ZNF217AURKAERLIN2CCNE1ARID1APTENBRCA1BAP1PBRM1CDKN2ACDKN2BCDKN1B
Germline
(49 mo) (31 mo)
MCL1
ERLIN2ARID1APTENBRCA1BAP1PBRM1CDKN2ACDKN2B
E
Primary1
Primary2
Bone.R2
Bone.R1
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 7: ER+/PR-/HER2- (Skin Metastasis IHC, Primary Tumor data NA)
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 9 38 TP53
28 134 Primary 2 16 0
0
100
100
Primary local recurrence
Skin1.R1
Skin2.R1
(21)
(80)
(29)
(103)
(41)
Germline
(64 mo)
(64 mo)
MYC
MCL1NKX2-1FGFR3PDGFRABAP1PBRM1CDKN2ACDKN2B
CCND1ZNF217
(54 mo)
F
Primary local recurrence
Skin2.R1
Skin1.R1
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 8: ER+/PR+/HER2- (Based on IHC on axillary Lymph and Bone metastasis)
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 6 86 0Primary 14 41 0Lymph 15 18 0Metastasis 4 57 EP400;STAT3
2 40 PLCG1
1 27 NOTCH3
ALN
100
100
56
(82)EP400STAT3AKT3MDM2NKX2-1MYCNCDKN2ACDKN2B
(71)
NOTCH3NOTCH3MDM4CCND1IGF1RPIK3CATERTPTENRB1(29)
Primary
Local regionalSkin.R1
Bone.R2(41) (23)
PLCG1CDKN1B
(44)
(13)
Germline
(49 mo)
(75 mo)
ERBB2ZNF217AURKAMYCHRASSTK11
MDM4CCND1IGF1R
AKT3MDM2TERTCDK4NOTCH3AKT2NFE2L2EGFRFGFR1PTENRB1CDKN1B
MYCN
G
Primary
ALN
Bone.R2
Local regional
Skin.R1
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 9: ER-/PR-/HER2-
Germline
(26 mo)
(26 mo)(26 mo)
100
100
84Primary
Brain1.R1
Brain2.R1Brain3.R1
HSP90AB1KRASNKX2-1NOTCH3ZNF217AURKACCND3MYCCDKN1BBAP1PBRM1
(262)
NTRK3PTEN
(63)
(45)
(19)
(48)
MST1AKT3(46)
(62)
PPM1DNF2
MCL1IGF1RHRAS
MCL1IGF1RNSD1
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 2 166 0
8 74 HSP90AB1
22 37 0
Primary 1 31 0
Metastasis
30 53 PPM1D;NF2
29 34 NTRK3
16 23 MST1
17 13 0
19 51 0
H
Primary
Brain1.R1
Brain2.R1
Brain3.R1
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 10: ER-/PR-/HER2-
100
100
71
59
4438
Primary1
Primary2
ALN2
Skin1.R2Skin2.R2
Skin3.R2
ALN1
INSRFGFR2CCND1CDK4MDM2
(59)
(0)
(22)
(9)(27)
DDX3XCDK4MCL1PDGFRAEGFRCDK6ERLIN2WHSC1L1FGFR1
(48)
FANCD2
PHLPP2
(39)
(2) (5)
(30)
(6)
(34)
(43)
CCND1
FGFR2ESR1MYC1BRCA1
FGFR2
BRCA1
Germ
line
(54 mo)
(54 mo)
(54 mo)
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 8 20 0
9 (except primary1)
37 FANCD2
Primary 13 55 INSR
Metastasis
2 11 0
12 36 DDX3X
7 27 PHLPP2
6 19 0
I
Primary1
Primary2
ALN2
Skin3.R2
Skin2.R2
Skin1.R2
Phylogenetic Tree Cluster Table
Density Plot
u le entary igure
Patient 11: ER-/PR-/HER2-
(24 mo)
(24 mo)
Germline
Primary1
Primary2
Primary3
Primary4
Primary5
Brain2.R1
Brain1.R1
CARD11TP53MYC
(40)
(255)
(719)
NOTCH2DDR2MST1ARID1BMCL1AKT3CCNE1AKT2CCND3MDM4AKT1IGF1RNOTCH3NFE2L2TERTEGFRPTENHRASBRCA1BAP1PBRM1
(187)
(616)
(241)
BCL2L1PTENHRASBRCA1STK11BAP1PBRM1
(164)
(19)
(294)
(4)
(12)
(32)
(36)
100
100
55
56
42
59
MST1CDK6MCL1AKT3MDM4TERTPAX8BRCA1STK11
NOTCH2DDR2MST1ARID1BMCL1AKT3CCNE1AKT2CCND3PTENHRASRB1BRCA1STK11BAP1PBRM1
MCL1AKT3AKT1IGF1RTERTPAX8PTENHRASRB1BRCA1STK11BAP1PBRM1
MCL1
CCNE1TERT
MCL1AKT3AKT2MDM4AKT1IGF1RPAX8KRAS
Cluster color
Cluster ID Mutation count
Driver genes
Truncal
3 32 CARD11
9 36 0
24 (expect Brain1.R1)
484 0
Primary 26 26 CDK6 Primary & Metastasis
22 471 BCL2L1
Metastasis 19 67 NOTCH2;DDR2
16 27 0
J
Brain2.R1
Brain1.R1
Primary1
Primary5
Primary3
Primary4
Primary2
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
Patient 13: ER+/PR+/HER2- (Based on Bone Metastasis IHC, Primary Tumor data NA)
Germline
Primary local recurrence 1
Local regional axillary Lymph relapse
Bone2.R1
Bone1.R1
(51) (20)
(151)
(112)
(110)
TSC1(24)
(69)ESR1FLT1(54)
(25)
100
7873
78
Primary local recurrence 2
EGFRBAP1
(84 mo)
(84 mo)
(≈ 36 mo)
(156 mo)
(156 mo)
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 2 52 0
1 14 0
13 128 0
Primary local reccurence
11 59 NF110 36 0
Lymph relapse & Metastasis
7 56 0
8 65 FLT1;ESR1;TSC1
MDM4ERBB2MYCERLIN2WHSC1L1FGFR1
EGFRKRASIGF1RNOTCH3CCNE1CDK6BAP1
MCL1
MCL1
STK11BAP1
K
Primary local recurrence 1
Primary local
recurrence 2
Local regional axillary Lymph relapse
Bone2.R1
Bone1.R1
Phylogenetic Tree Cluster Table
Density Plot
u le entary igure
Patient 14: ER+/PR+/HER2+
Cluster color
Cluster ID Mutation count
Driver genes
Truncal
2 42 0
13 45 0
7 (except lymph) 94 0Primary 10 24 0Lymph 6 85 CCNE1 Metastasis 11 63 0
L
Primary
ALN
Brain2.R1
Brain1.R1
Phylogenetic Tree
Cluster Table
Density Plot
Germline
Primary
Brain1.R1 Brain2.R1
ALN
(49)
(41)
(76)
(50)
(109)
(33)
(47)
100
100
100
(26 mo)(26 mo)
AKT3ERBB2TERT
NOTCH3BRCA1RB1STK11CDKN2ACDKN2B
MDM4CCND1KRASZNF217AURKA
EGFRPAX8
MYCBRCA1
EGFRBRCA1
HRASBAP1
u le entary igure
Patient 15: ER+/PR+/HER2-
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 5 27 0 Primary 9 62 IGF2R
Metastasis 18 137 TP53;IKBKE
20 49 0
M
Primary
Liver.R1
Primary
Germline
ALN
Liver.R1
(20)
(20)
IKBKETP53MDM4AKT3TERTEGFRARID1APTENCDKN1BRB1BRCA1STK11
(202)
IGF2RERBB2
(57)
MAP3K4
(36)
77
100
(116 mo)
ERBB2
C
D
F
Germline
Skin1.R1
Skin2.R1
Skin3.R1
Primary local recurrence 3
Primary local recurrence 2
Primary local recurrence 1
TP53ERBB2ERLIN2
(23)WHSC1L1AKT1BRCA1
(86)
(146)
(218)
WHSC1L1
(126)
WHSC1L1FGFR3AKT3AKT1ERBB2TERTEGFRMYCHRASCDKN1BBRCA1STK11CDKN2ACDKN2B
(93)
(2)
(6)
RPS6KB1PAK3
(172)
(239)
WHSC1L1FGFR3
(200)
88
100100
100
100
RPS6KB1
FGFR3
(16 mo)
(16 mo)
(16 mo)
Patient 16: ER-/PR-/HER2+
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 11 51 TP53
6 (except Skin3) 45 RPS6KB1Primary local recurrence 1 727 PAK3
Metastasis 14 50 RPS6KB1
(12 mo)
(12 mo)
(12 mo)
N
Skin3.R1
Skin2.R1
Skin1.R1
Primary local recurrence 3
Primary local recurrence 2
Primary local
recurrence 1
C
D
F
Patient 17: ER+/PR+/HER2-
Germline
Primary1
Primary2
ALN2
ALN1
Brain.R1
(244)
ARID1AMCL1DDR2PIK3CAKDM6AAKT3AKT1NKX2-1CCNE1NFE2L2ZNF217AURKACCND3CDK6JAK2CDKN1BBRCA1
(22)
(59)
(1)
DDX3XATRXNSD1ERLIN2WHSC1L1FGR1
(71)(9)
ARID1A(31)
PLCG1NSD1HRASRB1CDKN2ACDKN2B
(36)
(19)
100100
46
100
(75 mo)
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 12 27 0
2 (except Brain) 27 0 Primary 9 48 0
5 53 DDR2;DDX3X;ATRXLymph 8 52 ARID1A;PLCG1 Metastasis 4 217 MCL1;DDR2;ARID1A;PIK3CA;
KDM6A
MCL1MDM4
CCND1MYCERLIN2
WHSC1L1FGFR1
AKT3
O
Primary1
Primary2
ALN2
ALN1
Brain.R1
Phylogenetic Tree
Cluster TableDensity Plot
u le entary igure
Patient 18: ER+/PR+/HER2-
ABL2TET2FGFR3MYCERLIN2WHSC1L1FGFR1AKT3MCL1MDM4TERT
(62)
PIK3R1ERLIN2AKT3RB1
(94)
NOTCH2PTENRB1ARID1ACDKN2ACDKN2B
(19)
(15)(61)
(55)
PTENATRX
(28) (0)
JAK1NOTCH2ERBB2FGFR3FGFR2CCNE1AKT2MYCERLIN2WHSC1L1FGFR1PTENSTK11RB1
(67)
MAP3K9ERBB2FGFR3(107)
(25)
Germline
ALN1
ALN2Primary2
Primary1
Skin1.R2 Skin2.R2
86
100100
99
78MAP3K9INSR
(87 mo)(87 mo)
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 3 33 PTEN;ATRXPrimary 15 56 INSR;PIK3R1Primary & Lymph 2 14 197 MAP3K9Lymph1 13 23 JAK1 Skin Metastasis & Primary 2
10 63 ABL2;TET2
STK11
PTEN
IGF1RZNF217AURKAMITFNSD1
P
Primary1
Primary2
ALN1
ALN2
Skin1.R2
Skin2.R2
Phylogenetic Tree
Cluster TableDensity Plot
u le entary igure
Patient 19: ER+/PR+/HER2+
100
100
100
Primary
Brain.R3
Brain1.R5
Brain2.R5
(23)
GNASSTK11
(71)
(36)
(22)
PALB2MCL1MDM4AKT2TERTMYCARID1APTENHRASCDKN1BBRCA1BAP1PBRM1CDKN2ACDKN2B
(113)
(20)
(170)
TP53ERBB2
CCND3AKT3ZNF217AURKAJUNNFE2L2
AKT3ZNF217AURKASTK11
Germline
(88 mo)
Cluster color
Cluster ID Mutation count
Driver genes
Truncal 36 20 0 Primary 40 75 GNASMetastasis 9 100 PALB2;TP53
6 12 0
28 18 0
20 27 0
18 203 0
(73 mo)
(88 mo)
Q
Primary
Brain1.R5
Brain2.R5
Brain.R3
Phylogenetic Tree Cluster Table
Density Plot
u le entary igure
(50)SF3B1MDM4CCND1TERTAKT3
(103)
SF3B1MDM4CCND1IGF1RERLIN2MCL1TERTRB1CDKN2ACDKN2B
(83)
(47)
MAP3K6(48)SF3B1
MDM4CCND1IGF1RERLIN2MCL1AKT1TERTRB1CDKN2ACDKN2B
(66)
(34) (19)
(37)
(51)
(1) (22) (6) (2) (5) SF3B1MCL1(13)
(11)
(34)(72)
CICKDM5CMDM4CCND1IGF1RNKX2-1RB1
(142)
(110)
CICCCND1
(114)
(71)
(29)
(45)
Germ
line
Ovary1.R2
Ovary2.R2
Ovary4.R2
Ovary3.R2Ova
ry5.R2
Ovary6
.R2
Ovary7
.R2 Ovary8.R
2
Ovary9.R
2 Primary1
Primary2 Primary4
Primary3
100
63
2267
49
100 16 6
22 10053
37
SF3B1IGF1RERLIN2MYCN
EGFRMYC
CDKN1B
Patient 20: ER+/PR+/HER2-
Cluster color
Cluster ID Mutation count
Driver genes
Truncal
8 19 0
1 (except ovary1.R2) 43 SF3B1
17 (except primary1) 17 0 Primary 27 316 CIC; KDM5C
Primary & Metastasis
14 (In primary3 & all ovary metastasis)
31 0
Metastasis 5 (except ovary1.R2) 30 0
20 28 0
21 26 0
24 24 MAP3K6
23 24 0
29 36 0
KRAS
CCND1NKX2-1RB1CDKN2ACDKN2B
CDK4CDKN2ACDKN2B
MDM4CCND1RB1CDKN2ACDKN2B
R
Primary3
Primary1
Primary2
Primary4
Ovary1.R2
Ovary3.R2
Ovary9.R2
Ovary7.R2
Ovary4.R2
Ovary2.R2
Ovary6.R2
Ovary8.R2
Phylogenetic Tree
Cluster Table
Density Plot
u le entary igure
0
2
4
2 4 6 8 10Number of Signatures
RSS
0.0
0.2
0.4
0.6
2 4 6 8 10Number of Signatures
Expl
aine
dVar
ianc
e
A
S1 S2 S3 S4
S1AS1BS2S3S4S5S6S7S8S9S10S11S12S13S14S15S16S17S18S19S20S21SR1SR2SR3SU1SU2
0.1
0.2
0.3
0.4
0.5 S1
S2
S3
S4
T>GT>CT>AC>TC>GC>A
T>GT>CT>AC>TC>GC>A
T>GT>CT>AC>TC>GC>A
T>GT>CT>AC>TC>GC>A
0.0
0.1
0.2
0.3
value
subs
titut
ion
strandUntranscribed
Transcribed
B
Supplementary Figure
Supplementary Figure
Supplementary Figure
Supplementary Figure
Supplementary Figure
Supplementary Figure
GL
74
52
47
8 2
16
Lung2.R1
0.09
Lung1.R1
0.17 20 29
Liver2.R2
0.11
Liver1.R2
0.13
Primary
0.17
Supplementary Figure
GL
133
36
Colon1.R1
0.0848
208
Primary
0.11
ALN
0.11
Colon2.R1
0.13
Supplementary Figure
GL
14
38
14
21 33
Primary
0.1
Bone.R1
0.12
ALN
0.16
Supplementary Figure
GL
67
18
19 26
36
22
50
23
46 20
46
Brain.R3
0.07
Primary3
0.08
Primary5
0.08
Uterus.R2
0.11
Primary4
0.07
Primary1
0.16
Primary2
0.15
Colon.R4
0.17
Primary6
0.12
Supplementary Figure
GL
137
14 36
9 19
Primary2
0.13
Primary1
0.12
21 17
Bone.R2
0.09
Bone.R1
0.14
Supplementary Figure
GL
19 73
90
Skin2.R1
0.09 15
Skin1.R1
0.15
Primary.local.recurrence
0.140.14
Supplementary Figure
GL
65
40 17
53
Primary
0.08
ALN
0.13 27 36
Bone
0.26
Skin
0.11
Supplementary Figure
GL
247
30
50
Primary
0.09
14
30
41 22
Brain3.R1
0.1
Brain2.R1
0.11
Brain1.R1
0.11
Supplementary Figure
GL
1118
3820 28133 20 6
Skin3.R2
0.1
Skin2.R2
0.12
Skin1.R2
0.1
ALN1
0.14
ALN2
0.17
Primary2
0.1
Primary1
0.09 0.110.19 0.1
Supplementary Figure
GL
53
31
32
Primary1
0.5
8844
34
64
65
Primary4
0.11
Brain2.R1
0.11
Primary5
0.17
Primary3
0.09
Primary2
0.130
199
0.14
Brain1.R1
0.21
0.141080.12
0.14
Supplementary Figure
GL
121
94 137
ALN
0.14
Primary
0.14
Supplementary Figure
GL
53
27
12
68
34
17
Primary.local.recurrence.2
0.1 16 49
12
23
Bone2.R1
0.11
Bone1.R1
0.11 0.09
Primary.local.recurrence.1
0.11
ALN.relapse
0.13
0.13 0.12
Supplementary Figure
GL
25
101
84
25
24
Primary
0.11 18 29
ALN
0.15
Brain1.R1
0.1
Brain2.R1
0.13
Supplementary Figure
GL
176
24
Liver.R1
0.3 26440.1
ALN
0.18
Primary
0.11
Supplementary Figure
GL
268
Skin3.R1
0.5
Skin2.R1
0.5
Skin1.R1
0.5149
Primary.local.recurrence.1
0.11
75
Primary.local.recurrence.2
0.13
Primary.local.recurrence.3
0.16
Supplementary Figure
GL
203
23
Brain.R1
0.3
35
56 38
15 5
ALN1
0.14
ALN2
0.12
Primary2
0.11
Primary1
0.1
Supplementary Figure
GL
16
20
25 20
ALN1
0.15
15 47 34
Primary2
0.13
12
ALN2
0.09
Primary1
0.18
53
Skin2.R2
0.13
Skin1.R2
0.18
73 34
0.13 0.16
Supplementary Figure
GL
20
65
111
31
8 209
Brain2.R5
0.08
Brain1.R5
0.08
Primary
0.11 10
Brain1.R3
0.07
Supplementary Figure
GL
27
Ovary1.R2
0.5
21
25
41 27
17
Primary1
0.11
Primary3
0.14 23 88 22 22 19 25 24
Ovary2.R2
0.09
Ovary5.R2
0.12
Ovary9.R2
0.11
Primary2
0.11
Primary4
0.12
Ovary3.R2
0.23
Ovary6.R2
0.19
Ovary8.R2
0.13
Ovary4.R2
0.12
Ovary7.R2
0.16
Supplementary Figure
Mutation map for Patient 1PrimaryLung1.R1Lung2.R1Liver1.R2Liver2.R2
Supplementary Figure 1
Mutation map for Patient 2PrimaryALNColon1.R1Colon2.R1
Supplementary Figure 1
Mutation map for Patient 3PrimaryALNBone.R1
Supplementary Figure 1
Mutation map for Patient 4Primary1Primary2Primary3Primary4Primary5Primary6Uterus.R2Brain.R3Colon.R4
Supplementary Figure 1
Mutation map for Patient 5Primary1Primary2Bone.R1Bone.R2
Supplementary Figure 1
Mutation map for Patient 7Primary.local.recurrenceSkin1.R1Skin2.R1
Supplementary Figure 1
Mutation map for Patient 8PrimaryALNBoneSkin
Supplementary Figure 1
Mutation map for Patient 9PrimaryBrain1.R1Brain2.R1Brain3.R1
Supplementary Figure 1
Mutation map for Patient 10Primary1Primary2ALN1ALN2Skin1.R2Skin2.R2Skin3.R2
Supplementary Figure 1
Mutation map for Patient 11Primary1Primary2Primary3Primary4Primary5Brain1.R1Brain2.R1
Supplementary Figure 1
Mutation map for Patient 13Primary.local.recurrence.1Primary.local.recurrence.2ALN.relapseBone1.R1Bone2.R1
Supplementary Figure 1
Mutation map for Patient 14PrimaryALNBrain1.R1Brain2.R1
Supplementary Figure 1
Mutation map for Patient 15PrimaryALNLiver.R1
Supplementary Figure 1
Mutation map for Patient 16Primary.local.recurrence.1Primary.local.recurrence.2Primary.local.recurrence.3Skin1.R1Skin2.R1Skin3.R1
Supplementary Figure 1
Mutation map for Patient 17Primary1Primary2ALN1ALN2Brain.R1
Supplementary Figure 1
Mutation map for Patient 18Primary1Primary2ALN1ALN2Skin1.R2Skin2.R2
Supplementary Figure 1
Mutation map for Patient 19PrimaryBrain1.R3Brain1.R5Brain2.R5
Supplementary Figure 1
Mutation map for Patient 20Primary1Primary2Primary3Primary4Ovary1.R2Ovary2.R2Ovary3.R2Ovary4.R2Ovary5.R2Ovary6.R2Ovary7.R2Ovary8.R2Ovary9.R2
Supplementary Figure 1
Validationofphylogenetictreesusingmoreconservativemutation-filteringcriteria
Forthephylogenetictreesreportedinthemanuscript(Figure2,3,4andSupplementaryFigure6),weusedmutationsobtainedusing
mutation-callingcriteriaasdescribedintheMethodssection.Next,wetestedtherobustnessoftheseresultsbyremovingmutations
affectedbyvariablecoverageand/ordifferenttumorpurityamongsamples.Forthis,weusedmutation-filteringcriteriaasdescribedin
“Validationofphylogenetictrees”subsectionoftheMethodsection.
Inthefollowingpages,wepresent,foreachpatient,aside-by-sidecomparisonofthetreereportedinthemanuscript(termedhereOldtree)vs.theonereconstructedusingthemoreconservativemutationselectioncriteria(termedhereNewtree).
Supplementary Figure 11
Primary
Lung2.R1
Lung1.R1
Liver2.R2
Liver1.R2
Germline
Patient 1: Old tree
Primary
Lung2.R1
Lung1.R1
Liver2.R2
Liver1.R2
Germline
Patient 1: New tree
Supplementary Figure 11
Colon2.R1
Primary
ALN
Colon1.R1
Germline
Patient 2: Old tree
Colon1.R1
Primary
ALN
Colon2.R1
Germline
Patient 2: New tree
Supplementary Figure 11
ALN
Primary
Bone.R1
Germline
Patient 3: Old tree
ALN
Primary
Bone.R1
Germline
Patient 3: New tree
Supplementary Figure 11
Colon.R4Primary1Primary2Primary3Primary4Uterus.R2Primary5Primary6Brain.R3Germline
Patient 4: Old tree
Uterus.R2Primary1Primary5Primary2Primary4Primary3Primary6Colon.R4Brain.R3Germline
Patient 4: New tree
Supplementary Figure 11
Primary1
Primary2
Bone.R2
Bone.R1
Germline
Patient 5: Old tree
Primary2
Primary1
Bone.R2
Bone.R1
Germline
Patient 5: New tree
Supplementary Figure 11
Skin1.R1
Skin2.R1
Primary.local.recurrence
Germline
Patient 7: Old tree
Skin2.R1
Skin1.R1
Primary.local.recurrence
Germline
Patient 7: New tree
Supplementary Figure 11
Primary
ALN
Skin
Bone
Germline
Patient 8: Old tree
ALN
Primary
Skin
Bone
Germline
Patient 8: New tree
Supplementary Figure 11
Primary
Brain1.R1
Brain2.R1
Brain3.R1
Germline
Patient 9: Old tree
Primary
Brain1.R1
Brain2.R1
Brain3.R1
Germline
Patient 9: New tree
Supplementary Figure 11
Primary1Skin1.R2Skin2.R2Skin3.R2Primary2ALN2ALN1Germline
Patient 10: Old tree
ALN2Primary1Skin2.R2Skin1.R2Skin3.R2Primary2ALN1Germline
Patient 10: New tree
Supplementary Figure 11
Brain1.R1Primary1Primary2Primary3Primary4Primary5Brain2.R1Germline
Patient 11: Old tree
Brain1.R1Primary1Primary2Primary4Primary3Primary5Brain2.R1Germline
Patient 11: New tree
Supplementary Figure 11
Primary.local.recurrence.1
Primary.local.recurrence.2
ALN.relapseBone2.R1
Bone1.R1
Germline
Patient 13: Old tree
Primary.local.recurrence.1
Primary.local.recurrence.2
ALN.relapseBone2.R1
Bone1.R1
Germline
Patient 13: New tree
Supplementary Figure 11
ALN
Primary
Brain2.R1
Brain1.R1
Germline
Patient 14: Old tree
ALN
Primary
Brain2.R1
Brain1.R1
Germline
Patient 14: New tree
Supplementary Figure 11
Primary
ALN
Liver.R1
Germline
Patient 15: Old tree
Primary
ALN
Liver.R1
Germline
Patient 15: New tree
Supplementary Figure 11
Skin1.R1Skin2.R1Skin3.R1Primary.local.recurrence.3Primary.local.recurrence.2Primary.local.recurrence.1Germline
Patient 16: Old tree
Skin3.R1Skin2.R1Skin1.R1Primary.local.recurrence.2Primary.local.recurrence.3Primary.local.recurrence.1Germline
Patient 16: New tree
Supplementary Figure 11
Primary1
Primary2
ALN2
ALN1
Brain.R1
Germline
Patient 17: Old tree
Primary1
Primary2
ALN2
ALN1
Brain.R1
Germline
Patient 17: New tree
Supplementary Figure 11
ALN1Skin2.R2Skin1.R2Primary1Primary2ALN2Germline
Patient 18: Old tree
Skin2.R2Skin1.R2ALN1Primary2Primary1ALN2Germline
Patient 18: New tree
Supplementary Figure 11
Primary
Brain2.R5
Brain1.R5
Brain1.R3
Germline
Patient 19: Old tree
Primary
Brain2.R5
Brain1.R5
Brain1.R3
Germline
Patient 19: New tree
Supplementary Figure 11
Ovary1.R2Ovary2.R2Ovary3.R2Ovary4.R2Ovary5.R2Ovary7.R2Ovary6.R2Primary1Primary2Primary3Primary4Ovary9.R2Ovary8.R2Germline
Patient 20: Old tree
Ovary1.R2Primary2Primary1Primary3Primary4Ovary3.R2Ovary7.R2Ovary2.R2Ovary4.R2Ovary6.R2Ovary5.R2Ovary9.R2Ovary8.R2Germline
Patient 20: New tree
Supplementary Figure 11
Efficient filtering of FFPE-related C>T/G>A artifacts
To account for potential artifacts induced by formalin-fixed paraffin embedded (FFPE) samples, we employed mutation-
filtering criteria described in the “Variant calling, filtering, and copy number alteration detection” subsection in Methods.
Apartfromotheranalysis,thisalsoinsuredthattheageofFFPEsamplescouldnotnegativelyinfluencethesignatureanalysis.
Toshowhoweffectiveourfilteringwas,wedividedoursamplesintotwogroups,i.e.<=2004and>2004.Thenwecompared
the number of C>T/G>A substitutions between the two groups before and after the filtering. According to the results, the
significantdifferenceinnumberofC>T/G>Asubstitutionsseenbeforefilteringwasefficientlycanceledafterfiltering.
ThefigureshowsNumberofC>Tmutationsbeforeandafterfiltering.Thep-valuesreportedaretwo-sided.
Supplementary Figure 12
50
100
150
200
<= 2004 (n = 50) > 2004 (n = 49)Year of fixation
Num
ber o
f C>T
mut
atio
ns b
efor
e fil
terin
g p−value = 0.03
40
80
120
160
<= 2004 (n = 50) > 2004 (n = 49)Year of fixation
Num
ber o
f C>T
mut
atio
ns a
fter f
ilter
ing
p−value = 0.86Supplementary Figure 12
0.4
0.6
0.8
Chemotherapy (n =10) No chemotherapy (n = 5)
Frac
tion
of m
etas
tasi
s−sp
ecifi
c m
utat
ions
p−value = 0.51A
0.4
0.6
0.8
Hormonal Therapy (n =9) No Hormonal Therapy (n = 6)
Frac
tion
of m
etas
tasi
s−sp
ecifi
c m
utat
ions
p−value = 0.33B
0.4
0.6
0.8
Radiotherapy (n =12) No Radiotherapy (n = 3)
Frac
tion
of m
etas
tasi
s−sp
ecifi
c m
utat
ions
p−value = 0.45CSupplementary Figure 13
0.2
0.4
0.6
0.8
< 2002 (n = 9) >= 2002 (n = 10)Year of fixation
Frac
tion
of s
ampl
e−sp
ecifi
c m
utat
ions
(lea
ves)
p−value = 0.55A
0.2
0.4
0.6
0.8
Coverage > 79 (n = 9) Coverage < 79 (n = 10)Sequencing Coverage
Frac
tion
of s
ampl
e−sp
ecifi
c m
utat
ions
(lea
ves)
p−value = 0.36B
SUPPLEMENTARY FIGURE LEGENDS
Supplementary Figure 1: Exome sequencing coverage statistics. Stacked bar showing the percentage of target
regions covered at certain coverage. Each bar represents one sample and the bars are grouped by patient.
Supplementary Figure 2: Schematic representation of treatment history, number of relapses, relapse
locations, sequenced relapsed sites, PAM50 intrinsic molecular subtypes for primary tumors and survival
timeline of patients in our cohort. Color bands, whose length is proportional to the timescale, represent different
treatment types. Each molecular subtype is represented by its own specific color. Failed and un-sequenced samples
are colored grey. L+, positive axillary lymph node; Loc, Local relapse; Contr: contralateral event; BL, Basal like; LA,
Luminal A; LB, Luminal B; H2, Her 2 enriched; NBL, Normal breast like CT, chemotherapy; RT, radiotherapy; HT,
hormonal therapy; M1, metastasis 1; M2, metastasis 2; M3, metastasis 3; M4, metastasis 4;
Supplementary Figure 3: Analysis pipeline for investigating tumor progression models in breast cancer.
Given the exome-sequencing data, Mutect was used for calling somatic mutations while AscatNGS was used for
estimating tumor purity and copy number aberrations. The input to phylogenetic reconstruction, using Dollo
Parsimony, consisted of a binary matrix obtained by first weighing the mutant allele frequency by tumor purity and
then thresholding the resulting values by 0.05. To infer the statistical support of internal vertices, non-parametric
bootstrapping was used. The phylogenetic analysis resulted in a tree with bootstrap support. The input to subclonal
reconstruction (using PyClone) consisted of mutant allele frequency, copy number aberrations and tumor purity data.
The subclonal analysis resulted in inferred clusters, represented here as density plot, which shows the cellular
prevalence of each cluster (or subclone) in each sample. Finally the output from phylogenetic and subclonal analysis
is integrated as a tree containing the subclonal information as colored (single clone) or dotted (multiple clones) lines
along its edges. Edge lengths in the tree are scaled by number of substitutions while internal vertices are marked with
bootstrap support values.
Supplementary Figure 4: Separating property in the tumor tree. (A) Inferring the role of axillary lymph node in
seeding distant metastasis, based on the separating property in tumor tree. We observe that Germline-to-Primary
path (color red) is separating the path from “axillary lymph node” to “Metastasis 1”, “Metastasis 2a”, and “Metastasis
2b”. Thus, we infer that Primary, rather than Lymph node, has seeded distant metastases. (b) Inferring linear
progression based on the separating property in tumor tree. We observe that Germline-to-Primary path (colored red)
is not separating the path from “Metastasis 1” to the two blocks of “Metastasis 2”, namely “Metastasis 2a” and
“Metastasis 2b”. Thus we infer that “Metastasis 1”, rather than Primary, has seeded “Metastasis 2”.
Supplementary Figure 5: Pairwise mutation heatmaps for each patient in the cohort. For each patient, the
fraction of shared and specific mutations is presented in the left and right column respectively. The heatmap in the
left column illustrates, for row i and column j, the fraction of shared mutations between i and j divided by the total
mutations in both samples. The heatmap on the right illustrates, for row i and column j, the fraction of specific
mutations present in sample i but absent in present j. Pairwise mutation heatmaps are not given for patient 6 and 12
due to low number of samples. (A) Patient 1 (B) Patient 2 (C) Patient 3 (D) Patient 4 (E) Patient 5 (F) Patient 7 (G)
Patient 8 (H) Patient 9 (I) Patient 10 (J) Patient 11 (K) Patient 13 (L) Patient 14 (M) Patient 15 (N) Patient 16 (O)
Patient 17 (P) Patient 18 (Q) Patient 19 (R) Patient 20.
Supplementary Figure 6: Phylogenetic trees and subclonal information for each patient in the cohort. For
each patient, subclonal information is embedded in the phylogenetic, which is presented as subfigure I. In the tree,
edge lengths are proportional to the number of mutations, with the actual number given in parenthesis for each edge.
The list of alterations in putative driver genes are given for each edge, with mutations, amplifications and deletions
shown in black, red and blue color respectively. The information about individual subclones is given in tabular format
which includes cluster ID of the subclone, the color used for subclone in phylogenetic tree (and the density plot),
number of mutations in the subclone and list of putative driver genes included in the subclone. The density plot is
given as subfigure II, which shows the cellular prevalence of each subclone in each sample. In the density plot, the
cluster IDs along with the number of mutations are given on x-axis while their cellular prevalence in samples are
given on y-axis. Figures for the patients are given in numeric order and exclude patients 6 and 12 since the number
of samples, in both cases, is less than 3 (minimum number for reconstructing a phylogenetic tree). (A) Patient 1 (B)
Patient 2 (C) Patient 3 (D) Patient 4 (E) Patient 5 (F) Patient 7 (G) Patient 8 (H) Patient 9 (I) Patient 10 (J) Patient 11
(K) Patient 13 (L) Patient 14 (M) Patient 15 (N) Patient 16 (O) Patient 17 (P) Patient 18 (Q) Patient 19 (R) Patient 20.
Supplementary Figure 7: (A) i. Residuals sum of squares (RSS) as a function of number of signatures attempted.
Dots represent mean values and bars represent standard errors. ii. Explained variance as a function of number of
signatures attempted. Dots represent mean values and bars represent standard errors. (B) A heatmap of Euclidean
distances between the extracted four signatures (x-axis) and the published signatures (y-axis). (C) Barplots showing
the frequencies of six classes of substitutions in both the transcribed strand (red) and the untranscribed strand (blue)
across the four extracted signatures.
Supplementary Figure 8: Heatmap showing the copy number landscape across samples for each patient. To
visualize the varying landscape of copy numbers between different samples, copy number heatmap for each patient
is given.
Supplementary Figure 9: Lineage analysis for all patients performed using LICHeE. In order to validate our
tumor progression results, we used LICHeE to reconstruct lineage trees for all patients (except patient 6, which has
only a single sample) as described in “Validation of phylogenetic trees” subsection in Methods. LICHeE uses variant
allele frequencies of somatic mutations to reconstruct multi-sample cell lineage trees and infer the subclonal
composition of the samples. (A) Patient 1 (B) Patient 2 (C) Patient 3 (D) Patient 4 (E) Patient 5 (F) Patient 7 (G)
Patient 8 (H) Patient 9 (I) Patient 10 (J) Patient 11 (K) Patient 12 (L) Patient 13 (M) Patient 14 (N) Patient 15 (O)
Patient 16 (P) Patient 17 (Q) Patient 18 (R) Patient 19 (S) Patient 20.
Supplementary Figure 10: Mutations heatmap for all patients to visualize the shared and specific mutations
among samples
Supplementary Figure 11: Comparison of phylogenetic trees reconstructed using two different mutation-
filtering criteria. Validation of phylogenetic trees by removing mutations affected by variable coverage and/or
different tumor purity among samples as described in “Validation of phylogenetic trees” subsection of the Method
section. For each patient, a side-by-side comparison of the tree reported in the paper (termed here Old tree) vs. the
one reconstructed using the more conservative mutation selection criteria (termed here New tree) is presented.
Supplementary Figure 12: Efficient filtering of FFPE-related C>T/G>A artifacts. Boxplots showing the number
of C>T/G>A mutations (Y-axis) in two groups of samples (x-axis) defined based on sample age. The comparison was
performed both before applying filtering (A) and after applying filtering (B). P-values are computed based on two-
sided Mann-Whitney test.
Supplementary Figure 13: Effect of treatment on fraction of metastasis-specific mutations. Boxplots showing
the proportions of metastasis-specific mutations (Y-axis) between two groups of patients divided based on treatment
history. Each type of treatment was tested separately: (A) chemotherapy, (B) hormonal therapy and (C)
radiotherapy. Only patients where at least one primary and one distant metastasis sample have been sequenced (15
patients) were considered for this comparison. For each patient, the fraction of metastasis mutations that are not
detected in primary tumor was computed for each distant metastasis site. In patients where more than one distant
metastasis site were sequenced, we chose the highest fraction. P-values were computed using two sided Mann-
Whitney test.
Supplementary Figure 14: Effect of sample age and coverage on fraction of sample-specific mutations
(length of leaves in phylogenetic trees). Boxplots showing the proportion of sample-specific mutations (Y-axis) for
two groups of patients divided based on time of first sample acquisition (A) and based on average sample coverage
(B). P-values are computed based on two-sided Mann-Whitney test.