43. Harismendy, O. et al. Detection of low prevalence somatic mutations
in solid tumors with ultra-deep targeted sequencing. Genome Biol. 12,
R124 (2011).
44. Forshew, T. et al. Noninvasive identification and monitoring of cancer
mutations by targeted deep sequencing of plasma DNA. Sci. Transl. Med. 4,
136ra168 (2012).
45. Yoshida, K. et al. Frequent pathway mutations of splicing machinery in
myelodysplasia. Nature 478, 64–69 (2011).
46. Haferlach, T. et al. Landscape of genetic lesions in 944 patients with
myelodysplastic syndromes. Leukemia 28, 241–247 (2014).
47. Suzuki, H. et al. Mutational landscape and clonal architecture in grade II and
III gliomas. Nat. Genet. 47, 458–468 (2015).
48. Shiraishi, Y. et al. An empirical Bayesian framework for somatic mutation
detection from cancer genome sequencing data. Nucleic Acids Res. 41,
e89 (2013).
49. Niida, A., Imoto, S., Shimamura, T. & Miyano, S. Statistical model-based
testing to evaluate the recurrence of genomic aberrations. Bioinformatics 28,
i115–i120 (2012).
50. Arber, D. A. et al. The 2016 revision to the World Health Organization
classification of myeloid neoplasms and acute leukemia. Blood 127,
2391–2405 (2016).
Nature Medicine
Acknowledgements
This work was supported by the Japan Agency for Medical Research and Development
(nos. JP15cm0106056h0005, JP19cm0106501h0004, JP16ck0106073h0003 and
JP19ck0106250h0003 to S.O.; nos. JP17km0405110h0005 and JP19ck0106470h0001 to
H.M.; and no. JP19ck0106353h0003 to Y.N.); the Core Research for Evolutional Science
and Technology (no. JP19gm1110011 to S.O.); the Ministry of Education, Culture, Sports,
Science and Technology of Japan; the High Performance Computing Infrastructure
System Research Project (nos. hp160219, hp170227, hp180198 and hp190158 to S.O.
and S. Miyano) (this research used computational resources of the K computer provided
by the RIKEN Advanced Institute for Computational Science through the HPCI System
Research project); the Japan Society for the Promotion of Science; Scientific Research on
Innovative Areas (nos. JP15H05909 to S.O. and S. Miyano and JP15H05912 to S. Miyano)
and KAKENHI (nos. JP26221308 and JP19H05656 to S.O., JP16H05338 and JP19H01053
to H.M. and JP15H05707 to S. Miyano); and the Takeda Science Foundation (to S.O.,
H.M. and T.Y.). S.O. is a recipient of the JSPS Core-to-Core Program A: Advanced
Research Networks. DNA samples and subjects’ clinical data were provided by BBJ, the
Institute of Medical Science, the University of Tokyo. The supercomputing resource was
provided by the Human Genome Center, the Institute of Medical Science, the University
of Tokyo. We thank K. Matsuo at the Aichi Cancer Center Research Institute (Nagoya,
Japan), who suggested the design of the case-cohort study for estimation of cumulative
mortality from, and incidence of, hematological malignancies. We thank the TCGA
Consortium and all its members for making publicly available their invaluable data.
Author contributions
R.S., H.M. and S.O. designed the study. K.M., Y. Kamatani, T.M. and Y. Murakami provided
DNA samples and clinical data. Y. Kuroda and S. Matsuda provided bone marrow samples.
C.T. and Y. Kamatani performed copy number analysis. Y. Momozawa and M.K. performed
sequencing. M.M.N. performed cell sorting and single-cell analysis. R.S., M.M.N., Y.O.,
T.Y., Y.S., K.C., H.T., A.N., S.I. and S. Miyano performed bioinformatics analysis. R.S., Y.N.,
M.M.N., Y.O., T.Y., H.M. and S.O. prepared the manuscript. All authors participated in
discussions and interpretation of the data and results.
Competing interests
The authors declare no competing interests.
Additional information
Extended data is available for this paper at https://doi.org/10.1038/s41591-021-01411-9.
Supplementary information The online version contains supplementary material
available at https://doi.org/10.1038/s41591-021-01411-9.
Correspondence and requests for materials should be addressed to S.O.
Peer review information Nature Medicine thanks Daniel Link, Duane Hassane, Todd
Druley and the other, anonymous, reviewer(s) for their contribution to the peer review
of this work. Michael Basson was the primary editor on this article and managed its
editorial process and peer review in collaboration with the rest of the editorial team.
Reprints and permissions information is available at www.nature.com/reprints.
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
CH(+)
Case-control study for all HM
Case
(n=672)
Total
376
296
672
160
55
215
19
63
27
90
25
34
75
25
100
11
11
CNA alone
Both
All CH(+)
154
115
107
53
41
66
AML
32
12
MDS
16
MPN
CML
Others
Case
Myeloid
Lymphoid
90
69
32
191
229
420
B-NHL
61
44
18
123
143
266
T-NHL
15
17
32
CLL
ALL
12
19
17
12
36
53
89
11
25
12
37
Control
2,177
1,399
633
4,209
6,353
10,562
Total
2,331
1,514
740
4,585
6649
11,234
SNV alone
CNA alone
Both
All CH(+)
CH( )
Total
14
11
32
23
55
14
19
AML
MDS
10
11
Control
(n=10,562)
MM/PCT
Others
Linage Unknown
CH( )
SNV alone
Subcohort
CH(+)
Case-cohort study for HM death
Hematogical malignancy (+)
Myeloid
Target cohort (n=43,662 * )
(≥60 y.o. and no cancer history)
MPN
CML
Others
17
18
35
B-NHL
13
14
27
T-NHL
CLL
Lymphoid
Subcohort
(n=7,937 ** )
ALL
MM/PCT
Others
Hematological malignancy ( )
1,614
1,036
447
3,097
4,785
7,882
Total
1,628
1,047
454
3,129
4,808
7,937
Linage Unknown
Case
(n=401)
overlap
(n=55)
Case (Death from HM)
CH(+)
**
Among 60,787 cases aged ≥60 years and
confirmed not to have solid cancers as of March
2013, 43,662 had the follow up data for survival.
SNV alone
CNA alone
Both
All CH(+)
CH( )
Total
109
63
67
239
162
401
41
24
42
107
39
146
AML
24
40
20
60
MDS
14
13
23
50
17
67
MPN
CML
Others
62
38
22
122
122
244
B-NHL
38
25
11
74
74
148
T-NHL
12
21
CLL
ALL
13
25
28
53
Hematogical malignancy (+)
Myeloid
Among 10,623 cases randomly selected from the
60,787 cases, 7,937 had the follow up data for
survival.
Lymphoid
MM/PCT
Others
Linage Unknown
Hematological malignancy ( )
Total
10
11
109
63
67
239
162
401
Extended Data Fig. 1 | Design of case-control and case-cohort study. a, Design of case-control study (Left). Diagnosis of hematological malignancies
(HM) in subjects with or without CH enrolled in the case-control study (Right). b, Design of case-cohort study for death from HM (Left). Diagnosis of
HM in subjects with or without CH enrolled in the case-cohort study (Right). AML, acute myeloid leukemia; MDS, myelodysplastic syndromes; MPN,
myeloproliferative neoplasms; CML, chronic myeloid leukemia; B-NHL, B-cell non-Hodgkin lymphoma; T-NHL, T-cell non-Hodgkin lymphoma; CLL, chronic
lymphoid leukemia; ALL, acute lymphoblastic leukemia; MM, multiple myeloma; PCT, plasma cell tumor.
Nature Medicine | www.nature.com/naturemedicine
Articles
Number of subjects
Nature Medicine
1500
1000
500
Number of subjects
T3
TE A
A T2
SX
PP L1
1D
TP
SF 53
3B
SR 1
SF
G L
JA 1
U 2
2A
G 1
EZ S
ID
RU H2
K 1
N S
ET S
M V6
YD
88
ID
1500
1000
500
14
qU
PD
de 21q
l(2
0q
+1 )
1p 5q
1q PD
de PD
de (5q
l(1 )
11 3q
qU )
9q PD
9p PD
6p PD
17 PD
qU
4q PD
de PD
l(1
1q
+2 )
de 2q
l(6
de q
l(1 )
4q
17 +8q
pU
PD
FDR
< 0.01
< 0.001
< 0.01
> 100
< 0.1
cooccurr
in ≥5 cases
0.6
0.5
0.4
0.3
0.2
0.1
0.1
0.2
0.3
0.4
78.0
12qUPD
TET2
p.I1873T
39.9
79.8
+21
Subject 2
DNMT3A
p.W313X
57.1
100.0
del(20q)
14UPD
7.3
Subject 3
TP53
p.G205S
51.3
100.0
del(6q)
9.2
Subject 4
DNMT3A
p.R882H
34.8
69.6
ETV6
Subject 8
0.1
0.5
0.6
0.1
SNV/indels
p.G375R
VAF (%) Cell fractions (%) CNA
0.3
0.2
0.1
0.2
5.5
p.R166Q
19.9
39.8
p.S34F
35.1
70.2
SF3B1
p.K700E
27.5
55.0
TET2
p.R1216X
U2AF1
p.S34Y
p.T188A
2.2
4.4
34.0
68.0
31.1
1.9
+8
67.4
del(13q)
57.0
0.3
62.2
SRSF2
p.P95R
30.2
TET2
p.V1900A
32.4
64.8
JAK2
p.V617F
64.8
100.0
TET2
p.L748X
48.0
96.0
p.E184X
41.5
83.0
0.5
0.3
0.2
0.1
0.3
0.4
0.5
0.6
0.1
21
TCRA
22
0.2
0.3
0.4
0.5
0.6
VAF of TET2
0.5
0.4
0.3
SNV/indel
Both
CNA
21 %
7%
13 %
0.2
0.1
0.6
0.1
0.2
0.3
0.4
0.5
0.6
VAF of CBL
Position on chr14 (Mb)
23
24
Proportion of subjects (%)
10
20
30
40
50
60
70
80
TP53
DNMT3A
TET2
JAK2
ASXL1
del(20q)
PPM1D
2.3
9p+
60.4
8+
60.7
3+
61.2
del(20q)
54.6
9pUPD
28.5
SF3B1
Number of alterations
GNB1
3 ≥4
CBL
60.4
TET2
0.4
VAF of SRSF2
11.0
RUNX1
U2AF1
RUNX1
1.4
0.4
0.6
Cell fraction (%)
21.4
0.2
VAF of ASXL1
0.4
0.1
0.5
VAF of TET2
0.6
0.5
39.0
Subject 7
0.2
0.6
T3
TE A
A T2
PP XL1
TP D
SF 53
SR B1
SF
G L
JA 1
U K2
2A
G F1
EZ S
ID 2
2p 2
4q OH
17 O
pL H
1p OH
1q PD
6p PD
9p PD
9q PD
11 UP
q D
14 UP
q D
16 UP
p D
17 UP
qU D
de PD
l(
de 5q
de l(6q
l( )
d 11
de el(1 q)
l(1 3q
4q )
de 1
l(2 1)
0q
+1
+2
VAF of ASXL1
p.V927fs
Subject 6
0.3
VAF of TET2
Gene
TET2
Subject 5
0.4
Subject 1
0.5
2pLOH
4qLOH
17pLOH
1pUPD
1qUPD
6pUPD
9pUPD
9qUPD
11qUPD
14qUPD
16pUPD
17qUPD
del(5q)
del(6q)
del(11q)
del(13q)
del(14q11)
del(20q)
+15
+21
+22
0.6
VAF of SRSF2
Odds ratio
VAF of DNMT3A
TET2
ASXL1
PPM1D
TP53
SF3B1
SRSF2
CBL
GNB1
JAK2
U2AF1
GNAS
EZH2
IDH2
VAF of ASXL1
SRSF2
Deletion
TET2 SNV/indels
U2AF1
GNAS
10
20
30
40
50
60
70
80
Extended Data Fig. 2 | Landscape of genetic alterations in CH. a-b, The number of subjects with individual SNVs/indels (a) and CNAs (b). The vertical
axis represents the number of subjects with indicated alterations. Unclassifiable CNAs are not included in (b). c, Landscape of SNVs/indels and CNAs
in 11,234 subjects. Those without CH-related alterations are omitted. d, The correlations between individual genetic alterations. Combinations seen in 5
or more cases are indicated by asterisks. e-i, VAF of cooccurring SNVs/indels in diagonal plot. Dots above the dashed line fulfill ‘pigeonhole principle’. j,
Venn diagram illustrating the overlap between subjects with SNVs/indels and those with CNAs. Frequencies within all subjects in whom SNVs/indels and
CNAs were examined (n = 11,234) are indicated. k, Subjects in whom cooccurring SNVs/indels and CNAs were suspected to coexist in the same cells on
the basis of ‘pigeonhole principle.’ l, A magnified illustration of microdeletions around TCRA locus (14q11.2). A gray bar represents gene body of TCRA.
Blue horizontal bars represent microdeletions. Cooccurring TET2 SNVs are indicated by red dots. Genomic coordinates in hg19 are indicated above. m,
Proportions of subjects with different number of cooccurring alterations within those who harbor SNVs/indels in the indicated genes. The proportions of
subjects with 1, 2, 3, and ≥4 CNAs are depicted by different colors.
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
Chr1
Chr2
Chr3
Chr4
TET2
DNMT3A
Chr5
Chr6
GNB1
Chr8
Chr7
TNFAIP3
TCRB
EZH2
HLA
Chr9
Chr10
Chr11
Chr12
Chr13
miR-15a
miR-16-1
CBL
Chr16
Chr17
Chr18
Chr15
TCRA
ATM
JAK2
Chr14
Chr19
Chr20
Chr21
NF1
Chr22
CHEK2
TP53
50 subjects
RUNX1
Type of CNAs
Cell fraction (%)
Duplication
0.1
10 100
Deletion
0.1
10 100
UPD
0.1
Unclassifiable
Cooccurring SNV/indels
10 100
Extended Data Fig. 3 | Distribution of CNAs in all chromosomes. Distributions of CNAs on all chromosomes are illustrated. Loci of known driver genes are
indicated by arrows. Each horizontal bar represents one CNA. Cooccurring SNVs/indels are indicated by red dots. Types of CNAs are depicted by different
colors as indicated in the annotations.
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
Duplication
Current Study
Loh et al. 2020
+1q
Current Study
6pUPD
9pUPD
9qUPD
+15q
+18
+21q
+22q
10
11
12
13
14
15
16
17
18
19
20
21
22
+12
11qUPD
+14q
14qUPD
17qUPD
del(6q16-24) TNFAIP3
10
11
12
13
14
15
16
17
18
19
20
21
22
del(7q32-36)
TCRB, EZH
11pUPD
10
11
12
13
14
15
16
17
18
19
20
21
22
del(11q14-23) ATM
del(13q13-31) miR-15a miR-16-1
12qUPD
13qUPD
16p,16qUPD
15qUPD
del(14q11.2) TCRA
del(17p13-11) TP53
del(20q11-13)
22qUPD
del(4q23-24)
TET2
del(5q14-32)
del(3p13)
4qUPD
Loh et al. 2020
del(2p23)
DNMT3A
+3q
+8
Current Study
1qUPD
Deletion
Loh et al. 2020
1pUPD
UPD
del(21q)
del(8p23.1)
del(10q25-26)
FRA10B
del(22q12) CHEK2
1.3
1.2
1.1
Frequency (%)
1.0
0.9
0.8
Cell fraction <5%
This study
Loh et al, 2020
Laurie et al, 2012
Jacobs et al, 2012
0.7
0.6
0.5
0.4
0.3
0.2
0.1
26
−q
+8
+3
25
0q
l(1
de
de
l(2
0q
de 1−1
l(1 3)
4q
11
de +1
l(2 5
p2
1p 3)
de P
l(9 D
de 14 31)
l(1 qU
de q13 D
l(5 −3
de 14 1)
l(6 −3
q1 2)
6−
24
+1
9p q
PD
+2
1q 1
11 PD
qU
16 PD
qU
de 12 PD
l(1 qU
de q14 D
l(7 −2
q3 3)
de −36
l(3 )
de p1
l(2 3)
2q
9q 2)
PD
+2
de
l(1 qU
7p PD
13
17 11)
qU
6p D
PD
11 18
pU
PD
de +
l(2 12
1q
4q 2)
16 PD
pU
13 PD
qU
15 PD
qU
PD
0.4
Frequency (%)
0.3
Enriched in the current study
Cell fraction ≥5%
Enriched in Loh et al. 2020
0.2
0.1
+8
q2 +3
5− q
q2
6)
(1
de
de
l(2
0q
de 1−1
l(1 3)
4q
11
de +1
l(2 5
p2
1p 3)
de P
l(9 D
de 14 31)
l(1 qU
3q PD
de 13
l(5 −3
de q14 1)
l(6 −3
q1 2)
6−
24
+1
9p q
PD
+2
1q 1
11 PD
qU
16 PD
qU
de 12 PD
l(1 qU
de q14 D
l(7 −2
q3 3)
de −36
l(3 )
de p1
l(2 3)
2q
9q 2)
PD
de 22 22
l(1 qU
7p PD
13
17 11)
qU
6p D
PD
11 18
pU
PD
de +
l(2 12
1q
4q 2)
16 PD
pU
13 PD
qU
15 PD
qU
PD
Extended Data Fig. 4 | Chromosomal regions significantly affected by CNAs. a-c, Chromosomal regions significantly affected by duplications (a), UPDs
(b), and deletions (c) in a Japanese cohort (current study) and in a British cohort11. Statistical significance for recurrence of CNAs were evaluated by PART49.
Dashed lines indicate thresholds for statistical significance (FDR = 0.25). d-e, Comparison of frequencies of individual CNAs between the current and
previous studies8,9,11. Comparisons were performed in those aged 60-75 years. In (d) or (e), CNAs in <5% or ≥5% cell fractions were taken into account,
respectively. CNAs significantly enriched in either cohort (FDR < 0.1) were indicated by asterisks in (e).
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
542
0.3
SNV/indels
CNAs
0.2
200
60
SNV/indels + CNA
SNV/indel alone
CNA alone
Age (Years)
Frequency (%)
Count
DNMT3A
TET2
ASXL1
PPM1D
TP53
JAK2
SF3B1
SRSF2
GNAS
GNB1
CBL
U2AF1
IDH2
MYD88
EZH2
KRAS
NRAS
20q
13q
14q
12q
9p
11q
15q
4q
22q
7q
1p
11p
1q
9q
17q
others
−9
−6
Number of alterations
90
85
−8
80
−7
21
70
56
0.1
−6
65
Frequency
Number of subjects
400
−8
600
75
chr1
chr2
chr3
chr4
MYD8
TET2
chr5
chr6
chr7
chr8
chr9
EZH2
JAK2
chr10
chr11
chr12
chr13
chr14
chr15
SNV/indel
Missense
Inframe indel
Splice−site
Frameshift indel
Stop−gain
Multiple
chr16
chr17
UPD
Unclassifiable
chr19
chr20
chr21
chr22
Cell fraction (%)
10
100
Duplication
Duplication
Deletion
chr18
0.1
CNA
TP53
Deletion
UPD
Unclassifiable
SNV/indels
Extended Data Fig. 5 | Analysis of SNVs/indels and CNAs in peripheral blood samples in TCGA cohort. a, Distribution of the number of genetic
alterations in each subject. Subjects with SNVs/indels alone, with CNAs alone, or with both of them are illustrated by different colors. b, Solid lines indicate
the prevalence of CH-related SNVs/indels and CNAs, according to age. Colored bands represent the 95% confidence intervals. c, The landscape of
CH-related SNVs/indels and CNAs. Each row represents genetic alterations or affected chromosomal arms, and each column represents subjects. Subjects
without any alterations are omitted. Types of SNVs/indels and CNAs are depicted by different colors. d, Distributions of CNAs on all chromosomes are
illustrated. Loci of cooccurring SNVs/indels are indicated by arrows. Each horizontal bar represents one CNA. Cooccurring SNVs/indels are indicated by
red asterisks. Types of CNAs are depicted by different colors.
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
HM(-)
HM(+)
15
10
TP53
TET2
JAK2 DNMT3A
GNB1
CBL
RUNX1
SNV alone (n=1,357)
30
SNV+CNA, different loci (n=332)
Cumulative mortality from
cardiovascular diseases
20
25
20
15
10
SNV+CNA, same loci (n=42)
0.2
P = 0.34
0.1
No alteration (n=4,097)
35
Proportion of SNV/indels
associated with CNAs in the same gene (%)
Number of cases with
SNV/indels and CNAs in the same gene
25
0.3
EZH2
TP53
TET2
JAK2
DNMT3A
GNB1
RUNX1
CBL
EZH2
20
40
60
80
100
120
140
Months
0.2
P value
No alterations (n=4.947)
SNV/indel alone (n=1,723)
0.12
8.3×10
3.9×10
0.8
-5
-3
SNV+CNA, different genes/loci (n=450)
SNV+CNA, same genes/loci (n=64)
0.08
SNV/indel + CNA in the same genes/loci
other than TP53 + 17p (n=46)
TP53 + 17pLOH (n=18)
0.04
20
40
60
80
100
120
1.6×10
1.5×10
0.6
0.4
No alteration (n=4,097)
0.047
SNV alone (n=1,357)
0.2
SNV+CNA, different loci (n=332)
SNV+CNA, same loci (n=42)
Number of subjects within the case-cohort design.
P = 0.16
140
20
40
60
80
100
120
140
Months
Months
-2
-5
10
11
12
13
14
15
16
17
18
19 21
20 22
WBC Hb
Plt Ht
#CNA
P = 0.0085
0.15
Cumulative mortality from
hematological maligancies
Overall survival
Cumulative mortality from
hematological maligancnies
0.16
TP53 + 17p alt.
(n=29) *
0.1
TP53 without 17p alt.
(n=81) *
0.05
20
40
60
80
100
120
140
Months
of subjects within the case-cohort design.
* Number
SNVs/indels of TP53 include those detected by ddPCR.
17p alt.
#CNA ≥3
del(5q)
Number of TP53 SNVs
Copy-number alterations
Blood counts abnormality
#CNAs
Single TP53 SNV
Duplication
UPD
Low
Normal
≥3 CNAs
Multiple TP53 SNV
Deletion
Unclassifiable
High
Unknown
<3 CNAs
10
100
Odds ratio for mortality from MDS in
subjects with TP53-involving SNVs/indels (n=165)
Extended Data Fig. 6 | Interplay between SNVs/indels and CNAs. a, Number of subjects with SNVs/indels and CNAs involving the same genes/loci. b,
Proportion of SNVs/indels associated with CNAs in the same genes/loci. c, Cumulative mortality from hematological malignancies. d, Cumulative mortality
from cardiovascular diseases. e, Survival curves for overall survival. f, Profiles of CNAs in subjects with SNVs/indels in TP53. Abnormally high or low blood
counts (WBC, Platelet, hemoglobin, and hematocrit) are indicated by red or blue, respectively. Numbers of cooccurring CNAs are indicated on the right
side (#CNA), where subjects with ≥3 CNAs were highlighted by purple. Subjects without any CNA are abbreviated. g, Mortality from hematological
malignancies in TP53-mutated cases with or without CNAs in 17p. h, Odds ratio for mortality from MDS calculated by multivariate logistic regression in
subjects with TP53-involving SNVs/indels. Error bars indicate 95% confidence intervals. We included unclassifiable CNAs involving 17p in 17p alterations
(17p alt.) in panel (g-h) because they are most likely to be LOH (UPDs or deletions). TP53-involving SNVs/indels in panel (f-h) included those detected by
ddPCR (Supplementary Fig. 3).
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
WBC (low)
WBC (high)
Hb (low)
Hb (high)
Plt (low)
Plt (high)
Cytopenia (All)
Cytopenia (Multi)
DN
T3
TE
AS T2
PP L1
1D
TP
SF 3
3B
SR 1
SF
CB
GN L
B1
JA
U2 2
AF
Any abnormality
WBC (low)
Odds ratio
WBC (high)
> 10
Hb (low)
Hb (high)
<1
Plt (low)
FDR
Plt (high)
<0.001
Cytopenia (All)
<0.01
Cytopenia (Multi)
<0.1
Extended Data Fig. 7 | See next page for caption.
Nature Medicine | www.nature.com/naturemedicine
0q
1p 15
UP
1q D
UP
de D
l(
de 5q)
l(1
11 3q)
qU
9q D
UP
9p D
UP
l(2
+2
>0.1
de
14
qU
PD
Any abnormality
Articles
Nature Medicine
Extended Data Fig. 7 | Genetic alterations in CH and abnormalities in blood counts. a, Landscape of SNVs/indels and CNAs in subjects without
abnormalities in blood counts (left), in those with any abnormalities in blood counts (middle), and in those with no available blood counts (right). Each
row represents a genetic alteration while each column represents a subject. Subjects without any alteration are omitted. Different types of mutations and
CNAs are depicted by different colors. b, Enrichment of genetic alterations in subjects with abnormalities in blood counts. Sizes of rectangles indicate
significance of enrichment. Colors of rectangles indicate odds ratios. The enrichment of alterations was examined by Fisher exact test. Cytopenia (All),
subjects with cytopenia in at least one lineage; Cytopenia (Multi), subjects with cytopenia in ≥2 lineage. WBC, white blood cell; Hb, hemoglobin; Plt,
platelet. c, Distribution of blood cell counts in subjects with different CH-related alterations. In all box plots, the median, first and third quartiles (Q1 and
Q3) are indicated, and whiskers extend to the furthest value between Q1 – 1.5×the interquartile range (IQR) and Q3 + 1.5×IQR. Numbers of subjects (n)
are indicated below the names of alterations. d, Relationships between blood cell counts and VAF of SNVs/indels or cell fractions of CNAs. P values are
calculated by two-sided t test in multivariate linear regression models, taking the effect of age and gender into account. Correction for multiple testing is
not performed.
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
Gender
(1.5%)
0.08
0.12
# of SNVs (n)
Cumulative moretality from
hematological maligancnies
Age
(14.5%)
CH (84%)
Attributable proportions of
CH-associated increase in HM mortality
≥3 (129)
1 (1,683)
0 (6,046)
≥3 (50)
7.4×10-4
2 (425)
0.06
# of CNA (n)
2 (229)
1.5×10-3
1 (1,334)
0.08
1.3×10-3
0 (6,670)
9.5×10-4
5.6×10-4
1.1×10-3
0.04
0.04
0.02
20
40
60
80
100 120 140
20
40
60
Time (month)
Cumulative mortality from
hematological malignancies
1 SNV
0.06
0.048
0.048
SNV+CNA (n=348)
SNV alone (n=1,335)
0.08
SNV+CNA (n=121)
P = 1.8×10-4
0.06
0.024
0.024
0.04
0.012
0.012
0.02
20
40
60
80
100 120 140
20
40
Time (month)
60
80
0.03
Both of SNV and CNA
(n=272)
P = 0.39
Either of SNV or CNA
(n=450)
No alteration
(n=4,947)
0.018
3 alterations
Both of SNV and CNA
(n=159)
0.048
P = 0.93
No alteration
(n=4,947)
0.024
100 120 140
Time (month)
No alteration
(n=4,947)
0.072
0.012
80
P = 0.71
Either of SNV or CNA
(n=16)
0.006
60
100 120 140
Both of SNV and CNA
(n=54)
0.096
0.048
40
80
4 alterations
Either of SNV or CNA
(n=87)
0.036
60
0.12
0.024
20
40
0.012
20
Time (month)
0.06
0.024
100 120 140
Time (month)
2 alterations
P = 0.047
SNV alone (n=68)
0.036
SNV+CNA (n=32)
P = 0.31
SNV alone (n=304)
100 120 140
3 SNV
0.1
0.036
Cumulative moretality from
hematological maligancnies
2 SNV
0.06
80
Time (month)
20
40
60
80
100 120 140
Time (month)
20
40
60
80
100 120 140
Time (month)
Extended Data Fig. 8 | Impact of CH on mortality from HM stratified by number of alterations. a, Pie chart showing the proportions of difference in
mortality from hematological malignancies (HM) between subjects with or without CH (Fig. 4a) which are attributable to each prognostic factor (Online
methods). b-c, Cumulative mortality from HM in subjects with different number of SNVs/indels (b), or CNAs (c). d-f, Cumulative mortality from HM in
subjects with both SNVs/indels and CNAs or in those with SNVs/indels alone. Subjects with 1 (d), 2 (e), or ≥3 alterations (f) are separately shown. g-i,
Cumulative mortality from HM in subjects with both SNVs/indels and CNAs or in those with either of them. Subjects with 2 (g), 3 (h), or 4 alterations (i)
are separately shown. Throughout the figure, P values were calculated by two-sided Wald test and not adjusted for multiple comparison.
Nature Medicine | www.nature.com/naturemedicine
Articles
All HM
Alteration
SNV
VAF <5%
VAF 5−10%
VAF >10%
#SNV 1
#SNV 2
#SNV >=3
3071
1960
485
626
2312
586
173
CNA
CF <5%
CF 5−10%
CF >10%
#CNA 1
#CNA 2
#CNA >=3
2254
1870
162
222
1841
322
91
all CH
SNV alone
CNA alone
4585
2331
1514
SNV+CNA
Same loci
Different loci
740
92
648
0.1
0.01
Alteration
Nature Medicine
Myeloid
10
100
0.01
0.1
All HM
Lymphoid
10
Odds ratio
100
0.01
0.1
Myeloid
Alteration
10
0.1
10
100
0.01
0.1
100
Lymphoid
10
100
0.1
Case
(n=67)
** )
overlap
(n=11)
Among 60,787 cases aged≥60 years and confirmed not to
have solid cancers as of March 2013, 52,472 had the follow
up data for development of HM.
**
Among 10,623 cases randomly selected from the 60,787
cases, 9,147 had the follow up data for development of HM.
Cumulative incidence of
hematological malignancies
10
100
0.01
22
20
12
18
38
49
89
1,214
84
198
835
17pUPD
16qUPD
20qUPD
11pUPD
9pUPD
13qUPD
del(13q)
1pUPD
+21q
14qUPD
13
17
12
15
29
13
25
58
88
165
0.1
10
100
Hazard ratio
CNA (CF≥5%)
0.0024
0.0024
0.0012
0.0012
0.0006
0.0006
80
100
Time (month)
10
100
10
100
Both
Either
SNV alone
CNA alone
No alteration
No alteration
0.0012
60
0.1
0.003
0.0024
40
0.01
0.003
0.0018
20
100
Hazard ratio
0.0018
10
0.1
CNA (CF<5%)
P = 0.019
GNAS
U2AF1
RUNX1
EZH2
JAK2
GNB1
TP53
DNMT3A
SF3B1
ASXL1
TET2
0.0036
0.1
Alteration
3,630
1,865
1,222
543
SNV (VAF<5%)
No alteration
0.1
All CH
SNV alone
CNA alone
SNV+CNA
0.006
SNV (VAF≥5%)
0.01
Any CNA 1,765
CF <5% 1,482
CF 5−10%
123
CF >10%
160
#CNA 1 1,477
#CNA 2
241
#CNA ≥3
47
0.0048
100
Any SNV 2,408
VAF <5% 1,591
VAF 5−10%
374
VAF >10%
443
#SNV 1 1,840
#SNV 2
454
#SNV ≥3
114
Subcohort
(n=9,147
Lymphoid
Odds ratio
Alteration
10
Case-cohort study for HM development
Target cohort (n=52,472 )
(≥60 y.o. and no cancer history)
0.01
Odds ratio
Myeloid
19pUPD 19
+9 16
del(7q) 19
+3q 13
17pUPD 26
13qUPD 17
+1q 14
del(13q) 41
del(1p) 10
22qUPD 18
+12 17
3pUPD 11
+8 26
del(17p) 13
12qUPD 21
11qUPD 40
+14 12
14qUPD 223
+1p 11
del(4q) 12
4qUPD 31
17qUPD 32
del(14q) 28
del(21q) 13
del(5q) 42
9pUPD 35
9qUPD 38
+21 109
1qUPD 53
del(2p) 19
del(3p) 10
1pUPD 81
+18 19
2qUPD 12
del(11q) 29
+17 12
7qUPD 10
15qUPD 14
11pUPD 21
6pUPD 32
+15 84
del(20q) 98
del(6q) 28
16qUPD 22
20qUPD 18
+22 29
16pUPD 24
19qUPD 12
2pUPD 20
43
U2AF1
25
EZH2
19
RUNX1
71
SRSF2
143
TP53
52
JAK2
159
PPM1D
SF3B1 115
64
CBL
TET2 1067
ASXL1 247
23
IDH2
DNMT3A 1521
60
GNB1
27
GNAS
10
KRAS
10
NRAS
0.01
All HM
20
40
60
Time (month)
80
100
20
40
60
80
100
Time (month)
Extended Data Fig. 9 | See next page for caption.
Nature Medicine | www.nature.com/naturemedicine
Nature Medicine
Articles
Extended Data Fig. 9 | Association of CH-related SNVs/indels and CNAs with hematological malignancies. a, Odds ratios for the events (death and/
or development) of hematological malignancies in case-control study (Extended Data Fig. 1a). Error bars indicate 95% confidence intervals. b, Design of
case-cohort study for development of hematological malignancies. c, Hazard ratios for development of hematological malignancies. Error bars indicate
95% confidence intervals. d-f, Effect of SNVs/indels (d), CNAs (e), and combined SNVs/indels and CNAs (f) on the cumulative incidence of development
of hematological malignancies. P values are calculated by two-sided Wald test. n, number of cases with the indicated alterations; SNV + CNA,
cooccurrence of both SNVs/indels and CNAs; #SNV, number of SNVs/indels; CF, cell fraction of CNAs; #CNA, number of CNAs.
Nature Medicine | www.nature.com/naturemedicine
Articles
Nature Medicine
Overall Survival
0.8
0.8
0.8
0.6
0.6
0.6
0.4
0.4
0.4
CNA
P value
SNV/indel
VAF <5% (n=1,146)
No SNV/indel (n=4966)
0.31
0.2
9.5×10-8
Cell fracrtion <5% (n=1,056)
20
40
60
80
100
120
140
No alteration (n=4,097)
20
40
60
80
100
120
140
20
40
60
Any SNV vs. No SNV
Any CNA vs. No CNA
Both vs SNV (VAF>5%) alone
CNA (CF<5%) vs. No CNA
Both vs CNA alone
SNV (VAF≥5%) vs. No SNV
CNA (CF≥5%) vs. No CNA
Both vs Either
10
0.1
0.4
P = 0.20
≥2 alterations
0.35
SNV+CNA (n=86)
0.4
SNV alone (n=332)
≥1 SNVs/indel + ≥1 CNA
(Max VAF>5%) (n=143)
0.28
P = 0.04
0.3
0.21
0.2
0.2
0.14
0.1
0.1
0.07
40
60
80
100
120
140
20
40
Time (month)
60
80
100
120
0.35
0.28
0.21
P = 0.041
0.07
0.07
0.06
Time (month)
120
140
140
120
140
P = 0.37
100
120
P = 0.90
0 (n=4,097)
0.18
No CH (n=4,097)
0.12
80
100
P = 0.014
2 (n=554)
1 (n=1,810)
0.14
60
≥3 (n=236)
0.24
0.14
80
0.3
3 SNV/indels
(Max VAF>5%) (n=28)
No CH (n=4,097)
40
60
Number of alterations
2 SNVs/indels + 1 CNA
or 1 SNVs/indels + 2 CNA
(Max VAF>5%) (n=50)
2 SNVs/indels
(Max VAF>5%) (n=96)
20
40
3 alterations
P = 0.091
0.28
20
Time (month)
0.35
0.21
1 ...