Aoshima, M. (2018). A survey of high dimension low sample size asymptotics, Aust. N.
Z. J. Stat., 60, 4-19.
Cerny, V. (1985). Thermodynamical approach to the traveling salesman problem : An
efficient simulation algorithm, J. Optim. Theory Appl., 45, 41-51.
Eaton, D.L. and Gilbert, S.G. (2015). Principles of toxicology, In: Klaassen, C.D. and
Watkins, III J.B. (eds.) Casarett Doull’s Essentials Toxicology, Third Edition, 5-20,
McGraw-Hill Education, NY, USA.
Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P. (1983). Optimization by simulated annealing, Science, 220, 671-680.
Kramer, A et al. (2014). Causal analysis approaches in Ingenuity Pathway Analysis,
Bioinformatics, 30, 523–530.
Liu, J. et al. (2015). Predicting hepatotoxicity using ToxCast in vitro bioactivity and
chemical structure, Chem. Res. Toxicol., 28, 738-751.
Low, Y. et al. (2011). Predicting drug-induced hepatotoxicity using QSAR and toxicogenomics approaches, Chem. Res. Toxicol., 24, 1251-1262.
Madden, J.C. (2013). Tools for grouping chemicals and forming categories, In: Cronin,
M. et al. (eds) Chemical Toxicity Prediction: Category Formation and Read-Across,
72-97, Royal Society of Chemical Publishing, London.
Mellor, C.L. et al. (2017). Read-across for rat oral gavage repeated-dose toxicity for
short-chain mono-alkylphenols: A case study, Comput. Toxicol., 2, 1-11.
NITE: Hazard Evaluation Support System Integrated Platform (HESS) (2020).
http://www.nite.go.jp/en/chem/qsar/hess-e.html, Accessed 5 January 2021.
OECD (2014). Guidance on grouping of chemicals, Second edition, Number 194,
ENV/JM/MONO(2014)4, Paris, France.
OECD (2017). The OECD QSAR Toolbox.
http://www.oecd.org/env/ehs/risk-assessment/theoecdqsartoolbox.htm,
Accessed 5 January 2021
Raymond, J.W., Blankley, C.J., and Willett, P. (2003). Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures, J. Mol.
Graph. Model., 21, 421-433.
J. Takeshita, A. Toyoda,H. Tani, Y. Endo, and S. Miyamoto
Sakuratani, Y. et al. (2013). Hazard Evaluation Support System (HESS) for predicting
repeated dose toxicity using toxicological categories, SAR QSAR Environ. Res., 24,
351-63.
Sawaragi, Y., Nakayama, H., and Tanino, T. (1985). Theory of Multiobjective Optimization, Academic Press.
Tani, H. et al. (2019). Identification of RNA biomarkers for chemical safety screening
in mouse embryonic stem cells using RNA-seq, Biochem. Biophys. Res. Commun.,
512, 641-646.
Received: March 3, 2021
Revised: May 13, 2021
Accept: May 21, 2021
Classification of compounds based on in vitro gene expression profiles
Dissimilality measures
Figures
The nine chemicals in duplicate
Figure 1: A dendrogram obtained by applying aggregative hierarchical clustering (the
average linkage between the merged groups) to the data of the gene expression ratios
for the nine compounds and 32,586 RNAs. The y-axis marks the dissimilarity measures
at which the clusters merge, and the x-axis the distribution of the nine compounds in
duplicate.
J. Takeshita, A. Toyoda,H. Tani, Y. Endo, and S. Miyamoto
Dissimilality measures
10
The nine chemicals in duplicate
Figure 2: A dendrogram obtained by applying aggregative hierarchical clustering (the
average linkage between the merged groups) to the data of the gene expression levels
for the nine compounds and 32,586 RNAs. The y-axis marks the dissimilarity measures
at which the clusters merge, and the x-axis the distribution of the nine compounds in
duplicate.
11
Dissimilarity measures
Dissimilarity measures
Classification of compounds based on in vitro gene expression profiles
The nine compounds in duplicate
(a) α = 0.0
(b) α = 0.1
Dissimilarity measures
Dissimilarity measures
The nine compounds in duplicate
The nine compounds in duplicate
The nine compounds in duplicate
(c) α = 0.2
(d) α = 0.3
Figure 3: Four dendrograms obtained by applying aggregative hierarchical clustering
(the average linkage between the merged groups) to the data of the gene expression
ratios for the nine compounds and 3, 000 extracted RNAs. The upper-left (a), upperright (b), lower-left (c) and lower-right (d) panels are the cases of α = 0.0, 0.1, 0.2, and
0.3, respectively. In each panel, the y-axis marks the dissimilarity measures at which
the clusters merge, and the x-axis the distribution of the nine compounds in duplicate.
Dissimilarity measures
J. Takeshita, A. Toyoda,H. Tani, Y. Endo, and S. Miyamoto
Dissimilarity measures
12
The nine compounds in duplicate
(a) α = 0.0
(b) α = 0.1
Dissimilarity measures
Dissimilarity measures
The nine compounds in duplicate
The nine compounds in duplicate
The nine compounds in duplicate
(c) α = 0.2
(d) α = 0.3
Figure 4: Four dendrograms obtained by applying aggregative hierarchical clustering
(the average linkage between the merged groups) to the data of the gene expression
ratios for the nine compounds and 1, 000 extracted RNAs. The upper-left (a), upperright (b), lower-left (c) and lower-right (d) panels are the cases of α = 0.0, 0.1, 0.2, and
0.3, respectively. In each panel, the y-axis marks the dissimilarity measures at which
the clusters merge, and the x-axis the distribution of the nine compounds in duplicate.
13
Dissimilarity measures
Dissimilarity measures
Classification of compounds based on in vitro gene expression profiles
The nine compounds in duplicate
(a) α = 0.0
(b) α = 0.1
Dissimilarity measures
Dissimilarity measures
The nine compounds in duplicate
The nine compounds in duplicate
The nine compounds in duplicate
(c) α = 0.2
(d) α = 0.3
Figure 5: Four dendrograms obtained by applying aggregative hierarchical clustering
(the average linkage between the merged groups) to the data of the gene expression
ratios for the nine compounds and 100 extracted RNAs. The upper-left (a), upper-right
(b), lower-left (c) and lower-right (d) panels are the cases of α = 0.0, 0.1, 0.2, and 0.3,
respectively. In each panel, the y-axis marks the dissimilarity measures at which the
clusters merge, and the x-axis the distribution of the nine compounds in duplicate.
14
J. Takeshita, A. Toyoda,H. Tani, Y. Endo, and S. Miyamoto
Figure 6: Scatter plot of all the RNAs (32, 586 RNAs) between the sample 1 and 2
in case of bis-phthalate. The red plots indicate the 1, 000 extracted RNAs in case of
α = 0.0, and the blue plots are the rest RNAs. The sizes of the extracted RNAs are
almost zeros.
Figure 7: Scatter plot of all the RNAs (32, 586 RNAs) between the sample 1 and 2
in case of bis-phthalate. The red plots indicate the 1, 000 extracted RNAs in case of
α = 0.2, and the blue plots are the rest RNAs. The number of RNAs whose sizes are
not zeros increase, compared to the case of α = 0.0.
...