リケラボ論文検索は、全国の大学リポジトリにある学位論文・教授論文を一括検索できる論文検索サービスです。

リケラボ 全国の大学リポジトリにある学位論文・教授論文を一括検索するならリケラボ論文検索大学・研究所にある論文を検索できる

リケラボ 全国の大学リポジトリにある学位論文・教授論文を一括検索するならリケラボ論文検索大学・研究所にある論文を検索できる

大学・研究所にある論文を検索できる 「Improving Compound–Protein Interaction Prediction by Self-Training with Augmenting Negative Samples」の論文概要。リケラボ論文検索は、全国の大学リポジトリにある学位論文・教授論文を一括検索できる論文検索サービスです。

コピーが完了しました

URLをコピーしました

論文の公開元へ論文の公開元へ
書き出し

Improving Compound–Protein Interaction Prediction by Self-Training with Augmenting Negative Samples

Koyama, Takuto Matsumoto, Shigeyuki Iwata, Hiroaki Kojima, Ryosuke Okuno, Yasushi 京都大学 DOI:10.1021/acs.jcim.3c00269

2023.08.14

概要

Identifying compound-protein interactions (CPIs) is crucial for drug discovery. Since experimentally validating CPIs is often time-consuming and costly, computational approaches are expected to facilitate the process. Rapid growths of available CPI databases have accelerated the development of many machine-learning methods for CPI predictions. However, their performance, particularly their generalizability against external data, often suffers from a data imbalance attributed to the lack of experimentally validated inactive (negative) samples. In this study, we developed a self-training method for augmenting both credible and informative negative samples to improve the performance of models impaired by data imbalances. The constructed model demonstrated higher performance than those constructed with other conventional methods for solving data imbalances, and the improvement was prominent for external datasets. Moreover, examination of the prediction score thresholds for pseudo-labeling during self-training revealed that augmenting the samples with ambiguous prediction scores is beneficial for constructing a model with high generalizability. The present study provides guidelines for improving CPI predictions on real-world data, thus facilitating drug discovery.

この論文で使われている画像

参考文献

(1) Keiser, M. J.; Setola, V.; Irwin, J. J.; Laggner, C.; Abbas, A. I.;

Hufeisen, S. J.; Jensen, N. H.; Kuijer, M. B.; Matos, R. C.; Tran, T. B.;

et al. Predicting new molecular targets for known drugs. Nature 2009,

462, 175−181.

(2) Macarron, R.; Banks, M. N.; Bojanic, D.; Burns, D. J.; Cirovic, D.

A.; Garyantes, T.; Green, D. V.; Hertzberg, R. P.; Janzen, W. P.;

Paslay, J. W.; et al. Impact of high-throughput screening in biomedical

research. Nat. Rev. Drug Discovery 2011, 10, 188−195.

(3) Trott, O.; Olson, A. J. AutoDock Vina: improving the speed and

accuracy of docking with a new scoring function, efficient

optimization, and multithreading. J. Comput. Chem. 2009, 31, 455−

461.

(4) Meng, X. Y.; Zhang, H. X.; Mezei, M.; Cui, M. Molecular

docking: a powerful approach for structure-based drug discovery.

Curr. Comput.-Aided Drug Des. 2011, 7, 146−157.

(5) Meiler, J.; Baker, D. ROSETTALIGAND: Protein−small

molecule docking with full side-chain flexibility. Proteins: Struct.,

Funct., Bioinf. 2006, 65, 538−548.

(6) Wishart, D. S.; Feunang, Y. D.; Guo, A. C.; Lo, E. J.; Marcu, A.;

Grant, J. R.; Sajed, T.; Johnson, D.; Li, C.; Sayeeda, Z.; et al.

DrugBank 5.0: a major update to the DrugBank database for 2018.

Nucleic Acids Res. 2018, 46, D1074−D1082.

(7) Wang, R.; Fang, X.; Lu, Y.; Wang, S. The PDBbind database:

Collection of binding affinities for protein− ligand complexes with

known three-dimensional structures. J. Med. Chem. 2004, 47, 2977−

2980.

(8) Kim, S.; Chen, J.; Cheng, T.; Gindulyte, A.; He, J.; He, S.; Li, Q.;

Shoemaker, B. A.; Thiessen, P. A.; Yu, B.; et al. PubChem 2019

update: improved access to chemical data. Nucleic Acids Res. 2019, 47,

D1102−D1109.

4558

https://doi.org/10.1021/acs.jcim.3c00269

J. Chem. Inf. Model. 2023, 63, 4552−4559

Journal of Chemical Information and Modeling

pubs.acs.org/jcim

(29) Davis, M. I.; Hunt, J. P.; Herrgard, S.; Ciceri, P.; Wodicka, L.

M.; Pallares, G.; Hocker, M.; Treiber, D. K.; Zarrinkar, P. P.

Comprehensive analysis of kinase inhibitor selectivity. Nat. Biotechnol.

2011, 29, 1046−1051.

(30) Boeckmann, B.; Bairoch, A.; Apweiler, R.; Blatter, M.-C.;

Estreicher, A.; Gasteiger, E.; Martin, M. J.; Michoud, K.; O’Donovan,

C.; Phan, I. The SWISS-PROT protein knowledgebase and its

supplement TrEMBL in 2003. Nucleic Acids Res. 2003, 31, 365−370.

(31) Ö ztürk, H.; Ö zgür, A.; Ozkirimli, E. DeepDTA: deep drug−

target binding affinity prediction. Bioinformatics 2018, 34, i821−i829.

(32) Kojima, R.; Ishida, S.; Ohta, M.; Iwata, H.; Honma, T.; Okuno,

Y. kGCN: a graph-based deep learning framework for chemical

structures. J. Cheminf. 2020, 12, 32.

(33) Rogers, D.; Hahn, M. Extended-connectivity fingerprints. J.

Chem. Inf. Model. 2010, 50, 742−754.

(34) Ö zdemir, Ö .; Sönmez, E. B. Weighted cross-entropy for

unbalanced data with application on covid x-ray images. 2020

Innovations in Intelligent Systems and Applications Conference (ASYU);

IEEE, 2020; pp 1−6.

(35) Drummond, C.; Holte, R. C. C4. 5, class imbalance, and cost

sensitivity: why under-sampling beats over-sampling. Workshop on

Learning from Imbalanced Datasets II; ICML, 2003; Vol. 11, pp 1−8.

(36) Bai, X.; Yin, Y. Exploration and augmentation of pharmacological space via adversarial auto-encoder model for facilitating kinasecentric drug development. J. Cheminf. 2021, 13, 95.

(37) Lihong, P.; Wang, C.; Tian, X.; Zhou, L.; Li, K. Finding lncrnaprotein interactions based on deep learning with dual-net neural

architecture. IEEE/ACM Transactions on Computational Biology and

Bioinformatics; IEEE, 2022; Vol. 19, pp 3456−3468.

(38) Saito, T.; Rehmsmeier, M. The precision-recall plot is more

informative than the ROC plot when evaluating binary classifiers on

imbalanced datasets. PLoS One 2015, 10, No. e0118432.

(39) Xu, K.; Hu, W.; Leskovec, J.; Jegelka, S. How powerful are

graph neural networks? 2018, arXiv:1810.00826. arXiv preprint.

(40) McInnes, L.; Healy, J.; Melville, J. Umap: Uniform manifold

approximation and projection for dimension reduction. 2018,

arXiv:1802.03426. arXiv preprint.

Article

Recommended by ACS

MMDTA: A Multimodal Deep Model for Drug-Target

Affinity with a Hybrid Fusion Strategy

Kai-Yang Zhong, Yi Li, et al.

AUGUST 23, 2023

JOURNAL OF CHEMICAL INFORMATION AND MODELING

READ

AttenSyn: An Attention-Based Deep Graph Neural Network

for Anticancer Synergistic Drug Combination Prediction

Tianshuo Wang, Leyi Wei, et al.

AUGUST 11, 2023

JOURNAL OF CHEMICAL INFORMATION AND MODELING

READ

CoGT: Ensemble Machine Learning Method and Its

Application on JAK Inhibitor Discovery

Yingzi Bu, Duxin Sun, et al.

MARCH 27, 2023

ACS OMEGA

READ

Persistent Path-Spectral (PPS) Based Machine Learning for

Protein–Ligand Binding Affinity Prediction

Ran Liu, Jie Wu, et al.

JANUARY 16, 2023

JOURNAL OF CHEMICAL INFORMATION AND MODELING

READ

Get More Suggestions >

4559

https://doi.org/10.1021/acs.jcim.3c00269

J. Chem. Inf. Model. 2023, 63, 4552−4559

...

参考文献をもっと見る

全国の大学の
卒論・修論・学位論文

一発検索!

この論文の関連論文を見る