Machine learning identifies proteomic risk factors across 23 diseases


Meng L., Li M., Kong X., Zhang T., Álvez M. B., Liao X., ...Daha Fazla

iScience, cilt.29, sa.2, 2026 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 29 Sayı: 2
  • Basım Tarihi: 2026
  • Doi Numarası: 10.1016/j.isci.2026.114687
  • Dergi Adı: iScience
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Directory of Open Access Journals
  • Anahtar Kelimeler: machine learning, medicine, proteomics
  • Atatürk Üniversitesi Adresli: Evet

Özet

Achieving minimally invasive and rapid detection is a crucial goal in modern medicine. The comprehensive characterization of the blood proteome holds great promise in advancing our understanding of disease etiology, facilitating early diagnosis, risk stratification, and improved monitoring across various diseases and their subtypes. In this study, we collected plasma proteomes from over 3000 patients, representing 23 distinct diseases, encompassing a total of 1462 proteins. Based on histological knowledge, we developed a two-stage hierarchical multi-disease classifier and applied it to perform multi-disease classification on the collected proteomic data. Our results demonstrate that this empirically guided two-stage hierarchical multi-disease classifier outperforms traditional machine learning algorithms in terms of prediction performance, showing better balance and more meaningful feature selections. This finding highlights the positive role that domain expertise can play in machine learning-based disease detection, and underscores the potential of plasma proteomics for multi-disease screening.