IARC 60th Anniversary - 19-21 May 2026
Session : 19/05/26 - Posters
Integrating polygenic and machine learning–derived nutrient risk scores for colorectal cancer risk stratification: Evidence from Korean adults
DAM T. 1, GUNATHILAKE M. 2, LEE J. 2, OH J. 3, CHANG H. 3, SOHN D. 3, SHIN A. 4, KIM J. 2
1 Department of Public Health & AI, Graduate School of Cancer Science and Policy, National Cancer Center, Goyang-si, Korea (Republic of); 2 Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, National Cancer Center, Goyang-si, Korea (Republic of); 3 Center for Colorectal Cancer, National Cancer Center Hospital, National Cancer Center, Goyang-si, Korea (Republic of); 4 Department of Preventive Medicine, Seoul National University College of Medicine, Jongno, Korea (Republic of)
Background: Colorectal cancer (CRC) incidence is rising in East Asia, necessitating improved strategies for risk assessment. While polygenic risk scores (PRS) effectively quantify genetic susceptibility, capturing the complexity of dietary exposures remains challenging. Machine learning (ML) offers a powerful data-driven approach to integrate these complex nutrient patterns. However, evidence combining genetic and ML-derived dietary risks in East Asian populations remains limited.
Objectives: This study aimed to construct a machine learning–based Nutrient Risk Score (NRS) for risk stratification, calculate a PRS, and evaluate the independent associations and potential interactions of genetic susceptibility and nutrient-based risk in relation to CRC in Korean adults.
Methods: This hospital-based case–control study included 477 CRC cases and 1,532 controls. PRS was calculated based on established susceptibility variants. A rigorous ML pipeline was applied, comparing five algorithms using nested cross-validation and recursive feature elimination to identify key nutrient features. Model performance was assessed using six metrics, including the area under the curve (AUC), F1-score, sensitivity, specificity, precision, and Matthews correlation coefficient (MCC). Logistic regression was used to construct the NRS based on selected features and to evaluate the associations of PRS, NRS, and their combined effects with CRC risk.
Results: A total of 21 nutrients (e.g., fructose, cholesterol, and vitamin C) were retained to construct the NRS. In association analyses, both the standard weighted PRS and the inverse-variance weighted PRS were positively associated with CRC risk (OR = 1.23 per 1-SD increase; ORs = 1.53 and 1.52 for the highest versus lowest tertile, respectively). The NRS showed stronger associations with CRC risk, with ORs of 6.35 for the highest versus the lowest tertile and 2.65 per 1-SD increase. No statistically significant interaction was observed between PRS and NRS.
Conclusions/Implications for practice or policy: Both PRS and NRS were independently associated with CRC risk. These findings suggest that the NRS may serve as a valuable marker for risk discrimination, likely capturing a composite metabolic profile associated with CRC. Future studies incorporating longitudinal designs, larger sample sizes, and multi-omic data are warranted to further elucidate CRC risk determinants and to inform personalized risk assessment strategies.