IARC 60th Anniversary - 19-21 May 2026
Session : 19/05/26 - Posters
Interpretable Machine Learning for Predicting Cancer-Related Sarcopenia: A Validated Tool for Clinical Implementation and Public Health Action
ZHANG G. 2, ZENG Y. 1
1 National University of Singapore, Singapore, Singapore; 2 The First Affiliated Hospital of Guangzhou Medical University , Guangzhou , China
ABSTRACT
Background
Cancer-related sarcopenia—characterized by progressive loss of skeletal muscle mass and function—is frequently underdiagnosed despite its strong association with poor clinical outcomes, including reduced treatment tolerance, increased complications, and decreased survival. Current risk prediction models are limited by small sample sizes and tumor-specific populations, restricting their generalizability and clinical utility. Addressing this gap is essential for translating research into actionable public health interventions and early preventive strategies.
Objectives
To develop and externally validate an interpretable machine learning model for predicting cancer-related sarcopenia that demonstrates robust generalizability across diverse cancer populations, and to create a practical, clinically implementable tool for risk stratification and early intervention.
Methods
We analyzed data from 1,182 adults with cancer from a population-based survey (1999–2006), randomly divided into training (n=827, 70%) and internal test (n=355, 30%) sets. External validation was performed using 269 patients with lung cancer treated at a tertiary center (March 2022–April 2023). From 27 candidate variables encompassing demographic, clinical, and laboratory parameters, key predictors were identified using the Boruta algorithm for feature selection. Four machine learning algorithms (random forest, gradient boosting, logistic regression, and support vector machines) were trained and systematically compared. Model performance was evaluated using area under the receiver operating characteristic curve (AUC), calibration plots, and decision curve analysis. Model interpretability was enhanced using SHapley Additive exPlanations (SHAP) to provide transparent, patient-level explanations of predicted sarcopenia risk.
Results
Sarcopenia prevalence was 12.9% (153/1,182) in the population-based cohort and 25.7% (69/269) in the lung cancer cohort, highlighting the clinical burden across cancer populations. The Boruta algorithm identified six key predictors that demonstrated clinical relevance and biological plausibility. The random forest model achieved optimal performance with AUC 0.982 (95% CI 0.974–0.989) in the training set and 0.889 (95% CI 0.840–0.934) in the internal test set. External validation confirmed robust generalizability with AUC 0.834 (95% CI 0.783–0.880) in the lung cancer cohort. SHAP waterfall plots provided transparent, individualized risk explanations, enabling clinicians to understand specific contributors to each patient's sarcopenia risk and facilitating personalized intervention strategies.
Conclusions/Implications for Practice or Policy
We developed an interpretable, high-performing machine learning model that accurately predicts cancer-related sarcopenia and demonstrates strong generalizability from population-based cancer cohorts to real-world clinical settings with higher sarcopenia prevalence. The parsimonious six-feature model with SHAP-based interpretability provides a practical, implementable tool for risk stratification in diverse healthcare settings. This model enables early identification of high-risk patients, facilitating timely preventive interventions such as nutritional support and exercise programs. By bridging the gap between predictive analytics and clinical action, this tool supports translation of cancer research into public health practice and provides evidence-based frameworks for policy development in cancer survivorship care and sarcopenia prevention programs.
Resume:
Employment: Assisstant Professor, National University of Singapore since 2025
Education: Doctor of Philosophy in Rehabilitation Sciences, The Hong Kong Polytechnic University, Hong Kong, China

Conference Abstract