The role of artificial intelligence in predicting febrile neutropenia in cancer patients

From the Lyman research group, Public Health Sciences Division

Chemotherapy, a cornerstone of cancer treatment, often results in adverse events that carry significant physical, emotional, and economic repercussions. One particularly critical adverse event is febrile neutropenia (FN), characterized by a fever and a substantial drop in neutrophils -a crucial type of white blood cell. FN is a common and serious side effect of chemotherapy that necessitates immediate medical intervention due to its high morbidity, mortality, and associated healthcare costs. Identifying patients at high risk for FN is essential for tailoring preventive measures and improving clinical outcomes. In a recent article published in Cancer Investigation, Drs. Gary H. Lyman and Nicole M. Kuderer delve into the application of artificial intelligence (AI) in creating risk prediction models for FN among cancer patients undergoing chemotherapy.

FN is specifically defined as a single oral temperature of 38.3°C or higher, or a persistent temperature of 38.0°C for over an hour, in patients with neutropenia—a condition marked by a neutrophil count below 1,000 cells per microliter. The initial cycle of chemotherapy poses the greatest risk for FN, especially when patients are administered the full treatment dose.

Accurate clinical assessment of FN risk remains challenging for clinicians due to various factors. FN risk is influenced by the severity and duration of neutropenia post-chemotherapy, but individual patient responses to chemotherapy can vary widely. Clinicians must consider a multitude of variables, including patient history, type of cancer, specific chemotherapy regimens, and individual patient health conditions, making precise risk prediction complex and often unreliable.

To address these challenges, formal risk models have been developed to better identify high-risk patients and aid in personalized treatment decisions. These models are validated through their ability to differentiate between low and high-risk patients and their calibration, which measures the concordance between predicted and actual outcomes.

Traditional risk models for FN typically use multivariable logistic regression analysis. In recent years, efforts have focused on incorporating larger and more diverse patient cohorts to enhance model accuracy. For instance, a significant US prospective study involving over 4,000 patients developed a validated FN risk model based on logistic regression, achieving an area under the receiver operating curve (AUROC) of 0.833, indicating that the predictive model has good discriminative ability. The AUROC is a key performance metric for classification models, with values approaching 1.0 indicating better discrimination between different risk levels.

Generic photo
Generic photo

To further improve the predictive power of these models, researchers have explored adding new risk factors and employing advanced statistical techniques. Variable shrinkage methods, such as the Least Absolute Shrinkage and Selection Operator (LASSO) and Ridge Regression are used to prevent overfitting—a common issue where the model performs well on training data but poorly on new, unseen data. LASSO aids in variable selection by shrinking less important coefficients to zero, whereas Ridge Regression reduces the variance of coefficients by shrinking them towards zero without eliminating them completely.

The authors also discuss the potential of deep learning (DL) methods in risk prediction modeling. Deep learning, a subset of machine learning, involves algorithms that simulate the workings of the human brain to process data and create patterns for decision-making. DL algorithms consist of multiple layers of processing that ‘learn’ data representations, adjusting parameters to enhance model performance at each level. Artificial Neural Networks (ANNs) are a type of DL algorithm inspired by the neural structure of the human brain. ANNs consist of interconnected nodes (neurons) organized into layers—input, output, and intermediate (hidden) layers—where computations occur. Connections between these nodes have weights that are adjusted during training to minimize prediction errors through a process called backpropagation. Gradient boosting, another advanced DL technique, builds sequential models where each new model corrects the errors of the previous ones. This iterative process enhances overall model accuracy. XGBoost, a popular implementation of gradient boosting, incorporates LASSO and Ridge Regression to mitigate overfitting and efficiently handle large datasets.

Despite the promise of deep learning methods, current studies indicate minimal improvement over traditional regression-based models. For example, a study in 2020 comparing various machine learning algorithms, including support vector machines, decision trees, and ANNs, found similar AUROC values ranging from 0.855 to 0.905 across all models, with no single algorithm significantly outperforming others. However, the study had limitations, including a small sample size, potential selection bias due to high FN rates not seen in other breast cancer studies, and the lack of formal analytic comparisons of model performance between algorithms. These factors limit the generalizability and utility of the results.

Despite these advancements, the authors conclude by emphasizing the need for further research to validate these models across diverse patient populations and in real-world clinical settings. The Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD) Statement provides guidelines for the rigorous development and validation of prediction models. These guidelines emphasize the importance of sufficient sample sizes, appropriate internal and external validation, and transparency in reporting to reduce the risk of bias and enhance the reliability of the models.

The key takeaway from this article is that while AI, particularly machine learning techniques, offers promising advancements in risk prediction for FN in cancer patients, these methods must be thoroughly validated. Current evidence suggests only modest improvements over traditional models, highlighting the need for continued research and methodological rigor to realize the full potential of AI in clinical oncology. The ultimate goal is to enhance clinical decision-making, optimize patient outcomes, and ensure the effective and safe delivery of chemotherapy.


No funding was associated with this work.

Lyman, G. H., & Kuderer, N. M. 2024. Artificial Intelligence and Cancer Clinical Research: III Risk Prediction Models for Febrile Neutropenia in Patients Receiving Cancer Chemotherapy. Cancer investigation, 1–5. Advance online publication.

Tags

There are no tags on this page. A list of tags will appear here once there are.