Machine learning model identifies glioblastoma patients with poor survival prognosis

Glioblastoma (GBM) is the most common and primary type of brain cancer with a median overall survival of less than 15 months despite aggressive treatment. At the genomic level, GBM is characterized by several changes in chromosome structure, referred to as somatic copy number alterations (SCNAs). Multiple SCNAs are associated with GBM, and their identification is a valuable marker for prognosis assessment. However, molecular characterization of GMBs –via whole-genome exome sequencing (WES) and SCNA determination– is not routine practice in clinical settings. Magnetic nuclear resonance (MRI) data, on the other hand, is widely available. Efforts to computationally identify MRI radiographic signatures associated with tumor genomic profiles have encountered several challenges, including the complexity of MRI data, limited samples from patients, and high variability from other factors such as MRI scanner manufacturers.

Researchers in the Cimino lab at the Department of Laboratory Medicine and Pathology at the University of Washington (UW) and the Human Biology Division at Fred Hutch successfully trained a machine learning model with targeted features from standard MRI scans to accurately classify two types of survival-associated glioblastoma patient groups. By identifying a set of features that appeared more frequently in MRI data from the poor-survival group and using those features to train the machine learning models, the researchers reduced the dimensionality of the MRI data and improved the ability of machine learning classification algorithms. The new study was recently published in the journal Neuro-Oncology Advances. Nicholas Nuechterlein, a Ph.D. student in the Paul G. Allen School of Computer Science and Engineering at UW and first author in the study, highlighted the significance of this work: “This study outlines a method capable of identifying glioblastoma patients predisposed to poor survival from non-invasive MRI scans alone. Clinically, there are the patients we would like to put on upfront clinical trials because they are the patients whose first-line therapy is most likely to fail.”

MRI feature selection schematic — Feature selection pipeline starting with MRI extracted features and ending with 288 radiogenomic features. Image provided by Dr. Patrick J. Cimino

In the study, the researchers focused on gliomas classified by mutation status of the gene isocitrate dehydrogenase (IDH). IDH-wildtype gliomas are associated with lower survival compared to IDH-mutant gliomas. Furthermore, IDH-wildtype gliomas are comprised of genetically distinct groups 1 and 2, which are associated with different survival outcomes. This grouping cannot be precited by clinical factors or epigenetic signatures; thus, whole-exome sequencing or SCNA data is necessary to classify patients. To investigate the potential for radiology in distinguishing Group 1 from Group 2, the researchers sought to build a prediction model based on radiographic signatures, features that more commonly appear in the poor-survival group. To extract MRI features, the tumor is first segmented from the whole-brain MRI image. Then, the tumor region is isolated, and the image is transformed to achieve the extraction of interpretable histograms and texture features, along with MRI sequences, tumor regions, and image transformations.

The most important features for the prediction task were then selected by focusing on characteristics of several features (feature components) rather than individual features. Additional principal components analyses reduced the data's dimensionality from hundreds of feature components to only 15, which were then used to train machine learning models to predict patients’ placement into IDH-wildtype molecular Group 1 versus Group 2. Their algorithm revealed that features describing contours in the peritumoral edema and the infiltrating portions of glioblastoma visible on the T2-weighted FLAIR MRI sequence –an acquisition parameter for MRI– may differentiate these profiles. Importantly, these features might not be discernable by the human eye. Comparison with similar algorithms revealed that the tailored radiogenomic feature selection outperforms all-purpose feature selection methods.

“We hope that our methods, which extract imaging features important to glioblastoma survival, will be effective in other medical imaging studies where, like ours, the number of patients is small and their images very large. A critical step forward will be to visualize the features that set these especially aggressive tumors apart and, in doing so, help neuroradiologists decide where to look and what to look for,” added Nuechterlein.

Nuechterlein, N, Li, B, Feroze, A, Holland, EC, Shapiro, L, Haynor, D, Fink, J, & Cimino, PJ (2021). Radiogenomic modeling predicts survival-associated prognostic groups in glioblastoma. Neuro-oncology advances, 3(1), vdab004. https://doi.org/10.1093/noajnl/vdab004

UW/FHCRC cancer consortium members Patrick J. Cimino, James Fink, David Haynor, Laura Shapiro, and Eric Holland contributed to this work.

This work was funded by the National Science Foundation Graduate Research Fellowship Program and a grant from the National Institutes of Health.