1. Froelich, Warren

Article Content

Researchers from the University of Florida are using machine learning algorithms to help identify new risk factors that might explain the early onset of colorectal cancer among patients under the age of 50 years. In preliminary findings-presented during the AACR Virtual Special Conference: Artificial Intelligence, Diagnosis, and Imaging-the researchers said their computational models revealed that chronic immunosuppression from infections such as HIV and human papillomavirus (HPV), as well as inflammation from obesity, sinusitis, and dermatitis are likely contributors toward the onset of colorectal cancer among younger individuals.

cancer patient. canc... - Click to enlarge in new windowcancer patient. cancer patient

In 2020, about 147,950 people were diagnosed with colorectal cancer, with 53,200 dying from their disease in the United States, according to the American Cancer Society. This includes 17,930 cases for individuals younger than 50, or roughly 12 percent of all cases, and 3,640 deaths.


"Incidence of early-onset colorectal cancer has been rising 2 percent annually since 1994," said Michael Quillen, a medical student at the University of Florida who presented the study with Taylor (Max) Parker, a fellow medical student.


Perhaps as many as half of these cases can be accounted by the prevalence of genetic syndromes or predisposing conditions, such as irritable bowel disorder, Quillen added. But what's causing the remaining cancers among young individuals, as well as the increasing incidence, remains something of a mystery.


Study Details

To gain insights into this disturbing trend, the researchers retrieved data from electronic health records in the OneFlorida Clinical Data Research Network of 1,227 cases of individuals with colorectal cancer, matched against 34,157 controls. Median age of the cases and controls was about 35 years.


The team trained four machine learning algorithms to analyze the electronic health data, creating separate computer models for colon cancer and rectal cancer. The cohorts were further divided into "prediction windows" of 0, 1, 3, and 5 years prior to diagnosis, with each patient matched against a control based on age at an encounter date to close to the case index date.


Data was split into a training set (80%) for training the models, and a testing set (20%) used to measure model performance. Notable trends in the model prediction results were decreased sensitivity across prediction windows as data per patient decreased for both rectal cancer and colon cancer cohorts.


The researchers found that 0-year and 1-year prediction area under the curve (AUC) was significant at 0.64-0.75 for all algorithms for rectal cancer and colon cancer. As the prediction window widened, the performance dropped to as low as 0.35 (i.e., 5-year prediction).


As the diagnostic date drew closer, the model identified classic predictors of colon cancer, such as abdominal pain, anemia, and hemorrhages-symptoms for which machine learning is not required. Other top predictors included inflammation disorders, including chronic sinusitis, anxiety, and atopic dermatitis.


But the model identified two other risk factors previously not considered classic symptoms for the diagnosis of colon cancer: primary hypertension and cough/asthma. Both "may make sense in a 0-year cohort of the lung," said Parker. "However, we saw these become features in our 1-year and 3-year and then drop out at 5 years. So, we're looking to identify what's underlying those in correlation to colon cancer."


For rectal cancer, top predictors included obesity, female gender, anxiety, and asthma, in addition to immunosuppressive risk factors brought about by infectious disease, such as genital warts associated with HPV and infection with HIV.


Interestingly, amoxicillin was shown to be an "increasingly important" risk factor the farther away it was used from the diagnostic date.


"Rectal cancer literature has identified HIV and HPV as potential prognosticators and this finding corroborates those," Parker said.


Clinical data so far has shown mixed results, the researchers noted, with some studies linking chronic immunosuppression to negative or inverse relations to colorectal cancer, some positive, and others neutral.


"Our hypothesis is maybe the 50-year and older population is diluting the risk ratio and we're looking at a study going forward that may just include the under 50-year old population," Parker said.


As for other next steps, Parker and Quillen said they hope to validate their findings in models in a second independent dataset and, if validated, then consider a prospective cohort study. The researchers also wish to explore the linkage found between young colorectal carcinogenesis to hypertension, cough, and the use of amoxicillin.


Computationally, the researchers said they'll continue to search for improvements in their model's performance, with an ultimate goal of clinical use.


"This preliminary study provides early insight into the capacity of artificial intelligence to uncover new risk factors in the population of patients with onset of young-onset colorectal cancer, with more algorithm refinement and risk factor exploration underway," the researchers wrote in an abstract of their research


Warren Froelich is a contributing writer.