AI Tool Accurately Sorts Cancer Patients by their Likely Outcomes

finger pointing to colorful image of networks

A new artificial intelligence-based method accurately sorts cancer patients into groups that have similar characteristics before treatment and similar outcomes after treatment, according to a study led by investigators at Weill Cornell Medicine. The new approach has the potential to enable better patient selection in clinical trials and better treatment selection for individual patients.

The study, published May 12 in Nature Communications, was a collaboration with Regeneron Pharmaceuticals, and addressed a problem that many pharma companies and physicians face: how to predict which patients will have the best responses to a drug. The results showed that the new method’s ability to predict treatment outcomes from health record data was better than that of any other method published to date.

Dr. Fei Wang

Dr. Fei Wang

“We’re hopeful that this approach ultimately will be useful for testing and targeting treatments across a wide range of diseases,” said senior author Dr. Fei Wang, founding director of the Institute of AI for Digital Health in the Department of Population Health Sciences and a professor of population health sciences at Weill Cornell Medicine.

Machine learning has long been a promising tool for finding subtle but meaningful patterns in large datasets, including medical datasets. However, although these systems can stratify patients into well-defined groupings based on broad similarities in their health data, those groupings don’t always correspond closely to the patients’ future treatment responses.

Study co-author Dr. Ying Li, a scientist at Regeneron who works on treatment response prediction, recently approached Dr. Wang to see if his group could help develop a better solution for this problem.

“Our goal was to develop a platform that sorts patients with the target disease who are receiving the same treatment into groups sharing similar baseline characteristics and treatment outcomes,” Dr. Li said. “We validated this method using a real-world database of advanced small cell lung cancer patients treated with immune checkpoint inhibitors.”

Dr. Ying Li

Dr. Ying Li

Study first author Dr. Weishen Pan, a postdoctoral research associate in the Wang Laboratory, led the development of the new machine learning platform, “training” it on the deidentified health records of 3,225 patients with lung cancer in a commercial database. Each patient record contained 104 different variables covering items such as blood test results, prescriptions, medical history and tumor stage.

In this initial effort, the platform sorted the patients into three groups. In the group that had the longest mean overall survival time from the start of treatment, most patients (55.5%) were women, and the rates of other disorders such as diabetes and heart failure were relatively low. In contrast, the shortest-surviving group had less than half the mean survival time of the first group, consisted mostly of men (66.2%), and had relatively high rates of tumor metastases as well as abnormal blood test results reflecting inflammatory, liver and kidney problems.

Dr. Weishen Pan

Dr. Weishen Pan

“Using a metric called the concordance index, we showed that the average performance of this new approach at predicting patient survival times was superior to that of standard statistical and machine learning methods,” Dr. Pan said.

The team applied their trained machine learning system to a new dataset covering 1,441 patients with non-small-cell lung cancer and found that it yielded almost identical groupings in terms of baseline characteristics and survival times.

Dr. Wang and Dr. Li and their colleagues now plan to do further development and testing of the new approach for patient stratification in clinical tests of new pharmaceuticals as well as individual treatment selection. Their platform’s reproducible groupings of patients and outcomes suggest, moreover, that such tools also could be used to gain basic insights into disease biology.

“We’ll probably need more than electronic health record data for this, but we do want to understand the biological mechanisms that explain these distinct patient subgroups,” Dr. Wang said.

Many Weill Cornell Medicine physicians and scientists maintain relationships and collaborate with external organizations to foster scientific innovation and provide expert guidance. The institution makes these disclosures public to ensure transparency. For this information, see profile for Dr. Fei Wang.

The research reported in this story was supported by Regeneron.

Weill Cornell Medicine
Office of External Affairs
Phone: (646) 962-9476