A new Artificial Intelligence (AI) model predicts breast cancer in mammography scans more accurately than radiologists, reducing false positives and false negatives, reports a large international study from Google, Northwestern Medicine and two screening centers in the United Kingdom (U.K.).
“This is a huge advance in the potential for early cancer detection,” said Northwestern study co-author Dr. Mozziyar Etemadi. “Breast cancer is one of the highest causes of cancer mortality in women. Finding cancer earlier means it can be smaller and easier to treat. We hope this will ultimately save a lot of lives.”
Etemadi is a research assistant professor of anesthesiology at Northwestern University Feinberg School of Medicine and of biomedical engineering at the McCormick School of Engineering. He also is a Northwestern Medicine physician and a member of the Robert H. Lurie Comprehensive Cancer Center of Northwestern University.
Breast cancer is the most common type of cancer in women globally, occurring in about one in eight women.
Mammography is the most widely used breast cancer screening tool, but diagnosing cancer from these images is a challenge. One in five cases of breast cancer is missed by radiologists and, according to the American Cancer Society, 50% of all women who undergo screening for a 10-year period will experience a false positive, in which cancer is wrongly suspected.
A false positive can lead to overtreatment with invasive biopsies and unnecessary stress for patients. A false negative can result in delayed detection and treatment.
The international research team worked together to build an AI model to address these shortcomings.
“Computers are really good at these tasks,” said co-lead author Scott McKinney, a Google software engineer. “We hope someday this tool for radiologists becomes as ubiquitous as spell-check for writing e-mail.”
The team used fully de-identified mammograms accompanied by biopsy-proven outcomes and longitudinal follow-up to train a deep-learning AI model to identify breast cancer in screening images. The model was tested against a new set of mammograms from the U.K., where screening occurs every three years, and from the U.S., where screening occurs every one to two years. These predictions were then compared against the set of predictions made in clinical practice as well as those gathered from six radiologists in an independent study.
The study was published Jan. 1 in Nature.
Key findings
- Absolute reduction of 9.4%/2.7% (U.S./U.K.) in false negatives (when a mammogram is incorrectly deemed normal even though breast cancer is present).
- Absolute reduction of 5.7%/1.2% (U.S./U.K.) in false positives (mammogram is incorrectly deemed abnormal even though no cancer is actually present).
- Evidence of the system's ability to generalize from what it learned on the U.K. sites to the U.S. site, which shows the system is applicable to different populations.
“While this is exciting, early-stage research, validation in future trials is needed to better understand how models like these can be effectively integrated into clinical practice,” Etemadi said.
“In some examples, the human outperforms the AI and in others, it’s the opposite. But the ultimate goal will be to find the best way to combine the two – the magic of the human brain isn’t going anywhere any time soon.”