Early detection of cancer in patients can greatly improve prognoses. That’s why medical experts advise getting screening tests regularly, before symptoms appear. Cancer screening tests, though, are not perfect, and indications of cancer can be subtle, even eluding the best trained eyes. Given the success of artificial intelligence on many natural image tasks, researchers are turning to AI as a possible solution for early cancer detection. DeepHealth, a leading radiology AI company, is developing technology using AI to improve screening mammography interpretation and accuracy.

About the Company

DeepHealth uses state-of-the-art technology to enable the best patient care, providing scientifically driven solutions that clinicians and patients can trust. Focusing initially on breast cancer detection, the startup is building medical imaging AI products that help clinicians help patients.

The Challenge

Breast cancer is the second leading cause of cancer death in women in the United States, according to the American Cancer Society. The good news is that death rates have slowly been decreasing, in part as the result of early detection through screening. Screening mammography (an X-ray of the breast for the detection of breast cancer) saves lives, but it is still imperfect: DeepHealth says that at least one in eight cancers are missed and out of 100 women asked to come back for additional stressful and costly procedures, 95 don’t actually have breast cancer.

In addition, widespread screening is not always available, both inside and outside the U.S., and many regions around the world do not have a screening program at all due to a lack of qualified interpreters and the cost of interpretation. “A key promise of AI is to make healthcare more equitable,” says Dr. William (Bill) Lotter, CTO and Co-founder of DeepHealth. However, a core challenge in applying AI to a broad range of people is generalization. “Generalization of performance to different populations can’t be assumed,” Dr. Lotter continues. “In fact, there are many known examples where AI models don’t perform well across populations.”

Applying AI has the potential to improve screening worldwide, but also presents unique challenges in both training and validation. Dr. Lotter says that for training, “AI methods typically rely on large amounts of highly annotated data, where for mammography, this amounts to having radiologists draw boxes on cancer exams to localize the cancer.” Not only that, but obtaining such large amounts of data is challenging for mammography. “In particular, these types of detailed annotations are much more costly and inefficient to obtain than the ‘weak’ annotations, such as just knowing only the laterality of the cancer (i.e. left or right breast).”

For validation, “even if a powerful AI algorithm is developed, it must be validated before it can be trusted and used clinically,” Dr. Lotter says. So far, validation of mammography AI has been limited by training and testing in similar populations and/or data sources, testing exclusively on exams for which the cancers were found in the clinic, and/or testing on only 2D mammography. Despite the increasing use of the 3D mammography technology known as “digital breast tomosynthesis” (DBT), which can be thought of like a video of different angles of the breast rather than a single static image, he adds that the “large data size presents challenges to humans and AI alike.”

The Solution

In a new paper published in Nature Medicine, DeepHealth researchers demonstrate that their AI model has the potential to improve breast cancer detection in a way that clinicians can trust, as well as improving equitable patient access to screening — both goals are important to DeepHealth’s mission.

First, they show that their AI model is capable of detecting cancer in prior mammograms of cancer patients that were interpreted as normal in the clinic. They directly compare the model to five breast imaging expert radiologists in this earlier detection task using a reader study and demonstrate higher performance. While it is known that cancers are often visible on prior exams in retrospect, the direct comparison to expert radiologists provides stronger evidence of meaningful earlier cancer detection.

Next, they include DBT and a novel AI approach. Since DBT is quickly becoming the standard of care over traditional 2D digital mammography, DeepHealth has developed a method that works for DBT and does so in a way that addresses a core research problem in AI: relying less on highly annotated data.

As Dr. Lotter explains, “Mammography is really a ‘needle-in-a-haystack’ problem where small, subtle lesions are often surrounded by similarly appearing but benign tissue. This challenge is exacerbated in DBT where each view consists of ~50 images instead of just one image for 2D mammography. To enable effective AI training on this data while relying less on annotations, we have developed an approach that turns each DBT image stack of ~50 images into a single 2D image that aims to preserve the most important information across all of the original images. The resulting image allows for further AI training with only ‘weak’ annotations.”

Finally, DeepHealth scientists focus on generalization and a step toward increasing access to screening. They tested their AI models on sources that were never used for training, including both the reader study dataset and a Chinese hospital dataset. The results show stronger evidence of performance that can be trusted in new settings, and the generalization performance to the Chinese clinic is a promising step toward a tool that could increase the accessibility of screening mammography to populations in which it is currently infeasible.

Background and Startup Connect with CSAIL Alliances

Launched in 2015, DeepHealth was founded by Dr. Lotter, Dr. David Cox, and Dr. Greg Sorensen, who have all conducted related research at MIT.

While earning his PhD, Dr. Lotter was a member of the Center for Brains, Minds, and Machines, a CSAIL-affiliated research group that aims to understand human intelligence and then develop machines with similar capabilities.

“Much of deep learning, the form of AI we use in our software, is informed by the human visual system, and my research at CBMM aimed to further develop brain-inspired deep learning approaches,” says Dr. Lotter. “This experience has been valuable for our efforts at DeepHealth, where we are similarly building deep learning models that are informed by how expert radiologists interpret medical images.”

For the past several years, DeepHealth has benefited from CSAIL’s Startup Connect program in terms of staying connected to the lab, recruiting, and finding potential investors. Dr. Lotter says that as an example, “we participated in a poster session where we presented our research and, as a result, met a number of talented job candidates and interested investors. The Startup Connect program also graciously introduced us to potential investors on several other occasions, which was quite helpful when we were raising early funding rounds.”

How the Technology Works

To help provide insights to radiologists, DeepHealth’s software has two modes of operation:

In the first mode, it outputs an exam-level suspicion score, which is binarized to provide an indication of whether the exam has findings that are suggestive of cancer. In this way, the technology can be used for triage, so that radiologists who have many cases in their worklist can prioritize reading the cases that the software deems the most suspicious. “This can help ensure that women at risk receive follow-up procedures as soon as possible, sometimes even before they have left the facility,” says Dr. Lotter.

In the second mode, the software outputs bounding boxes that localize the suspicious findings. Radiologists can then review the bounding boxes while interpreting exams, which can help them detect subtle cancers they might otherwise miss and also more confidently dismiss benign findings.

These capabilities improve the accuracy and efficiency of screening mammography, and could also facilitate quality screening mammography in regions and populations that are more limited by the availability of expert clinicians.

The Future

After the promising start of demonstrating the robustness and generalization of their software, the “next major frontier in AI for mammography is effective clinical deployment and achieving real-world impact on women’s lives,” Dr. Lotter says. “We are fortunate to have a unique opportunity to do so, as we have recently been acquired by a company called RadNet, the largest outpatient imaging services provider in the U.S. RadNet performs over one million mammograms per year across hundreds of diverse sites. Together, we truly do have an opportunity to realize the potential of AI for screening mammography. As a step in this process, we are currently in the midst of obtaining FDA clearance for further validation of our products.”

Beyond mammography, DeepHealth envisions that their deep learning model architectures and algorithms can be used for many medical imaging domains where subtle lesions and high-dimensional data are common challenges.

As for increasing access to medical screening, the company also seeks to demonstrate consistent performance across populations. “This aspect is a top priority for us, and our results so far are promising,” he says. In addition to showing that their software performs well on women who received mammograms at a collaborating Chinese hospital, they have also found that their software exhibits similar performance among Black and white women in New York City. These findings were recently presented at the Radiological Society of North America annual meeting.

“Verifying equitable performance is an ongoing goal, but our results so far are encouraging and point to the potential of the software in making screening more accessible and accurate for all women.”