Hey there, human — the robots need you! Vote for IEEE’s Robots Guide in the Webby Awards.

Close bar

Big Data Beats Cancer

One woman’s fight against cancer in the new era of precision medicine

6 min read
Kathy and John Halamka
Photo: David Yellen

opening image for Treatments articleWife, Patient, Survivor: To find the best treatment for Kathy Halamka’s stage III breast cancer, her husband, John, deployed the big-data query tools he’d developed with a network of Harvard-affiliated hospitals.Photo: David Yellen

John and Kathy Halamka met on their first day in their freshman dorm at Stanford. They decided almost instantly that they made the perfect team: With his science background and her artistic sensibility, they’d be able to handle anything that college, or life, could throw at them.

They proved their point three decades later, when Kathy was diagnosed with stage III breast cancer at the age of 49. John, as chief information officer of a Boston hospital renowned for its technological innovations, made sure that his other half got the most cutting-edge treatment. That didn’t mean exotic new drugs or experimental surgery. Rather, her doctors planned her treatment with the help of big-data tools that John himself, along with his colleagues, had only recently brought into existence.

On a bright midwinter day in Sherborn, Mass., the couple are telling their story in the kitchen of their snug farmhouse. Outside the window, their alpacas walk in a neat line down a freshly shoveled path between towering snow banks. Inside, for lunch, Kathy is serving a frittata made from their own chicken eggs, and John is cracking open two varieties of hard apple cider that he’d lovingly brewed with fruit from their orchard. Jars of jam and honey, products of their blueberry bushes and beehives, are glowing, jewel-like, on the kitchen counter.

human os icon

It was in the middle of Kathy’s grueling chemotherapy treatment that the Halamkas moved to this 15-acre farm. Planning the new homestead “gave us something positive to look forward to,” she says.

Kathy received her diagnosis in December 2011. The fast-growing tumor had already launched a few malignant cells into her lymph nodes, but the cancer hadn’t spread any further. The conservative treatment option would have been a mastectomy, removing either one breast or possibly both, followed by chemotherapy. “That would have been the basic standard of care,” Kathy says. But human beings are far from standard. Doctors have long talked of the promise of precision medicine, in which tailored treatment matches the patient’s exact case. That’s what Kathy got.

John, as CIO of Beth Israel Deaconess Medical Center and a professor at Harvard Medical School, had been instrumental in creating an open-source platform called Informatics for Integrating Biology and the Bedside, or i2b2. Several Boston hospitals began building that platform in 2004 to enable researchers to query vast databases of patients’ electronic health records, letting them study treatment outcomes and find subjects for clinical trials. A few years later, five Harvard-affiliated hospitals used i2b2 to create a powerful search tool they named the Shared Health Research Information Network (SHRINE), which linked the five institutions’ databases.

In 2008 the tool was just a proof of concept. But by the time of Kathy’s diagnosis, her doctors could use SHRINE to sift the records of 6.1 million patients for valuable information. What they found made a world of difference for Kathy.

John explains that SHRINE allows physicians to search for patients who have certain characteristics, returning “de-identified” results that don’t violate patient privacy rules. When Kathy’s care team at Beth Israel Deaconess was planning her treatment, they searched for precedents. “We could say, ‘I’m looking for age-50 Asian females who were treated with stage III breast cancer,’ ” John remembers. “What were their medications, what were their outcomes?”

This big-data tactic led to a somewhat unusual treatment for Kathy. She and her doctors decided to hold off on surgery and instead start by targeting her estrogen-sensitive tumor cells with chemotherapy drugs. “By the third chemotherapy session, the doctors almost couldn’t feel the tumor anymore with palpation,” Kathy says. “By the time I was completely done with the chemotherapy regimen, the people in radiology thought they were being punked.” The radiologists simply didn’t see the tumor on her scans, she explains with a gentle little smile. Kathy got a lumpectomy in May 2012, and she’s still on drugs that block the production of estrogen. But it’s hard to imagine how her outcome could have been better.

At last count, the i2b2 platform had been adopted by more than 100 medical institutions worldwide. A 2014 case study called it “arguably the most widely used clinical research data infrastructure based on [electronic health records] in the world.” The open-source SHRINE search tool has also proved popular, with researchers in the United States and Europe deploying it to monitor public health in real time, detect the harmful side effects of certain medications, and investigate other medical topics. The U.S. government is now funding research on using SHRINE to find participants for clinical trials.

img of Halamka alpacasHomesteaders: Kathy and John Halamka hang out with their alpaca herd at their farm in Sherborn, Mass.Photo: David Yellen

And yet, just a couple of years ago, hospitals were deeply reluctant to get involved with the development of SHRINE and similar tools. Not only did John and his colleagues have to deal with hospitals’ concerns over patient privacy, they also had to convince competing medical institutions to share their databases—considered valuable intellectual property—with one another. The pioneers also grappled with the technical challenge of searching across databases with different structures and health records in different formats. All these problems prompted the SHRINE developers to use a peer-to-peer architecture rather than collecting information in a central repository. With a P2P system, each institution retained control over its records and translated queries into a format that matched its database.

Now that the search tools have proved their value, John expects many more institutions to recognize them as the natural next step in digitized medicine. Over the last decade, he says, hospitals have simply been doing the preliminary work of putting patient information into electronic formats. These electronic health records “have been dumb databases for many years, but now they can become decision-support tools and care-management tools,” he says. “That’s really where we’re headed.”

Kathy’s case dramatically illustrates this transitional moment in medicine. Her diagnosis and preliminary imaging were done at a hospital that wasn’t part of the SHRINE system; when she switched to Beth Israel Deaconess for her cancer care, she hand-carried a CD containing her mammograms and other records. Her two hospitals couldn’t exchange information via e-mail, much less search each other’s databases. About a year later, in October 2012, John and Kathy stood with the governor of Massachusetts to celebrate the launch of Mass HIway, a health information exchange that lets hospitals, doctors, labs, pharmacies, and other health care organizations securely transmit data. As a demonstration, the Halamkas sent Kathy’s medical records over the wire.

Creating such networks could enable smarter and more responsive medicine. In today’s atomized medical system, it can take 10 to 20 years for a major treatment advance to become ubiquitous, says John—but linking and cross-correlating institutional records can speed up the dissemination of knowledge. In such a network, hospitals that conduct clinical trials could continually publish results and update their guidelines, and everyone on the network could access that information. A doctor running a query could then get the response: “This is the protocol you should use as of last Friday,” John says.

There’s plenty more to be done in order to turn today’s data into tomorrow’s wisdom. If tools like SHRINE can be augmented with natural-language-processing capacities, they’d be able to mine the unstructured text in doctors’ notes. And the query tools have to be ready for big data to get a whole lot bigger: The cost of a full genome scan has plummeted to about US $1,000, which means that health records may soon be stuffed with patients’ digitized gene sequences. How useful that data will be for research or customized treatment plans remains to be seen—but as soon as it’s in the databases, innovators will certainly start trying to use it. John, who sits on advisory boards of several health care startups, is helping the big electronic-records vendors create a secure interface for nimble software companies, letting them build apps to work with the vendors’ medical records. “All these big companies are fine, but do we really think the next cool innovation is going to come out of an 8,000-person company?” John asks. “No. It’s probably a two-person garage operation.”

As John and Kathy finish their frittata lunch and their story, the February sun is fading. John pulls on his winter gear and heads out to his chores in the barn and poultry coop. Kathy is looking forward to the spring thaw, when she’ll restock the beehives and plant vegetables. As the seasons go by and Kathy’s checkups continue to return a clean bill of health, all the details of her happy outcome will be recorded in her electronic health records. That means her case will now be part of the data that will guide the treatment of the next 49-year-old Asian woman with stage III breast cancer. Kathy’s just another statistic—and that suits her fine.

This article originally appeared in print as “Their Prescription: Big Data.”

This article is for IEEE members only. Join IEEE to access our full archive.

Join the world’s largest professional organization devoted to engineering and applied sciences and get access to all of Spectrum’s articles, podcasts, and special reports. Learn more →

If you're already an IEEE member, please sign in to continue reading.

Membership includes:

  • Get unlimited access to IEEE Spectrum content
  • Follow your favorite topics to create a personalized feed of IEEE Spectrum content
  • Save Spectrum articles to read later
  • Network with other technology professionals
  • Establish a professional profile
  • Create a group to share and collaborate on projects
  • Discover IEEE events and activities
  • Join and participate in discussions