There’s quite a gulf between large-scale ocean circulation patterns and the human genome. But Matthew Trunnell, Fred Hutchinson Cancer Research Center’s new chief information officer, sees a common link: How computational capabilities can speed science’s progress.
After tackling — and solving — problems of big data in oceanography and genomics during his career, Trunnell has now turned to the more turbulent issue of clinical data and what it can bring to bear on precision medicine, the research field that aims to bring more targeted therapies to patients with cancer and other diseases.
The need to keep patient data private and the historically insular nature of academic research have combined to leave biomedical research behind the times when it comes to big data, Trunnell said.
But he — and the hiring committee who lured him away from his CIO spot at the Broad Institute of MIT and Harvard — believe bringing medical research up to speed, data-wise, will drive even greater advances in human health.
Trunnell, who joined Fred Hutch on Aug. 31, wasn’t actively looking for a new job when Fred Hutch came calling.
But conversations with President and Director Dr. Gary Gilliland and other clinical researchers made it clear there was “much potential for impact here,” he said. “It’s tremendous what you can do just by having a sufficient quantity of data.”
Trunnell is “exceptionally positioned” to take leadership in this role at Fred Hutch, Gilliland said in a statement. “Matthew brings a balance and energy around the strategic use of IT combined with operational experience.”
The new CIO got a chance to share his views and vision Wednesday afternoon at the 2015 Policy Summit of the Association of Washington Businesses in Cle Elum, Washington — his first public appearance representing Fred Hutch.
“We have entered the age of data,” Trunnell said during a summit panel session on new technologies and the new economy. “We have gone from an era of being limited by how quickly we can hire qualified laboratory technicians to push research forward to an era where we are really limited by how we can bring in data-analytical capabilities.”
In his more than 20 years working at the intersection between data and research, Trunnell has seen momentous changes in both arenas.
He got his feet wet with literal sea changes as an oceanography graduate student at the University of Washington in the early 1990s. Trunnell and his group were using computer modeling to understand ocean currents.
When the first desktop computers capable of processing his group’s oceanographic models came on the scene, Trunnell was responsible for setting up the computers’ networks. It was his first foray into the field of information technology.
“I had one foot in IT and the other foot in science, and that’s kind of the way it’s been ever since,” he said.
But Trunnell eventually grew bored with ocean sciences: “The really big, first-order problems in physical oceanography had been solved in the 1950s,” he said. So he took an IT job for a Boston genome sequencing company and soon realized the enormous potential for growth both in research and computational technology in the field. He also became intrigued with bioinformatics and genomics, where “there [were] all these tremendous problems just waiting to be solved.”
He held several positions at different companies in life sciences IT and bioinformatics in the Boston area and California, eventually returning to academia when he joined the Broad Institute in 2006. His start date at the genomic research center coincided with another momentous arrival: the institute’s first machine for next-generation sequencing. That equipment and its ilk, which vastly decreased the time and money required to capture the 3 billion DNA letters that makeup one person’s unique genome, ushered in a wave of modern genomics for Broad — and the rest of the world.
They also generated massive piles of data for people like Trunnell to wrangle.
To put the scale of that shift in context, the first sequencing of the human genome, known as the Human Genome Project, spawned about 10 terabytes of raw data in the 10 years it took to complete. (That’s the equivalent amount of data that would fill approximately 25 million printed books.)
In his nine-year tenure at Broad, Trunnell and his team increased the institute’s data storage capacity by 50-fold, resulting in computer infrastructure capable of holding 3,000 times the amount of data generated by the entire Human Genome Project.
Trunnell thinks Fred Hutch needs something similar, although he’s quick to point out that the infrastructure and analytical needs won’t be the exact equivalent of the system he helped build at Broad. His first weeks and months at the Hutch, in fact, will be spent figuring out exactly what that new capacity will look like.
And that’s part of why Trunnell is excited to land in Seattle, a city where he will be able to leverage the technology landscape. "Seattle is really becoming the Silicon Valley for data science,” he said.
Companies like Amazon have pioneered data science, the field that seeks to extract meaningful information from large quantities of data, with innovative marketing tools like product recommendations and targeted advertising, which personalize the ads and product choices a computer user might see while surfing online.
“Those same capabilities applied to biomedical research have the potential to be transformative, but it’s not a trivial thing to do,” Trunnell said.
The ultimate goal for Fred Hutch and other biomedical research centers? Make our somewhat imprecise medical practices much more precise.
“At a care level, the idea is we want to give the right therapy to the right patient,” he said. “At a data level, that means we need to bring together everything we know about the patient.”
All that information — from the patient’s medical and family histories to lab results to genetic tests — has to be gathered and sifted through quickly enough to help clinicians and their patients make better-informed decisions. And we’re not there yet.
Getting there means parsing the issue of privacy, a major stumbling block to bringing together patient data for researchers to analyze, Trunnell said. It’s an issue he and his colleagues in genomics have given a great amount of thought.
“Human genomic data by itself is not considered to be identifiable under current regulations,” he said. “[But] those of us who are working in the field know that there is sufficient information in a genome, or even a small fraction of a genome, to identify an individual.”
There are computational techniques, however, to keep aspects of large datasets private even while mining them for information, he said.
Patient consent is another wrinkle for Trunnell and his fellow data scientists to iron out. When clinicians or researchers collect patients' data, they generally ask them to sign forms allowing for certain research uses. People like Trunnell will need to quickly and automatically figure out whether future research projects fall under the consent patients have already given — a process currently requiring review by teams of people and which can take months even at its current small scale.
Ultimately, realizing precision medicine’s potential to improve human health — and prolong patient lives — will rely on researchers’ responsibility and ethical conduct, Trunnell said.
“We can do things to secure the data, but it will be beholden on the researchers to do the right thing with the data,” he said.
Fred Hutch staff writer Sabin Russell contributed to this story.
What do you think of the marriage of Big Data and biomedical research? Tell us about it on Facebook.
Rachel Tompa is a former staff writer at Fred Hutchinson Cancer Center. She has a Ph.D. in molecular biology from the University of California, San Francisco and a certificate in science writing from the University of California, Santa Cruz. Follow her on Twitter @Rachel_Tompa.
Every dollar counts. Please support lifesaving research today.
For the Media