Photo by Robert Hood / Fred Hutch News Service
Fred Hutch computational biologist Dr. Raphael Gottardo has received one of 85 new funding awards from the Chan Zuckerberg Initiative, or CZI, an advised fund of Silicon Valley Community Foundation, which are aimed at developing computational tools for the Human Cell Atlas, a global research effort to map every type of cell in the human body. The 85 one-year grants total $15 million in funding from the initiative and were chosen from nearly 300 applications.
This set of projects, which were announced April 19, is the third set of grants CZI has awarded in support of the massive cell-mapping feat. The initiative’s first round of funding for the atlas went to 38 pilot projects in October 2017; Fred Hutch biologist Dr. Steven Henikoff is leading one of those projects to scale up a new DNA-mapping technique developed in his lab.
Gottardo will lead a project that will adapt the community-driven software platform Bioconductor to seamlessly fit with the type of analyses and experimental data that researchers in the Human Cell Atlas are generating. Most of these projects involve single-cell analyses, Gottardo said, an emerging field in which new techniques allow scientists to capture information from individual cells rather than the average of thousands or millions of cells.
The Bioconductor platform is an open-source software project for computational biologists to use to analyze genomic data. The lead developers of Bioconductor were both former Hutch researchers. Gottardo’s own field of expertise at the Hutch is computational and statistical analyses of flow cytometry, one of the earlier techniques developed for single-cell analysis.
By their nature, the type of data generated by these studies will require sophisticated computational tools for analysis and new approaches to cloud-based storage, Gottardo said.
His project is “really about how can we efficiently store, manipulate and analyze these data sets,” Gottardo said. “If you have just one data set that has, for example, 1 million cells, and you have 30,000 genes that you measure in those cells, the actual data set that you need to read on your computer will be too large to fit in the memory of a standard personal computer. So we have to be clever not only in the way we process the data, but also clever in the way we store the data.”
For his previous work analyzing data sets generated from flow cytometry experiments, Gottardo has used a format known as HDF5, which was originally developed by NASA. This approach allows individual users to extract only the pieces of data from a large data set that they care about (such as a gene or a cell), an approach often referred to as “chunking,” in which data are sliced and stored in a way to optimize retrieval, Gottardo said. He’s thinking about tailoring and applying some of those same tools to the new project for the Human Cell Atlas.
The data that will come out of the Human Cell Atlas “is probably some of the largest data that we’re going to get from the fields of biology or human health,” Gottardo said. “So big data and cloud computing are going to be big parts of what we’re doing.”