Single-cell RNA transcriptomics allows researchers to broadly profile the gene expression of individual cells in a particular tissue. This technique has allowed researchers to identify new subsets of cells or new gene programs that have advanced the understanding of development, cancer, and normal biology. Standard single-cell RNA transcriptomics requires researchers to dissociate cells from their tissues for profiling. This process can lead to loss of certain cell types, and dissociation completely destroys the spatial organization of a tissue.
Understanding the spatial organization of single cells is important. This dimension lets researchers know which cells could potentially be interacting or communicating with one another in a tissue full of many diverse cell types. To solve this problem, researchers have combined tissue imaging with highly multiplexed quantification of 100s to 1000s of RNA transcript-specific probes to create a technique called spatial transcriptomics. “Tissues are complex, and we want to understand what programs the cells are running in their native environment. Spatial transcriptomics is really exciting because you look at tissues that are fixed in place and identify all the different types of cells…and learn about what transcriptional programs they’re running,” says Dr. Evan Newell of the Vaccine and Infectious Disease Division at Fred Hutch.
Since its inception, gleaning this information from spatial transcriptomic data has been challenging because methods for segmenting, or computationally separating single cells, frequently misidentify cellular borders. Historically, segmentation has been accomplished through antibody staining and microscope imaging of the tissue section. However, this technique is error prone and frequently misidentifies cell boundaries. “With antibody staining, it’s more of an art…there’s a lot of variation and background,” explains Newell.
Assigning expression patterns to single cells in their native environment is the entire point of spatial transcriptomics, so the fact that cells are misidentified using classic methods is a big problem. To tackle the issue of segmentation, Dr. Daniel Jones, a staff scientist in the Newell lab, built a new computational tool called Proseg to segment cells based on their RNA expression. First, he defines the number of cells in a particular tissue by staining and counting their nuclei. This data is then fed into a probabilistic model that defines cell boundaries based on what transcripts are present. His model takes advantage of the fact that cells behave like a “bag of RNA,” meaning that RNA transcripts are typically randomly distributed throughout the cell. From there, Proseg takes inspiration from the Cellular Potts Model to simulate cells that best explain the distribution of the transcripts.