Researchers at Fred Hutch have been working to create open-source R software packages for the analysis and graphing of flow cytometry data. Cytometry is a standard method used by scientists for immune cell monitoring. There are currently many packages for analysis of flow cytometry data; however, most focus on analysis and not visualization of data. A newly published package from the Gottardo lab is described in the journal Bioinformatics. The new package is called ggCyto and provides easy visualization of cytometry data. This follows in the wake of openCyto (data analysis pipeline for flow cytometry) as well as many other tools. The newest installment from the group focuses on an easy user interface based on ggplot2, which is a widely used graph package. “The advantage of using R packages to flow analysis is the ability to analyze complex datasets with a reproducible computational workflow” stated last author, Dr. Finak.
The earlier package, openCyto, allows users to import FCS files and analyze the data. Users can either open a workspace (data analyzed in FlowJo) or unaltered files. In the case of ungated files, one can use the automated gating template functionality of openCyto to detect sub-populations that can then be applied to whole datasets. By automating this process, the researcher no longer has to complete labor-intensive manual gating for matched experiments. This allows for not only decreased time, but also improved reproducibility and objectivity. The gating pipeline can be applied to any dataset produced on the same instrument with the same markers. After analysis, data visualization is essential for publication and sharing of data. This can be accomplished in R using newly released ggCyto, which is compatible with most BioConductor flow packages. In general, ggCyto allows users to specify a cell population, plot flow parameters to the axes, specify the axis transformations and even show specific gates with statistics.
Figure provided by Dr. Finak
ggCyto is an open-source package available through GitHub or BioConductor. Data can be imported either ungated or gated. Once loaded, channels or markers are used to specify the axis, and a 1D density plot will be produced, or if two dimensions are specified, a 2D density plot will be produced. Plotting can be accomplished by using autoplot, a feature that makes most of the decisions and sets sensible defaults, or by using ggcyto(), which allows for more user control. Within this control, a researcher can specify the type of plot, which gates and statistics, as well as show backgating of specific populations (see figure). Once code for a plot is created, it can be batched to the whole dataset or other datasets can easily be substituted. By publishing ggCyto for flow cytometry data visualization, the Gottardo lab has created a user-friendly visualization package for flow cytometry data. Combined with openCyto, this allows for full computational analysis of flow data using R with high reproducibility and fast, efficient data analysis. The lab plans to continue its work producing user friendly packages for flow analysis. According to Dr. Finak, future work will focus on well-defined workflows for analysis by novice R users as well as updated documentation.
Van P, Jiang W, Gottardo R, Finak G. 2018. ggCyto: Next Generation Open-Source Visualization Software for Cytometry. Bioinformatics. Ahead of print.
This work was supported by the Bill and Melinda Gates Foundation and National Institutes of Health.
Fred Hutch/UW Cancer Consortium faculty members Raphael Gottardo contributed to this research.