With the approach of colder weather and people spending more time indoors, medical experts are concerned that COVID-19 might hide among other respiratory viruses that spread during cold and flu season, making it harder to separate discomforting illnesses from potentially lethal ones.
Researchers at the University of Washington and Fred Hutchinson Cancer Research Center are working together to explore the use of advanced genetic sequencing technologies to analyze nasal swabs of COVID-19 patients to find out what kinds of germs might co-exist with the coronavirus and conceivably affect their course of disease.
These sort of probes, called metagenomic sequencing, would not be used to diagnose patients, but they can be especially valuable to researchers working to track and understand COVID-19 and the SARS-CoV-2 virus that causes it. Knowing the complete genetic code of organisms found in nasal swabs strengthens our understanding of what our immune systems are facing in this pandemic and help us guard against future threats.
“It’s a whole-genome test. It lets you sequence every single piece of DNA and RNA in the sample. That generates a ton of information, and you should be able to find every pathogen in that,” said Dr. Sam Minot, a Fred Hutch computational biologist who is contributing to the project his expertise in wrangling massive amounts of data.
The team is getting a boost from Amazon Web Services, which credited free time on its AWS Batch cloud computing platform to handle the massive amount of data generated by the gene sequencing machines. By one estimate, a whole genome sequencing of specimens from 350 patients might yield 6 billion strings of genetic data, each string containing 75 characters DNA or RNA code — hence the need to outsource processing to the cloud.
In a small study reported last June in Clinical Chemistry, the researchers demonstrated that whole genome sequencing of nasal swab samples from a handful of early COVID-19 patients as well as controls could not only flag the presence of the new coronavirus, but could find other germs such as human parainfluenzavirus, a bacterial infection Moraxella catarrhalis, and rhinoviruses — which cause the common cold.
Since that study was conducted, researchers have sequenced more than 2,000 more, and are continuing to evaluate the results.
Metagenomic sequencing is very different from the COVID-19 tests now taken by millions of Americans. That now common “PCR” test has, in a sense, prior knowledge of what it is looking for. It uses sensitive genetic probes that can spot a few telltale sections of the coronavirus’ genes, which can be read like barcodes to identify the virus as SARS-CoV-2, the pandemic strain. This allows caregivers and public health officials to diagnose and treat the disease — and take steps to stop its spread.
By contrast, the metagenomic screen used in this new research was designed as if there was nothing known about the pandemic strain. It sequences all the genetic material found in the sample. By sifting through mountains of raw genetic data and comparing it to the genetic profiles of diseases that have already been identified prior to the pandemic, the program was able to uncover a “new” virus.
Within 36 hours it found and fully sequenced SARS-CoV-2 in six of six samples from COVID-19 patients who had previously tested positive. When the analysis compared an unusual gene sequence it had unearthed to those in an electronic library of viral genes known in 2019, it correctly described it as “a novel human Betacoronavirus” related to known strains found in bats.
Chinese researchers used whole genome sequencing in late December 2019 as part of their effort to identify the yet-to-be named novel coronavirus, but first used PCR tests to search through lists of know respiratory viruses. The small UW/Fred Hutch study is a proof of principle that metagenomic screening and rapid analysis could be used in future disease surveillance.
While these early studies are essentially dry runs for the use of whole genome sequencing tools to study co-infections of COVID-19 with other bugs, they also demonstrate a way that could quickly spot the presence of a new respiratory disease so the world will not be caught so-flatfooted as it was with SARS-CoV-2.
“These results demonstrate the value of metagenomic analysis in the monitoring and response to this and future viral pandemics,” wrote the researchers, led by Fred Hutch virologist Dr. Keith Jerome, who also heads the Virology Division at UW, and Dr. Alex Greninger, assistant director of the UW Medicine Clinical Virology Laboratory.
Greninger noted that one reason these screens are so computationally intensive is that the genetic information pulled from the nasal swabs, which are capable of generating a billion snippets of code, must be matched against a vast database of more than 200 million known gene sequences from the vast range of microbial life archived by global researchers to date.
Services such as AWS can help researchers queue up their data and run analysis in a few hours that might otherwise take weeks.
“We’re starting to get to the point where we can almost sample all the molecules in a given sample,” he said. “The sequences are getting bigger and bigger, and the databases are getting bigger and bigger. That’s why cloud computing is so important.”
Elsewhere at Fred Hutch, researchers are using metagenomic sequencing to study how protein structures on the surfaces of infection-fighting blood cells might determine why some 20% of COVID-19 patients experience severe disease, while the rest do not.
“We have already completed the collection of sequences from 1,032 patients, half of them with severe COVID-19, and half without severe disease,” said Dr. Lue Ping Zhao, who is leading the project.
The focus of Zhao’s study is on a set of immune system genes called HLA Class II on the surfaces of T cells — the same gene that's profiled and matched so that donated immune cells are compatible with the tissue-type of transplant patients. Researchers want to know if HLA genes may recognize and present viral antigens for the immune system to respond to, and if an overactive immune response might lead to the powerful inflammatory response, known as a “cytokine storm,” that is a common cause of death in severely ill patients.
“Through this interdisciplinary collaboration, we are now putting clinical, viral and genetic data together through a statistical analysis to see which HLA genetic elements might be predictive of severe COVID-19,” Zhao said.
These early experiments are demonstrating how advances in metagenomic sequencing empower researchers by giving them a clearer, comparative view of the many mechanisms that control our biology and provide new ways to detect and treat disease.
Sabin Russell is a staff writer at Fred Hutchinson Cancer Research Center. For two decades he covered medical science, global health and health care economics for the San Francisco Chronicle, and wrote extensively about infectious diseases, including HIV/AIDS. He was a Knight Science Journalism Fellow at MIT, and a freelance writer for the New York Times and Health Affairs. Reach him at firstname.lastname@example.org.
Are you interested in reprinting or republishing this story? Be our guest! We want to help connect people with the information they need. We just ask that you link back to the original article, preserve the author’s byline and refrain from making edits that alter the original context. Questions? Email us at email@example.com