Science Spotlight

With immunosequencing, what is past is prologue

From the Robins Lab, Public Health Sciences Division.

The human body is constantly exposed to various external agents and pathogens. The adaptive immune system can act as a logbook for these events, with an individual’s T-cell repertoire representing their pathogen exposure history. Reading these logbooks in a meaningful way, however, has previously been difficult. In a recent issue of Nature Genetics, Dr. Harlan Robins and colleagues in the Public Health Sciences and Human Biology Divisions were able to develop a statistical classification framework that can “read” these immunological histories in individuals to diagnose their cytomegalovirus status.

When the body encounters a new pathogen, the adaptive immune system responds by having T cells generate antigen-specific T cell receptors (TCRs). This occurs through V(D)J recombination, where the TCRβ gene is randomly rearranged into a highly diverse set of sequences. The antigen specificity of these TCRs is modified further by the human leukocyte antigen (HLA) context, encoded by the highly polymorphic HLA-A, HLA-B, and HLA-C loci. Once these TCRs recognize an antigen, activated T cells proliferate by clonal expansion and are added to the memory compartment.

Across all their circulating T cells, healthy adults express roughly 10 million unique TCRβ chains. While this is a lot of pages for any one individual’s book of antigen exposure, there are many additional possibilities in the library of possible rearrangements. Despite this dauntingly large library, however, observing the same TCRβ chain in multiple individuals is much more common than would be expected if all possibilities were equally likely. These public T cell responses, where a particular antigen is targeted by the same TCR sequence in multiple individuals, suggest that characterizing immunological memory across different individuals may be possible.

To evaluate this potential the authors performed immunosequencing of the TCRβ region in 640 healthy individuals, roughly half of whom were positive for cytomegalovirus. Comparing public TCRβ chains between positive and negative individuals revealed 164 chains associated with cytomegalovirus. Because the affinity of a given TCR for a given antigen is modulated by the HLA type of the individual, the authors also looked at whether these TCRβ were associated with HLA polymorphisms. Of the 164 chains associated with cytomegalovirus status, 45 chains were associated with at least one HLA-A or HLA-B allele.

Performance of prediction algorithm
Receiver operating characteristic (ROC) curves showing how well the T cell receptor classifier performed in the original population (dotted line), a cross-validation of the original population (dashed line), or in an independent validation cohort (solid line). Performance was measured by the area under the ROC curve (AUC). Image modified from the publication.

The authors then developed a generative binary classifier that infers cytomegalovirus serostatus from the number of associated TCRβ chains. After training and cross-validating this classifier in the original 640 individuals, it was used to test cytomegalovirus seropositivity in a second independent set of 120 individuals. The classifier performed well, with a sensitivity of 0.90, a specificity of 0.88, and an area under the receiver operating characteristic curve of 0.94 (see figure). In addition, the authors also constructed a similar classifier to infer the HLA type of each individual, and were able to successfully type most HLA alleles among the second group.

Overall, the authors demonstrated the possibility of diagnosing cytomegalovirus status and HLA typing via immunosequencing. Importantly, this methodology is potentially generalizable to a wide variety of immune-related phenotypes. For example, the authors hope to extend this approach to other pathogens, vaccination, and other immunological phenotypes. Furthermore, the parallelizability of the approach may allow for the diagnosis of multiple infections or infection history simultaneously. Said senior author Dr. Robins, “this study demonstrated the exciting, potentially far-reaching potential for reading T-cell memory to infer pathogen exposure history. Due to powerful computers and analytic tools, we can now access reams of data to profile the entire human ‘immunome.’ We believe that high-throughput immunosequencing will be every bit as exciting and impactful as profiling the human genome.”

Also contributing to this project from the Fred Hutch were Drs. William DeWitt, Jenna Gravley, Christopher Carlson, and John Hansen.


Emerson RO, DeWitt WS, Vignali M, Gravley J, Hu JK, Osborne EJ, Desmarais C, Klinger M, Carlson CS, Hansen JA, Rieder M, Robins HS. Immunosequencing identifies signatures of cytomegalovirus exposure history and HLA-mediated effects on the T cell repertoire. Nat Genet 2017; 49(5):659-665. doi: 10.1038/ng.3822.


Funding for this study was provided by Adaptive Biotechnologies and the W.M Keck Foundation.