Diverse study populations drive new genetic insight

From the Carlson, Kooperberg, and Peters Groups, Division of Public Health Sciences

In the United States, disparities in disease burden and healthcare disproportionately affect minority populations; contributors to the development of health disparities are complex and multifactorial. The conduct of research studies that include diverse study populations are one way to combat growing disparities, as understanding how genetic and lifestyle factors differentially affect populations is crucial for reducing disease burden. Large-scale genome-wide association studies (GWAS) have proven to be powerful tools in the discovery of genetic variants that influence predisposition for diseases and behaviors. The majority of GWAS are comprised predominantly of European-descent individuals which limits the generalizability of the results to other racial and ethnic groups. Researchers from the Division of Public Health Sciences, along with a team of national collaborators, recently published a paper in the journal Nature that identified several new genetic risk loci from a GWAS of 26 complex traits in a multi-ethnic study population.

The work was conducted within the Population Architecture using Genomics and Epidemiology (PAGE) study that included nearly 50,000 non-European individuals who self-identified as either Hispanic/Latino, African American, Asian, Native Hawaiian, Native American, or Other. Drs. Charles Kooperberg, Ulrike (Riki) Peters, and Chris Carlson are members of the PAGE study and senior authors on the Nature paper. The complex traits assessed in the study were grouped into eight categories: inflammatory, lipid, lifestyle, glycemic, anthropometric, electrocardiogram, blood pressure, and kidney. All individuals were genotyped using the Multi-Ethnic Genotyping Array, a tool recently developed by PAGE investigators for use in genomic studies of diverse populations. Association analyses were conducted two ways: with all individuals included, as genetic diversity occurs across a continuum, and then also by race/ethnicity categories according to the participants’ self-reporting. The latter is the method commonly employed in GWAS but the former had greater power for detection of significant associations. As Dr. Peters emphasized, “Genetic ancestry is a continuum, so it’s not surprising that analyzing all subjects jointly is more powerful than trying to divide individuals into predefined subsets.”    

Image of DNA strand
Image from Pixabay

“The main contribution of the paper is the development of a framework for multi-ethnic analyses. While we identified a substantial number of genetic associations, much larger multi-ethnic studies are needed to level the playing field,” said Dr. Kooperberg. The authors identified 16 new trait-genetic variant associations and 11 new low-frequency variants with suggestive associations. These 27 novel genetic loci were spread across the eight trait categories. In adjusted analyses, 38 secondary signals at loci previously identified in other GWAS were discovered. Correlation analyses revealed significant differences in the frequencies of the novel risk variants among the different populations. Dr. Carlson pointed out, “the findings demonstrate that GWAS of populations of European descent are likely to miss important risk alleles present in minority populations.” For the 26 complex traits included in the current study, previous GWAS had identified 8,979 variant-trait associations. While these associations had been discovered in cohorts comprised predominantly of European-descent individuals, the authors were interested in determining what proportion could be replicated in the multi-ethnic PAGE population. When they looked at this set of known variants, 16% (1,444) were replicated in the multi-ethnic population at the standard significance threshold. Furthermore, the authors demonstrated that the effect sizes reported for Europeans were significantly weaker in the minority populations, particularly in those of African descent. “Polygenic risk scores to define individual risk for common complex diseases derived from European populations will underperform in minority populations,” noted Dr. Kooperberg.

In a subsequent analysis, the authors aimed to further determine the impact of conducting GWAS with multi-ethnic populations. They used data from a GWAS of 250,000 individuals of European descent (the ‘GIANT’ study) and added the ~50,000 individuals of the PAGE cohort or 50,000 individuals of European descent randomly selected from the UK Biobank. When they assessed genetic loci significant for height, the addition of 50,000 from either cohort added to the number of significant loci. The GIANT plus PAGE cohort identified 82 loci while GIANT plus the UK Biobank cohort identified 107 loci. In addition to the number of significant loci, a combination of these various data sources showed that adding diverse populations to European descent GWAS can help fine-mapping of known loci by reducing the average 95% credible set from 12 single nucleotide polymorphisms (SNPs) to 9.7 SNPs. “While these results suggest that although an increase in the study population size can increase total number of identified loci, lack of diversity of the population can exacerbate disparities in genetic understanding of complex traits among populations,” said Dr. Carlson.

The results from this study clearly demonstrate the benefits of studying multi-ethnic populations in study cohorts. Efforts focused on the conduct of studies with diverse populations may translate into a reduction in health disparities, greater health benefits for minority populations, and a greater understanding of the biology underlying many complex diseases.


This work was supported by the National Institutes of Health.

Fred Hutch/UW Cancer Consortium members Drs. Christopher Carlson, Ulrike Peters, Charles Kooperberg, Timothy Thornton, and Alexander Reiner contributed to this research.

Wojcik GL, Graff M, Nishimura KK, Tao R, Haessler J, Gignoux CR, Highland HM, Patel YM, et al. 2019. Genetic analyses of diverse populations improves discovery for complex traits. Nature. doi: 10.1038/s41586-019-1310-4.