Biostatistics Seminar Series

Biostatistics Program

Seminar Series: 2019

When: 12:00 noon

: M1-A305/307 unless otherwise noted.

The Winter, Spring, and Summer 2019 organizer is Jing Ma

Upcoming Seminars

May 22, 2019

 *Note special time & location*

12:30 - 1:30 pm, Location: B1-072/074, Weintraub Building

Glen Satten, Centers for Disease Control

Analyzing Microbiome Data Using the Linear Decomposition Model

Abstract. Distance-based methods for analyzing microbiome data are typically restricted to testing the global hypothesis of any effect of the microbiome on a trait of interest, but do not test the contribution of individual bacteria (operational taxonomic units OTUs or amplicon sequence variants ASVs). Conversely, tests for individual OTUs do not typically provide a global test of microbiome effect. Without a unified approach, the findings of a global test may be hard to resolve with the findings at the individual OTU level. In addition, many existing methods cannot be applied to complex studies such as those with confounders and correlated data. To bridge this gap, we have proposed the linear decomposition model (LDM) that provides a single analysis path that includes global tests of any effect of the microbiome, tests of the effects of individual OTUs while accounting for multiple testing by controlling the false discovery rate (FDR). The LDM accommodates both continuous and discrete variables and allows for adjustment of confounding covariates.  We also show how to analyze matched sets of microbiome data using the LDM, and consider applying the LDM to presence-absence data.  If time permits, I will also describe a general approach to testing association between groups of OTUs (e.g., species, genera, families etc.) that have a tree-structured dependence structure using a novel bottom-up approach.


June 26, 2019

*Joint Biostatistics & Data Science Affinity Group Seminar*

12:00 - 1:00 pm

Shwetak Patel, University of Washington

New Ways of Thinking of the Mobile Phone for Healthcare

Abstract. Much of the fundamental research in computer science has been driven by the needs of those attempting to utilize computing for various applications, such as health. Dr. Patel will describe a collection of research projects conducted with his clinical collaborators that leverage the sensors on mobile devices (e.g., microphones, cameras, accelerometers, etc) in new ways to enable the screening, self-management and longitudinal study of diseases. These projects follow the theme of finding unique signals and biomarkers in order to enable access and scale by leveraging existing hardware. His remarks will underscore the potential advances in health and clinical science through the convergence of sensing, machine learning, and human-computer interaction.

Past Seminars

April 24, 2019

Zihuai He,  Stanford University

“Statistical and computational methods for integrative analysis on non-coding variation”

Understanding the functional consequences of genetic variants is a challenging problem, especially for variants in non-coding regions. The noncoding genome covers ~98% of the human genome and includes elements that regulate when, where, and to what degree protein-coding genes are transcribed. We will talk about a combination of new methodologies for the analysis of noncoding variants, integrating whole genome sequencing, epigenetic technologies and experimental approaches. First, we propose a semi-supervised approach, GenoNet, to jointly utilize experimentally confirmed regulatory variants (labeled variants), millions of unlabeled variants genome-wide, and more than a thousand cell type/tissue specific epigenetic annotations to predict functional consequences of non-coding genetic variants. Second, we propose a scan statistic framework, GenoScan, to simultaneously detect the existence, and estimate the locations of the association signal at genome-wide scale. Last, we will discuss their application to integrative analysis of complex trait genetics. 

April 3, 2019

Xing Hua, National Cancer Institute

“Intra-tumor Heterogeneity in Lung Adenocarcinoma and Statistical Challenges”

Genomic studies have revealed remarkable intra-tumor heterogeneity (ITH) and its clinical impact in various types of cancers. To investigate the ITH in lung adenocarcinoma (LUAD), we performed a comprehensive analysis of ITH in copy number variation, DNA methylation and somatic mutations of 292 tumor samples from 84 LUAD patients. In this talk, I will introduce some of our main findings in this project and the statistical work behind them: 1) Lung adenocarcinomas show substantial ITH. 2) Great consistency was found between the genetic and epigenetic profiles of LUAD tumors. 3) High ITH is associated with poor prognosis in LUAD.


March 27, 2019

12:00 - 1:00 pm

Abdus S. Wahed, University of Pittsburgh

“Parametric Regression Models for Optimal Treatment Regimes for Leukemia”

Patients with cancer or other recurrent diseases may undergo a long process of initial treatment, treatment resistance or disease recurrences followed by salvage treatments. Optimizing leukemia treatment should account for this complex process to maximally prolong patients' survival. Comparing disease-free survival for each treatment stage over-penalizes disease recurrences but under penalizes treatment-related mortalities. Moreover, treatment regimes used in practice are dynamic; that is, the choice of next treatment depends on a patient's responses to previous therapies. In this talk, using accelerated failure time models, we will develop a method to optimize such dynamic treatment regimes. This method utilizes all the longitudinal data collected during the multi-stage process of disease recurrences and treatments, and identifies the optimal dynamic treatment regime for each individual patient by maximizing his or her expected overall survival. We illustrate the application of this method using data from a study of acute myeloid leukemia.

March 13, 2019

12:00 - 1:00 pm

Katerina Kechris, University of Colorado Denver

“High-throughput methods for studying the role of micro RNA regulation in alcohol related behaviors”

Alcohol use disorders affect more than 16 million people in the United States. There is increasing evidence that micro RNA (miRNA) play an important role in alcohol related behaviors. The expression of miRNA and corresponding targets have been found to change in the brain following acute or chronic ethanol exposure. However, the role of miRNA as mediators of the genetic effect on alcohol phenotypes is not well understood. Our group has explored the role of miRNA regulation in the brain as a predisposing factor to alcohol responses and behavior. Within a well-characterized renewable panel of mice, we have integrated behavioral data with genetic variants and high-throughput miRNA and mRNA brain expression data to identify miRNA mediating pathways.  Our results suggest a mechanism of how genetic variants may be affecting alcohol behaviors through the modification of miRNA expression and their downstream targets.

            In this talk, I will also discuss the development of biostatistics and bioinformatics methods relevant to this project, including miRNA target site databases, miRNA sequencing quantitation, models for repeated measures in sequencing data, and -omics data integration.

February 27, 2019

12:00 - 1:00 pm

Amy Willis, University of Washington

“Estimating diversity and relative abundance in microbial communities”

High-throughput sequencing has advanced our understanding of the role that bacteria and archaea play in marine, terrestrial and host-associated health. Microbial community ecology differs in many ways from macroecology, and therefore new statistical methods are required to analyze microbiome data. In this talk I will present two new statistical methods for the analysis of microbiome data. The first, DivNet, estimates the diversity of microbial communities, and the second, corncob, estimates the relative abundance of microbial strains, metabolites, or genes. Both methods explicitly model microbe-microbe interactions, resulting in larger (but more accurate) estimates of variance compared to classical models. The methods will be illustrated with an analysis of the effects of wildfire on soil microbial communities in the northwestern Canadian boreal forest. 

February 20, 2019

12:00 - 1:00 pm

Yen-Chi Chen, University of Washington

“Analyzing GPS data using density ranking”

A common approach for analyzing a point cloud is based on estimating the underlying probability density function. However, in complex datasets such as GPS data, the underlying distribution function is singular so the usual density function no longer exists. To analyze this type of data, we introduce a statistical model for GPS data in the form of a mixture model with different dimensions. To derive a meaningful surrogate of the probability density, we propose a quantity called density ranking. Density ranking is a quantity representing the intensity of observations around a given point that can be defined in a singular measure. We then show that one can consistently estimate the density ranking using a kernel density estimator even in a singular distribution such as the GPS data. We apply density ranking to GPS datasets to analyze activity spaces of individuals. 


February 6, 2019

Wenxuan Zhong, University of Georgia

MetaGen: Reference-Free Learning with Multiple Metagenomic Samples

A major goal of metagenomics is to identify and study the entire collection of microbial species in a set of targeted samples. This talk will present a novel statistical metagenomic algorithm that simultaneously identifies microbial species and estimates their abundances without using reference genomes. Compared to reference-free methods based primarily on k-mer distributions or coverage information, the proposed approach achieves a higher species binning accuracy and is particularly powerful when sequencing coverage is low. Performance of this new method will be demonstrated through both simulation and real metagenomic studies. The MetaGen software is available at

January 30, 2019

Hua Tang, Stanford University

“Learning Genetic Architecture of Complex Traits Across Populations”

Genome-wide association studies (GWAS) have become a standard approach for identifying loci influencing complex traits. However, the relatively small sample size of non-European cohorts poses special challenges for both trait mapping and risk prediction. In the first part of the talk I will discuss two approaches, which aim to overcome these challenges by leveraging trans-ethnic information.  More recently, large-scale multi-ethnic cohorts have offered unprecedented potential to elucidate the genetic factors influencing complex traits in minority populations. The second part of the talk will focus on the utility of race/ethnicity categories in these multi-ethnic cohorts. We demonstrate that race/ethnicity-stratified analysis enhances the ability to understand population-specific genetic architecture. To address the practical issue that self-reported racial/ethnic information may be incomplete, we use a machine learning algorithm to produce a surrogate variable, termed HARE. We use height as a model trait to demonstrate the utility of HARE and ethnicity-specific GWAS.

January 23, 2019

*Joint Biostatistics & ATME Affinity Group Seminar*

12:00 - 1:00 pm

Saonli Basu, University of Minnesota

“Estimating Variance Components in Longitudinal Family Studies with Applications to Genetic Heritability”

A longitudinal familial study with repeated measurements on relatives and can serve as a powerful resource to detect genetic association in the development of complex traits.   Modeling the covariance structure in a longitudinal family study is often quite challenging, as one needs to capture three distinct levels of dependencies. The dependency among family members at each time point, the dependency of outcomes in an individual across all time-points and modeling the outcome dependencies between two family members among different time points. Heritability measures the contribution of the additive genetic component in a trait variance.  In this talk, we investigate the challenges with joint genetic association testing and heritability estimation with longitudinal family data. We will focus on twin studies as this unique study design with MZs and DZs can separate additive genetic effects from shared environmental component and thus provides means for accurate heritability estimation. We develop a model that enables both heritability estimation and association testing with more flexible assumptions and is an extension of the traditional Falconer's approach. For our model, we propose a rapid two stage estimation procedure and a method of moments approach which can even be used to estimate variance components in a multivariate ACE model.  We compare our approach with existing approaches through extensive simulation studies and illustrate our approach on Minnesota twin studies.

December 5, 2018


Jun Xie, Purdue University

"Powerful statistical tests for genetic variant sets with scalable algorithms"

Identification of disease-related genetic variants is dependent upon a fundamental statistical method, hypothesis testing. In the context of whole genome sequence analysis, standard hypothesis testing methods, as well as multiple testing methods, face great challenges, due to the situations of sparse and weak genetic effects, rare variants, high correlations between genetic variants, and confounding factors such as population stratification. We conduct studies of the theory and application of powerful statistical tests for sets of multiple variants. In statistical theory, the tests that we study are global tests, based on either regression approaches or combinations of individual p-values, and have been proven to be optimal under sparse and weak alternative hypotheses. We demonstrate the advanced tests through theories, simulations, and real genome-wide data analysis. By applying the powerful tests for large amounts of sequence data, we aim to aggregate individual genetic variant effects and reduce the burden of multiple testing, improve power to detect weak genetic effects, develop an efficient and accurate p-value calculation for scalable algorithms, incorporate other covariates such as clinical and demographical data and confounding factors, and accommodate high correlations between variants.

November 28, 2018


Jinbo Chen, University of Pennsylvania

“An Estimating Equation Approach to Adjusting for Case Contamination in Electronic Health Records-based Case-Control Studies”

Abstract: Clinically relevant information from electronic health records (EHRs) permits derivation of a rich collection of phenotypes. Unfortunately, the true status of any given individual with respect to a phenotype of interest is not necessarily known. A common study design is to use structured clinical data elements to identify case and control groups. While controls can usually be identified with high accuracy through rigorous selection criteria, the stringency of rules for identifying cases needs to be balanced against the achievable sample size. The inaccurate identification results in a pool of candidate cases consisting of genuine cases and non-case subjects that do not satisfy control definition. This case contamination issue represents a unique challenge in EHR-based case-control studies. We propose a novel estimating equation (EE) approach to estimating odds ratio association parameters and study its large sample properties. We evaluate the large and finite sample performance of our method through extensive simulation studies and application to a real EHR-based study of aortic stenosis. A practical issue for designing EHR-based case-control studies is the balance between accuracy and size of the case pool. Our simulation results showed that enlarging the case pool by incorporating more genuine cases can lead to improved statistical efficiency of EE estimates.

November 7, 2018


Bin Zhu, NCI

“Utilization of patient information to investigate subtype heterogeneity of driver genes in cancer genome sequencing studies”

Identifying cancer driver genes is essential for understanding mechanisms of carcinogenesis and designing therapeutic strategies. Consequently, a set of driver genes has been identified for each cancer types, assumed to be identical across subtypes. This assumption may not hold, and the sets of driver gene are possibly distinct across cancer subtypes. We propose a statistical framework MutScot that identifies driver genes and utilizes patient information to investigate subtype heterogeneity of driver genes. Through simulation studies, we show that compared with other methods MutScot is more powerful and properly controls the type I error for finding driver genes; we demonstrate that MutScot is capable of identifying subtype heterogeneity of a driver gene, which is infeasible by other methods. Applications of MutScot to three The Cancer Genome Atlas (TCGA) studies showcase that MutScot possesses higher accuracy of finding driver genes and that MutScot identifies subtype heterogeneity of driver genes in breast cancer with regards to the status of hormone receptor.

October 31, 2018



“Efficiency of incorporating retesting outcomes for estimation of disease prevalence”

Group testing has been widely used as a cost-effective strategy to screen for and estimate the prevalence of a rare disease.  While it is well-recognized that retesting is necessary for identifying infected subjects, it is not required for estimating the prevalence. However, one can expect gains in statistical efficiency from incorporating retesting results in the estimation. Research in this context is scarce, particularly for tests with classification errors. For an imperfect test we show that retesting subjects in either positive or negative groups can substantially improve the efficiency of the estimates, and retesting positive groups yields higher efficiency than retesting a same number or proportion of negative groups. Moreover, when the test is subject to no misclassification, performing retesting on positive groups still results in more efficient estimates.

October 10, 2018



Anna Bellach, University of Washington

The Regression Analysis of Competing Risks Data with Pseudo Risk Sets

Abstract. A common approach for analyzing competing risks data is to model the cause speci c hazards.  Challenges arise from the fact that the relation between the cause speci c hazard and the corresponding cumulative incidence function is complex. The product limit estimator based on the cause speci c hazard systematically overestimates the cumulative incidence function and estimated regression parameters are not interpretable with regard to the cumulative incidence function. Direct regression modeling of the subdistribution has thus become popular for analyzing data with multiple competing event types. All general approaches so far are based on nonlikelihood-based procedures and target covariate effects on the subdistribution. We introduce a novel weighted likelihood function that allows for a direct extension of the Fine-Gray model to a broad class of semiparametric transformation models. Targeting the subdistribution hazard, the model accommodates time-dependent covariates. We establish extensions to the practically relevant settings of recurrent event data with competing terminal events and to independently left-truncated and right-censored competing risks data. To motivate the proposed likelihood method, we derive standard nonparametric estimators and discuss a new interpretation based on pseudo risk sets. We establish consistency and asymptotic normality of the estimators and propose a sandwich estimator of the variance. In comprehensive simulation studies, we demonstrate the strong performance of the weighted nonparametric maximum likelihood estimators. Illustrating its practical utility, we provide applications of the proposed method to a large bone marrow transplant dataset, to recent data from HIV-1 vaccine effiacy trials and to a bladder cancer dataset. 

October 3, 2018


Yue Wang, Fred Hutch

Partial Least Squares Methods for Functional Regression Models

Abstract. With the growth of modern technology, many biomedical studies have collected massive datasets with large volumes of imaging, genetic, and clinical data from increasingly large cohorts. An important class of prediction problems in modern biomedical studies is to use medical images (e.g., computed tomography and magnetic resonance imaging) as well as genetic and clinical biomarkers at an earlier time point to predict important clinical outcomes. Functional regression is one of the statistical models to handle this prediction problem by treating medical images as smooth functions.  We developed the functional partial least squares (FPLS) methods for a wide class of functional models. Numeric studies including simulation and ADNI data analysis demonstrated the advantage of FPLS methods in terms of both estimation and prediction accuracy.

September 26, 2018


Location: D1-080/084 Thomas Building

Brenda Price, University of Washington

Comparing IPCW-TMLE Methodology to Breslow-Holubkov Estimation for Two Phase Sampling Association Studies

Abstract. New developments in analysis for two-phase (or two-stage) studies include an inverse probability of censoring weighted targeted maximum likelihood estimator (IPCW-TMLE) approach for association parameter estimation. This approach exhibits doubly-robust properties potentially making it more desirable than some existing methodologies. We explore how well the doubly robust IPCW-TMLE approach compares to the logistic regression approach of Breslow and Holubkov (1997, JRSS-B), a well-established method for two-phase data analysis. Simulation studies explore a variety of scenarios, including instances of rare or common failure rate, few or many covariates, and few or many cases. Preliminary results indicate that the IPCW-TMLE approach exhibits less bias in estimation of the average exposure effect and results in smaller root mean square error than the Breslow-Holubkov approach when the outcome regression model is misspecified.

August 8, 2018

Pei Wang, Icahn School of Medicine at Mount Sinai

Constructing Tumor-specific Gene Regulatory Networks Based on Samples with Tumor Purity Heterogeneity

Abstract. Tumor tissue samples often contain an unknown fraction of normal cells. This problem well known as tumor purity heterogeneity (TPH) was recently recognized as a severe issue in omics studies. Specifically, if TPH is ignored when inferring co-expression networks, edges are likely to be estimated among genes with mean shift between normal and tumor cells rather than among gene pairs interacting with each other in tumor cells. To address this issue, we propose TSNet a new method which constructs tumor-cell specific gene/protein co-expression networks based on gene/protein expression profiles of tumor tissues. TSNet treats the observed expression profile as a mixture of expressions from different cell types and explicitly models tumor purity percentage in each tumor sample. The advantage of TSNet over existing methods ignoring TPH is illustrated through extensive simulation examples. We then apply TSNet to estimate tumor specific co-expression networks based on breast cancer expression profiles. We identify novel co-expression modules and hub structure specific to tumor cells.


Public Health Science Special Seminar

August 8, 2018

Arlene Ash, University of Massachusetts

Predicting Risk in Health Care

Abstract. In this talk I will provide an overview of what we have learned, since the mid-1980s, about building predictive models for US health care delivery programs, including Medicare (covering most people over age 65 and some younger people with disabilities) and Medicaid (joint Federal-state partnerships, mainly financing care for low-income people). Our payment models are designed to provide health plans with the resources needed to care for the specific panel of individuals they enroll: more for those with complex medical and/or social needs and less for those with low levels of expected need. I will also discuss current work on risk adjustment for quality measures, with the specific example of a measure that seeks to identify and reward plans that reduce (unnecessarily) high rates of Emergency Department visits.


June 6, 2018

Shuangge Steven Ma, Yale School of Public Health

Assisted Analysis of Gene Expression Data

Absract. Gene expression studies have been playing a pivotal role for the research on many complex diseases. With the high dimensionality and noisy nature of data, the analysis of gene expression studies, despite many promising findings, is still often unsatisfactory. In recent omics studies, a prominent trend is to conduct multidimensional studies, where gene expressions are profiled along with their regulators (methylation, copy number variation, microRNA, and others). In a series of studies, we have developed assisted analysis techniques, which use regulator information to assist the regression, clustering, and other analysis of gene expression data. The assisted analysis differs from the analysis of gene expression data only and integrated analysis in multiple aspects. Numerical and statistical investigations show promising performance of the assisted analysis.


May 30, 2018

Xiao-Li Meng, Harvard University

Statistical Paradises and Paradoxes in Big Data (I): Law of Large Populations, Big Data Paradox, and the 2016 US Presidential Election


Absract. The term “Big Data” emphasizes data quantity, not quality.  Once we take into account data quality, the effective sample size of a “Big Data” set can be vanishingly small, because of the Law of Large Populations, bringing back the long-forgotten monster, the population size.  Without understanding this phenomenon, we may subject ourselves to the Big Data Paradox:  the bigger the data, the surer we fool ourselves. This is because we would be misled by the drastically inflated precision assessment hence a gross overconfidence, setting us up to be caught by surprise when the reality unfolds, as we all experienced during the 2016 US presidential election. Data from Cooperative Congressional Election Study (CCES, conducted by Stephen Ansolabehere, Douglas River and others, and analyzed by Shiro Kuriwaki), are used to assess the data quality in 2016 US election polls, with the aim to gain a clearer vision for the 2020 election and beyond. (Preprint available at


May 16, 2018

Eric Laber, North Carolina State University

Sample size considerations for precision medicine

Absract. Sequential Multiple Assignment Randomized Trials (SMARTs) are considered the gold standard for estimation and evaluation of treatment regimes. SMARTs are typically sized to ensure sufficient power for a simple comparison, e.g., the comparison of two fixed and non-overlapping treatment sequences.  Estimation of an optimal treatment regime is conducted as part of a secondary and hypothesis-generating analysis with formal evaluation of the estimated optimal regime deferred to a follow-up trial. However, running a follow-up trial to evaluate an estimated optimal treatment regime is costly and time-consuming; furthermore, the estimated optimal regime that is to be evaluated in such a follow-up trial may be far from optimal if the original trial was underpowered for estimation of an optimal regime.  We derive sample size procedures for a SMART that ensure: (i) sufficient power for comparing the optimal treatment regime with standard of care; and (ii) the estimated optimal regime is within a given tolerance of the true optimal regime with high-probability. We establish asymptotic validity of the proposed procedures and demonstrate their finite sample performance in a series of simulation experiments.


May 9, 2018

Jinko Graham, Simon Fraser University

Combining Phenotypes, Genotypes and Genealogies to Find Trait-influencing Variants

A basic tenet of statistical genetics is that shared ancestry leads to trait similarities in individuals. Related individuals share segments of their genome, derived from a common ancestor.  The coalescent is a popular mathematical model of the shared ancestry that represents the relationships amongst segments as a set of binary trees, or genealogies, along the genome. While these genealogies cannot be observed directly, the genetic-marker data enable us to sample from their posterior distribution.  For each genealogical tree that is sampled, we may compare the clustering of trait values to the clustering obtained under the prior distribution. This comparison provides a latent pvalue that reflects the degree of surprise about the prior distribution in the sampled tree. The distribution of these latent pvalues is the fuzzy pvalue as defined by Geyer and Thompson.  The fuzzy pvalue contrasts the prior and posterior distributions and is informative for mapping trait-influencing variants. In this talk, I will discuss these ideas with application to data from an immune-marker study, present results from preliminary analyses and highlight potential avenues for further research.

April 25, 2018

Limin Peng, Emory University

Trajectory Quantile Regression for Longitudinal Data

Quantile regression has demonstrated promising utility in longitudinal data analysis. Existing work is primarily focused on modeling cross-sectional outcomes, while outcome trajectories often carry more substantive information in practice. In this work, we develop a trajectory quantile regression framework that is designed to robustly and flexibly investigate how latent individual trajectory features are related to observed subject characteristics. The proposed models are built under multilevel modeling with usual parametric assumptions lifted or relaxed. We derive our estimation procedure by novelly transforming the problem at hand to quantile regression with perturbed responses and adapting the bias correction technique for handling covariate measurement errors. We establish desirable asymptotic properties of the proposed estimator, including uniform consistency and weak convergence. Extensive simulation studies confirm the validity of the proposed method as well as its robustness. An application to the DURABLE trial uncovers sensible scientific findings and illustrates the practical value of our proposals.

April 11, 2018

Jinguo Cao, Simon Fraser University

Estimating Time-Varying Directed Gene Regulation Networks

The problem of modeling the dynamical regulation process within a gene network has been of great interest for a long time. We propose to model this dynamical system with a large number of nonlinear ordinary differential equations (ODEs), in which the regulation function is estimated directly from data without any parametric assumption. Most current research assumes the gene regulation network is static, but in reality, the connection and regulation function of the network may change with time or environment. This change is reflected in our dynamical model by allowing the regulation function varying with the gene expression and forcing this regulation function to be zero if no regulation happens. We introduce a statistical method called functional SCAD to estimate a time-varying sparse and directed gene regulation network, and simultaneously, to provide a smooth estimation of the regulation function and identify the interval in which no regulation effect exists. The finite sample performance of the proposed method is investigated in a Monte Carlo simulation study. Our method is demonstrated by estimating a time-varying directed gene regulation network of 20 genes involved in muscle development during the embryonic stage of Drosophila melanogaster.

April 4, 2018

Tracey Marsh

Quantifying Uncertainty in Estimates of Net Benefit

Evidence for the clinical usefulness of biomarkers, as incorporated into clinical practice, is the primary goal of phase 4 prospective studies and phase 5 trials. At earlier phases of biomarker development, how to appraise the compromise between sensitivity and specificity is central to the challenge of assessing the potential for clinical utility from a retrospective study. When a biomarker test will ultimately guide whether or not a patient will undergo a particular medical intervention (for example to diagnose or prevent disease), net benefit metrics provide a single population-level quantity that reflects the consequences of test-guided decisions.  The importance of quantifying the uncertainty inherent in estimates of net benefit does not appear widely appreciated, as evidenced by routine omission in published results. I will present recently completed work that establishes the asymptotic theory for empirical estimators of net benefit. I leverage the versatility of influence function techniques to provide inference for estimators from common study designs, contrasting test performance, and to estimate confidence bands for net benefit curves. Resulting insights and implications for study design will be discussed. Examples are drawn from cancer biomarker research and validation of clinical decision rules in cardiovascular and emergency medicine.