Vaccine and Infectious Disease Division

Estimating false positive rates more precisely

For any disease test or screening, a certain false positive rate exists. In statistics, the “positive predictive value” (PPV) refers to the proportion of people with a positive test result who actually have the disease in question – a high PPV indicates a low proportion of false positive tests. The calculation of this measure depends on the prevalence of the disease, so different populations with different disease rates will have different PPVs for the same diagnostic test. Traditionally, the PPV is estimated separately for a given population. Now, VIDD assistant member Dr. Ying Huang and colleagues, including VIDD assistant member Dr. Youyi Fong, devised a way to more accurately estimate this value, using data from multiple populations.

As an example, Huang looked at the power of PCA3, a prostate-specific molecule that is over-expressed in prostate tumors, to predict prostate cancer. She looked at two populations, men who had previously had a negative biopsy for prostate cancer, and men who had never been biopsied before. The rates of prostate cancer in these two populations are very different. The researchers first estimated the “receiver operating characteristic” (ROC), a plot that describes the true positive rate vs. false positive rate of the test and does not rely on disease prevalence, for both populations. Relying on this measure, they could then estimate the PPV for each population. As compared to calculating the PPV for each population in isolation, this technique is a more efficient use of data and should give more accurate PPV estimations.

Huang Y, Fong Y, Wei J, Feng Z. Borrowing Information across Populations in Estimating Positive and Negative Predictive Values. J R Stat Soc Ser C Appl Stat. 2011 Nov 1;60(5):633-653.