False Discovery Rate Correction#
This document describes the multiple testing correction methods used in WASP2 to control false positive rates when testing many genomic regions.
The Multiple Testing Problem#
When testing thousands of genomic regions for allelic imbalance, even a small per-test false positive rate leads to many false discoveries:
For \(m = 10{,}000\) tests at \(\alpha = 0.05\), we expect 500 false positives by chance alone.
Example: Testing 20,000 genes for ASE
At α = 0.05: ~1,000 expected false positives
At α = 0.01: ~200 expected false positives
Even stringent thresholds yield many false discoveries
Error Rate Definitions#
Family-Wise Error Rate (FWER)#
The probability of making any false discovery:
where \(V\) is the number of false positives.
FWER control (e.g., Bonferroni) is very conservative for genomic studies.
False Discovery Rate (FDR)#
The expected proportion of false discoveries among rejections:
where \(V\) is false positives and \(R\) is total rejections.
FDR is more appropriate for discovery-oriented genomic studies where:
Some false positives are acceptable
The goal is to prioritize candidates for follow-up
The number of tests is very large
Benjamini-Hochberg Procedure#
WASP2 uses the Benjamini-Hochberg (BH) procedure [Benjamini1995] for FDR control.
Algorithm#
Given \(m\) p-values \(p_1, p_2, \ldots, p_m\):
Sort p-values: \(p_{(1)} \leq p_{(2)} \leq \cdots \leq p_{(m)}\)
Find the largest \(k\) such that \(p_{(k)} \leq \frac{k}{m} \cdot q\)
Reject all hypotheses with \(p_{(i)} \leq p_{(k)}\)
where \(q\) is the target FDR level (typically 0.05 or 0.1).
Adjusted P-Values (q-values):
WASP2 reports BH-adjusted p-values:
These can be interpreted as the minimum FDR at which the hypothesis would be rejected.
Implementation#
from scipy.stats import false_discovery_control
# WASP2 uses the BH method
fdr_pvals = false_discovery_control(pvals, method='bh')
# Equivalent manual implementation
def benjamini_hochberg(pvals):
n = len(pvals)
ranked = np.argsort(pvals)
adjusted = np.empty(n)
cummin = 1.0
for i in range(n - 1, -1, -1):
idx = ranked[i]
adjusted[idx] = min(cummin, pvals[idx] * n / (i + 1))
cummin = adjusted[idx]
return adjusted
Properties#
Advantages:
Controls FDR at level \(q\) under independence
More powerful than FWER methods (fewer false negatives)
Simple to compute and interpret
Works well with continuous p-values
Assumptions:
P-values under the null are uniformly distributed
Independence or positive regression dependence (PRDS)
The chi-squared p-values from WASP2’s likelihood ratio test satisfy these assumptions when regions are independent.
Alternative Methods#
While WASP2 uses BH by default, researchers may consider alternatives for specific scenarios.
Benjamini-Yekutieli (BY)#
For arbitrary dependence between tests:
where \(c(m) = \sum_{i=1}^{m} 1/i \approx \ln(m) + 0.577\).
BY is more conservative but valid under any dependence structure.
# Available in scipy
fdr_pvals = false_discovery_control(pvals, method='by')
Storey’s q-value#
Estimates the proportion of true nulls (\(\pi_0\)) for improved power:
The q-value procedure is more powerful when many tests are true discoveries.
# Requires qvalue package
# pip install qvalue
from qvalue import qvalue
q = qvalue(pvals)
Alternative Correction Methods#
Discrete FDR and Mid-P Adjustments
For exact tests with discrete p-values (Fisher’s exact, exact binomial), specialized methods like Gilbert’s procedure [Gilbert2005] or mid-p adjustments can reduce conservativeness.
However, WASP2’s likelihood ratio test produces continuous p-values from the chi-squared distribution, so standard BH is appropriate and these discrete methods are not needed.
Practical Guidelines#
Choosing an FDR Threshold#
FDR |
Use Case |
Interpretation |
|---|---|---|
0.01 |
High-confidence discoveries |
~1% false among significant |
0.05 |
Standard exploratory analysis |
~5% false among significant |
0.10 |
Liberal discovery |
~10% false, maximize sensitivity |
0.20 |
Hypothesis generation |
For follow-up validation |
When to Use Stricter Control#
Consider stricter FDR or FWER control when:
Results will directly inform clinical decisions
Follow-up validation is expensive
The number of true positives is expected to be small
Independence assumptions may be violated
Reporting Results#
When reporting FDR-corrected results:
Report the method: “FDR correction was performed using the Benjamini-Hochberg procedure”
Report the threshold: “Significance was declared at FDR < 0.05”
Report both p-values: Include raw and adjusted p-values in supplements
Report the number of tests: “Among 15,234 tested regions…”
Output Format#
WASP2 reports both raw and adjusted p-values:
Column |
Type |
Description |
|---|---|---|
pval |
float |
Raw p-value from likelihood ratio test |
fdr_pval |
float |
BH-adjusted p-value (q-value) |
Interpretation:
pval < 0.05: Nominally significant (not corrected)fdr_pval < 0.05: Significant after multiple testing correction
References#
Benjamini Y, Hochberg Y (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society B 57:289-300.
Storey JD, Tibshirani R (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences 100:9440-9445.
Gilbert PB (2005). A modified false discovery rate multiple-comparisons procedure for discrete data, applied to human immunodeficiency virus genetics. Journal of the Royal Statistical Society C 54:143-158.