Controlling the False Discovery Rate (FDR)

A summary of how you can control the False Discovery Rate: FDR=E(V/R|R>0)P(R>0) after Benjamini and Hochberg (1995) when multiple hypothesis testing is performed, e.g. a statistical test is applied on every grid box of a model grid. The procedure can be applied to every collection of tests that are performed to local grid points, e.g. correlation, t-test, and so on, as long as p-values are provided.

An overview of the problem and practical solutions on how to implement the FDR procedure into your software (R, Matlab, python) can be found in this presentation (Oliver Gutjahr - Ocean group meeting on 13 Nov 2019).

xarray implementation (Aaron Spring):

def xr_multipletest(p, alpha=0.05, method='fdr_bh', **multipletests_kwargs):
    """Apply statsmodels.stats.multitest.multipletests for multi-dimensional xr.objects."""
    from statsmodels.stats.multitest import multipletests
    # stack all to 1d array
    p_stacked = p.stack(s=p.dims)
    # mask only where not nan: https://github.com/statsmodels/statsmodels/issues/2899
    mask = np.isfinite(p_stacked)
    pvals_corrected = np.full(p_stacked.shape, np.nan)
    reject = np.full(p_stacked.shape, np.nan)
    # apply test where mask
    reject[mask] = multipletests(
        p_stacked[mask], alpha=alpha, method=method, **multipletests_kwargs)[0]
    pvals_corrected[mask] = multipletests(
        p_stacked[mask], alpha=alpha, method=method, **multipletests_kwargs)[1]
 
    def unstack(reject, p_stacked):
        """Exchange values from p_stacked with reject (1d array) and unstack."""
        xreject = p_stacked.copy()
        xreject.values = reject
        xreject = xreject.unstack()
        return xreject
 
    reject = unstack(reject, p_stacked)
    pvals_corrected = unstack(pvals_corrected, p_stacked)
    return reject, pvals_corrected
 
reject, xpvals_corrected = xr_multipletest(p)