Heterogeneity#

With experiment aggregation, prob_conf_mat tries to estimate the distribution of the average experiment in an ExperimentGroup. Implicitly, this assumes that all experiments come from the same distribution, and that all between experiment variance can be explained by random noise.

This is not always the case. For example, if the experiments represent the same model tested on different benchmarks (or different models tested on the same benchmark). In these cases, inter-experiment heterogeneity can exist.

Heterogeneity can lead to large inter- (or between) experiment variance, which in turn can make estimating an aggregate difficult. The methods in this module try to estimate the degree of heterogeneity present, so users are better informed as to the quality of the experiment aggregation.

See the guide on experiment aggregation for more details.

`HeterogeneityResult` `dataclass` #

Container for the output of a heterogeneity computation.

Functions#

`template_sentence` #

Fills a template string with some standard summary statistics.

`heterogeneity_dl` #

Compute the DerSimonian-Laird estimate of between-experiment heterogeneity.

Parameters:

means (Float[ndarray, 'num_experiments']) –

the experiment means
variances (Float[ndarray, 'num_experiments']) –

the experiment variances

Returns:

float ( float ) –

estimate of the between-experiment heterogeneity

`heterogeneity_pm` #

Compute the Paule-Mandel estimate of between-experiment heterogeneity.

Based on the _fit_tau_iterative function from stats_models.

Original implementation is based on Appendix A of [1]

We make two modifications

instead of stopping iteration if F(tau_2) < 0, we back-off to the midpoint between the current and previous estimate
optionally, we apply the Viechtbauer correction to the root. Instead of converging to the mean, converge to the median

`estimate_i2` #

Estimates a generalised I^2 metric, as suggested by Bowden et al. [1].

It measures the amount of variance attributable to within-experiment variance vs. between-experiment variance. The between experiment variance is estimated using a Paule-Mandel tau2 estimator.

`interpret_i2` #

Interprets I^2 values using prescribed guidelines [1].

Heterogeneity#

HeterogeneityResult dataclass #

Functions#

template_sentence #

heterogeneity_dl #

heterogeneity_pm #

estimate_i2 #

interpret_i2 #