Heterogeneity#
With experiment aggregation, prob_conf_mat
tries to estimate the distribution of the average experiment in an ExperimentGroup. Implicitly, this assumes that all experiments come from the same distribution, and that all between experiment variance can be explained by random noise.
This is not always the case. For example, if the experiments represent the same model tested on different benchmarks (or different models tested on the same benchmark). In these cases, inter-experiment heterogeneity can exist.
Heterogeneity can lead to large inter- (or between) experiment variance, which in turn can make estimating an aggregate difficult. The methods in this module try to estimate the degree of heterogeneity present, so users are better informed as to the quality of the experiment aggregation.
See the guide on experiment aggregation for more details.
HeterogeneityResult
dataclass
#
heterogeneity_dl
#
Compute the DerSimonian-Laird estimate of between-experiment heterogeneity.
Parameters:
-
means
(Float[ndarray, 'num_experiments']
) –the experiment means
-
variances
(Float[ndarray, 'num_experiments']
) –the experiment variances
Returns:
-
float
(float
) –estimate of the between-experiment heterogeneity
heterogeneity_pm
#
Compute the Paule-Mandel estimate of between-experiment heterogeneity.
Based on the _fit_tau_iterative
function from stats_models
.
Original implementation is based on Appendix A of [1]
We make two modifications
- instead of stopping iteration if F(tau_2) < 0, we back-off to the midpoint between the current and previous estimate
- optionally, we apply the Viechtbauer correction to the root. Instead of converging to the mean, converge to the median
Read More:
- DerSimonian, R., & Kacker, R. (2007). Random-effects model for meta-analysis of clinical trials: an update. Contemporary clinical trials, 28(2), 105-114.
Parameters:
-
means
(Float[ndarray, ' num_experiments']
) –the experiment means
-
variances
(Float[ndarray, ' num_experiments']
) –the experiment variances
-
init_tau2
(float
, default:0.0
) –the inital tau2 estimate. Defaults to 0.0.
-
atol
(float
, default:1e-05
) –when to assume convergence. Defaults to 1e-5.
-
maxiter
(int
, default:100
) –the maximum number of iterations needed. Defaults to 50.
-
use_viechtbauer_correction
(bool
, default:False
) –whether to use the Viechtbauer correction. Very new. Defaults to False.
Returns:
-
float
(float
) –estimate of the between-experiment heterogeneity
estimate_i2
#
Estimates a generalised I^2 metric, as suggested by Bowden et al. [1].
It measures the amount of variance attributable to within-experiment variance vs. between-experiment variance. The between experiment variance is estimated using a Paule-Mandel tau2 estimator.
Read more:
- Bowden, J., Tierney, J. F., Copas, A. J., & Burdett, S. (2011). Quantifying, displaying and accounting for heterogeneity in the meta-analysis of RCTs using standard and generalised Qstatistics. BMC medical research methodology, 11(1), 1-12.
Parameters:
-
individual_samples
(Float[ndarray, 'num_samples num_experiments']
) –the samples from individual experiments
Returns:
-
float
(HeterogeneityResult
) –the I^2 estimate
interpret_i2
#
Interprets I^2 values using prescribed guidelines [1].
Read More:
- Higgins, J. P., & Green, S. (Eds.). (2008). Cochrane handbook for systematic reviews of interventions.
Parameters:
-
i2_score
(float
) –the I2 estimate
Returns:
-
str
(Literal['insignificant heterogeneity', 'borderline moderate heterogeneity', 'moderate heterogeneity', 'borderline substantial heterogeneity', 'borderline considerable heterogeneity', 'considerable heterogeneity', 'unknown']
) –a rough interpretation of the magnitude of I2