Experiment Aggregators#

`SingletonAggregator` #

Bases: ExperimentAggregator

An aggregation to apply to an ExperimentGroup that needs no aggregation.

For example, the ExperimentGroup only contains one Experiment.

Essentially just the identity function:

\[f(x)=x\]

`BetaAggregator` #

Bases: ExperimentAggregator

Samples from the beta-conflated distribution.

Specifically, the aggregate distribution \(\text{Beta}(\tilde{\alpha}, \tilde{\beta})\) is estimated as:

\[\begin{aligned} \tilde{\alpha}&=\left[\sum_{i=1}^{M}\alpha_{i}\right]-\left(M-1\right) \\ \tilde{\beta}&=\left[\sum_{i=1}^{M}\beta_{i}\right]-\left(M-1\right) \end{aligned}\]

where \(M\) is the total number of experiments.

Uses scipy.stats.beta class to fit beta-distributions.

the individual experiment distributions are beta distributed
the metrics are bounded, although the range need not be (0, 1)

Read more:

Parameters:

estimation_method (str, default: 'mle' ) –

method for estimating the parameters of the individual experiment distributions. Options are 'mle' for maximum-likelihood estimation, or 'mome' for the method of moments estimator. MLE tends be more efficient but is difficult to estimate

`GammaAggregator` #

Bases: ExperimentAggregator

Samples from the Gamma-conflated distribution.

Specifically, the aggregate distribution \(\\text{Gamma}(\\tilde{\\alpha}, \\tilde{\\beta})\) (\(\\alpha\) is the shape, \(\\beta\) the rate parameter) is estimated as:

\[\\begin{aligned} \\tilde{\\alpha}&=\\left[\\sum_{i}^{M}\\alpha_{i}\\right]-(M-1) \\\\ \\tilde{\\beta}&=\\dfrac{1}{\\sum_{i}^{M}\\beta_{i}^{-1}} \\end{aligned}\]

where \(M\) is the total number of experiments.

An optional shifted: bool argument exists to dynamically estimate the support for the distribution. Can help fit to individual experiments, but likely minimally impacts the aggregate distribution.

the individual experiment distributions are gamma distributed

Read more:

`FEGaussianAggregator` #

Bases: ExperimentAggregator

Samples from the Gaussian-conflated distribution.

This is equivalent to the fixed-effects meta-analytical estimator.

Uses the inverse variance weighted mean and standard errors. Specifically, the aggregate distribution \(\\mathcal{N}(\\tilde{\\mu}, \\tilde{\\sigma})\) is estimated as:

\[\\begin{aligned} w_{i}&=\\dfrac{\\sigma_{i}^{-2}}{\\sum_{j}^{M}\\sigma_{j}^{-2}} \\\\ \\tilde{\\mu}&=\\sum_{i}^{M}w_{i}\\mu_{i} \\\\ \\tilde{\\sigma^2}&=\\dfrac{1}{\\sum_{i}^{M}\\sigma_{i}^{-2}} \\end{aligned}\]

where \(M\) is the total number of experiments.

the individual experiment distributions are normally (Gaussian) distributed
there is no inter-experiment heterogeneity present

Read more:

`REGaussianAggregator` #

Bases: ExperimentAggregator

Samples from the Random Effects Meta-Analytical Estimator.

First uses the standard the inverse variance weighted mean and standard errors as model parameters, before debiasing the weights to incorporate inter-experiment heterogeneity. As a result, studies with larger standard errors will be upweighted relative to the fixed-effects model.

Specifically, starting with a Fixed-Effects model \(\\mathcal{N}(\\tilde{\\mu_{\\text{FE}}}, \\tilde{\\sigma_{\\text{FE}}})\),

\[\\begin{aligned} w_{i}&=\\dfrac{\\left(\\sigma_{i}^2+\\tau^2\\right)^{-1}}{\\sum_{j}^{M}\\left(\\sigma_{j}^2+\\tau^2\\right)^{-1}} \\\\ \\tilde{\\mu}&=\\sum_{i}^{M}w_{i}\\mu_{i} \\\\ \\tilde{\\sigma^2}&=\\dfrac{1}{\\sum_{i}^{M}\\sigma_{i}^{-2}} \\end{aligned}\]

where \(\\tau\) is the estimated inter-experiment heterogeneity, and \(M\) is the total number of experiments.

Uses the Paule-Mandel iterative heterogeneity estimator, which does not make a parametric assumption. The more common (but biased) DerSimonian-Laird estimator can also be used by setting paule_mandel_heterogeneity: bool = False.

If hksj_sampling_distribution: bool = True, the aggregated distribution is a more conservative \(t\)-distribution, with degrees of freedom equal to \(M-1\). This is especially more conservative when there are only a few experiments available, and can substantially increase the aggregated distribution's variance.

the individual experiment distributions are normally (Gaussian) distributed
there is inter-experiment heterogeneity present

Read more:

Parameters:

paule_mandel_heterogeneity (bool, default: True ) –

whether to use the Paule-Mandel method for estimating inter-experiment heterogeneity, or fallback to the DerSimonian-Laird estimator. Defaults to True.
hksj_sampling_distribution (bool, default: False ) –

whether to use the Hartung-Knapp-Sidik-Jonkman corrected \(t\)-distribition as the aggregate sampling distribution. Defaults to False.

`HistogramAggregator` #

Bases: ExperimentAggregator

Samples from a histogram approximate conflation distribution.

First bins all individual experiment groups, and then computes the product of the probability masses across individual experiments.

Unlike other methods, this does not make a parametric assumption. However, the resulting distribution can 'look' unnatural, and requires overlapping supports within the sample. If any experiment assigns 0 probability mass to any bin, the conflated bin will also contain 0 probability mass.

As such, inter-experiment heterogeneity can be a significant problem.

Uses numpy.histogram_bin_edges to estimate the number of bin edges needed per experiment, and takes the smallest across all experiments for the aggregate distribution.

the individual experiment distributions' supports overlap

Read more:

Experiment Aggregators#

SingletonAggregator #

BetaAggregator #

GammaAggregator #

FEGaussianAggregator #

REGaussianAggregator #

HistogramAggregator #

`SingletonAggregator` #

`BetaAggregator` #

`GammaAggregator` #

`FEGaussianAggregator` #

`REGaussianAggregator` #

`HistogramAggregator` #