API

Summary statistics

PosteriorStats.SummaryStats — Type

struct SummaryStats

A container for a column table of values computed by summarize.

This object implements the Tables and TableTraits column table interfaces. It has a custom show method.

Note

SummaryStats behaves like an OrderedDict of columns, where the columns can be accessed using either Symbols or a 1-based integer index. However, this interface is not part of the public API and may change in the future. We recommend using it only interactively.

Constructor

SummaryStats(data; name="SummaryStats"[, labels])

Construct a SummaryStats from tabular data.

data must implement the Tables interface. If it contains a column label, this will be used for the row labels or will be replaced with the labels if provided.

Keywords

name::AbstractString: The name of the collection of summary statistics, used as the table title in display.
labels::AbstractVector: The names of the parameters in data, used as row labels in display. If not provided, then the column label in data will be used if it exists. Otherwise, the parameter names will be numeric indices.

source

PosteriorStats.summarize — Function

summarize(data; kind=:all,kwargs...) -> SummaryStats
summarize(data, stats_funs...; kwargs...) -> SummaryStats

Compute summary statistics on each param in data.

Arguments

data: a 3D array of real samples with shape (draws, chains, params) or another object for which a summarize method is defined.
stats_funs: a collection of functions that reduces a matrix with shape (draws, chains) to a scalar or a collection of scalars. Alternatively, an item in stats_funs may be a Pair of the form name => fun specifying the name to be used for the statistic or of the form (name1, ...) => fun when the function returns a collection. When the function returns a collection, the names in this latter format must be provided.

Keywords

var_names: a collection specifying the names of the parameters in data. If not provided, the names the indices of the parameter dimension in data.
name::String: the name of the summary statistics, used as the table title in display.
kind::Symbol: The named collection of summary statistics to be computed:
- :all: Everything in :stats and :diagnostics
- :stats: mean, std, <ci>
- :diagnostics: ess_tail, ess_bulk, rhat, mcse_mean, mcse_std
- :all_median: Everything in :stats_median and :diagnostics_median
- :stats_median: median, mad, <ci>
- :diagnostics_median: ess_median, ess_tail, rhat, mcse_median
kwargs: additional keyword arguments passed to default_summary_stats, including:
- ci_fun=eti: The function to compute the credible interval <ci>, if any. Supported options are eti and hdi. CI column name is <ci_fun><100*ci_prob>.
- ci_prob=0.89: The probability mass to be contained in the credible interval <ci>.

Extended Help

Examples

Compute all summary statistics (the default):

Display precision

When an estimator and its MCSE are both computed, the MCSE is used to determine the number of significant digits that will be displayed.

julia> using Statistics, StatsBase

julia> x = randn(1000, 4, 3) .+ reshape(0:10:20, 1, 1, :);

julia> summarize(x)
SummaryStats
       mean   std  eti89          ess_tail  ess_bulk  rhat  mcse_mean  mcse_std
 1   0.0003  0.99  -1.57 .. 1.59      3567      3663  1.00      0.016     0.012
 2  10.02    0.99   8.47 .. 11.6      3841      3906  1.00      0.016     0.011
 3  19.98    0.99   18.4 .. 21.6      3892      3749  1.00      0.016     0.012

Compute just the default statistics with a 94% HDI, and provide the parameter names:

julia> var_names=[:x, :y, :z];

julia> summarize(x; var_names, kind=:stats, ci_fun=hdi, ci_prob=0.94)
SummaryStats
         mean    std  hdi94
 x   0.000275  0.989  -1.92 .. 1.78
 y  10.0       0.988   8.17 .. 11.9
 z  20.0       0.988   18.1 .. 21.9

Compute Statistics.mean, Statistics.std and the Monte Carlo standard error (MCSE) of the mean estimate:

julia> summarize(x, mean, std, :mcse_mean => sem; name="Mean/Std")
Mean/Std
       mean    std  mcse_mean
 1   0.0003  0.989      0.016
 2  10.02    0.988      0.016
 3  19.98    0.988      0.016

Compute multiple quantiles simultaneously:

julia> percs = (5, 25, 50, 75, 95);

julia> summarize(x, Symbol.(:q, percs) => Base.Fix2(quantile, percs ./ 100))
SummaryStats
       q5     q25       q50     q75    q95
 1  -1.61  -0.668   0.00447   0.653   1.64
 2   8.41   9.34   10.0      10.7    11.6
 3  18.4   19.3    20.0      20.6    21.6

Extending summarize to custom types

To support computing summary statistics from a custom object MyType, overload the method summarize(::MyType, stats_funs...; kwargs...), which should ultimately call summarize(::AbstractArray{<:Union{Real,Missing},3}, stats_funs...; other_kwargs...), where other_kwargs are the keyword arguments passed to summarize.

source

PosteriorStats.default_summary_stats — Function

default_summary_stats(kind::Symbol=:all; kwargs...)

Return a collection of stats functions based on the named preset kind.

These functions are then passed to summarize.

Arguments

kind::Symbol: The named collection of summary statistics to be computed:
- :all: Everything in :stats and :diagnostics
- :stats: mean, std, <ci>
- :diagnostics: ess_tail, ess_bulk, rhat, mcse_mean, mcse_std
- :all_median: Everything in :stats_median and :diagnostics_median
- :stats_median: median, mad, <ci>
- :diagnostics_median: ess_median, ess_tail, rhat, mcse_median

Keywords

ci_fun=eti: The function to compute the credible interval <ci>, if any. Supported options are eti and hdi. CI column name is <ci_fun><100*ci_prob>.
ci_prob=0.89: The probability mass to be contained in the credible interval <ci>.

source

Credible intervals

PosteriorStats.hdi — Function

hdi(samples::AbstractVecOrMat{<:Real}; [prob, sorted, method]) -> IntervalSets.ClosedInterval
hdi(samples::AbstractArray{<:Real}; [prob, sorted, method]) -> Array{<:IntervalSets.ClosedInterval}

Estimate the highest density interval (HDI) of samples for the probability prob.

The HDI is the minimum width Bayesian credible interval (BCI). That is, it is the smallest possible interval containing (100*prob)% of the probability mass.[1] This implementation uses the algorithm of Chen and Shao [2].

LOO

PosteriorStats.AbstractELPDResult — Type

abstract type AbstractELPDResult

An abstract type representing the result of an ELPD computation.

Every subtype stores estimates of both the expected log predictive density (elpd) and the effective number of parameters p, as well as standard errors and pointwise estimates of each, from which other relevant estimates can be computed.

Subtypes implement the following functions:

elpd_estimates
information_criterion

source

PosteriorStats.PSISLOOResult — Type

Results of Pareto-smoothed importance sampling leave-one-out cross-validation (PSIS-LOO).

Model comparison

PosteriorStats.ModelComparisonResult — Type

ModelComparisonResult

Result of model comparison using ELPD.

This struct implements the Tables and TableTraits interfaces.

Each field returns a collection of the corresponding entry for each model:

name: Names of the models, if provided.
rank: Ranks of the models (ordered by decreasing ELPD)
elpd_diff: ELPD of a model subtracted from the largest ELPD of any model
se_elpd_diff: Standard error of the ELPD difference
weight: Model weights computed with weights_method
elpd_result: AbstactELPDResults for each model, which can be used to access useful stats like ELPD estimates, pointwise estimates, and Pareto shape values for PSIS-LOO
weights_method: Method used to compute model weights with model_weights

source

PosteriorStats.compare — Function

compare(models; kwargs...) -> ModelComparisonResult

Compare models based on their expected log pointwise predictive density (ELPD).

The ELPD is estimated by Pareto smoothed importance sampling leave-one-out cross-validation (PSIS-LOO), the same method used by loo. For more theory, see Spiegelhalter et al. [5].

Arguments

models: a Tuple, NamedTuple, or AbstractVector whose values are either AbstractELPDResult entries or any argument to loo.

Keywords

weights_method::AbstractModelWeightsMethod=Stacking(): the method to be used to weight the models. See model_weights for details
sort::Bool=true: Whether to sort models by decreasing ELPD.

Returns

ModelComparisonResult: A container for the model comparison results. The fields contain a similar collection to models.

Examples

Compare the centered and non centered models of the eight school problem:

julia> using ArviZExampleData

julia> models = (
           centered=load_example_data("centered_eight"),
           non_centered=load_example_data("non_centered_eight"),
       );

julia> mc = compare(models)
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
ModelComparisonResult with Stacking weights
               rank  elpd  se_elpd  elpd_diff  se_elpd_diff  weight    p  se_p
 non_centered     1   -31      1.5       0            0.0       1.0  0.9  0.32
 centered         2   -31      1.4       0.03         0.061     0.0  0.9  0.33
julia> mc.weight
(non_centered = 1.0, centered = 0.0)

Compare the same models from pre-computed PSIS-LOO results and computing BootstrappedPseudoBMA weights:

julia> elpd_results = mc.elpd_result;

julia> compare(elpd_results; weights_method=BootstrappedPseudoBMA())
ModelComparisonResult with BootstrappedPseudoBMA weights
               rank  elpd  se_elpd  elpd_diff  se_elpd_diff  weight    p  se_p
 non_centered     1   -31      1.5       0            0.0      0.51  0.9  0.32
 centered         2   -31      1.4       0.03         0.061    0.49  0.9  0.33

References

[5] Spiegelhalter et al. J. R. Stat. Soc. B 64 (2002)

source

PosteriorStats.model_weights — Function

model_weights(elpd_results; method=Stacking())
model_weights(method::AbstractModelWeightsMethod, elpd_results)

Compute weights for each model in elpd_results using method.

elpd_results is a Tuple, NamedTuple, or AbstractVector with AbstractELPDResult entries. The weights are returned in the same type of collection.

Stacking is the recommended approach, as it performs well even when the true data generating process is not included among the candidate models. See Yao et al. [6] for details.

Examples

Compute Stacking weights for two models:

julia> using ArviZExampleData

julia> models = (
           centered=load_example_data("centered_eight"),
           non_centered=load_example_data("non_centered_eight"),
       );

julia> elpd_results = map(models) do idata
           log_like = PermutedDimsArray(idata.log_likelihood.obs, (2, 3, 1))
           return loo(log_like)
       end;
┌ Warning: 1 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...

julia> model_weights(elpd_results; method=Stacking())
(centered = 0.0, non_centered = 1.0)

Now we compute BootstrappedPseudoBMA weights for the same models:

julia> model_weights(elpd_results; method=BootstrappedPseudoBMA())
(centered = 0.492513, non_centered = 0.507487)

References

[6] Yao et al. Bayesian Analysis 13, 3 (2018)

source

The following model weighting methods are available

PosteriorStats.AbstractModelWeightsMethod — Type

abstract type AbstractModelWeightsMethod

An abstract type representing methods for computing model weights.

Subtypes implement model_weights(method, elpd_results).

source

PosteriorStats.BootstrappedPseudoBMA — Type

struct BootstrappedPseudoBMA{R<:Random.AbstractRNG, T<:Real} <: AbstractModelWeightsMethod

Model weighting method using pseudo Bayesian Model Averaging using Akaike-type weighting with the Bayesian bootstrap (pseudo-BMA+)[6].

The Bayesian bootstrap stabilizes the model weights.

BootstrappedPseudoBMA(; rng=Random.default_rng(), samples=1_000, alpha=1)
BootstrappedPseudoBMA(rng, samples, alpha)

Construct the method.

rng::Random.AbstractRNG: The random number generator to use for the Bayesian bootstrap
samples::Int64: The number of samples to draw for bootstrapping
alpha::Real: The shape parameter in the Dirichlet distribution used for the Bayesian bootstrap. The default (1) corresponds to a uniform distribution on the simplex.

Predictive checks

PosteriorStats.loo_pit — Function

loo_pit(y, y_pred, log_weights) -> Union{Real,AbstractArray}

Compute leave-one-out probability integral transform (LOO-PIT) checks.

Arguments

y: array of observations with shape (params...,)
y_pred: array of posterior predictive samples with shape (draws, chains, params...).
log_weights: array of normalized log LOO importance weights with shape (draws, chains, params...).

Returns

pitvals: LOO-PIT values with same size as y. If y is a scalar, then pitvals is a scalar.

LOO-PIT is a marginal posterior predictive check. If $y_{-i}$ is the array $y$ of observations with the $i$th observation left out, and $y_i^*$ is a posterior prediction of the $i$th observation, then the LOO-PIT value for the $i$th observation is defined as

\[P(y_i^* \le y_i \mid y_{-i}) = \int_{-\infty}^{y_i} p(y_i^* \mid y_{-i}) \mathrm{d} y_i^*\]

The LOO posterior predictions and the corresponding observations should have similar distributions, so if conditional predictive distributions are well-calibrated, then for continuous data, all LOO-PIT values should be approximately uniformly distributed on $[0, 1]$. [7]

Warning

For discrete data, the LOO-PIT values will typically not be uniformly distributed on $[0, 1]$, and this function is not recommended.

Examples

Calculate LOO-PIT values using as test quantity the observed values themselves.

julia> using ArviZExampleData

julia> idata = load_example_data("centered_eight");

julia> y = idata.observed_data.obs;

julia> y_pred = PermutedDimsArray(idata.posterior_predictive.obs, (:draw, :chain, :school));

julia> log_like = PermutedDimsArray(idata.log_likelihood.obs, (:draw, :chain, :school));

julia> log_weights = loo(log_like).psis_result.log_weights;

julia> loo_pit(y, y_pred, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.942759
 "Deerfield"         0.641057
 "Phillips Andover"  0.32729
 "Phillips Exeter"   0.581451
 "Hotchkiss"         0.288523
 "Lawrenceville"     0.393741
 "St. Paul's"        0.886175
 "Mt. Hermon"        0.638821

Calculate LOO-PIT values using as test quantity the square of the difference between each observation and mu.

julia> using Statistics

julia> mu = idata.posterior.mu;

julia> T = y .- median(mu);

julia> T_pred = y_pred .- mu;

julia> loo_pit(T .^ 2, T_pred .^ 2, log_weights)
┌ 8-element DimArray{Float64, 1} ┐
├────────────────────────────────┴─────────────────────────────── dims ┐
  ↓ school Categorical{String} ["Choate", …, "Mt. Hermon"] Unordered
└──────────────────────────────────────────────────────────────────────┘
 "Choate"            0.868148
 "Deerfield"         0.27421
 "Phillips Andover"  0.321719
 "Phillips Exeter"   0.193169
 "Hotchkiss"         0.370422
 "Lawrenceville"     0.195601
 "St. Paul's"        0.817408
 "Mt. Hermon"        0.326795

References

[7] Gabry et al. J. R. Stat. Soc. Ser. A Stat. Soc. 182 (2019).

source

PosteriorStats.r2_score — Function

r2_score(y_true::AbstractVector, y_pred::AbstractArray; kwargs...) -> (; r2, <ci_fun>)

$R²$ for linear Bayesian regression models.[8]

The $R²$, or coefficient of determination, is defined as the proportion of variance in the data that is explained by the model. For each draw, it is computed as the variance of the predicted values divided by the variance of the predicted values plus the variance of the residuals.

The distribution of the $R²$ scores can then be summarized using a point estimate and a credible interval (CI).

Arguments

y_true: Observed data of length noutputs
y_pred: Predicted data with size (ndraws[, nchains], noutputs)

Keywords

summary::Bool=true: Whether to return a summary or an array of $R²$ scores. The summary is a named tuple with the point estimate :r2 and the credible interval :<ci_fun>.
point_estimate=Statistics.mean: The function used to compute the point estimate of the $R²$ scores if summary is true. Supported options are:
- Statistics.mean (default)
- Statistics.median
- StatsBase.mode
ci_fun=eti: The function used to compute the credible interval if summary is true. Supported options are eti and hdi.
ci_prob=0.89: The probability mass to be contained in the credible interval.

Examples

julia> using ArviZExampleData

julia> idata = load_example_data("anes");

julia> y_true = idata.observed_data.vote;

julia> y_pred = PermutedDimsArray(idata.posterior_predictive.vote, (:draw, :chain, :__obs__));

julia> r2_score(y_true, y_pred)
(r2 = 0.4944850210319484, eti = 0.46184359652436546 .. 0.528018251711097)

References

[8] Gelman et al, The Am. Stat., 73(3) (2019)

source

Utilities

PosteriorStats.kde_reflected — Function

kde_reflected(data::AbstractVector{<:Real}; bounds=extrema(data), kwargs...)

Compute the boundary-corrected kernel density estimate (KDE) of data using reflection.

For $x \in (l, u)$, the reflected KDE has the density

\[\hat{f}_R(x) = \hat{f}(x) + \hat{f}(2l - x) + \hat{f}(2u - x),\]

where $\hat{f}$ is the usual KDE of data. This is equivalent to augmenting the original data with 2 additional copies of the data reflected around each bound, computing the usual KDE, trimming the KDE to the bounds, and renormalizing.

Any non-finite bounds are ignored. Remaining kwargs are passed to KernelDensity.kde. The default bandwidth is estimated using the Improved Sheather-Jones (ISJ) method [9].

References

[9] Botev et al. Ann. Stat., 38: 5 (2010)

source

PosteriorStats.pointwise_conditional_loglikelihoods — Function

pointwise_conditional_loglikelihoods(y, dists)

Compute pointwise conditional log-likelihoods of y for non-factorized distributions.

A non-factorized observation model $p(y \mid \theta)$, where $y$ is an observation in its support and $\theta$ are model parameters, can be factorized as $p(y_i \mid y_{-i}, \theta) p(y_{-i} \mid \theta)$. However, completely factorizing into individual likelihood terms can be tedious, expensive, and poorly supported by a given PPL. This utility function computes $\log p(y_i \mid y_{-i}, \theta)$ terms for all $i$; the resulting pointwise conditional log-likelihoods can be used e.g. in loo.

Arguments

y: observed value in the support of the distributions in dists. If the distribution is array-variate, y is an array with shape (params...,).
dists: array of shape (draws[, chains]) containing parametrized Distributions.Distributions representing a non-factorized observation model, one for each posterior draw. The following distributions are currently supported:
- Distributions.MvNormal [10]
- Distributions.MvNormalCanon
- Distributions.MatrixNormal
- Distributions.MvLogNormal
- Distributions.GenericMvTDist [10, but uses a more efficient implementation]
- Distributions.AbstractMixtureModel for mixtures of any of the above multivariate distributions
- Distributions.JointOrderStatistics for joint distributions of order statistics
- Distributions.ProductDistribution for products of univariate distributions and any of the above array-variate distributions
- Distributions.ReshapedDistribution for any of the above distributions reshaped
- Distributions.ProductNamedTupleDistribution for NamedTuple-variate distributions comprised of univariate distributions and any of the above distributions.

Returns

log_like: Array with pointwise conditional log-likelihood values. If the distributions are array-variate, then the shape is (draws[, chains], params...) with real values. Otherwise, the shape is (draws[, chains]), with values of a similar eltype to y.

Examples

julia> using Distributions

julia> dists = [
           MvNormal([ 0.8, -0.9], [1.3  0.7;  0.7 0.5])
           MvNormal([-0.9,  0.6], [2.7 -1.4; -1.4 1.5])
           MvNormal([-0.6,  0.4], [1.0  0.2;  0.2 0.2])
       ];

julia> y = [2.9, 0.4];

julia> PosteriorStats.pointwise_conditional_loglikelihoods(y, dists)
3×2 Matrix{Float64}:
 -0.471721   0.0121882
 -5.77002   -2.81539
 -8.46362   -1.5339

References

[10] Bürkner et al. Comput. Stat. 36 (2021).
[11] Vehtari et al. Leave-one-out cross-validation for non-factorized models

source