API

Core functionality

PSIS.PSISResultType
PSISResult

Result of Pareto-smoothed importance sampling (PSIS) using psis.

Properties

  • log_weights: un-normalized Pareto-smoothed log weights
  • weights: normalized Pareto-smoothed weights (allocates a copy)
  • pareto_shape: Pareto $k=ξ$ shape parameter
  • nparams: number of parameters in log_weights
  • ndraws: number of draws in log_weights
  • nchains: number of chains in log_weights
  • reff: the ratio of the effective sample size of the unsmoothed importance ratios and the actual sample size.
  • ess: estimated effective sample size of estimate of mean using smoothed importance samples (see ess_is)
  • tail_length: length of the upper tail of log_weights that was smoothed
  • tail_dist: the generalized Pareto distribution that was fit to the tail of log_weights. Note that the tail weights are scaled to have a maximum of 1, so tail_dist * exp(maximum(log_ratios)) is the corresponding fit directly to the tail of log_ratios.
  • normalized::Bool:indicates whether log_weights are log-normalized along the sample dimensions.

Diagnostic

The pareto_shape parameter $k=ξ$ of the generalized Pareto distribution tail_dist can be used to diagnose reliability and convergence of estimates using the importance weights [1].

  • if $k < \frac{1}{3}$, importance sampling is stable, and importance sampling (IS) and PSIS both are reliable.
  • if $k ≤ \frac{1}{2}$, then the importance ratio distributon has finite variance, and the central limit theorem holds. As $k$ approaches the upper bound, IS becomes less reliable, while PSIS still works well but with a higher RMSE.
  • if $\frac{1}{2} < k ≤ 0.7$, then the variance is infinite, and IS can behave quite poorly. However, PSIS works well in this regime.
  • if $0.7 < k ≤ 1$, then it quickly becomes impractical to collect enough importance weights to reliably compute estimates, and importance sampling is not recommended.
  • if $k > 1$, then neither the variance nor the mean of the raw importance ratios exists. The convergence rate is close to zero, and bias can be large with practical sample sizes.

See PSISPlots.paretoshapeplot for a diagnostic plot.

References

  • [1] Vehtari et al. JMLR 25:72 (2021).
source
PSIS.psisFunction
psis(log_ratios, reff = 1.0; kwargs...) -> PSISResult
psis!(log_ratios, reff = 1.0; kwargs...) -> PSISResult

Compute Pareto smoothed importance sampling (PSIS) log weights [1].

While psis computes smoothed log weights out-of-place, psis! smooths them in-place.

Arguments

  • log_ratios: an array of logarithms of importance ratios, with size (draws, [chains, [parameters...]]), where chains>1 would be used when chains are generated using Markov chain Monte Carlo.
  • reff::Union{Real,AbstractArray}: the ratio(s) of effective sample size of log_ratios and the actual sample size reff = ess/(draws * chains), used to account for autocorrelation, e.g. due to Markov chain Monte Carlo. If an array, it must have the size (parameters...,) to match log_ratios.

Keywords

  • warn=true: If true, warning messages are delivered
  • normalize=true: If true, the log-weights will be log-normalized so that exp.(log_weights) sums to 1 along the sample dimensions.

Returns

  • result: a PSISResult object containing the results of the Pareto-smoothing.

A warning is raised if the Pareto shape parameter $k ≥ 0.7$. See PSISResult for details and PSISPlots.paretoshapeplot for a diagnostic plot.

Examples

Here we smooth log importance ratios for importance sampling 30 isotropic Student $t$-distributed parameters using standard normal distributions as proposals.

julia> using Distributions

julia> proposal, target = Normal(), TDist(7);

julia> x = rand(proposal, 1_000, 1, 30);  # (ndraws, nchains, nparams)

julia> log_ratios = @. logpdf(target, x) - logpdf(proposal, x);

julia> result = psis(log_ratios)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       7 (23.3%)  959
  (0.5, 0.7]  okay      13 (43.3%)  938
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

If the draws were generated using MCMC, we can compute the relative efficiency using MCMCDiagnosticTools.ess.

julia> using MCMCDiagnosticTools

julia> reff = ess(log_ratios; kind=:basic, split_chains=1, relative=true);

julia> result = psis(log_ratios, reff)
┌ Warning: 9 parameters had Pareto shape values 0.7 < k ≤ 1. Resulting importance sampling estimates are likely to be unstable.
└ @ PSIS ~/.julia/packages/PSIS/...
┌ Warning: 1 parameters had Pareto shape values k > 1. Corresponding importance sampling estimates are likely to be unstable and are unlikely to converge with additional samples.
└ @ PSIS ~/.julia/packages/PSIS/...
PSISResult with 1000 draws, 1 chains, and 30 parameters
Pareto shape (k) diagnostic values:
                        Count       Min. ESS
 (-Inf, 0.5]  good       9 (30.0%)  806
  (0.5, 0.7]  okay      11 (36.7%)  842
    (0.7, 1]  bad        9 (30.0%)  ——
    (1, Inf)  very bad   1 (3.3%)   ——

References

  • [1] Vehtari et al. JMLR 25:72 (2021).
source
PSIS.ess_isFunction
ess_is(weights; reff=1)

Estimate effective sample size (ESS) for importance sampling over the sample dimensions.

Given normalized weights $w_{1:n}$, the ESS is estimated using the L2-norm of the weights:

\[\mathrm{ESS}(w_{1:n}) = \frac{r_{\mathrm{eff}}}{\sum_{i=1}^n w_i^2}\]

where $r_{\mathrm{eff}}$ is the relative efficiency of the log_weights.

ess_is(result::PSISResult; bad_shape_nan=true)

Estimate ESS for Pareto-smoothed importance sampling.

Note

ESS estimates for Pareto shape values $k > 0.7$, which are unreliable and misleadingly high, are set to NaN. To avoid this, set bad_shape_nan=false.

source

Plotting

PSIS.PSISPlots.paretoshapeplotFunction
paretoshapeplot(values; showlines=false, ...)
paretoshapeplot!(values; showlines=false, kwargs...)

Plot shape parameters of fitted Pareto tail distributions for diagnosing convergence.

values may be either a vector of Pareto shape parameters or a PSIS.PSISResult.

If showlines==true, horizontal lines indicating relevant Pareto shape thresholds are drawn. See PSIS.PSISResult for an explanation of the thresholds.

All remaining kwargs are forwarded to the plotting function.

See psis, PSISResult.

Examples

using PSIS, Distributions, Plots
proposal = Normal()
target = TDist(7)
x = rand(proposal, 1_000, 100)
log_ratios = logpdf.(target, x) .- logpdf.(proposal, x)
result = psis(log_ratios)
paretoshapeplot(result)

We can also plot the Pareto shape parameters directly:

paretoshapeplot(result.pareto_shape)

We can also use plot directly:

plot(result.pareto_shape; showlines=true)
source