Datasets

The following shows the currently available example datasets:

using ArviZExampleData

println(describe_example_data())
centered_eight
==============

A centered parameterization of the eight schools model. Provided as an example of a model that NUTS has trouble fitting. Compare to `non_centered_eight`.

The eight schools model is a hierarchical model used for an analysis of the effectiveness of classes that were designed to improve students' performance on the Scholastic Aptitude Test.

See Bayesian Data Analysis (Gelman et. al.) for more details.

local: /home/runner/.julia/artifacts/cb5639396962742e4eab4a5bd561126f19b9e6e3/arviz_example_data-0.4.3/data/centered_eight.nc

non_centered_eight
==================

A non-centered parameterization of the eight schools model. This is a hierarchical model where sampling problems may be fixed by a non-centered parametrization. Compare to `centered_eight`.

The eight schools model is a hierarchical model used for an analysis of the effectiveness of classes that were designed to improve students' performance on the Scholastic Aptitude Test.

See Bayesian Data Analysis (Gelman et. al.) for more details.

local: /home/runner/.julia/artifacts/cb5639396962742e4eab4a5bd561126f19b9e6e3/arviz_example_data-0.4.3/data/non_centered_eight.nc

radon
=====

Radon is a radioactive gas that enters homes through contact points with the ground. It is a carcinogen that is the primary cause of lung cancer in non-smokers. Radon levels vary greatly from household to household.

This example uses an EPA study of radon levels in houses in Minnesota to construct a model with a hierarchy over households within a county. The model includes estimates (gamma) for contextual effects of the uranium per household.

See Gelman and Hill (2006) for details on the example, or https://docs.pymc.io/notebooks/multilevel_modeling.html by Chris Fonnesbeck for details on this implementation.

remote: https://ndownloader.figshare.com/files/24067472

rugby
=====

The Six Nations Championship is a yearly rugby competition between Italy, Ireland, Scotland, England, France and Wales. Fifteen games are played each year, representing all combinations of the six teams.

This example uses and includes results from 2014 - 2017, comprising 60 total games. It models latent parameters for each team's attack and defense, as well as a global parameter for home team advantage.

See https://github.com/arviz-devs/arviz_example_data/blob/main/code/rugby/rugby.ipynb for the whole model specification.

remote: https://ndownloader.figshare.com/files/44916469

rugby_field
===========

A variant of the 'rugby' example dataset. The Six Nations Championship is a yearly rugby competition between Italy, Ireland, Scotland, England, France and Wales. Fifteen games are played each year, representing all combinations of the six teams.

This example uses and includes results from 2014 - 2017, comprising 60 total games. It models latent parameters for each team's attack and defense, with each team having different values depending on them being home or away team.

See https://github.com/arviz-devs/arviz_example_data/blob/main/code/rugby_field/rugby_field.ipynb for the whole model specification.

remote: https://ndownloader.figshare.com/files/44667112

glycan_torsion_angles
=====================

Torsion angles phi and psi are critical for determining the three dimensional structure of bio-molecules. Combinations of phi and psi torsion angles that produce clashes between atoms in the bio-molecule result in high energy, unlikely structures.

This model uses a Von Mises distribution to propose torsion angles for the structure of a glycan molecule (pdb id: 2LIQ), and a Potential to estimate the proposed structure's energy. Said Potential is bound by Boltzman's law.

remote: https://ndownloader.figshare.com/files/22882652

crabs_poisson
=============

Horseshoe crabs arrive at the beach in pairs for their spawning ritual. Solitary males gather around the nesting couples and vying to fertilize the eggs. These individuals, known as satellite males, often congregate near certain nesting pairs while disregarding others.

We use Bambi to create a hurdle-negative binomial model for the number of male satellites as a function of the carapace width and color of the female.

For details see https://doi.org/10.1111/j.1439-0310.1996.tb01099.x and https://bap.com.ar.

remote: https://ndownloader.figshare.com/files/53391239

crabs_hurdle_nb
===============

Horseshoe crabs arrive at the beach in pairs for their spawning ritual. Solitary males gather around the nesting couples and vying to fertilize the eggs. These individuals, known as satellite males, often congregate near certain nesting pairs while disregarding others.

We use Bambi to create a hurdle-negative binomial model for the number of male satellites as a function of the carapace width and color of the female.

For details see https://doi.org/10.1111/j.1439-0310.1996.tb01099.x and https://bap.com.ar.

remote: https://ndownloader.figshare.com/files/53391224

sbc
===

Simulated Based Calibration with the eight-school dataset and PyMC with NUTS sampler. 100 very short simulations were carried-out

remote: https://ndownloader.figshare.com/files/52915271

anes
====

Logistic regression model for American National Election Studies (ANES) data. The model aims to predict voter intention for Clinton based on party identification and its interaction with age.

remote: https://ndownloader.figshare.com/files/53391248

periwinkles
===========

31 periwinkles (a kind of sea snail) were removed from it original place and released down shore. A VonMises likelihood is used to model the direction of motion as function of the distance travelled by the periwinkles after being release.

remote: https://ndownloader.figshare.com/files/53391296

censored_cats
=============

Simple model of the survival curve for cat adoptions from an animal shelter in Austin, Texas. The dataset comes from the City of Austin Open Data Portal for more details of the model you can check bambi's documentation

remote: https://ndownloader.figshare.com/files/58114255

roaches_nb
==========

The roaches data example comes from Chapter 8.3 of of Gelman and Hill (2007).

remote: https://ndownloader.figshare.com/files/59961428

roaches_zinb
============

The roaches data example comes from Chapter 8.3 of of Gelman and Hill (2007).

remote: https://ndownloader.figshare.com/files/59961890