GitHub - ISAAKiel/bayCA: Bayesian modelling of contingency tables for Correspondence Analysis

Bayesian modelling of contingency tables for Correspondence Analysis

This is the code accompanying the paper "Modelling uncertainty in Correspondence Analysis: a Bayesian framework for imputation and credibility ellipses" by Nils Müller-Scheeßel, Martin Hinz and Andrea Göhring.

Abstract

Correspondence Analysis (CA) is widely used in archaeology to explore associations in contingency tables and to visualize underlying structural gradients. While bootstrapped confidence regions have been proposed to express sampling uncertainty in CA ordinations, missing data – ubiquitous in archaeological datasets – are usually handled separately by imputation, without accounting for the additional uncertainty introduced. As a consequence, combined workflows of imputation followed by bootstrapped CA tend to underestimate total uncertainty. In this paper, we present a fully Bayesian approach that integrates contingency table imputation and CA within a single probabilistic framework. Using a Poisson log-linear model estimated via Markov Chain Monte Carlo, missing cell counts are treated as parameters and sampled jointly with observed data. Repeated CAs on posterior predictive samples allow the construction of credibility regions that simultaneously reflect sampling variation and imputation uncertainty. Instabilities in axis order and sign are addressed systematically using assignment algorithms. The method is demonstrated on two archaeological case studies: Romano-British small-find assemblages and European Iron Age sites with single human bones. Results show that, in the absence of missing data, Bayesian credibility ellipses and bootstrapped confidence ellipses are broadly comparable. When missing data are present, however, bootstrapped ellipses remain unrealistically narrow, whereas Bayesian credibility regions expand appropriately and reflect both data scarcity and imputation uncertainty. We conclude that Bayesian CA offers a coherent and conservative framework for analyzing incomplete archaeological contingency tables. Its main advantage lies in enabling the joint visualization of uncertainty arising from multiple analytical steps.

Running the code

If you want to reproduce the analysis, create a local RStudio-project and then simply follow the numbered individual R-files, so start with 1_start.R …

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
_obsolete		_obsolete
data		data
functions		functions
images		images
.gitignore		.gitignore
1_start.R		1_start.R
2a_roman.R		2a_roman.R
2b_roman_missing.R		2b_roman_missing.R
3_iron_age_single_bones.R		3_iron_age_single_bones.R
4_supplementary.Rmd		4_supplementary.Rmd
4_supplementary.pdf		4_supplementary.pdf
LICENSE		LICENSE
README.md		README.md
bayCA.Rproj		bayCA.Rproj
latex_code.Rmd		latex_code.Rmd
latex_code.html		latex_code.html
latex_code.pdf		latex_code.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bayesian modelling of contingency tables for Correspondence Analysis

Abstract

Running the code

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Bayesian modelling of contingency tables for Correspondence Analysis

Abstract

Running the code

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages