Three levels of reproducible workflow remove barriers for archaeologists and increase accessibility
Removing Barriers to Reproducible Research in Archaeology
Recommendation: posted 21 November 2022, validated 21 November 2022
Over the last decade, a small but growing community of archaeologists, from a diversity of intellectual and demographic backgrounds, have been striving for computational reproducibility in their published research. In their survey of the accomplishments of this thriving community, Emma Karoune and Esther Plomp (2022) analyzed the wide variety of approaches researchers have taken to enhance the reproducibility of their research. A key contribution of this paper is their excellent synthesis of diverse approaches into three levels of increasing complexity. This is helpful because it provides multiple entry points for researchers new to the challenge of fortifying their research. Many researchers assume that computational reproducibility is only achievable if they have a high degree of technical skill with computers, or is only necessary if their work is very computationally intensive. Karoune and Plomp give three compelling reasons why reproducibility is important for all archaeological research, and through their three levels they demonstrate that how these levels can be accomplished with basic, non-specialized computer skills and widely used free software. They showcase exemplary work from a variety of archaeologists to show how practical and achievable reproducible research is for all archaeologists. They advocate for archaeologists to use the most widely used and supported tools and services to support their reproducible research, such as the R and Python programming languages for data analysis, and Git and GitHub for collaboration.
This paper, with its extensive appendix including thoughtful responses to frequently asked questions about reproducible research in archaeology, is likely to have a wide reach and influence, beyond previous works on this topic that have largely focused on technical details. Karoune and Plomp have provided the on-ramp for a generation of archaeologists who will find their questions about reproducible research answered here. They will also find an agreeable entry point to reproducible research in one of the three levels described by the authors. Will every archaeologist embrace this way of working? Should they? The work of Leonelli (2018) can help us anticipate the answers to these questions. Leonelli asks where are the limits to reproducibility, and how do the characteristics of different ways of knowing affect the desirability of reproducibility? Leonelli's work invites us to consider that there will be archaeologists coming from different epistemic cultures for whom the motivations presented by Karoune and Plomp will not resonate. For example, archaeologists engaged in mostly hermeneutical social science and humanities research, who do little or no quantitative analysis and statistics, are unlikely to see reproducibility as meaningful or desirable for their work. We can describe these researchers as working in interpretative or constructivist epistemic cultures. In these cultures, the particulars of how an individual researcher engages with their subject are exclusive and unique, and they would argue it cannot be fully captured or shared in an meaningful way (Elman and Kapiszewski 2017). Here, knowledge is situational, emerging from a specific, once-off combination of people and circumstances. One example in archaeology is the chaîne opératoire approach of stone artefact analysis, which Monnier and Missal (2014:61) describe as "based upon the analyst's experience and intuition, and it is not replicable, nor quantifiable". To make sense of this example we can draw on Galison's (1997) concept of 'image traditions' and 'logic traditions'. An image tradition is a way of knowing that is qualitative, based on composing narratives from drawings and photographs. A logic tradition is based on the use of instruments and statistical methods to collect standardised quantitative data. Chaîne opératoire approaches fall into the image tradition, along with many other ways of working in archaeology that do not generate numbers or use them to support claims about the past. Archaeologists working in a logic tradition will find reproducible research to be more meaningful than those working in an image tradition.
We should be mindful not to claim that one epistemic culture is superior to another because reproducibility is not meaningful or attainable for researchers in one culture. Such a claim would threaten the plurality that is essential for the reliability of scientific knowledge (Massimi 2022). Instead we should identify those communities in archaeology where reproducible research is both meaningful and attainable, but has not yet been widely embraced. That is the where the most beneficial effects can be expected. According to Leonelli's (2018) framework, we can recognise these communities by a few basic characteristics. For example: they are doing computationally intensive archaeology, such as using or writing software to collect, simulate, analyse or visualise data; they are doing experimental archaeology; or they are making knowledge claims that are supported by tables of numeric data and data visualisations. Archaeologists whose work shares one or more of these characteristics will find the guidance provided here by Karoune and Plomp to be highly instructive and relevant, and stand the most to benefit from it.
But it is not only individual archaeological scientists that have potential to benefit from how Karoune and Plomp have lowered the barriers to reproducible research. An especially important implication of this paper is that by lowering the barriers to reproducible research, Karoune and Plomp help us all to lower barriers to participation in archaeology in general. Documenting our research transparently, and sharing our materials (such as data and code and so on) openly, can profoundly change how others can participate in archaeology. By doing this, we are enabling students and researchers elsewhere, for example in low and middle income locations, to use our materials in their teaching and learning. Other researchers and students can apply our methods to their data, and combine their data with ours to achieve syntheses beyond what a single project can do. Similarly, for archaeologists working with local, descendant or marginalized communities, the tools of reproducible research are vital for enabling community members to have full access to the archaeological process, and thus reproducibility may be considered a necessity for decolonising the discipline. Karoune and Plomp present the CARE principles (Carroll et al. 2020) to guide archaeologists in ensuring community control of data so that reproducibility can be ethically accomplished with community safety and well-being as a priority. This may have a profoundly positive impact on the demographics of archaeology, as it lowers the barriers of meaningful participation by people far beyond our immediate groups of collaborators.
Making archaeology more accessible is of critical importance in stemming the negative social impacts of pseudoarchaeologists, who often claim that archaeologists actively suppress the truth of the archaeological record through secrecy, elitism, and exclusiveness. The harm in this is twofold. First, that pseudoarchaeology typically erases Indigenous heritage by claiming that their past achievements were due to an ancient, extinct advanced civilization, not Indigenous people. These claims are often adopted by white supremacists to support racist and antisemitic conspiracy theories (Turner and Turner 2021), which sometimes leads to prejudice, physical violence, radicalization and extremism. A second type of harm that can come from claims of secrecy and elitism is it drains public trust in experts, leading to science denial. Not only trust in archaeologists, but trust in many kinds of experts, including those working on urgent contemporary issues such as public health and climate change. Karoune and Plomp's work is important here because it provides a practical and affordable pathway for archaeologists to fight claims of secrecy and elitism by sharing their work in ways that make it possible for non-academics to inspect the analyses and logic in detail. Claims of secrecy and elitism can be easily countered by openness, transparently and reproducibility by archaeologists. This is not only useful for tackling pseudoarchaeologists, but also in enacting an ethic of care, framing members of the public as people that not only care about archaeology as part of humanity's shared heritage, but also care for the construction of reliable interpretations of the archaeological record to provide secure and authentic foundations for their social identities and relationships (Wylie et al 2018; de la Bellacasa 2011). By striving for reproducible research in the way described by Karoune and Plomp, we are practicing a kind of reciprocal care among ourselves as archaeologists, and between archaeologists and members of the public as two communities who care about the human past.
Karoune, E., and Plomp, E. (2022). Removing Barriers to Reproducible Research in Archaeology. Zenodo, 7320029, ver. 5 peer-reviewed and recommended by Peer Community in Archaeology. https://doi.org/10.5281/zenodo.7320029
de la Bellacasa, M. P. (2011). Matters of care in technoscience: Assembling neglected things. Social Studies of Science, 41(1), 85–106. https://doi.org/10.1177/0306312710380301
Carroll, S. R., Garba, I., Figueroa-Rodríguez, O. L., Holbrook, J., Lovett, R., Materechera, S., Parsons, M., Raseroka, K., Rodriguez-Lonebear, D., Rowe, R., Sara, R., Walker, J. D., Anderson, J., and Hudson, M. (2020). The CARE Principles for Indigenous Data Governance. Data Science Journal, 19(1), Article 1. https://doi.org/10.5334/dsj-2020-043
Elman, C., and Kapiszewski, D. (2017). Benefits and Challenges of Making Qualitative Research More Transparent. Inside Higher Ed 2017, http://web.archive.org/web/20220407064134/https://www.insidehighered.com/blogs/rethinking-research/benefits-and-challenges-making-qualitative-research-more-transparent (accessed 21 Oct, 2022).
Galison, P. (1997). Image and logic: a material culture of microphysics. Chicago (IL): University of Chicago Press.
Leonelli, S. (2018). Re-Thinking Reproducibility as a Criterion for Research Quality [preprint]. Available online: http://philsci-archive.pitt.edu/id/eprint/14352 (Accessed 21 Oct 2022).
Massimi, M. (2022). Perspectival realism. Oxford University Press.
Monnier, G. F., and Kele M.. "Another Mousterian debate? Bordian facies, chaîne opératoire technocomplexes, and patterns of lithic variability in the western European Middle and Upper Pleistocene." Quaternary International 350 (2014): 59-83. https://doi.org/10.1016/j.quaint.2014.06.053
Turner, D. D., and Turner, M. I. (2021). “I’m Not Saying It Was Aliens”: An Archaeological and Philosophical Analysis of a Conspiracy Theory. In A. Killin and S. Allen-Hermanson (Eds.), Explorations in Archaeology and Philosophy (pp. 7–24). Springer International Publishing. https://doi.org/10.1007/978-3-030-61052-4_2
Wylie, C., Neeley, K., and Ferguson, S. (2018). Beyond Technological Literacy: Open Data as Active Democratic Engagement? Digital Culture & Society, 4(2), 157–182. https://doi.org/10.14361/dcs-2018-0209
Ben Marwick (2022) Three levels of reproducible workflow remove barriers for archaeologists and increase accessibility . Peer Community in Archaeology, 100022. https://doi.org/10.24072/pci.archaeo.100022
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article.
Evaluation round #2
DOI or URL of the preprint: https://doi.org/10.5281/zenodo.7320029
Version of the preprint: v2
Author's Reply, 27 Oct 2022
Decision by Ben Marwick, posted 24 Oct 2022
Thank you Emma and Esther for your thoughtful responses and diligent revisions. I have made editorial changes to the text, starting with that MS Word document, and using track changes. In brief, I have made the main focus of the paper the three levels, and moved all the Q&A to the appendix. I think this gives the paper a more coherent and compelling logical structure, one that is consistent with what most readers will be expecting. Most readers are expecting a sustained, contextualised argument in a journal article, and I believe they come to PCI expecting to find pieces that closely resemble journal articles. In this case the main argument of this paper is that approaches to reproducible research can be organised into three levels. I think this is the most original and creative contribution in this paper, and deserved a stronger focus.
My perspective is that the Q&A content, as it is currently written, detracts from the main argument and was not effective at contextualising the main argument. Some of the Q&A is just a list of links, which is fine for a workshop handout, but I believe inconsistent with most people's expectations of a journal article. I agree that the Q&A text is relevant to the paper, so I moved it all into the appendix. I've also divided the appendix into two appendices: the Q&A and the glossary.
I've edited the main text of the paper to make it consistent with my perspective on 'high scientific quality' because 'PCI Archaeology recommends only preprints of high scientific quality that are methodologically and ethically sound.' (https://archaeo.peercommunityin.org/help/guide_for_authors). Editing this paper is challenging because the writing frequently switches between passive third person to active second person. The active second person voice is very rare in journal articles, and I'm concerned that many readers will conflate its presence here with low quality scholarship. This is because readers are not accustomed to being so directly addressed in journal articles, and I believe some will find it a bit off-putting with the gap between the author and reader so small. To be clear, I think this paper contains high quality scholarship, and I want it to have an extensive readership and impact. I think one way we can support that is to satisfy readers' basic expectations. I believe they come to PCI Archaeology expecting to read journal articles, so we should tailor our writing to meet those expectations and follow some of the conventions of journal article writing. Otherwise the reader will question the credibility and reliability of what they are reading. So that's my main motivation for editing the main text.
I've only very lightly edited the appendixes, since I think readers have different expectations of those. I found the text formatting, e.g. size, bold and italics, inconsistently applied throughout, which gives the reading a feeling of disorder. I encourage you to take a very systemic approach to using those text decorations. The shifting uses of "I", "you", and "we" is also jarring throughout the appendix. I think this could be easily fixed by replacing "I" with "you" throughout.
If you are ok with my edits and can submit a version with all the tracked changes accepted and questions responded to, I'll mark it as 'recommended' and prepare a note to appear on PCI.
Download recommender's annotations
Evaluation round #1
DOI or URL of the preprint: https://doi.org/10.5281/zenodo.7256954
Author's Reply, 22 Oct 2022
Decision by Ben Marwick, posted 19 Aug 2022, validated 25 Oct 2022
Dear Dr Karoune and Dr Plomp,
Thank you for submitting your pre-print for review, and for providing an opportunity for a robust and stimulating discussion about reproducibility in archaeology. I have been so inspired by similar discussions in other disciplines (e.g. handy guides for beginners such as Alston and Rick 2020 in ecology and revealing surveys of barriers such as Stodden 2010 in computer science), and I believe that essays such as the one you have written will similarly inspire and guide many archaeologists to improve the reproducibility of their work. An especially motivating detail that you mention is the importance of reproducible research for supporting sustainability, inclusiveness, and equitable access to participating in archaeological research.
Thanks also to our four reviewers, who are some of the most skilled and experienced scholars on this topic. It's an honour to have input from these researchers who have pioneered reproducibility in many areas of archaeology, and whose own compendia of code and data should be the among the first things junior scholars seek out as excellent examples of how to do this (e.g. Conrad et al 2016; 2021; Leggett 2021; 2022; Lodwick 2019).
Dr Karoune and Dr Plomp, please do carefully study the thoughtful reviews and consider editing and expanding your paper as they recommend. There are many excellent suggestions that will greatly help in upgrading your pre-print from something of a workshop handout, as it is currently, to a substantial manuscript with broad relevance to archaeologists around the world that helps to advance reproducibility in archaeology. As you do your revisions, I hope you might be able to draw relevant and diverse examples of reproducible research in this list of 250+ archaeology articles spanning 10 years that include R code and data.
A technical note: many of the resources you cite are websites without persistent identifiers, and so there is a danger of link-rot in your paper that will be frustrating for future readers. For an ephemeral workshop handout, this is expected, but for a scholarly publication I think we should invest some effort into insuring against the risk of link-rot to make the paper useful to readers long into the future. I recommend including only the most relevant and stable links in your paper, and removing those that are already out of date (I found a few that reference outdated content) or less relevant to your central claims. Then I recommend, as much as it practical, replacing links in your text with traditional in-text citations, following a widely used style such as APA. This will help readers to find the websites if there are minor changes to the URLs, which happen often. Additionally I recommend including in the reference to each website an archive URL from a service such as https://perma.cc/ or https://web.archive.org/ Then readers can still access the content even after the original website has gone.
Alston, J. M., and Rick, J. A.. 2020. A Beginner's Guide to Conducting Reproducible Research. Bull Ecol Soc Am 102( 2):e01801. https://doi.org/10.1002/bes2.1801
Conrad, C., et al. (2021). Re-Evaluating Pleistocene–Holocene Occupation of Cave Sites in North-West Thailand: New Radiocarbon and Luminescence Dating. Antiquity https://doi.org/10.15184/aqy.2021.44
Conrad, C., et al. (2016). Paleoecology and Forager Subsistence Strategies During the Pleistocene-Holocene Transition: A Reinvestigation of the Zooarchaeological Assemblage from Spirit Cave, Mae Hong Son Province, Thailand. Asian Perspectives 55(1). https://www.jstor.org/stable/26357698
Leggett, S. (2022). A Hierarchical Meta-Analytical Approach to Western European Dietary Transitions in the First Millennium AD. European Journal of Archaeology, 1-21. https://doi.org/10.1017/eaa.2022.23
Leggett, S. (2021). Migration and cultural integration in the early medieval cemetery of Finglesham, Kent, through stable isotopes. Archaeol Anthropol Sci 13, 1. https://doi.org/10.1007/s12520-021-01429-7
Lodwick, L., 2019. Sowing the Seeds of Future Research: Data Sharing, Citation and Reuse in Archaeobotany. Open Quaternary, 5(1), p.7. DOI: http://doi.org/10.5334/oq.62
Stodden, Victoria, The Scientific Method in Practice: Reproducibility in the Computational Sciences (February 9, 2010). MIT Sloan Research Paper No. 4773-10, Available at SSRN: https://ssrn.com/abstract=1550193 or http://dx.doi.org/10.2139/ssrn.1550193