Close printable page

Recommendation

A demonstration of the use and finetuning of existing machine learning tools for analysing large complexes of coins

Alex Brandsen based on reviews by 2 anonymous reviewers

A recommendation of:

Supporting the analysis of a large coin hoard with AI-based methods

Chrisowalandis Deligio, Karsten Tolle, David Wigg-Wolf (2024), Zenodo, ver.4, peer-reviewed and recommended by PCI Archaeology https://doi.org/10.5281/zenodo.8301464

Read preprint in preprint server

Data used for results

Codes used in this study

Scripts used to obtain or analyze results

Abstract

EN

AR

ES

FR

HI

JA

PT

RU

ZH-CN

Supporting the analysis of a large coin hoard with AI-based methods

In the project "Classifications and Representations for Networks: From types and characteristics to linked open data for Celtic coinages" (ClaReNet) we had access to image data for one of the largest Celtic coin hoards ever found: Le Câtillon II with nearly 70,000 coins. Our aim was not to develop new processes, but rather to demonstrate how existing tools can be used to support the numismatic task of processing and analysing large complexes of coins, thus validating the enormous potential of IT-based methods. The main steps involved are the pre-sorting of coins by size (denomination), the attribution of individual coins to classes or types, and finally the identification of which coins were struck by individual dies.
The process from digitisation of a hoard as images to an actual die study is lengthy and work-intensive. In testing methods to support each of the steps, we focussed particularly on methods that do not need any prior knowledge of the material, in order to explore whether these methods can be applied to a dataset for which there is no more information than the images themselves. The different steps were evaluated against information provided by the numismatist working on the hoard (class and die attributions of the coins), who was also involved in different stages of the process.

The result is a workflow that can be used in future work on large coin finds, thus supporting numismatists and significantly speeding up the work of identification and analysis.

This paper also presents tools, visualisation methods and extensions that proved useful, both for the individual processes, as well as for communicating with the numismatists and integrating their expertise. Earlier phases of our work were presented at CAA 2022 in Oxford.

machine learning, celtic coins, classification, unsupervised learning, object detection

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

دعم تحليل كنز كبير من العملات المعدنية باستخدام الأساليب القائمة على الذكاء الاصطناعي

في مشروعنا "التصنيفات والتمثيلات للشبكات: من الأنواع والخصائص إلى البيانات المفتوحة المرتبطة للعملات السلتية" (ClaReNet)، كان لدينا بيانات صور لواحدة من أكبر كنوز العملات السلتية التي تم العثور عليها على الإطلاق: Le Câtillon II مع ما يقرب من 70000 قطعة نقدية . في المراحل الأولى من نهجنا، كانت المشكلة الرئيسية هي كيفية التعامل مع مجموعة البيانات دون الحصول على أي معلومات عنها. أولاً، قمنا بفصل مجموعة البيانات إلى مجموعات من العملات المعدنية بأحجام مختلفة باستخدام التعرف على الكائنات مع المقياس الموجود في الصور. كان النهج الرئيسي هو التعامل مع العملات المعدنية بشكل مستقل عن التصنيف الأساسي وتحليل كيف يمكن لطريقة غير خاضعة للرقابة أن تجمعها. قمنا لاحقًا بتقييم نتائجنا مقابل الجدول الذي قدمه وأنتجه فريق الخبراء. بالإضافة إلى ذلك، قمنا بمراجعة تصنيفات الخبراء هذه لتحسينها وتوفير فحص الجودة، ولكن أيضًا للحصول على فهم أفضل لكيفية تصنيف الخبراء للعملات المعدنية، خاصة لتلك التي في حالة سيئة. بالإضافة إلى ذلك، ألقينا نظرة فاحصة على فئة واحدة وحاولنا التعرف على العملات المعدنية التي تم ضربها باستخدام القوالب المختلفة المستخدمة. تم عرض مراحل عملنا في CAA 2022 في أكسفورد وفي CAA 2023 في أمستردام.

التعلم الآلي، العملات السلتية، التصنيف، التعلم غير الخاضع للرقابة، اكتشاف الأشياء

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Respaldar el análisis de un gran tesoro de monedas con métodos basados en IA

En nuestro proyecto "Clasificaciones y representaciones para redes: desde tipos y características hasta datos abiertos vinculados para monedas celtas" (ClaReNet) teníamos datos de imágenes de uno de los mayores tesoros de monedas celtas jamás encontrados: Le Câtillon II con casi 70.000 monedas. . En las etapas iniciales de nuestro enfoque, el principal problema era cómo manejar el conjunto de datos sin tener ninguna información al respecto. Primero, separamos el conjunto de datos en grupos de monedas de diferentes tamaños mediante el reconocimiento de objetos combinado con la escala contenida en las imágenes. El enfoque principal fue tratar las monedas independientemente de la clasificación subyacente y analizar cómo un método no supervisado podría agruparlas. Posteriormente evaluamos nuestros resultados con la tabla proporcionada y producida por el equipo de expertos. Además, hemos revisado estas clasificaciones de expertos para mejorarlas y proporcionar un control de calidad, pero también para comprender mejor cómo los expertos clasifican las monedas, especialmente las que están en mal estado. Además, analizamos más de cerca una sola clase y tratamos de identificar monedas acuñadas con los diferentes troqueles utilizados. Las fases de nuestro trabajo han sido presentadas en CAA 2022 en Oxford y en CAA 2023 en Amsterdam.

aprendizaje automático, monedas celtas, clasificación, aprendizaje no supervisado, detección de objetos

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Soutenir l'analyse d'un grand trésor de pièces de monnaie avec des méthodes basées sur l'IA

Dans notre projet "Classifications et représentations pour les réseaux : des types et caractéristiques aux données ouvertes liées pour les monnaies celtiques" (ClaReNet), nous disposions de données d'image pour l'un des plus grands trésors de pièces celtiques jamais trouvés : Le Câtillon II avec près de 70 000 pièces. . Dans les premières étapes de notre approche, le principal problème était de savoir comment traiter l’ensemble de données sans disposer d’informations à son sujet. Tout d’abord, nous avons séparé l’ensemble de données en groupes de pièces de différentes tailles en utilisant la reconnaissance d’objets combinée à l’échelle contenue dans les images. L’approche principale consistait à traiter les pièces indépendamment de la classification sous-jacente et à analyser comment une méthode non supervisée pouvait les regrouper. Nous avons ensuite évalué nos résultats par rapport au tableau fourni et produit par l'équipe d'experts. De plus, nous avons revu ces classifications d'experts pour les améliorer et assurer un contrôle de qualité, mais aussi pour mieux comprendre comment les experts classent les monnaies, notamment pour celles en mauvais état. De plus, nous avons examiné de plus près une seule classe et essayé d'identifier les pièces frappées avec les différents coins utilisés. Les phases de notre travail ont été présentées au CAA 2022 à Oxford et au CAA 2023 à Amsterdam.

apprentissage automatique, pièces celtiques, classification, apprentissage non supervisé, détection d'objets

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

एआई-आधारित तरीकों से एक बड़े सिक्के के भंडार के विश्लेषण का समर्थन करना

हमारे प्रोजेक्ट में "नेटवर्क के लिए वर्गीकरण और प्रतिनिधित्व: प्रकार और विशेषताओं से लेकर सेल्टिक सिक्कों के लिए लिंक किए गए खुले डेटा तक" (क्लेरेनेट) हमारे पास अब तक पाए गए सबसे बड़े सेल्टिक सिक्के के भंडार में से एक के लिए छवि डेटा था: लगभग 70,000 सिक्कों के साथ ले कैटिलॉन II . हमारे दृष्टिकोण के शुरुआती चरणों में, मुख्य समस्या यह थी कि बिना किसी जानकारी के डेटासेट से कैसे निपटा जाए। सबसे पहले, हमने छवियों में मौजूद पैमाने के साथ संयुक्त वस्तु पहचान का उपयोग करके डेटासेट को विभिन्न आकारों के सिक्कों के समूहों में अलग किया। मुख्य दृष्टिकोण सिक्कों को अंतर्निहित वर्गीकरण से स्वतंत्र रूप से व्यवहार करना और यह विश्लेषण करना था कि एक असुरक्षित विधि उन्हें कैसे समूहित कर सकती है। बाद में हमने विशेषज्ञ टीम द्वारा प्रदान की गई और तैयार की गई तालिका के आधार पर अपने परिणामों का मूल्यांकन किया। इसके अलावा, हमने इन विशेषज्ञ वर्गीकरणों की समीक्षा की है ताकि उनमें सुधार किया जा सके और गुणवत्ता की जांच की जा सके, साथ ही यह भी बेहतर समझ पाने के लिए कि विशेषज्ञ सिक्कों को कैसे वर्गीकृत करते हैं, खासकर खराब स्थिति वाले सिक्कों के लिए। इसके अतिरिक्त, हमने एक ही वर्ग पर करीब से नज़र डाली और इस्तेमाल किए गए विभिन्न डाई के साथ फंसे सिक्कों की पहचान करने की कोशिश की। हमारे काम के चरण ऑक्सफोर्ड में सीएए 2022 और एम्स्टर्डम में सीएए 2023 में प्रस्तुत किए गए हैं।

मशीन लर्निंग, सेल्टिक सिक्के, वर्गीकरण, अनसुपरवाइज्ड लर्निंग, ऑब्जेक्ट डिटेक्शन

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

AIを活用した手法で大規模なコイン保有の分析をサポート

私たちのプロジェクト「ネットワークの分類と表現: ケルト硬貨の種類と特性からリンクされたオープンデータまで」(ClaReNet) では、これまでに発見された最大のケルト硬貨の宝庫の 1 つである、約 70,000 枚の硬貨を含む Le Câtillon II の画像データがありました。。私たちのアプローチの初期段階での主な問題は、データセットに関する情報がまったくない状態でデータセットをどのように扱うかということでした。まず、画像に含まれるスケールと物体認識を組み合わせて、データセットをさまざまなサイズのコインのグループに分割しました。主なアプローチは、基礎となる分類とは独立してコインを扱い、教師なしの方法でコインをどのようにグループ化できるかを分析することでした。その後、専門家チームが提供および作成した表と照らし合わせて結果を評価しました。さらに、私たちは専門家による分類を改善して品質チェックを行うため、また特に状態の悪いコインについて専門家がどのように分類しているかをより深く理解するために、これらの専門家の分類を見直しました。さらに、単一のクラスを詳しく調べて、使用されたさまざまなダイスで打ち出されたコインを特定しようとしました。私たちの取り組みの各段階は、オックスフォードで開催される CAA 2022 とアムステルダムで開催される CAA 2023 で発表されました。

機械学習、ケルトコイン、分類、教師なし学習、物体検出

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Apoiando a análise de um grande tesouro de moedas com métodos baseados em IA

Em nosso projeto "Classificações e Representações para Redes: Dos tipos e características aos dados abertos vinculados para moedas celtas" (ClaReNet), tivemos dados de imagem de um dos maiores tesouros de moedas celtas já encontrados: Le Câtillon II com quase 70.000 moedas . Nas fases iniciais da nossa abordagem, o principal problema era como lidar com o conjunto de dados sem ter qualquer informação sobre ele. Primeiramente, separamos o conjunto de dados em grupos de moedas de diferentes tamanhos utilizando o reconhecimento de objetos combinado com a escala contida nas imagens. A principal abordagem foi tratar as moedas independentemente da classificação subjacente e analisar como um método não supervisionado poderia agrupá-las. Posteriormente, avaliamos nossos resultados em relação à tabela fornecida e produzida pela equipe de especialistas. Além disso, revisámos estas classificações de especialistas para melhorá-las e fornecer uma verificação de qualidade, mas também para obter uma melhor compreensão de como os especialistas classificam as moedas, especialmente aquelas em mau estado. Além disso, examinamos mais de perto uma única classe e tentamos identificar moedas cunhadas com os diferentes moldes usados. As fases do nosso trabalho foram apresentadas na CAA 2022 em Oxford e na CAA 2023 em Amsterdã.

aprendizado de máquina, moedas celtas, classificação, aprendizado não supervisionado, detecção de objetos

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

Поддержка анализа большого запаса монет с помощью методов на основе искусственного интеллекта.

В нашем проекте «Классификации и представления сетей: от типов и характеристик к связанным открытым данным для кельтских монет» (ClaReNet) у нас были данные изображений одного из крупнейших когда-либо найденных кельтских кладов монет: Le Câtillon II с почти 70 000 монет. . На начальных этапах нашего подхода основная проблема заключалась в том, как обращаться с набором данных, не имея о нем никакой информации. Сначала мы разделили набор данных на группы монет разного размера, используя распознавание объектов в сочетании с масштабом, содержащимся в изображениях. Основной подход заключался в том, чтобы рассматривать монеты независимо от лежащей в их основе классификации и анализировать, как неконтролируемый метод может их сгруппировать. Позже мы сравнили наши результаты с таблицей, предоставленной и подготовленной группой экспертов. Кроме того, мы рассмотрели эти экспертные классификации, чтобы улучшить их и обеспечить проверку качества, а также чтобы лучше понять, как эксперты классифицируют монеты, особенно те, которые находятся в плохом состоянии. Кроме того, мы более внимательно рассмотрели один класс и попытались идентифицировать монеты, отчеканенные с помощью различных используемых штампов. Этапы нашей работы были представлены на CAA 2022 в Оксфорде и на CAA 2023 в Амстердаме.

машинное обучение, кельтские монеты, классификация, обучение без учителя, обнаружение объектов

This is an automatically generated version. The authors and PCI decline all responsibility concerning its content

支持使用基于人工智能的方法对大量硬币进行分析

在我们的项目“网络的分类和表示：从凯尔特硬币的类型和特征到链接的开放数据”(ClaReNet) 中，我们拥有迄今为止发现的最大凯尔特硬币宝库之一的图像数据：Le Câtillon II，拥有近 70,000 枚硬币。在我们方法的初始阶段，主要问题是如何在没有任何信息的情况下处理数据集。首先，我们使用对象识别结合图像中包含的比例将数据集分成不同大小的硬币组。主要方法是独立于底层分类来处理硬币，并分析无监督方法如何对它们进行分组。后来我们根据专家团队提供和制作的表格评估了我们的结果。此外，我们还审查了这些专家分类，以改进它们并提供质量检查，同时也更好地了解专家如何对硬币进行分类，特别是对于那些状况较差的硬币。此外，我们仔细研究了单个类别，并尝试识别使用不同模具敲击的硬币。我们的各个阶段工作已在牛津 CAA 2022 和阿姆斯特丹 CAA 2023 上展示。

机器学习、凯尔特硬币、分类、无监督学习、目标检测

Submission: posted 30 August 2023, validated 30 August 2023
Recommendation: posted 01 May 2024, validated 14 May 2024

Cite this recommendation as:
Brandsen, A. (2024) A demonstration of the use and finetuning of existing machine learning tools for analysing large complexes of coins. Peer Community in Archaeology, 100397. https://doi.org/10.24072/pci.archaeo.100397

Recommendation

The paper outlines the ClaReNet project's exploration of computer-based methods for classifying Celtic coin series, specifically focusing on a hoard from Jersey [1]. They collaborated with Jersey Heritage and numismatists, utilising a large dataset of coin images. The process involves stages such as pre-sorting, size-based sorting, class/type identification, and die studies. They employed IT methods, including object detection and unsupervised learning, followed by supervised learning for data refinement. Collaboration with numismatic experts ensured data quality. The study highlighted challenges in classifying coins, suggesting techniques like image matching alongside convolutional neural networks (CNNs). The results demonstrate the efficacy of semi-automatic processes in coin classification, emphasising the importance of human-computer collaboration for successful outcomes.

Overall, this is a good paper, showing how we as archaeologists and numismatics can use existing tools and finetune them for our purposes; without the need for huge domain specific datasets. This research and related papers show how we can more effectively deal with the increasingly bigger data we deal with, saving time on the monotonous and labour intensive tasks, leaving us more time to deal with the big picture. An important strength of the work is the provided public software repository and the dataset. The paper is well written, and a number of images illustrate the methodology as well as the objects used.

Reference

[1] Deligio, C., Tolle, K., and Wigg-Wolf, D. (2024). Supporting the analysis of a large coin hoard with AI-based methods. Zenodo, 8301464, ver. 4 peer-reviewed and recommended by Peer Community in Archaeology. https://doi.org/10.5281/zenodo.8301464

PDF recommendation

Conflict of interest:
The recommender in charge of the evaluation of the article and the reviewers declared that they have no conflict of interest (as defined in the code of conduct of PCI) with the authors or with the content of the article. The authors declared that they comply with the PCI rule of having no financial conflicts of interest in relation to the content of the article.

Funding:
Bundesministerium für Bildung und Forschung

Reviews

Evaluation round #1

DOI or URL of the preprint: https://doi.org/10.5281/zenodo.8301561

Version of the preprint: 1

Author's Reply, 03 Apr 2024

Dear reviewers,

Thank you for reviewing our paper. We really appreciate the comments and also understand that we may not have got our message across to our target audience.

We have rewritten a large part of the paper and also added important information and our message. Our paper is aimed at numismatists who are dealing with similar problems. With our paper we did not try to develop new AI models or achieve new higher scores, we wanted to show how to take tools from IT and apply them in the process of numismatists. Our pipeline shows different tasks in a numismatist's journey, from sorting by size, by class, by die, or even starting a sorting.

Thank you again.

https://doi.org/10.24072/pci.archaeo.100397.ar1

Decision by Alex Brandsen, posted 22 Jan 2024, validated 22 Jan 2024

Dear Authors,

your paper describes the ClaReNet project, which analysed the extensive Le Câtillon II Celtic coin hoard, employing object recognition and scale analysis to group coins by size in the absence of detailed information. Results were presented at CAA 2022 and CAA 2023, showcasing the use of unsupervised methods and evaluating findings against expert classifications. The project also involved refining expert classifications and examining specific coin classes for die variations.

I enjoyed reading your paper, and I think it's a nice addition to the ongoing discussions around this topic. However, based on the reviewers' feedback, I feel that this paper needs major revisions to be suitable for publication.

Please check all the issues raised by the reviewers and either revise the paper or address them in your response. I hope you are not disheartened by these comments, as the research seems promising and could definitely contribute to the field, if well described and analysed.

Kind regards,

Alex Brandsen.

https://doi.org/10.24072/pci.archaeo.100397.d1

Reviewed by anonymous reviewer 2, 19 Sep 2023

The ClaReNet project is dedicated to exploring computer-based methods applied to the analysis of three distinct Celtic coin series. One of these series comprises the staters attributed to the Coriosolitae, found in the Le Câtillon II hoard in Jersey in 2012.

The primary objective of the ClaReNet project was twofold. First, it aimed to support the numismatic process by leveraging machine learning-based methods. The project addressed various aspects, including pre-sorting, classifying, and recognizing different dies. A significant emphasis was placed on employing unsupervised methods to tackle the challenge of working with an unknown dataset, devoid of any information beyond the images themselves. Secondly, the project sought to compare its results with the classification created by the Jersey Heritage team and actively engaged numismatic experts in the evaluation process.

This paper not only presents the outcomes of the ClaReNet project but also showcases the tools, visualizations, and extensions developed during the project that proved to be valuable for facilitating communication with numismatists and integrating their expert opinions into the analysis.

The paper is well written, a number of images illustrate the methodology applied as well as the objects used.

The paper does not provide a comprehensive state of the art; neither on coin image analysis nor on the AI based methodology. While the paper covers important work on coin analysis, the motivation from both a numismatists or vision based point of view is not provided.

Furthermore the novelty of the contribution needs further clarification, how does numismatic research benefit from these findings. A main part of the paper is dedicated to (un/supervised) learning, the novelty of the contribution for computer vision is not clear.

The acquisition of the coin data is not clearly described: how is the data acquired? It seems that the numismatic standards (e.g. uniform background or illumination) for image acquisition are not applied, why is that? How is the data specified, is there something unique/special with the data?

Quantitative measure or output is scares, a comparison with state of the art or an ablation study is missing. How do you define accuracy and how is it measured?

The structure of the paper needs improvement: a clear problem formulation and motivation is missing.

Why do you need object recognition for the scale or is it size estimation? Or is it scale, and you count the coins?

Minor: The concrete implementation or acknowledging institutions during the description of the methodology is not appropriate.

For the given reasons I recommend to reject the paper.

https://doi.org/10.24072/pci.archaeo.100397.rev11

Reviewed by anonymous reviewer 1, 17 Jan 2024

This paper aims to employ various machine-learning methods for the detection of coin regions in given images, their classification, and die recognition. The work was based on the images of a found hoard of Celtic coins, specifically stater coins in the found hoard.

The effort of the authors to bring this work into this publication is appreciated. The machine learning methods were used effectively in the addressed tasks. An important strength of the work is the provided public software repository and the dataset. The primary drawback of the paper is its significantly low readability, demonstrating a slight confusion in presenting computer science concepts. In this manner, I think this paper requires a significant revision.

Some notes are as follows:

1) Detection of coin region and its scale:

This is formulated as a 2-class object detection problem, where classes are the "coin" and the "scale bar", and such two object classes were aimed to be detected by using a neural network. It was very confusing in the beginning what is meant by 'scale recognition' because it sounds like an automatic calculation of the size of the coin, however, what is aimed is simply detecting the black-colored bar region in a bounding box, and such object is called 'scale'. I suggest to authors to use a less confusing notation, maybe "scale bar"? Also, it should be mentioned clearly, the variation in the scale bar appearance in the training and test set, i.e. how many different scales they have, what the annotations look like for scale recognition. I recommend presenting a block diagram of operational steps to provide a better understanding of the approach to the user. How does the detection of the scale bar yield calculation of the coin size (this should be seen in the block diagram as well)? Also, what are the individual performances obtained for coin region detection and scale bar detection?

The model was trained on 100 training images, and then the trained model was used to find the size of a wider dataset of coins after examination on a 25-image test set, but it wasn't mentioned on which dataset this trained model was examined afterward; i.e., at the beginning of the paper, it was mentioned as this paper focused on the Staters (line 49), but then with this current analysis it is being understood as it was executed on another data because there are scale differences.

2) Clustering coins

- This section starts by mentioning Staters group was selected as a dataset. In Figure 8 its connection with the previous section can be seen but from the beginning it is very unclear the connection of this analysis with the findings in the prior section is, e.g. it is not clear whether the group of coins examined in this section was detected as staters in the previous section, or it is another dataset labeled initially by the numismatists. If Figure 8 had been shown and this divide-and-conquer approach had been presented in the very prior section of the paper (i.e. even before object detection) it would have been much better to improve the readability of the paper.

- the number of clusters was chosen as k=100, however, then we see there are 25 clusters in the results. What is the reason for that?

3) Die study

- What is the name of the parameter for value 0.3 mentioned (in line 360)? I'm guessing it is a sort of threshold to branch tree roots, but this sort of information should always be given in the text when it is seen for the first time in the text, when writing a scientific paper.

- which number of clusters was chosen for the DeepCluster application for the die study?

- Please write to the caption of Figure 15, which features were used while creating this dendrogram (ORB or Method 2)?

- which distance metric was used to compute the similarity between the image features? These should be given in the text when writing a scientific paper.

I think the authors are using the wrong taxonomizing of the adopted methodologies. It is mentioned on line 63, that the proposed approach involves two steps, namely, 1) object detection, and 2) unsupervised learning, then 3) supervised learning. It is better to add coin classes as in unsupervised learning 'of coin classes'. Otherwise, it doesn't look correct to categorize used approaches this way, Object detection was also done by supervised learning.

- I wonder if the authors consider publishing the die study dataset and annotations, as there is a significant lack of research on this problem due to the lack of annotated datasets. This would be certainly a significant contribution.

Abstract:

From the following sentence, it is really not possible to understand what is accomplished in this work exactly: "First, we separated the dataset into groups of coins of different sizes using object recognition combined with the scale contained in the images. The main approach was to treat the coins independently of the underlying classification and analyze how an unsupervised method could group them." --> How does the first operation connect to the second operation / (mention classification was conducted on the coin image dataset collected in the first operation? - and should mention also why such prior operation was needed for the second one)?

https://doi.org/10.24072/pci.archaeo.100397.rev12