Recently, Google DeepMind published a paper in the journal Nature, launching a program called Aeneas
an artificial intelligence model designed to revolutionize the way historians study ancient inscriptions. The tool can help researchers better interpret, attribute, and repair ancient texts that are fragmented.
In ancient Rome, writing was almost ubiquitous, found everywhere from imperial monuments to everyday objects. Covering everything from political graffiti to love poems, business transactions and even birthday invitations, these inscriptions provide modern historians with a rich window into everyday life in the Roman world. However, most of the approximately 1,500 new inscriptions discovered each year have suffered from the ravages of time, weathering, or human damage. Without contextual information, restoration, dating and provenance are almost impossible tasks.
Traditionally, historians have relied on personal expertise and professional resources to find "similar texts", i.e., other inscriptions with similarities in wording, syntax, or provenance. Aeneas
has emerged to dramatically speed up this time-consuming and labor-intensive process. It can process thousands of Latin inscriptions in seconds, retrieving deeply correlated textual and contextual similarities to support historians in their interpretative work.
Aeneas
's development is led by Google DeepMind and the University of Nottingham, in collaboration with researchers at the University of Warwick, the University of Oxford and the Athens University of Economics and Business. The project is not limited to Latin; its modeling can also be applied to other ancient languages, scripts and media such as papyri and coins, with the potential to connect to a wider range of historical evidence. To promote academic research, the team has been predictingthepast.com
The website offers a free Aeneas
an interactive version of it, and open-sourced its code and dataset.
Core competencies of Aeneas
Aeneas
Named after the heroes of Greco-Roman mythology, it builds on the earlier use of ancient Greek inscriptions for restoring, dating and locating the Ithaca
on top of the model base. But Aeneas
Further, it aims to help historians contextualize, give meaning to isolated fragments, and ultimately piece together a more complete understanding of ancient history.
Its core competencies include:
- Similar Text Search: Through a technology called Embeddings.
Aeneas
The textual and contextual information (e.g. language, place of origin, date) of each inscription is encoded into a unique "historical fingerprint". In this way, it is possible to recognize deep connections in the vast number of Latin inscriptions, helping historians to place individual inscriptions in a broader historical context. - Multimodal Input Processing::
Aeneas
is the first model capable of utilizing multimodal inputs (i.e., text and images) to determine the geographic origin of inscriptions. This feature takes its analysis beyond the limits of pure text. - Notch repair of unknown length: In the face of severely damaged text with an unknown number of missing characters.
Aeneas
For the first time, effective restoration has been realized. This makes it a more flexible and powerful tool for dealing with materials in poor conservation conditions. - Industry-leading performance: both in repairing damaged texts and in predicting when and where they were written.
Aeneas
Both set new technological benchmarks.
Principle of operation and performance
Aeneas
is a multimodal generative neural network. The research team first integrated three major inscription databases (EDR, EDH, and EDCS-ELT) to create a machine-readable dataset (LED) containing over 176,000 Latin inscriptions.
The model uses a Transformer-based decoder to process the textual input and a specialized network for character restoration and dating. When performing geographic attribution, the model analyzes both the text and the image of the inscription.
In terms of performance. Aeneas
The performance of the model is outstanding. The "historical fingerprints" it generates are much clearer than those of other generalized Latin mega-language models when grouping inscriptions chronologically.
When fixing gaps of up to 10 characters, the Aeneas
of the top 20 candidate words with an accuracy of 731 TP3T; even in the challenging task of unknown gap lengths, the accuracy was maintained at 581 TP3T.With its use of visual data, the model was able to attribute the inscription to one of the 62 ancient Roman provinces with an accuracy of 721 TP3T and keep the text dating error within 13 years.
Providing new perspectives to the historical debate
To test Aeneas
In a practical research application, the team used it to analyze one of Rome's most famous inscriptions: the Record of the Sacred Performance of Augustus. Written in the first person by Augustus the Great, this inscription's exact dating has been a point of contention among historians.
Aeneas
Instead of giving a fixed date, a detailed probability distribution was generated. The results show two distinct peaks: a smaller peak between 10-1 B.C., and a larger, higher-confidence peak between 10-20 A.D. This quantitative result aptly reflects the two dominant dating hypotheses in academia.
Aeneas
predictions are based on subtle linguistic features and historical markers in the text, such as official titles and monuments. By transforming the dating problem into a probabilistic estimate based on linguistic and contextual data, the model provides a new quantitative approach to resolving unresolved historical debates.
Historical research to promote human-computer collaboration
In a large-scale collaborative study between historians and AI, 23 experts in inscription research were invited to use the Aeneas
to process a batch of text.
The results of the evaluation indicate that when historians place Aeneas
The efficiency and accuracy of the study was significantly improved when the contextual information provided (e.g., similar text) was used in conjunction with its predictions. One of the historians involved in the study stated anonymously, "Aeneas
The similar text found completely changed my view of this inscription. The details it noted were decisive for the restoration and dating of the text."
By combining expert knowledge with machine learning, the Aeneas
is attempting to integrate into the existing workflow of historians, offering new possibilities for connecting to humanity's past in an interpretable, collaborative way.