Artificial Intelligence for History Data
AI is bringing changes to many things, and the study of history is one of them. Running Reality is taking a balanced approach to AI as a tool to help us realize our vision of a digital time machine. Protecting your trust in our platform is critical for us. Therefore, we are using AI for select tasks where we can monitor the data quality, bring you more data than ever, and visualize history in more immersive ways.
Overview
AI tools are becoming ubiquitous in history education and research. Just a few years ago, AI for history meant niche research tools for Natural Language Processing (NLP) or image analysis. The new "transformer" models, now known collectively as "GenAI" or Generative AI, changed all that.
We are working to build an immersive time machine experience and keep that free for students, educators, and history enthusiasts.
The internet has now been flooded with what people are calling "AI slop" including images, articles, and videos aimed at history enthusiasts. Want a video of walking down a street in ancient Egypt? Want to insert yourself into a historical photo of Julius Caesar? It is just a click away. Now, documentaries and video games and pop movies have been recreating history for some time, and with a wide variation in quality and historical accuracy. Just because the movie "300" had only a loose connection to the real history doesn't mean that all movies are inaccurate. So, what does it look like to take these new tools to make compelling next-generation historical experiences that are also carefully accurate?
Here are some of our basic assumptions:
- AI is here to stay and will become a better tool over time.
- People's expectations for what immersive history is will rise.
- Good history tools can be built to meet those expectations.
| Data | Task | AI Tools | Output |
|---|---|---|---|
| Historical Text | Convert narrative text into structured data. | LLMs: OpenAI GPT, Anthropic Claude, Google Gemini | Factoids |
| Historical Maps | Identify polygonal map features. | Facebook SAM2 | 2D Features, Factoids |
| Historical Photos | Convert photos of buildings and people into 3D models. | Google Gemini (Nano Banana), Meshy.ai | 3D Models |
Documenting "Why"
Have you seen fully immersive 3D reconstructions of ancient cities on the internet? There are some really amazing and well researched videos on YouTube. However, social media is also now being flooded with easy to make Generative AI videos purporting to be ancient scenes. These Generative AI videos are an extraordinary technical achievement and improve in their accuracy with each passing month. However, even if it is fairly accurate, there is no transparency, citation, or auditability. How can we keep up with Generative AI and hold to our principles of accuracy?
A fanciful view of the waterfront of Alexandria, Virginia, USA in 1780 as visualized by OpenAI's ChatGPT.
The central challenge for us is
Transformation versus Inference
Running Reality has always been an inference engine, since the first version of the engine in 1997. The history engine takes the factoids and is authorized to make inferences about movements, borders, populations, etc in between those factoids. The distinction between factoids and inference has always been clear and is fundamental to the idea of the factoid. Many design choices about factoids have been specifically made to minimize any inference inherent to a factoid. AI is a more advanced form of inference and will be used in a clear way to make inferences better and richer.
This distinction between factoids and inference is central. For factoids, AI (and humans) may transformed data from citable sources into the factoid format, but it must be verifiable. For inference between factoids, the engine may use algorithms or AI. If either a factoid or inference is historically inaccurate, either because of new discoveries or better research, then the solution is more, better factoids to guide the engine.
Historical Maps
Historical Photos
Running Reality believes it is critical to make available high-quality 3D visualizations that balance the need for an immersive historical experience with the need for research rigor. We will strive to adhere to the London Principles while carefully using Generative AI. We will make sure to extend the citation and sourcing approach we have used for other data into the 3D world. We will be respectful of people's trust.
Running Reality is taking multiple approaches to representing uncertainty in its visualizations. Our migration to a more immersive 3D model is underway and we are doing it in steps to get feedback on each step.
We are being careful in implementing this system to factor in the following:
- The same history engine inference system must be able to interpolate and extrapolate data over time.
- The source of the data must remain transparent and auditable, even if the structure contains a mixture of provenance.
- Uncertain and composite data, where height, material, architectural style, and condition may be separately cited factoids.
- There is linked data, such as interior spaces being linkable to people or businesses.
- Generative AI is making detailed 3D models more available, but with an extra burden of citation.
Using generative AI with historical drawings, illustrations, and photographs offers an tremendous opportunity. With generative AI, we can quickly build 3D models that are at the same level of historical accuracy as the source material. This can strongly enhance the sense of immersion in history while introducing a defined, transparent, and traceable level of inference.
As an example, here is our basic 2D and basic 3D augmented reality view of the Carlyle Warehouse on the Alexandria, Virginia, waterfront in about 1780. The building's dimensions are known from excavation, as is the basic wooden material. The site of the building is now roughly located under the present-date Indigo Hotel. However, rendering the warehouse as a plain block of wood with those dimensions is not very compelling.
Our basic 2D map of the waterfront of Alexandria, VA.
The basic map data rendering in 3D augmented reality while standing in Alexandria, VA.
There is an artist's rendering of the Alexandria waterfront representing this era that hangs in the Alexandria Archaeological Museum two blocks away from the warehouse site. The artists has attempted to properly represent Alexandria at this time, and though there is some artistic interpretation, it is considered sufficiently accurate to hang in the museum to give people an immersive understanding of Alexandria. It is understood by visitors not to be data, but art serving a purpose.
We used generative AI to create a 3D model that follows from the artists impression. This is similar to how such art was used to train the generative AI model used to create the fanciful scene at the top of this article. When asked to perform a more narrow task of creating this single warehouse and to use a specifically citable reference work, we have constrained the inferences made. Further, the 3D model can now be shown in Running Reality alongside other citable data points. When clicking the warehouse, the dates, locations, dimensions, and other data have their citations, and the 3D model has a citation to the artists work.
An artist's painting of the Alexandria waterfront that hangs in the Alexandria Archaeological Museum.
The artist's impression of the warehouse was used to guide a generative AI tool to create a 3D model.
The updated 3D model shown in 3D augmented reality gives a better sense of what the waterfront was like.
Historical Text
Factoids generated by or with assistance from Artificial Intelligence (AI) are allowed, with the following conditions:
- A human submits the factoids. A human is always accountable for the correctness of the data and the citation.
- A human reviews the factoids. The human review of a factoid submission is a safety net and the submitter is the one to attest to the factoid's accuracy.
- The AI was used as a data transformer. If an AI transforms a Wikipedia page, then the resulting factoids have a citation and can be verified against the page.
Running Reality is taking a balanced approach to AI. Generative AI is advancing rapidly and our tests of its accuracy show it has progressed to the point where it can be part of the Running Reality tool set. Just as a human would take a cite-able source and transform it into factoids, so can an AI. Humans make mistakes with factoids and so will AI, but the goal is to keep such mistakes to a minimum and to have an open, auditable factoid trail so corrections can be made later.
AI is now part of the Running Reality tool set. Running Reality is taking a balanced approach to AI. Generative AI is advancing rapidly and our tests of its accuracy show it has progressed to the point where it can be part of the Running Reality tool set. Just as a human would take a cite-able source and transform it into factoids, so can an AI.
All factoid data must be verifiable. Humans make mistakes with factoids and so will AI, but the goal is to keep such mistakes to a minimum and to have an open, auditable factoid trail so corrections can be made later. If either a human or AI-generated factoid is historically inaccurate, either because of new discoveries or better research, then the solution is more, better factoids to guide the engine.
Today, do not rely on AI to generate factoids from its training data, only use it to transform a specific source. This is because the specific item(s) from AI’s training data that resulted in the specific output factoid can’t be specifically cited or audited. If this changes with future iterations of AI, this policy may evolve. AI Large Language Models have memorized all of Wikipedia and thousands of history books, so their outputs may be correct and may be based on reliable sources. However, these factoids do not have a specific citation that can be used for verification.
- Example of transforming a source:
Generate factoids from the following article about the Battle of XYZ, with specific factoids for all people and military units mentioned in the article. [article text attached] - Example of reliance on training data:
Who was at the Battle of XYZ? Create specific factoids for all people and military units who were there.
The London Charter for the Computer-based Visualisation of Cultural Heritage is the leading set of principles for digital history projects to ensure they follow sufficient methodological rigor. Running Reality attempts to follow these principles.
Conclusion
The rise of generative AI historical imagery, video, and 3D models is presenting researchers, history students, and enthusiasts with a challenge. We believe that using the principles of the London Charter will provide confidence in Running Reality and the historical visualizations it creates. This doesn't mean avoiding generative AI altogether, but it does mean integrating it carefully into an existing data-driven inference engine already tested and trusted for accuracy and transparency.