Skip to content
This repository was archived by the owner on Feb 1, 2026. It is now read-only.

CVidalG/workshop-TUMO2025

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TUMO Workshop - Preserving the Past with AI

This Git contains main code and tasks for the TUMO workshop "Preserving the Past with AI", led by Chahan Vidal-Gorène (Calfa) and Baptiste Queuche (Calfa). Done in partnership with the National Library of Armenia.

Case Study: Document enhancement and post-processing

The goal is to deal with digitizations of newspapers done by the National Library: quality assessment, image enhancement, document classification.

Course obj. 1 : explore and discover AI techniques with Python

Develop and train-eval computer vision model from scratch, building datasets, install packages, defining annotation strategies, prompt engineering for document analysis and heritage materials (YOLO, LabelStudio).

Course obj. 2 : project development experience

Developing an AI-based program for the National Library of Armenia, using Streamlit, YOLO and openAI API.

Installation and running

git clone https://github.com/CVidalG/workshop-TUMO2025.git
cd workshop-TUMO2025/app/
python3 -m venv app-enhancement
source app-enhancement/bin/activate
pip install numpy opencv-python Pillow streamlit ultralytics openai pytesseract transformers

Make sure to have tesseract-ocr installed on your computer.

then

streamlit run main.py

Models and openAI API

You can change models in config.py and set your openAI api key in config.py. You need an openAI API key to run the LLM part (optional).

How to cite

To cite this work, you can use:

TUMO Students. (2025). Preserving the Past with AI [GitHub repository]. Supervised by Chahan Vidal-Grorène & Baptiste Queuche, Calfa. https://github.com/CVidalG/workshop-TUMO2025.git

Or to use the following bibtex:

@misc{abc2025project,
  author       = {TUMO Students},
  title        = {Preserving the Past with AI},
  year         = {2025},
  howpublished = {\url{https://github.com/CVidalG/workshop-TUMO2025.git}},
  note         = {Supervised by Chahan Vidal-Gorène and Baptiste Queuche, Calfa},
}

The full pipeline and results have been described in the Bulletin of Armenian Libraries. To cite thie work, please use the following informations:

@article{tumo-workshop-banber2026,
author = {Grigoryan, Alvard and Yeghiazaryan, Aren and Ispiryan, Davit and Meliksetyan, Davit and Navasardyan, Davit and Nazluxanyan, Davit and Shahnazaryan, Hakob and Ananikyan, Levon and Aslanyan, Mari and Katvalyan, Maria and Arshakyan, Milena and Khosrovyan, Sona and Saghatelyan, Suren and Karapetyan, Vahe and Sevyan, Valeri},
volume={8},
url={https://journal.nla.am/index.php/banber/article/view/115},
DOI={10.52027/18294685-aa.2.25-12},
number={2},
journal={Bulletin of Armenian Libraries},
year={2026},
month={Jan.},
pages={34–43}
}

About

Learning lab about Damaged Document Analysis and Historical Document Enhancement

Topics

Resources

Stars

Watchers

Forks

Contributors

Languages