CLARIAH-AT Summer School 2026

Machine Learning for Digital Scholarly Editions

Let's begin
21 - 25th, September 2026
Elisabethstraße 59/III, 8010 Graz, Austria
Department of Digital Humanities
banner

CLARIAH-AT Summer School
Machine Learning for Digital Scholarly Editions
September 21-25, 2026

Call for Applications

Abstract

Machine learning is increasingly shaping research in the Digital Humanities, offering powerful tools for analyzing and enriching textual data. In this summer school, participants will use the Python library BERTopic to explore various steps of topic modeling. Building upon BERTopic’s modular architecture, students will be introduced to essential machine learning techniques, such as embedding, dimensionality reduction, and clustering. Through practical sessions using historical texts, students will learn to apply, interpret, and critically assess these techniques. The aim is to give non-experts a high-level practical overview of how to use the BERTopic library and the essential theory behind its modules. The school is intended for both students and researchers with an interest in the intersection between digital scholarly editing and Machine Learning. After attending the school, participants will have a basic understanding of machine learning algorithms and be able to assess their possible applications as well as strengths and limitations. Participants will be able to practically use BERTopic on their own data.

Funding, Organizers and Hosting Institution

This one-week school is generously funded by CLARIAH-AT and the Univerisity of Graz. It will be hosted by the Department of Digital Humanities of the University of Graz in collaboration with the Institute for Documentology and Scholarly Editing (IDE).

Requirements

A basic understanding of digital editions, the Text Encoding Initiative (TEI), and Python programming is expected for participation. Participants will work with Jupyter Notebooks, Google Colab and GitHub.

Fees

There are no fees for the summer school. However, participants must arrange and cover the costs of their travel, accommodation, and meals.

Application

The school is limited to 24 participants. If you are interested in participating, please submit a letter of application including a short CV (max 2 pages) to Roman Bleier (roman.bleier@uni-graz.at) and Martina Scholger (martina.scholger@uni-graz.at). The deadline for submissions is 15 May 2026. The summer school committee will evaluate each application and select participants by 31 May 2026.

About the Summer School

Machine learning is increasingly shaping research in the Digital Humanities, offering powerful tools for analyzing and enriching textual data. Using the Python library BERTopic, participants will explore various steps of topic modeling. Building upon BERTopic’s modular architecture, students will be introduced to several essential machine learning methods, such as embedding, dimensionality reduction, and clustering. Through practical sessions, students will learn to apply these techniques to historical texts. The aim is to give non-experts a high-level practical overview of how to use the BERTopic library and the essential theory behind its modules.

The school is intended for both students and researchers with an interest in the intersection between digital scholarly editing and Machine Learning. After attending the school, participants will have a basic understanding of machine learning algorithms and be able to assess their possible applications as well as strengths and limitations. Participants will be able to practically use BERTopic on their own data.

logos

Meet the Speakers

RomanBleier

Roman Bleier

Roman Bleier is a postdoctoral researcher at the Department of Digital Humanities at the University of Graz. His research focuses on digital scholarly editing, text encoding, and digital history. He was part of the editorial team for The Imperial Diets of 1576 and is currently co-PI of the FWF-DFG project History as a Visual Concept: Peter of Poitiers' "Compendium historiae". Roman is also a member of the Institute for Documentology and Scholarly Editing.

Lucija Brozić

Lucija Brozić

Lucija Brozić is a PhD student and university assistant at the University of Graz, specializing in Digital Humanities and Natural Language Processing. Her doctoral research examines attitudes towards migration and minority groups in Austrian historical newspapers. She has led a CLARIAH-AT funded small project on sentiment annotation, developing annotation guidelines and training annotators. Her academic interests include machine learning for DH texts, topic-specific corpus building, annotation practices, sentiment analysis and migration studies.

Selina

Selina Galka

Selina Galka is a research assistant at the University of Graz at the Department of Digital Humanities. Her research focuses on digital editing and data modelling. After completing her master's degrees in “German Philology of the Middle Ages and Early Modern Period” and “Digital Humanities,” she is currently a PhD candidate in the field of digital humanities.

Martina Scholger

Martina Scholger

Martina Scholger is a senior scientist at the Department of Digital Humanities, University of Graz, where her research focuses on digital scholarly editing, text encoding, text mining, and LLM applications. She is co-PI of the FWF-DFG Early Manila Hokkien project and contributes to the digital edition of Joseph von Hammer-Purgstall’s correspondence, the Visual Archive Southeastern Europe, and Picturing Migrants' Lives. She has been an elected member of the TEI Technical Council since 2016, a member of the Institute for Documentology and Scholarly Editing since 2012, and is managing editor of RIDE (Review Journal for Scholarly Digital Editions and Resources).

Gunter Vasold

Gunter Vasold

Gunter Vasold is a research software engineer in the Department of Digital Humanities at Graz University. Thirty years ago, he began working on the pioneering and highly ambitious critical digital editions project, Fontes Civitates Ratisponensis. While he enjoyed working with medieval documents, he discovered an even greater passion for developing software for the project. Since then, he has been involved in numerous research and software initiatives. Currently, his primary focus is on software engineering, research infrastructures, and the long-term preservation of research data. Additionally, Gunter is an award-winning lecturer with over 25 years of experience.

Klara Venglarova

Klara Venglarova

Klara Venglarova is a PhD student of Linguistics and Digital Humanities at the Palacky University in Olomouc, Czech Republic. She is involved in the FWF-funded project The Making of the Incredibly Differentiated Labor Market: Evidence from Job Offers from Ten Decades at the University of Graz (PI Jörn Kleinert), specifically engaged in layout analysis, OCR, post-correction, information extraction and other NLP and machine-learning tasks.

Elsiabeth Raunig

Elisabeth Raunig

Elisabeth Raunig works as a Project Manager at the Department of Digital Humanities at the University of Graz, where she is also part of the Summer School’s organising team. Having completed her Master’s in Digital Humanities there, she now contributes to project and event management.

More information coming soon.

Schedule - coming soon

Find us