Grant Drives AI Research To Decipher Handwritten Jewish Manuscripts
In the world of historical document research, automatic reading of handwritten manuscripts and textual search of their contents has remained an unattainable dream. While optical character recognition (OCR) of printed material is today commonplace, deciphering and transcribing individual Medieval handwritings has remained out of reach, especially for less common alphabets like Hebrew.
Hebrew textual studies are on the cusp of an immense leap forward with the announcement that the European Research Council (ERC) has awarded 10 million euros to the École pratique des hautes études (Paris Sciences-Lettres University), Tel Aviv University, and Bar-Ilan University, with the participation of Haifa University and the National Library of Israel (NLI), for a six-year project to develop cutting edge computational methods to analyze tens of thousands of medieval Hebrew manuscripts and fragments from the world’s collections by using their digital images curated in the NLI’s KTIV collection.
The newly funded project, called “MIDRASH – Migrations of Textual and Scribal Traditions via Large-Scale Computational Analysis of Medieval Manuscripts in Hebrew Script,” is the first ERC Synergy grant in Jewish studies and the first for computational manuscript studies.
It will be led jointly by four principal investigators: Daniel Stökl Ben Ezra (École pratique des hautes études, Paris-Sciences-Lettres University), Judith Olszowy-Schlanger (École pratique des hautes études, Paris Sciences-Lettres University and Oxford University), Nachum Dershowitz (Tel Aviv University) and Avi Shmidman (Bar-Ilan University), with the participation of the National Library of Israel and Haifa University.
KTIV is a trailblazing initiative, spearheaded by the NLI, to enable global centralized digital access to the more than 100,000 known Hebrew manuscripts throughout the world. The vast majority of these manuscripts have already been digitized and are hosted by KTIV, including NLI’s own collection of 10,000 manuscripts. The computational analysis of this enormous corpus of millions of pages will reveal and map previously unsuspected networks of transmission, and migration of textual and paleographical features, (paleography being the study of historic writing systems, the deciphering, dating, and establishing provenance of manuscripts.).
The resulting AI-powered analysis will be made freely available to the public. For the very first time, scholars and laypersons will be able to conduct comprehensive queries across the entire corpus, combining intelligent full-text search with rich metadata filters.
ERC Synergy Grants support small groups of two to four principal investigators to jointly address ambitious research problems that could not be addressed by individual teams working alone. The projects should enable substantial advances at the frontiers of knowledge, stemming from cross-fertilization of scientific fields, from new productive lines of enquiry, from new methods and techniques, including unconventional approaches at the interface between established disciplines.
Oren Weinberg, CEO of the National Library of Israel, said, “Winning this prestigious grant presents an extraordinary opportunity to add advanced capabilities to the KTIV project, the world’s only digital repository of approximately 100,000 Hebrew manuscripts housed at the National Library of Israel. In the foreseeable future, we will be able to use the ERC’s groundbreaking research to enable the deciphering of handwritten Hebrew manuscripts and their conversion into machine-readable text. This is an unprecedented technological achievement whose results will open new horizons for research in Jewish studies.”
Judith Olszowy-Schlanger, FBA, Professor of Hebrew Manuscript Studies at the Ecole Pratique des Hautes Etudes, PSL and Professor of Oxford University stated, “This collaborative project is a groundbreaking opportunity to create a new approach to the study of Hebrew paleography combining traditional, digital and computational methods.”
Daniel Stökl Ben Ezra, Professor of Ancient Hebrew and Aramaic at the Ecole Pratique des Hautes Etudes, PSL stated, “We are proud that our open-source OCR platform eScriptorium, founded in a collaborative project inside PSL, will join forces with KTIV and now stands at the center of this collaboration between many large institutions combining computer sciences and digital humanities with paleography and philology to reconstruct the entire Jewish Medieval book culture from its remains.”
Professor Nachum Dershowitz, emeritus in the School of Computer Science at Tel Aviv University, says: “Since my student days 50 years ago, I have always been engaged in, among other things, making Jewish literature accessible through computational tools. This bold and daring project will allow anyone from any background to wander through this trove of millions of manuscript pages that, as part of the culture and heritage of the Jewish people, were hand-written and copied – even tracing the notes added in the margins of these writings – and handed down from generation to generation.”
Dr. Avi Shmidman, Senior Lecturer of Hebrew Literature at Bar-Ilan University, and member of the Academy of the Hebrew Language, and Senior Researcher at DICTA – The Israel Center for Text Analysis, stated, “This will be a watershed moment in the field of Jewish Studies, nearly every aspect of which will be ripe for reconsideration and reevaluation in light of substantial supplementary pieces of evidence. This project is poised to be a game changer, deeply transforming our historical knowledge, and advancing our understanding of the Jewish literary heritage. It will set a new bar in the field of computational humanities, providing a state-of-the-art foundation of algorithms and models for the continued study of many additional manuscript cultures.”
Dr. Moshe Lavee, Head of the e-Lijah Lab (the Eliyahu Laboratory of digital humanities) at the University of Haifa, said, “This project is the fulfillment of a great dream. We are translating Ben-Gurion’s vision from the 1950s into the language of the 21st century, gathering together the images of all Hebrew manuscripts in Jerusalem, and with modern technologies, transforming them into an international vision. The project is a wonderful opportunity to bring to light the experience we have gained at the Eliyahu Laboratory over the last five years in manuscript accessibility and geospatial analysis of Jewish history.”