Machine Learning for Digital Scholarly Editions: The Case of eScriptorium
Digital and computational tools and methods are becoming increasingly part of scholarly activity, including in Digital Scholarly Editing. One example of this is in transcribing texts from manuscripts, where machine learning is becoming more and more effective. To this end, eScriptorium is being developed to leverage Machine Learning to help in transcription, whether automatic, semi-automatic or manual. In principle the software should be useful for any type of edition, in any language and script and from any date. In practice, however, this raises many questions, including to what extent AI can or should be employed in preparing editions, how much the expert should remain ‘in the loop’, but also to what extent it is even possible to develop a single tool that can work for everything from Greek papyrus to 20th-century notebooks to Old Vietnamese inscriptions and beyond. This talk will therefore present the current state of the art while also addressing some practical and theoretical questions that remain for the future.