I presented my research on automatic translation alignment at the Computational Humanities Research Conference 2024 at Aarhus University in Denmark. This presentation detailed the development of an automated pipeline for aligning translations in multilingual digital editions.

Research Overview

The project addresses a critical challenge in digital humanities: how to automatically align texts across multiple languages and editions to create comprehensive multilingual digital scholarly editions.

Technical Approach

The pipeline integrates several computational methods:

  • Text preprocessing: Normalization and segmentation of source and target texts
  • Alignment algorithms: Advanced techniques for identifying corresponding text segments
  • Quality assessment: Automated evaluation of alignment accuracy
  • TEI integration: Export of aligned texts in TEI-compliant XML format

Case Study: I Promessi Sposi

The methodology was tested on Alessandro Manzoni’s “I Promessi Sposi” and its translations, demonstrating:

  • Effectiveness across different language pairs
  • Handling of structural variations between editions
  • Integration with existing digital edition frameworks

Key Contributions

  • Development of a robust automatic alignment pipeline
  • Evaluation metrics for translation alignment quality
  • Integration strategies for multilingual TEI documents
  • Practical solutions for common alignment challenges

Publication

The full paper is available in the CHR 2024 Proceedings: CEUR-WS Vol-3834, pp. 1086-1104

Impact

This work contributes to making multilingual digital editions more accessible and scalable, reducing the manual effort required for creating aligned translations while maintaining scholarly standards for digital textual scholarship.

The pipeline has been applied to the LeggoManzoni project and is being adapted for other multilingual digital edition projects.