Working with Aynat Rubinstein, of The Hebrew University of Jerusalem, we organized a corpus of early 1900’s texts and integrated them with research tools. The corpus consists of plain text files as well as TEI documents. Organization involved transforming the documents to TEI, using NLP tools to add linguistic information requested by researchers and uploading data to an ANNIS site.
A full discussion of this project was published in the journal Language Resources and Evaluation. A link to the full paper can be found on the publications page.