Monday, January 11, 2010

XML and Corpora

I just got asked to advise on tutorial materials about XML for a computer scientist starting a corpus encoding project. To get started with the very basics I like Greg Wilson's Software Carpentry lecture. For more advice on how to go about building up a corpus, see Developing Linguistic Corpora: a Guide to Good Practice, by Martin Wynne and a bunch of Humanities Computing luminaries.

The project is going to work with childrens' books. That led me to find the Comic Book Markup Language provides a tool for adding analytical markup to (wait for it) Comic Books. It uses TEI, which is great, but heavy-duty.

No comments :

Post a Comment