Using normalize-space to fix Oxygen “pretty print” spacing problems

The journals I publish using TEI XML use the tei:figDesc element to populate the alt and title attributes of html:img.

Until today, these results in very odd looking tool tips, where the text was spread all over the place, e.g.

The problem was being caused by the OxygenXML editor's pretty-print feature and how that was being transformed to the title and alt attributes.

Siva Vaidhyanathan on the value of public research

A great statement today in Slate by Siva Vaidhyanathan about the value of public research:

We Americans take these institutions for granted. We assume that private enterprise generates what is so casually called “innovation” all by itself. It does not. The Web browser you are using to read this essay was invented at the University of Illinois at Urbana-Champaign. The code that makes this page possible was invented at a publicly funded academic research center in Switzerland. That search engine you use many times a day, Google, was made possible by a grant from the National Science Foundation to support Stanford University. You didn’t get polio in your youth because of research done in the early 1950s at Case Western Reserve. California wine is better because of the University of California at Davis. Hollywood movies are better because of UCLA. And your milk was not spoiled this morning because of work done at the University of Wisconsin at Madison.

 These things did not just happen because someone saw a market opportunity and investors and inventors rushed off to meet it. That’s what happens in business-school textbooks. In the real world, we roll along, healthy and strong, in the richest nation in the world because some very wise people decided decades ago to invest in institutions that serve no obvious short-term purpose. The results of the work we do can take decades to matter—if at all. Most of what we do fails. Some succeeds. The system is terribly inefficient. And it’s supposed to be that way.

Along the way, we share some time and energy with brilliant and ambitious young people from around the world.

Important to realise this is also a selective list. Other things generated in whole or in part by publicly funded researchers and institutions include Unicode and XML.

Can anybody think of others?

Extracting a catalogue of element names from a collection of XML documents using XSLT 2.0

We are trying to build a single stylesheet to work with the documents of two independent journals. In order to get a sense of the work involved, we wanted to create a catalogue of all elements used in the published articles. This means loading as input document directories’ worth of files and then going through extracting and sorting the elements across all the input documents.

Here’s the stylesheet that did it for us. It is probably not maximally optimised, but it currently does what we need.

Using Oxygen and Subversion client

Here are instructions for using Oxygen for accessing the Littlechief Project Subversion server.

An Anglo-Saxon Timeline

This contains a link to an experiment in constructing a timeline of the Anglo-Saxon period using XML. It is very much a work in progress at the moment. The ultimate goal will be to have a synoptic oversight and index that will allow students to click on major events, persons, or cultural artefacts and then see how they fit in with other milestones. At the moment, the chart only includes Kings. And even then still in fairly rough fashion.

Transcription Guidelines

The following is a list of typographical conventions to use when transcribing medieval manuscripts in my classes.

A proposal for including chunk, inter, and div level children of tei:rdg and tei: lemma


The tei <app> element is used to group together readings that constitute a textual variation. The element has three children:

A Proposal for Revisions to choice, app, and model.pPartEdit

This document proposes a series of related revisions affecting

Disciplinary impact and technological obsolescence in digital medieval studies

First posted December 15, 2006 Published in The Blackwell Companion to the Digital Humanities, ed. Susan Schriebman and Ray Siemens. 2007.

In May 2004, I attended a lecture by Elizabeth Solopova at a workshop at the University of Calgary on the past and present of digital editions of medieval works1. The lecture looked at various approaches to the digitisation of medieval literary texts and discussed a representative sample of the most significant digital editions of English medieval works then available: the Wife of Bath’s Prologue from the Canterbury Tales Project (Robinson and Blake 1996), Murray McGillivray’s Book of the Duchess (McGillivray 1997), Kevin Kiernan’s Electronic Beowulf (Kiernan 1999), and the first volume of the Piers Plowman Electronic Archive (Adams et al. 2000). Solopova herself is an experienced digital scholar and the editions she was discussing had been produced by several of the most prominent digital editors then active. The result was a master class in humanities computing: an in-depth look at mark-up, imaging, navigation and interface design, and editorial practice in four exemplary editions.

Read the rest of this entry »


