Using normalize-space to fix Oxygen “pretty print” spacing problems

This is a straightforward thing for people who know what they are doing. It is only a reminder to me, who didn’t.

The journals I publish using TEI XML use the tei:figDesc element to populate the alt and title attributes of html:img.

Until today, these results in very odd looking tool tips, where the text was spread all over the place, e.g.

The problem was being caused by the OxygenXML editor’s pretty-print feature and how that was being transformed to the title and alt attributes. Read the rest of this entry »

Extracting a catalogue of element names from a collection of XML documents using XSLT 2.0

We are trying to build a single stylesheet to work with the documents of two independent journals. In order to get a sense of the work involved, we wanted to create a catalogue of all elements used in the published articles. This means loading as input document directories’ worth of files and then going through extracting and sorting the elements across all the input documents.

Here’s the stylesheet that did it for us. It is probably not maximally optimised, but it currently does what we need.


