Extracting a catalogue of element names from a collection of XML documents using XSLT 2.0
Posted: September 15, 2011 Filed under: Journal Incubator, Projects and Societies, Research, Technical Notes | Tags: Computers, digital humanities, journal incubator, Projects and Societies, scholarly publishing, Tips, xml, xslt Leave a comment »We are trying to build a single stylesheet to work with the documents of two independent journals. In order to get a sense of the work involved, we wanted to create a catalogue of all elements used in the published articles. This means loading as input document directories’ worth of files and then going through extracting and sorting the elements across all the input documents.
Here’s the stylesheet that did it for us. It is probably not maximally optimised, but it currently does what we need.