A proposal for including chunk, inter, and div level children of tei:rdg and tei: lemma


The tei <app> element is used to group together readings that constitute a textual variation. The element has three children:
  • lem (used optionally to encode the preferred or base text for a given variation)
  • rdg (used to encode a single variant form)
  • rdgGrp (used to group readings into sensible subdivisions).
At a minimum, <app> must contain at least one <rdg>. <lem> and <rdgGrp> are optional but common. A relatively minimal apparatus might look like the following:
<app> <rdg wit="#El">Experience though noon Auctoritee</rdg> <rdg wit="#La">Experiment thogh noon Auctoritee</rdg> <rdg wit="#Ra2">Eryment though none auctorite</rdg> </app>
A more complex might look like this:
<app type="substantive"> <rdgGrp type="subvariants"> <lem wit="#El #Hg">Experience</lem> <rdg wit="#Ha4">Experiens</rdg> </rdgGrp> <rdgGrp type="subvariants"> <lem wit="#Cp #Ld1">Experiment</lem> <rdg wit="#La">Ex&p-underbar;iment</rdg> </rdgGrp> <rdgGrp type="subvariants"> <lem>Eriment<wit>[unattested]</wit> </lem> <rdg wit="#Ra2">Eryment</rdg> </rdgGrp> </app>
The apparatus element can function as a traditional print apparatus, in which case it can be linked to the text in one of two ways: location-referenced, in which the apparatus is linked to the editorial text through references to a canonical numbering system (e.g. line numbers):
<text> <body> <div n="WBP" type="prologue"> <head>The Prologe of the Wyves Tale of Bathe</head> <l n="1">Experience though noon Auctoritee</l> <l>Were in this world ...</l> </div> <div> <ab> <app loc="WBP 1"> <rdg wit="#La">Experiment</rdg> <rdg wit="#Ra2">Eryment</rdg> </app> </ab> </div> </body> </text>
Double-end-point-referenced, in which the apparatus is linked to the editorial text through an explicit reference to the IDREF for the start and (optionally) end points of the collation unit:
<text> <body> <div n="WBP" type="prologue"> <head>The Prologe ... </head> <l n="1" xml:id="WBP.1">Experience<anchor xml:id="WBP-A2"/> though noon Auctoritee</l> <l>Were in this world ...</l> </div> </div> <div> <ab> <app from="#WBP.1" to="#WBP-A2"> <rdg wit="#La">Experiment</rdg> <rdg wit="#Ra2">Eryment</rdg> </app> </ab> </div> </body> </text>
A third type of apparatus (*parallel segmentation*) dispenses with the base text all together and references alternate readings in relation to each other. This is similar to a collation list used by print editors in preparing their editions and textual apparatus. The digital form is much more powerful, however, as projects can use it to render the collated text in a variety of different formats: as a parallel text edition, as a best text, or even as a standard text+apparatus:
<l n="1"> <app> <lem wit="#El #Hg">Experience</lem> <rdg wit="#La">Experiment</rdg> <rdg wit="#Ra2">Eryment</rdg> </app> though noon Auctoritee</l> <l>Were in this world ...</l>

Problems with the current model

There are two problems with the current tei apparatus element and its children. The first is that <app> is currently treated as a phrase-level element in the Guidelines; in actual fact, apparatus in print sources are commonly treated as separate chunks or list like elements. As I argue elsewhere, the app should almost certainly be treated as an inter-level, list-like element (i.e. part of model.listLike). The second problem has to do with the content model of <rdg> and <lem>: currently, these two exclude chunk-level elements (though oddly, they can contain inter-level elements). As I shall argue below, there is no reason for this. There are many examples of apparatus which collate variants at the phrase-, inter-, chunk-, and even div- or (floating) Text-level. To anticipate the rest of this argument, I propose in this document revising the content model of <lem> and <rdg>—which are syntactically identical—from
element lem { att.global.attributes, att.textCritical.attributes, ( text | model.gLike | model.phrase | model.inter | model.global | model.rdgPart )* }
element lem { att.global.attributes, att.textCritical.attributes, ( text | model.gLike | model.phrase | model.inter | model.global | model.common | model.divLike | floatingText | model.rdgPart )* }

Do people really collate at the level of the chunk and the div?

Yes. And they are not hard to find. Here are examples, drawn from the apparatus to the Riverside edition of the Canterbury Tales, that show variants at all levels; the sample encodings assume that the proposed revisions to rdg/lem have been accepted:

Phrase (Clerk’s Prologue and Tale)

17 be] be that Ch El En Ha Ne Rob^2^ Bgh; om[itted]. Pw. <app loc="ClT 17"> <lem>be</lem> <rdg wit="#Ch #El #En #Ha #Ne #Rob2 #Bgh">be that</rdg> <rdg wit="#Pw"><gap/></rdg> </app>

Chunk (Monk’s Prologue and Tale)

1957-58] Om. El. <app from="MkT.1957" to="MkT.1958"> <lem> <l>This maketh that oure heires been so sklendre</l> <l>And feble that they may nat wel engendre.</l> </lem> <rdg wit="El."> <gap/> </rdg> </app>

Div (Merchant’s Prologue and Tale)

1213-44] Om. Hg La Pw; on leaf now lost, Gg.
In actual fact, the missing lines correspond exactly to the Merchant’s Prologue—a fact not recorded in the original print apparatus, but possible to capture in digital form: <app from="MtPro.1213" to="MtPro.1244"> <lem> <div n="Merchant's Prologue"> <head>The Prologe of the Marchantes Tale</head> <lg> <l n="1213">"Weypng and waylyng, care and oother sorwe</l> <l>I knowe ynogh, on even and a morwe"</l> <!-- ... continues for 28 lines ... --> <l n="1243">"Gladly," quod he, "but of myn owene soore,</l> <l> For soory herte, I telle may namoore."</l> </lg> </div> </lem> <rdg wit="Hg La Pw"> <gap reason="omitted"/> </rdg> <rdg wit="Gg"> <gap reason="lost leave"/> </app>

FloatingText (Cook’s Tale/Gamelyn)

This example is recorded in narrative form in the textual apparatus of the Riverside Chaucer rather than as an apparatus entry (floating texts are hard to do in a print apparatus). But the intent is clear: one tradition of the Canterbury Tales adds an extra floating text (the tale of Gamalyn). Since I don’t have access to the text of the tale, I can’t reproduce it here. The following suggests how this might be encoded, however:
<app from="4422"> <lem><gap/></lem> <rdg wit="cd"> <floatingText> <body> <head>The Tale of Gamalyn</head> <lg> <l>Some poetic text</l> <l>goes here</l> </lg> </body> </floatingText> </rdg> </app>


There are a few things to note about these examples:
  1. I have found my examples from an existing print apparatus and have tried to stay close to the original format of this apparatus in my encodings. But in fact, we should use print apparatus only as evidence—compiled over several centuries—as to the type of thing editors frequently collate, rather than a limit on what we should do with the apparatus. Print is simply not good at capturing this type of material. In Digital form, we can go beyond what it is possible to do in print, encoding intellectual information in our XML that print editors were only able to hint at. The test is “can we reproduce print apparatus” is a base line, not the upper limit. Print apparatus carry a lot of implicit information that we can make explicit in XML.
  1. I have tried to find only one example of each. As we have noted on tei-council-l, there are plenty of examples from all periods—texts are constantly being revised at the level of the phrase, paragraph/line, chapter, and book/section.


Currently we explicitly do not include lem in model.rdgPart. Given that the two have identical content models and are commonly used together, this seems to me unnecessarily fine: <lem> is a semantically priviledged <rdg>; they should be in the same model.

Get every new post delivered to your Inbox

Join other followers: