Fixing a problem with broken stylesheets in OJS 2.3.6Posted: June 18, 2012
In recent days, we have encountered a problem at Digital Studies/Le champ numérique that has resulted in problems with the display of a number of our articles.
The symptom is that the article breadcrumb and menu bar appear below rather than beside the right navigation bar, as illustrated below.
After some investigation, we narrowed the problem down to an issue with how OJS handles HTML-encoded articles.
When you upload an article in HTML into OJS, you upload a complete and valid file, one that opens and closes with the <html> and </html> tags and has the two main HTML structural elements, <head> (where metadata and style information and the like goes) and <body> (where the actual content goes). In other words, a standard complete HTML file has a structure similar to this:
<html> <head> <title>title of the piece</title> ... various other meta data and links like stylesheets, etc. ... </title> </head> <body> <h1>Title of the piece</h1> <p>Some character content...</p> ... various types of body content: headers, paragraphs, etc. intended to encode the actual content the reader sees ... </body> </html>
When OJS serves out an article, it builds a new HTML file: the <head> is built from the metadata OJS has stored separately about the article (keywords, author information, abstract, etc.); and the <body> consists of the entire HTML file you uploaded earlier (i.e. with the original <html> and </html> and <head> and <body>. In other words, the file OJS serves out looks something like this:
<html> <head> <title>title of the piece</title> ... various other meta data and links like stylesheets, etc. ... </title> </head> <body> <html> <head> <title>title of the piece</title> ... various other meta data and links like stylesheets, etc. ... </title> </head> <body> <h1>Title of the piece</h1> <p>Some character content...</p> ... various types of body content: headers, paragraphs, etc. intended to encode the actual content the reader sees ... </body> </html> </body> </html>
There is no way around it. This is very poor practice, especially in an archival context like a scholarly or scientific journal. The sections in red in the above code are forbidden by the HTML DTD. This means that browsers need to switch to their “quirks” mode to process the document (according to the XML specification, XML processors must throw an error if they encounter invalid XML). And it means that archivists cannot rely on the quality of data in their collections: future datamining or repurposing tools need to take into account the possibility that their data source contains invalid code.
Until recently, this issue with the construction of the files didn’t really cause us much practical trouble. Browsers seemed to ignore the repeated head information and the articles displayed properly.
For reasons that aren’t clear, however, this has now changed: Firefox and Chrome browsers on both Windows and Linux are now displaying articles as in the screenshot on the left above.
After some experimentation, we discovered that the problem involves a repetition of stylesheet information in the <head> element in the original uploaded file. We include links to the standard stylesheets in our original files because we need them to proof: authors want to see what an article looks like when they proof their texts; but when OJS adds its additional frame, it also includes links to the same stylesheets in its <head>. Somehow, the second links to the stylesheets are causing some kind of cascade that is affecting how the articles display.
One obvious solution is to edit the proof-files to remove the style information and upload them again without it. This does solve the display problem and it is what we are doing from now on, as loath as we are to change a file after it has been proofed by an author.
But this solution has unexpected consequences for articles that have already been published.
- Replacing the already published galley using the normal interface results in the article having a new file number and name. This may have URL issues.
- Statistics (such as number if hits) associated with the old galleys are lost. In order to stop readers seeing the “bad” proofs, you need to delete the original galleys, meaning no record of the hits are kept (see screen shot below).
- Links to images seem to break. In one case, the links seemed to work without intervention; in others, we needed to download and re-upload all the image files again.
So what to do?
This is a situation in which it makes sense to use a feature of OJS that many, I suspect, have not tried out before: the files browser (available to the “Journal Manager”):
Using this utility, you can replace the existing galleys with the corrected set (i.e. without the stylesheet links) without changing the file name. The counter is not reset and the image links all seem to work.
This is not ideal. But very little of this process is. You are intervening with a significant aspect of the reader experience post publication, without providing an audit trail allowing users to recover the original published document; you are correcting a file after it has been authorised and proofed by the author. The opportunities for making a mistake at this point at quite high.
But it seems to be the only way of fixing the layout problem. And, since the HTML remains invalid due to the way it is processed by OJS, these are presumably only one of a series of data problems in the archive.