DeRose, Durand, Mylonas, and Renear’s “What is Text, Really?” includes two key concepts. One is the idea that texts are really, fundamentally, their structures of content objects, and these structures are hierarchical. For example, the highest content object or container might be the a genre called the “letter.” A letter can hold various content objects like a “signature,” a “greeting” and a “body,” but a careful reader will immediately sense that this casual model has something wrong with it. The “letter” consists, really, of “greeting,” “body,” and “signature” in that order. Therefore, according to this pragmatic thesis, text REALLY IS a series of content objects in a specific order.
But content objects have another quality. To continue with our example, the “body” of a letter may contain a “paragraph,” but a “paragraph” cannot contain a “body.” That is, letters have a hierarchy. That is why this thesis about text is known as the OHCO thesis, which stands for an ordered hierarchy of content objects. The key concept is that the text in one form, input into a computer system, will be the same as the text in another form, print out on a sheet of paper. The “content” is the same, but the “format” is a matter of processing that is unique to an output device. The example that I have chosen, the letter, is in fact DeRose, Durand, Mylonas, and Renear’s example, which appears in two forms as the article’s first figure. Their letter, from Scooter, is encoded in a form of simplified SGML, and Scooter, who has a computer, has apparently also been able to output the encoded letter to a printing device.
The authors use Scooter’s letter to illustrate that “it is natural to use this model in helping children understand and create documents” (4). The two versions of Scooter’s letter in * Journal of Computing in Higher Education illustrate the point of this seminal essay, that form and content can be usefully distinguished for the purpose of computer modeling, which aids in data exchange.
But to my knowledge no one has noticed a little problem with the SGML encoding of Scooter’s letter: it’s faulty translation. Either that or Scooter’s sorry little butt is lying, because the “content” of these two letters is NOT the same. In the printed letter, Scooter precedes his signature with “Sincerely,” as a closing. But the SGML-encoded letter has no “closing.” Scooter signs off with only his name. Is our little Scooter more sincere when he writes on paper than when he transmits by electronic message? Computer-Scooter ends his salutation “Hi Mom” with an exclamation point (!). Paper-Scooter is a much smoother dude: his “Hi Mom” has no exclamation.
Either Computer-Scooter was a too juiced about the OHCO text model, or DeRose et al were already predicting that XSLT would alter the content of the text for the publication medium, with one template to add an exclamation point at the conclusion of the greeting when output to paper, another to add a standard closing.
DeRose, S. J., Durand, D. G., Mylonas, E., and Renear A. H. (1990), “What
is Text, Really?’, * Journal of Computing in Higher Education, 1.2: 3-26.
