This is third in a series of six, and possibly seven, posts with the provisional title “Marking Uncle Tom’s Cabin: Typography, Race, and Textual Transmission.” See Part I: In which a space is not a space if you’d like to start at the beginning. This series includes much-revised versions of presentations at the Midwest MLA Conference (Minneapolis, 2008) and the Society for Textual Scholarship (New York, 2008). The revised version is intended as a draft for an article to be submitted to a journal. Comments are appreciated.
But even “does-n’t [sic]” is a form of abstraction that from some perspectives does not do justice to the documentary form of the text or to meanings independent of documentary forms. The word in the Jewett edition is divided by a line break, and the abstracted form omits the line break. Even a photographic reproduction does not do complete justice to the ontology of the object, as a whole range of material and social conditions that may have shaped the documentary artifact. A textual paradigm that attends to these factors has been called the “bibliographical orientation.” During the past two decades, this approach, often associated with theorists D. F. McKenzie and Jerome McGann, is the perspective that has unseated authorial intentionalist editing from its position as the dominant paradigm for Anglo-American editing. If taken as an editorial paradigm, the bibliographical orientation makes it difficult to take up the traditional editorial task of emending because, as Peter L. Shillingsburg notes, it “does not admit to any parts of the text or of the physical medium to be considered nonsignificant and therefore emendable” (SECA 23).
Strictly speaking, any act of editing involves judgment. To select a single copy of a work for photographic facsimile is still an act of judgment. Regardless, the bibliographical orientation been associated with the practices of photographic facsimile and digital editing (and elements are embodied in McGann’s own Rossetti Archive), and digital too is a physical condition. The physicality of digital media is often inaccessible to the user’s conscious attention, and theorists of media have argued that digital acts of remediation express “the desire to get past the limits of representation and to achieve the real” (Bolter-Grusin 53). But by outlining markers for line breaks in the oral-print divide, as well as markers along the contours both of an historical typographical condition and of some present-day digital typographical conditions, we can show that the conversion from oral to written and from typeset by hand to digital transcription depend upon multiple levels of abstraction. By investigating abstraction from the perspective of computable definitions embedded within encoding systems, we begin to see the consequences of translating hyphen and a line break from from one media (print) form into another (digital).
According to McGann, texts are “autopoietic mechanisms operating as self-generating feedback systems that cannot be separated from those who manipulate and use them. Their autopoiesis functions through a pair of interrelated textual embodiments we can study as systems of linguistic and bibliographical codings” (Textual 15). If one attends to bibliographical codings, a concept named by McGann for an approach pioneered by McKenzie, every physical aspect of a text is part of its ontology. This view challenged the dominant paradigm for Anglo-American editing (associated with Sir Walter W. Greg, Fredson Bowers, G. Thomas Tanselle, and the MLA’s Committee on Scholarly Editing) because the abstract work or version cannot be invoked to authorize emendation of the text of a particular document. The dispute is usually waged on the level of definition and ontology. Whether “text” is an abstraction that exists independently of any individual document depends finally on one’s philosophy of text. McGann, for example, insists that the only condition of texts is the physical (qtd. in Shillingsburg Resisting 40).
A classic work on how meaning is differently embedded in oral and documentary forms of the same work is D. F. McKenzie’s “The Sociology of a Text: Oral Culture, Literature, and Print in Early New Zealand.” McKenzie shows that the work known as the Treaty of Waitangi ceded sovereignty of Maori territory to Britain in the English documentary version. But McKenzie’s reconstruction shows that the Maori oral version of the treaty cannot have the same meaning. The belief in “high-level literacy of the Maoris in the 1830s,” which validates the English document of the treaty, ” is a chimera, a fantasy creation of the European mind” (113). For the Maori, the oral discussion and agreement had higher authority: “the very form of public discourse and decision-making was oral and confirmed in the consensus not in the document” (117). The university classroom is also no stranger to such separation of oral performance and documentary authority. When undergraduates read poetry they dissipate the fantasy that a literate person translates from written to oral form without abstraction. When a reader pauses at the end of an enjambed line in an anthologized poem by Alexander Pope or John Milton, he or she reminds us that the skill of of poetry recitation is not a quality of the natural being but a cultural convention. Every translation from oral to written or from written to oral, from making treaties to reading poetry, includes reminders that our seeming seamless negotiation is a learned convention.
By focusing attention on the means of makers, whether the makers create space or end lines, we can see that translation of texts originally printed from type or from stereotype plates to machine-readable form differs little in kind from translations from oral to print or from print to oral–but the difference to me is in the nitty-gritty details untrodden by critical angels. I’ve already cried and dried the tears over my fallen state, but the maddening complexities of translating space and line break from stereotype impressions to transcribed digital texts offer a great deal of fodder for reflection. So, critical angels, hold tight while I drag you through my private slough of despond. And, print-shop devils, please curse at me when like Little Eva undirtied with the steamboat coal dust I emerge at the end with my hands still unmarked with printer’s ink.
A compositor in the mid-nineteenth century had in his type case–and it usually was a he at mid-century–physical objects with an extraordinary flexibility to justify lines of type. A compositor’s means to match the width of the line of type to his stick was elegant in its simplicity. Rather than reaching for a generic space, which did not exist, the typesetter reached for the space appropriate to the needs of the individual line. The basic spaces between words in a line of type had one of three lengths. But the width of a space is always relative to the size of the type font, thus to speak of the width of a typographical space is to speak in relative terms about the relationship of a space character to the standard measure for type, the em quad, a square piece of type. The typical space between words is known as a thick space. Three such spaces are the width of an em quad, so a thick space is three-to-the-em wide. A middle space, four to the em, is used for tighter spacing. And a thin space, usually five to the em, is used for punctuation and very tight spacing (Gaskell 45). The three basic widths of spaces between words–1/5 em or thin, 1/4 em or middle, and 1/3 em or thick–and end-of-line hyphenation offered the compositor flexibility and precision.
But even these three do not exhaust the spaces in the type case. The 1-em space, an em quad, is itself a space for standard functions of indenting a paragraph and of separating sentences. A hair space (1/6 em) was used to separate punctuation. Another specialized space, the en quad, 2 to the em in width, was used for spacing headlines of small caps letters. And longer spaces, 2-em or 3-em width, were used to pad out a line or for a blank line. A late-nineteenth century printer’s guide advises that a compositor could also combine two or more spaces of different width (Southward 171, 166). To return then to the hyphen that is not a hyphen in Uncle Tom’s Cabin, the width of the “space” that the hyphen seemingly represents is not generic: the space separating parts of contractions in two-volume Jewett edition is either a middle space or 1/4 em in normally spaced type, a thin space or 1/5 em in tightly spaced type.
The form of the digital line break, abstracted out of my shorthand “does-n’t [sic], depends both on an encoding system and on a computer implementation. In ASCII alone, a line break has three different forms. A line break might be signaled by a Line Feed character (LF, hexadecimal form 0×0a, decimal form 10), by a Carriage Return character (CR, 0×0D, 13), or by both characters in series, CR+LF. UNIX systems (including Mac OS X) use LF, pre-UNIX Macintosh systems (through MAC OS 9) use CR, and Windows systems use CR+LF. (http://en.wikipedia.org/wiki/Newline, 29 May 2009 10:51 am EST). These conventions divide the basic operating systems according to a conceptual understanding of the relationship between text viewed on screen and text as printed matter, whether a carriage return is implicit in a line feed, whether a line feed is implicit in a carriage return, or whether we’re just going to get along by insisting on redundancy after the origins of a line break for on screen display or for printout have faded from our digital conscience.
Such disagreements are not irresolvable–and note that I have simplified the extent of the disagreement by ignoring other systems–but the UNICODE standard resolves them (and the need to transfer files between systems) by specifying that UNICODE-conformant applications must recognize seven characters as line terminators: Line Feed, Carriage Return, Carriage Return followed by Line Feed, Next Line, Form Feed, Line Separator, and Paragraph Separator. Literary scholars can typically ignore these distinctions, but a textual editor or a student of digital humanities (who might want to alter texts with a PERL script) forgets these things at the price of days–and sometimes weeks–of frustration. In regular expressions, a general-purpose text-matching tool common to many programming languages, the dot (.), the general-purpose expression for every character, means any character EXCEPT a line break. An expression to match all characters must use classes, so the class that includes all digits, signaled \d between brackets, and all non-digits, signaled \D, are combined into a single expression ([\d\D]) that defines “every character” such that line breaks too are counted as characters (Schwartz 102, 106).
Unlike the complexity of line breaks–which separate OS partisans into lines with picks and axes–the complexity of space in historical typography has not raised the hackles of humanists. While only a handful of historians and literary scholars gad realized that a digital revolution was in progress–and while knowledge of differences in typography remained a sign of a mind diseased with printing history and bibliographical lore–a mathematical genius solved most problems of space representation for digital typography. In Donald Knuth’s TeXbook (1984), the guide for the publishing system TeX, he on one page offered a series of control sequences that satisfactorily address historical printing practices in his digital typesetting system. He divided the em quad into 18 units of math glue: thin space (1/6 quad, 3 units of math glue), medium space (2/9 quad, 4 units math glue), thick space (5/16 quad, 5 units of math glue), and negative thin space (no such physical being, -1/6 units of digital glue) (167). When genius leads, consensus follows, and the UNICODE standard now permits us to encode XML texts with the specificity that Knuth long ago allowed in digitally typeset text. See the UNICODE General Punctuation chart, where thick space equals three per em, mid space equals four per em, thin space equals six per em, and hair space equals thinner than thin. Those who would complain about minor discrepancies between between Knuth’s definition and Southward’s and UNICODE’s –UNICODE’s 5/16 quad differs by a sixteenth of an em from Southward’s 1/3 em–are more persnickety than I, but the combination of multiple spaces, including the negative thin space permitted in TeX, really ought to satisfy everyone that historical typographical space can be represented in digital type.
These fine distinctions of historical texts could be translated into digital form, but in historical translations (with the allowance that online versions preceded successful implementations of UNICODE) one often finds the digital preference for presence or absence, for on or off, for 1 or 0. A space is either present or absent. Most digital projects treat typographical space as either present or absent. So, for example, the Early American Fiction site, one of the most carefully transcribed digital texts, which includes Uncle Tom’s Cabin, does not distinguish between, thin, or medium, or thick spaces. Medium and thick spaces are usually transcribed as spaces, but thin spaces are sometimes transcribed as spaces, sometimes omitted.
Space widths may exist in the original documents, but if the encoding system for translation into digital form does not recognize the distinction present in print forms, the fine distinctions that I discussed are edited out of the transcription. And I am not aware that former generations of textual editors have treated space width as a significant feature, even if in their silent labors they may have fretted. Fredson Bowers, in “Some Principles for Scholarly Editions of Nineteenth-Century American Authors,” one of the notable statements on editorial practice, condemned the practice of modernization for 19th-century books, in the case of “spelling, punctuation, capitalization, word-division, or paragraphing” (SiB vol. 17, 223). But Bowers did not mention typographical spacing. Bowers imagined scholarly editions in the codex form, and any process of re-setting for a book edition will introduce new line breaks and thus demand new spacing if the scholarly edition would display professional quality printing.
But it cannot be said that typographical spacing is haphazard in any of the four versions of Stowe’s Uncle Tom’s Cabin that I’ve examined. Each edition has a style and a preference–a design. We might remind ourselves—before we dismiss this work as incidental craft—that a compositor to set typographical space in the hand-setting era expended labor comparable to the setting of the letterforms themselves. Had someone edited Uncle Tom’s Cabin in a scholarly edition in print form–it is a curious feature of literary scholarship that no one yet has–the cost in paper to burden readers with expansive lists of minor alterations in typographical space would have mitigated against recording variations in typographical space. As we have seen, most editors who have reprinted the 2-volume Jewett edition treat the minor differences in typographical spacing as inconsequential, if they consciously note its presence at all. The process of editing present-day reprints for scholars is in part an inheritor of modernist ideas about type. Digitization too inherits modern ideals about type, and the process of digitizing, like the process of reprinting, renders invisible the work of earlier generations.
Coming soon: Part IV: The Modern Era: When Type Became Visible
Bolter, Jay David and Richard Grusin. Remediation: understanding new media. Cambridge Mass.: MIT Press, 2000. Print.
Gaskell, Philip. A new introduction to bibliography. New York: St. Paul’s Bibliographies and Oak Knoll Press, 1995. Print.
“General Punctuation Range: 2000-206F.” Unicode Home Page, Version 5.1. 20 Jun 2009 http://www.unicode.org/charts/PDF/U2000.pdf. Web.
Knuth, Donald. The TeXbook. Reading Mass.: Addison-Wesley, 1991. Print.
McGann, Jerome. The textual condition. Princeton N.J.: Princeton University Press, 1991. Print.
Mckenzie, D. F. Bibliography and the sociology of texts. Cambridge: Cambridge Univ Press, 1999. Print.
“Newline – Wikipedia, the free encyclopedia.” 20 Jun 2009 http://en.wikipedia.org/wiki/Newline. Web.
Schwartz, Randal, Tom Phoenix, and brian d foy. Learning Perl. 4th ed. Sebastopol CA: O’Reilly & Associates, 1997. Print.
Shillingsburg, Peter. Resisting texts : authority and submission in constructions of meaning. Ann Arbor: University of Michigan Press, 1997. Print.
—. Scholarly editing in the computer age: theory and practice. 3rd ed. Ann Arbor: University of Michigan Press, 1996. Print.
Southward, John. Modern printing a handbook of the principles and practice of typography and the auxiliary arts. 3rd ed. London: Raithby Lawrence & Co., 1912. Print.
Stowe, Harriet Beecher. Uncle Tom’s cabin, or, Life among the lowly (1852). University of Virginia Library Digital Collections. Boston: John P. Jewett, 2003. 20 Jun 2009. Web.