Counting Characters in Gettysburg Address

Janet Maslin in her 4 Dec. 2006 New York Times review of Plowing Hallowed Ground, Gabor Borritt’s book on the Gettysburg Address offers a backhanded compliment:

“And he can split hairs with the most obsessive experts in this field. (How many characters were in the speech: 1,243? 1,397? 1,402?)”

You would think that scholars could at least figure that out, but it’s not as easy as it seems. For example, see the LOC exhibit of the Gettysburg Address drafts. Let’s take an itty bitty sample.

In the Nicolay manuscript draft, the words all men are created equal are between double quotes, but there is no period. So, let’s count the “characters” in the handwritten text of this quotation. I count 21 alphabetic characters and an opening and a closing quote mark, but no period. Therefore, 23 characters.

The Hay draft, however, has no quote marks. If Lincoln in this handwritten draft is citing the words of the principle from the Declaration of Independence without noting the instance of quotation with marks, then there are 22 characters, 21 alphabetic characters and the period. Does it matter whether Lincoln marks off the quotation? Depends on various considerations, such as, for example, Lincoln’s usual practice: Did he make a mistake when copying? Did he choose not to draw attention to the source in his speech? Might he have wavered or varied on whether to draw attention to fact of quotation in his speech? Is the principle a quotation from another document or a principle that is applicable independently of the declaration? Thus, the presence or absence of these quotes might be significant for reasons other than obsessiveness about character counts. And yet, these “accidental” variations (term of Sir Walter Greg to refer to punctuation in “Rationale of Copy Text”) are not highlighted as significant in the transcription section of the LOC site.

What if, however, the speech is transcribed on a monument or set in type. New problems for the character count emerge. Unlike the type transcription reportedly in the oval office, the monument transcription, as you can see in this high-resolution photo on Wikipedia, is all caps, and it is much more like this:

ALL [line break]

But not quite, and here blog technology fails as a representation. The “period” is not a type or computer period. It’s a monumental inscription period, which is raised to the middle line, after a space. Because it’s a monument inscription, does it really make sense to speak of “space” characters? So let’s say 22 characters.

Note. It depends. Contemporary monument scripts are generated (according to stonecutter at Small Special Collections Library) by sandblasting die-cut piece of vinyl pasted onto marble. The vinyl generated from computer output. Space in monument script is thus in production sense a space character. But Lincoln monument pre-dates computer typesetting.

The LOC site’s oval office version of the monument inscription is a bit loopy, which in part derives from the translation of monument script into type. In type there are space characters. When set in type by hand (or with linotype, as this facsimile may be), a “space” is an uninked character. Spaces have varying widths (as individual metal pieces or as linotype codes). But we’ll count each space between words as one “character.” No quotes and a period, so 26 characters.

Other Notes:
Should the word be “cannot” as in the type transcription of monument or “CAN NOT” as in the monument inscription. Should it be “Fourscore” or “FOUR SCORE”? Notice on the monument inscription that ellipses (in the LOC print transcription) are tilde shapes and are indistinguishable (at least to me) from hyphens on that inscription. Do hyphen shapes matter? Ask an Emily Dickinson scholar.

We could go on. Is a line break a character? Is a hyphen? Is a ligatured combination fi or ff (occurs in type, not in monument script) two characters in handwriting and one character in hand-set type? Well, yes. Is a line break a character in a monument inscription? No. In hand-set type? If at justified line-end, no. If at unjustified line-end, yes.

Computer representations bring other issues. Microsoft Word code display tells user a line-break is one character, signaled by manual line break or by paragraph symbol. To display HTML line break in browser takes four characters <br> or three <p>. Underneath (at level of machine representation), it’s different depending on operating system. Windows uses a <CR> and an <LF> character to break a line, Unix an <LF> only, and Macintosh a <CR> only. Decimal values? Hexadecimal values? Ultimately, inscriptions on hard disks. Yes, different.

There are multiple texts of the Gettysburg address. Because the texts vary, each will have an alternate character count. Handwritten documents must usually be translated into the “intended” utterance. For example,  if in one’s judgment Lincoln probably “intended” to end the first paragraph with a period or “intended” to use quotation marks, an editor (or character counter) might choose to silently add the character for him. Does it then count as an additional character? The documents are translated into multiple representation system. Any representation in an alternate system (handwritten or hand-set type or linotype or computer type or monument inscription) will produce alternate character counts. The difference is not imperceptible (and may be important) if one is willing to read original documents or to read text technology at the level of the character.

The alternative is to let the “most obsessive experts” do it for you. Are you going to trust a person who would stoop to count the characters in the Gettysburg Address? Or is it more cool to dismiss the work of such folks as probably irrelevant to the real work of thinking and criticism and remain unbothered with such trivialities as detail, accuracy, encodings, and alternate texts?

