Learning Python (Week 4)

After last week’s distressing discovery, that CollateX is now only available as a Python module, I decided to hunker down and get serious about learning Python. I’m an educated person: I can learn a programming language, even if the only reason to do so is so that I can use CollateX. Should it prove that I CollateX is not suitable for my work, then I console myself that automating regular expression routines–I use RegEx extensively within text editors–will itself be worth the investment in time and the new skill.

Below are the ways that I’m studying Python:

  • Learning about managing two different environments, Python 2.7 and Python 3.4, on same machine. First, download and install Anaconda Python. And then take the 20-minute Test Drive, which explains how to switch between environments.
  • I need two Pythons because the Python course I’ve been taking from edX, Introduction to Programming and Computation, is version 2.7, and CollateX is supported under 3.4. I keep returning to course, even though I’m using it in archive mode–on week 4 of 8-week course that ended several weeks ago, because one still has access to lectures and auto-graded exercises.
  • Three other course/tutorials I’ve been working through or sampling from are the official Python Tutorial, Learn Python the Hard Way (also a book), and Automating the Boring Stuff with Python (also a book). PS: Book history types, they’re both rather interesting to think about as book-course combos.

With another serious week of Python study (Python all the time for last few days) under my belt on top of this summer (when I spent 3 semi-serious plus one desultory week with Guttag-Grimson-Bell course before teaching responsibilities kicked in), now I turn back to figure out how to run CollateX with Python because that’s the option that remains available. I’m an English professor. If you wonder why I’d put myself through this, it’s kinda-sorta Digital Humanities but for the most traditional of purposes, because I’m a philologist and want to learn how to use CollateX to help automate editing. For my initial panic, see CollateX, Python, Anaconda, Oh My: Or, What Have I Done? (Week 3 Reflections).

What’s next?

  1. Review Birnbaum’s Obduron instructions for importing files, and review Python instructions for manipulating files with RegEx and saving results as another file.
  2. Learn to read JSON and Dictionaries
  3. Encode UTC transcriptions as XML, preferably with and sentences and verse lines individually numbered. (PS: I always think of that as turning Uncle Tom’s Cabin into a Bible. Book-chapter-verse numbering is an amazing invitation. Philosophy texts are often numbered by book and section, and scholarly versions add numbering to paragraphs for reference purposes.)
  4. Re-learn XSLT (for the nth time)
Posted in Uncategorized | Leave a comment

CollateX, Python, Anaconda, Oh My: Or, What Have I Done? (Week 3 Reflections)

Somewhere in previous two or three posts I explained that I want to engage more sophisticated collation tools, which for me includes CollateX. Therefore, I decided this past week to get really engaged. Upon turning to the site at CollateX.net, I find that there is no longer a Windows or Mac command line version, as there was in CollateX 1.5. Now with version 1.6, it’s something else, a Java archive, and I’m not really sure what that means.

I play a DH scholar on TV (I’ve encoded texts in XML, published peer-reviewed scholarship on Scholarly Editing and the Whitman Archive, and done some XSLT development for Blake Archive), but I’m still an English major at heart. Mostly I read things, so this is distressing to me. But, okay, I have the enormous privilege of a research leave semester. If I’m going to learn something new and technical, it’s going to be when I have intensive time to devote to it. So I may as well. Take a few deep breaths, and here I go.

First I tried figuring out what the heck to do with the new CollateX download. Being the naive sort, I went to directory and tried a version of what worked with Version 1.5. This command worked in CollateX, version 1.5, when one had the files one wanted to compare in the bin directory:

./collatex wit1.txt wit2.txt > wit1wit2compare.txt

I wondered whether (but doubted that) it would work with CollateX, version 1.6, this new Java version. But I’m not expecting much cause there’s no longer a “bin” directory.

java -jar collatex-tools-1.6.1-jar wit1.txt wit2.txt > wit1wit2compare.txt

The result? not able to access jar file. OK, so that’s not going to work. A command displays the documentation:

java -jar collatex-tools-1.6.1-jar -h

This is what the “documentation” looks like:

Screen Shot 2015-09-20 at 10.03.39 PM

I try what documentation says, so:
collatex wit1.txt wit2.txt > wit1wit2compare.txt

And I get…nada. Command not found. I realize that there shall be more floundering in a technical hellscape. Then I recall something that Ronald Dekker had tweeted in reply to one of my Twitter questions:

I had started studying Python this summer. I’m glad I did, because now it looks like I may have no choice. I know the people who put together these tools are wicked smart, and when academics I’ve found them to be genuinely nice people, but (as I said) my English major heart is having palpitations.

Surely, I’m not the only one with this trouble. Isn’t Google search my friend? Yes, so I search for “collatex install” and hope for the best. Surprisingly, half-way down the first page of results is “Python 3 and CollateX installation instructions – oo.” The “oo” is a bit worrisome, but it is on the Obduron server, which I recognize. And I know the scholar from other work, his “Even Gentler Introduction to XML,” which I have assigned to students: it’s David Birnbaum. Alright, despair, away with ye for now.

I followed the Python and CollateX install instructions (this is not a namby-pamby version but serious geek tools, “an enterprise-ready Python distribution for large-scale data processing” called Anaconda. It’s not “clickable,” so to the command line. And I apparently have something called Pip, which is like HomeBrew or MacPorts, a command-line install routine (not so bad, I’ve been around this kind of block with LaTeX and installed Gimp with MacPorts). More Pip for Levenshstein (harmless process, will read about that later), and then GraphViz both as separate program (had that, now updated) and something called Python bindings for GraphViz. No idea, just need to do it, but almost skipped that step. Some 3 hours later (minor problems with XCode install having hung, Python 2.7 showing up on iPython Notebook, moments of about Python 3 requirement interfering with Python 2.7, reading Anaconda documentation), I have CollateX installed andrunning in iPython Notebook.

Now, I’m trying to get my head round fact that I now have industrial strength Python distribution (farewell Monty Python and Eric Idle jokes), what possible reason I need to launch a web server to run iPython Notebook, what iPython Notebook is, and where on God’s green earth (though I suspect in my file system) the files that I want to collate should be.

Side note here: I’m not trying to save the world. I’m trying to collate 5 very accurate transcriptions, transcribed by myself and others typing and read aloud to proofreading, in a very arts-and-crafts sense. I’m a bookish person: I treasure books as individual physical objects, and I gather up the fragments and put them in little baggies when old bindings or paper fragments crumble and totally sympathize with others who do same thing. I need computers to automate my editorial work, not do it for me. But I’m spending an awful lot of time trying to figure out whether computers can do the work that I need them to do.

Now, time to begin the CollateX tutorial. Cause of course all that’s not the actual work, just the setup to be complete before the work can begin. An afternoon I spent working my way through this. Ooh, step by step with iPython, I think can do this. This is all well and good, but my trouble is in 119, when transcribed texts are inserted into Python script and viewed on screen. That’s weird: no one would do that except a computer person teaching a tutorial for demonstration purposes. What I really need is collating exterior files. And yeah, I see exterior files, sample collation files from Barbara Bordalejo’s neat Darwin Online project. But how do you pull in external files? This is what tutorial says:

Part 3: Reading multiline input from files (watch this space)
Part 4: Creating XML output (watch this space)

You can expect me to be watching this space daily for the next two months. Enough for one day. Then, on Monday, I wake up and remember there’s email. So I (with hope) send David Birnbaum an email message. Not 30 minutes later–I kid you not; I just walked dog around block after sending–he replies with new instructions posted to GitHub, at https://github.com/ljo/collatex-tutorial/tree/master/unit5.

This is where my weekend work ended, perched between hope and fear, as I needed also to do other work, to demonstrate class project at library and to write a letter of recommendation. Maybe I’ll figure out how to do this, with its 180 lines of Python code, and maybe I won’t. And what will I do with JSON output, when I don’t even know what that is.

The project that I had to finish up on Monday is the Drupal publication of letters of Alfred Chester, which students in my DH course got into near-publishable state at end of class. But we had to wait for permissions from Chester’s executor Edward Field, which he provided last month. Today I tried very much to figure out how to publish letter images with TEICHI Framework, but I was ultimately stymied. I don’t feel like I can go back to that, as it may take several days, and I need to focus on this Python CollateX work.

I am nervous and anxious again, especially after reading “Computer-supported collation of modern manuscripts: CollateX and the Beckett Digital Manuscript Project,” in which the idea is that automated collation is supposed to solve all kinds of problems for technically sophisticated projects. The projects that are imagined are well-funded and technically supported, not the work of a lonesome scholar at a regional state university. Gonna have to get more imaginative here, as I have no desire to be Boxer the horse from Orwell’s Animal Farm. I know, maybe I’ll use my annual $500 travel budget and travel as luggage and sleep on the street in Europe the next time that a CollateX seminar is offered. Or raid my high son’s college fund?

Caution (Profanity with Sexual Innuendo Follows)

You don’t have to keep reading, as no further information about collation follows.

And yes, it’s bawdy, but it’s really funny: Alfred Chester is a riot. And now that his letters are published online with the permission of his estate, I can share the funniest line, from his 22 May 1964 letter to Norman Glass, “Why do you grease your asshole if there is no one there to fuck it?” Since reading that letter, I have no longer cared about a tree falling unheard in the forest. Now, I think instead of Chester’s line (and chuckle to myself).

Posted in Uncategorized | Leave a comment

Reflections two weeks (and 2 days) into research semester

This post will be brief on last week’s tasks–for purpose of self-reminding of thoughts–so that I can focus on tasks ahead.

Last week turned again to business with the process of writing a book review, which I had promised to a journal editor by September 15. This grew out of someone noticing my work on Mother Whitman, and I was asked to review Gary Schmidgall’s Containing Multitudes: Walt Whitman and the British Literary Tradition. I was rather frustrated with the book, but I did not want to write a snarky review because I respect the effort that it takes to write a substantial scholarly book. Yet I want to make substantive objections clear when they interfere with the claims that the author is attempting to make or to hint at. I tried initially to write a measured draft, but it was just impossible as my true attitude kept ruining the tone.  Eventually, most of last Friday, I rewrote most of near-complete draft without censoring my objections and produced a much better draft, though knowing I would revise again to be non-snarky.

The effort to remove the more pointed phrasings on Saturday and Sunday was not going very well, as I had become convinced of what I said. In exasperation, I turned to two sets of colleagues, Robert Trogdon, my department chair, and  members of my hopefully-on-temporary-hiatus NE Ohio reading group (just reached out to regular members Jon Miller at Akron, Debra Rosenthal at John Carroll, Denise Kohn at Baldwin Wallace, though Robert Nowatzki and Adam Sonstegard at Cleveland State have also attended). Jon Miller and Robert Trogdon had time to provide some excellent advice, and I revised the draft Sunday evening and yesterday and forwarded it on to the Prose Studies editor.  The process reminded me that when writing it’s often  necessary to get something, anything done before the real work can begin. Much real writing begins after the notes are taken and draft is complete. Also, stop underestimating how much work writing is.

The Uncle Tom’s Cabin scholarly project work was not abandoned, though the review cut into my Friday, Monday, and weekend free time. I completed the following:

  1. Using the Lindstrand Comparator to sight-collate 180 pages of the 1853 illustrated edition of Uncle Tom’s Cabin.
  2. Discovering that the encoding state of my 1853 Illustrated Edition and 1879 New Edition transcriptions were not quite up to snuff, and trying to figure out how to correct if any were without causing more trouble.
  3. Having Google Drive scare the bejeebees out of me when I began to wonder whether it was randomly discarding old files.
  4. Breathing deeply, thinking slowly, making sure I understand what the issue is before I try to tear through files with REG-EX to fix it, writing notes to myself with reminders.
  5. Fixing problem 2. Two copies of the illustrated edition encoding were corrected against original to conform to more recent project standard, and the 1879 New Edition of 3 copies was rigorously checked to ensure that the encoding that I thought had been made done was in fact done.
  6. Encoding several pages of Herman Melville’s manuscript of Billy Budd on TextLab. Cackling gleefully when I return to Stowe’s more legible MS.
  7. Reading quite a bit of John Locke’s Essay on Human Understanding and studying German in my free time.
  8. Doing student-related work and domestic chores and not reading nearly enough literary and historical criticism on Stowe, 19C, race, religious history, study of novel, etc.
  9. Getting angry at Kent State administration for faculty contract negotiation–because I’m an elected member of the AAUP Council.

With book review and encoding done, it’s now back to assessing the encoding standards for collation. Let me describe the editions briefly:

  • The encoding work began in 2006 while working on the National Era newspaper version. The copy has more errors than a book, but it’s a relatively clean draft.
  • The John P. Jewett first edition (1852) was corrected twice and type deteriorated over the first several printings that produced 160,000 copies or so. Another 140,000 copies or so would dribble out, and type would deteriorate badly. Very few errors, technically, but the declining quality of plates has led to errors in transcription, which I began to understand only after I started encoding Jewett Million edition.
  • In Million edition, which Stowe revised by adding a passage in chapter 20 and seems only to interest textual scholars, I again realized compositors used spaces in contractions and before punctuation. Allowed me, for example, to distinguish colon (with top dot lost) from period and semicolon (with top dot lost) from comma. Because a cheap edition, quite a few errors, but most again concern deterioration from banging on stereotype plates. Like flesh, even metal plates are heir to ills.
  • Illustrated edition: no spaces in contractions, many errors but little type damage.
  • 1879 New Edition: basically illustrations from a British edition by Nathaniel Cooke and the text of the 1852 Jewett first edition, including contraction spacing again, though more late-19C dialect and word style.  For example, Stowe and her mid-century printers considered a contraction one word. In 1879, contractions will have italics on one word in the contraction pair, i.e., considered as two words.

These are the complexities:

  • The National Era text encoding was initially intended for PC-CASE collator, but that application only runs on 32-bit Windows system. Did it for dissertation, but I need to move up to more modern textual editing system like TUSTEP or CollateX.
  • I continued to use similar encoding, which has been inflected by my use of LaTeX, to transcribe Jewett first edition, Million edition, ’53 Illustrated Edition, and ’79 Illustrated Edition.
  • The plan is to use Python and REGEX to transform the encoding into a form that will work with CollateX, which will require me to be extremely systematic because my transcriptions are extraordinarily detailed: special characters, verse indentation, small caps, italics, type damage, variants, contraction spacing, etc.
  • The challenge is to normalize and systematize encoding such that any loss of detail (some detail must be normalized to avoid collapsing into morass of detail about differences in type spacing) is processed systematically and documented systematically. To use a simple example, if I remove all spacing in contractions, those won’t be identified as variants. 99.7 % of readers will know right away that they don’t care, and I can describe for the 0.29% of readers who pretend to care and provide archival access the source files. When that 1 in 10,000 reader actually cares enough to check my archival source files, it’ll be there. If you want to see this done right, see Peter Shillingsburg’s statement on normalization in the scholarly edition of Thackeray’s The Newcomes.
  • Get good at this digital collating process, like Barbara Bordalejo in her kick-ass Variorum edition of Darwin’s Origin of the Species.
  • Describe the important variants, like John Bryant does in The Fluid Text.
  • As always, say what I do and and do what I say, the motto for scholarly editors.

This is it for now. I’m starting to ramble and need to go back to the hard work. Oh, by the way, I’ve restrained myself from tweeting for almost 20 days. I’m kind of proud of myself. But I broke down and started drafting clever tweets as the damn medium seems to have taken over my mind. Logging into Twitter periodically, which I still do, feels like stopping by a bar and not drinking, except that I don’t go to bars to drink and haven’t for years and have no clue whether it’s a good comparison. Back to figuring out how to normalize and regularize, and then figuring out how to automate some? most? appropriate amount? of the process with Python and CollateX.

Posted in Uncategorized | Leave a comment

Reflections one week into research semester

As last week’s post explains, I have completed first week of one of two semesters devoted to research on Uncle Tom’s Cabin, in particular, on the editing of it. As I began, one of my first tasks was to unburden myself of other tasks and distractions. So far, I’ve been reasonably successful.

I signed off of Twitter, or, rather, tweeted that I was taking a Twitterbatical of six weeks. I miss Twitter less than I thought I would. And I work much more productively without Twitter, which distracts me. I follow many interesting people, so the distractions are usually worthwhile, but distractions still from what I consider my main work. The losses seem like more personal things. I had forgotten that I use Twitter to track the Friday night football game. Also, my spouse realized that conversations elicited by my Twitter feed are her secondary source of news, after NPR.

One of my next tasks was to set up a daily schedule, which has been reasonably effective so far.

  • MWF early mornings are for writing and late-mornings are for reading scholarship
  • MWF afternoons are for electronic collation, variant narratives, and textual scholarship tool refresh
  • TTh mornings are for reading, library book retrieval, and print collation.
  • TTh afternoons are for variant narratives and German
  • Evenings are for making up what was skipped.
  • Weekend mornings are for intensive technology, and afternoons are for German, reading, and service activities. All weekend things are optional.

I knew the initial schedule expressed an ideal, but I did not realize how much of one, as all of the following other tasks joined my week.

  • Service, urgent Melville TextLab testing and AAUP meeting, some 4 hours
  • Annual report on scholarly activities, to new reporting system
  • Several student matters, which cannot be avoided because not teaching, as the responsibilities to students remain.
  • Computer repair for son, who had received new video card which, when installing, led to crazy silliness: no DVI cord, wrong style DVI cord from store, cracking of wireless card connecting and seeming crash, repair and ordering new wireless, reinstalling, probably 5 hours in all.
  • Bizarre Google Drive experience with missing files, which had to be restored from backup, probably 4 hours, after wasting 4 hours transcribing something that I had already transcribed but was missing inexplicably from Google Drive. Because had been 3 years, I did not remember I had done it. [still have not figured out what weirdness was leading to lost Google Drive files, and now have a lot of anxietyand no good answers]
  • On trying to test electronic collation, discovered that my work on Illustrated Edition was remembered as more complete than it was, so had to go back to collate and correct files. [Over years, my encoding styles have changed, and I need to apply greater consistency].
  • Other personal and family responsibilities that I had not anticipated specifically, but are just life with a family.

Despite all those, I succeeded at correcting and updating encoding for 1.5 copies of Illustrated edition, read and sorted through some 40 Stowe letters, and sight-collating half of an Illustrated edition, transcribing several pages of Melville’s Billy Budd, and read several articles.

Below are a few notes gleaned from return to close study of textual detail of Illustrated Edition and Stowe’s letters and Melville.

  • The Stowe family had a dog named Carlo, same name as George Harris’s dog, killed by George Harris’s owner in chapter 3 of Uncle Tom’s Cabin.
  • Stowe’s early MS is in a very lightly punctuated style. Her manuscript pages from later in the book echo punctuation of Jewett edition.
  • H.B.S. gave Calvin hell about accepting the Amherst position. She believed that an adjunct position at Bowdoin would surely be converted into a permanent one. (Another time, indeed).
  • There are at least 3 legit variants, altered text, in 1853 Illustrated Edition, if one includes title page and illustrations. Also, several illustrations were moved slightly.
  • Harrison Hayford and Merton Sealts were freaking brilliant.
  • In illustrated edition, an em dash to signal omitted letters is flush with letters, so d–n. When used as syntactical punctuation, no thin space separates.

I apologize for cryptic notes. I constrain myself to protect family privacy, to honor obligations to rights holders, and to pursue informality, which is born of desire to publish notes to prod myself rather than to seek public consumption. I no longer delude myself about broad cultural interest in my scholarship–though I continue to believe framing can broaden public interest. But for this purpose, I assume that people who read here care about this stuff.

Posted in Uncategorized | 2 Comments

2015-2016 Research Plans for Scholarly Edition of Uncle Tom’s Cabin

On Monday August 31 I begin my first ever research leave, and I thought it might be salutary to post my rationale and plan for the work that I am to undertake here on ye olde blog, both to show a few people, if not the world, that university faculty engage in research and to give myself a public kick in the rear so that I will stay on task.

Before sharing the details of my original proposals (which have been modified slightly during transition from LaTeX to blog post format), let me explain the university lingo that decorates the discussions below. At Kent State, our name for a sabbatical, a semester release from teaching to pursue whatever work a faculty member chooses, is Faculty Professional Improvement Leave, which is abbreviated FPIL. I became eligible to apply for FPIL in Fall 2014, after seven years. Or, maybe tomorrow Monday I could be begin funded research leave as Research and Creative Activities. I am doubly fortunate in following two terms to have applied for and received funding for a second semester leave. The explanation when seeking that funding must be more formal, as it goes before a faculty committee to be judged: a proposal has to be pretty strong to get it. FPIL eligibility, by contrast, is determined by departmental programmatic need and rarely denied.

Personally, I consider this my first sabbatical to have followed 7 years of teaching at Kent State, 4 years of PhD study, and 2 years as a post-doc, though no one counts that other work times as work. Regardless, it is humbling to have institutional endorsement that I am trusted to do work. Maybe trust isn’t the exact word. The university is confident that I’ve been co-opted, and my freedom is mostly an illusion. I am well aware of my privilege as a tenured faculty member, but there is a sense in which I cannot stop working. It’s now as much who I am as what I do. My main regret is that I don’t do much directly to bring justice into the world, but I rationalize that by spending my time illuminating the process by which a literary work with real social justice aims came into being I serve a somewhat higher purpose. On the other hand, the work has little or no potential profit. I’m doing work that needs to be done to study cultural heritage, and I’m trying to do it right.

Another semantic complication is that I was officially appointed to FPIL in the fall 2015 and Research and Creative Activities Leave in spring 2016. I asked my chair to reverse them so that I can serve on the departmental graduate committee that reviews applications (by policy, when on FPIL one is encouraged to abandon all departmental committee work). Our department faculty is stretched so thin that I wanted to arrange so I could read graduate school applications. Regardless of semantics and this somewhat minor impulse to self-martyrdom, these are my plans for the ensuing two semesters. I also apologize for the mélange of citation styles that follow, but no more hours are available for blog post prep.

Fall 2015 Research and Creative Activities Proposal

Uncle Tom’s Cabin: A Critical Edition” will, for the first time, produce an authoritative edition that argues for the literary artistry of a work with few rivals for cultural significance in American history. A semester-long research appointment would enable me to submit an effective proposal for a print edition with Cambridge University Press [or comparable] and have the work of the entire edition well along so that I can finish the edition within an expeditious time frame of 12 months (including an FPIL semester and summer) after the proposal is approved. In the summer preceding the grant, I will research publication history, and during the grant period I will to prepare a selection from the edition according to standards for an authoritative scholarly edition. The edition, if accepted for publication, will affirm Kent State University’s Institute for Bibliography and Editing (IBE) as a center for the preparation of authoritative editions of major English-language writers and will be able to compete effectively for external grant support for a digital project that draws from the work.

The growth of scholarly interest in Uncle Tom’s Cabin as a culturally significant literary work during the past three decades could hardly be overstated. When E. Bruce Kirkham published his pioneering and still standard textual study of Stowe’s novel in 1979, he apologized for its lack of “literary” interest: “No one would claim that Uncle Tom’s Cabin ranks as a literary work equal to Moby Dick or The Scarlet Letter, although its social and historical impact has been far greater” (viii). Two years later Jane Tompkins with “Sentimental Power” issued a call that redefined the terms of literary greatness: “the popular domestic novel of the nineteenth century,” she argued, “represents a monumental effort to reorganize culture from the woman’s point of view; […] in certain cases, it offers a critique of American society far more devastating than any delivered by better-known critics such as Hawthorne and Melville” (123; 124). Tompkins chose Uncle Tom’s Cabin as “the most dazzling exemplar” of the genre. Tompkins’s work—and that of scholars like Elizabeth Ammons and Ann Douglas—has helped to reshape the canon of American literature, an effort that scholars in the present day continue. Henry Louis Gates, who honors Tompkins, Ammons, and Douglas for having “resurrected the book,” has sought with Hollis Robbins in a recent edition to offer annotation that enriches our reading of the novel and has argued that Stowe’s novel is essential for the understanding of African American literature as well (6). Most emblematic of the shift toward acknowledging the work’s importance for literature and cultural studies in the nineteenth century is that three scholarly works in the last decade, a publication history, a history of illustration, and a cultural history, focus solely on Uncle Tom’s Cabin (Parfait; Morgan; Reynolds).

Despite the work’s prominence in literary and cultural study, scholar Susan Belasco has noted the inadequacy of tools to support the study of Harriet Beecher Stowe. Unlike authors established in the canon decades earlier, Stowe has neither an adequate bibliography of her publications nor of scholarly criticism, has no published complete edition of letters, and has not one of her works available in an authoritative scholarly edition (Belasco “Responsibility”). While I leave the first two tasks to other scholars, my project addresses the lack of a scholarly edition of Stowe’s foremost work. In present-day scholarship, concern for Uncle Tom’s Cabin as literature is predominantly for the first American book edition, the two-volume format by Boston publisher John P. Jewett (1852). Modern editions such as those by Kenneth S. Lynn (Harvard, 1962) and by Ann Douglas (Penguin, 1981) and the digital text of the first edition on Uncle Tom’s Cabin & American Culture are carelessly proofed, as I have shown in a book chapter (Raabe “Case Study”). Gates and Robbins’s Norton edition (2007), Jean Fagan Yellin’s Oxford World’s Classics (1998), and Harvard’s reissue of Lynn’s text (2009) are also below standards for scholarly study. Katherine Kish Sklar (Library of America, 1982) and Elizabeth Ammons (Norton, 1996) have produced accurate reprints of the first edition, but no reprint has addressed bibliographical studies that show authorial correction during the printing of the Jewett edition. David Reynolds’s facsimile of the illustrated edition (Jewett, 1853; Oxford, 2011) is valuable for its alternate text, but the editor fails to report errors in the printing or to note an alternate illustration that portrayed the Almighty’s vengeance on a sinning nation as an arch-angel with an upraised scourge (Raabe “Trouble”). Selective studies of alternate editions have been undertaken, but scholars routinely cite the first edition text as if Stowe’s work were fully represented by that version. In a continuing acknowledgment of its cultural significance, scholars have also offered major reconsiderations of the range of works written as revisionary responses to Stowe’s and have explored the novel’s extended life in stage adaptation in England and America (Jordan-Lake; Meer). While this scholarship has illuminated the importance of Stowe’s work by exploring the cultural responses to it, my purpose is to re-focus scholarly attention to literary study of its alternate publication forms, which reveal a more complex author than readers have suspected, one who I argue revised when preparing different publication forms for different readers.

My new print edition will include an authoritative text of the periodical installments in the National Era (1851–1852) with all substantive variants from the following forms represented in scholarly apparatus: manuscript texts, three editions by the initial publisher Jewett (two-volume, 1852; paperbound [N.B. That edition technically not paperbound, but I described it that way for nonspecialist proposal.], 1852/1853; illustrated, 1853), and Houghton Osgood’s New edition (1879), which was supervised by the author. The edition will also include thorough annotation drawn from Stowe’s Key to Uncle Tom’s Cabin (1853) and additional historical and cultural context. My work builds on Kirkham’s study of the work’s composition, and my broader investigation of the work’s printing history has already exposed previously unknown authorial revision. Jewett’s paperbound “Edition for the Million” includes a paragraph in which Topsy, a neglected slave child, supposes that she can achieve salvation with service to her heaven-bound mistress Ophelia. The presence of the passage was highlighted in an article published in the journal Documentary Editing, and its importance was demonstrated in a peer-reviewed digital project published by the online journal Scholarly Editing (Raabe “Fluid Text” 101–12; Raabe and Harrison).

The purpose of my application for semester funding is to work through and explain the textual variants and to assess the likelihood of Stowe’s authorial revision, which determines the placement of individual variants in the edition’s apparatus. Those variants that are deemed more likely to have originated in authorial preference will be placed alongside the reading text with an accompanying revision narrative, along the principles advocated by John Bryant in The Fluid Text (2002). For example, given the indisputable authorial revision in the “Edition for the Million,” every one of that text’s variants from the earlier two-volume edition must be taken seriously as a possible authorial correction or revision. I apologize to nonspecialists for delving into some detail, but editorial labor requires one to complete systematic analysis of individual variants to identify authorial revision, though any claim that a significant textual variant should be attributed to authorial preference is ultimately a matter of judgment. In most cases, the National Era will be treated as the copy-text (the text which according to the editorial tradition associated with Sir Walter W. Greg and Fredson Bowers is assumed authoritative in cases where an editor has no strong basis to assume that a later revision originates with the author) because the house styling of the newspaper publisher was light. However, in my analysis of the textual relationship between the National Era serial and the two-volume Jewett edition, I take the view that Stowe altered her text in response to different audiences and that the serial should not be considered uniformly as earlier than the book. Approximately two-thirds of serial installments served as setting copy for the book, but the 12 February–11 March 1852 serial installments were revised in manuscript while the book was in press, and the final three installments (18 March–1 April) are reprints of the book. In other words, even if one accepts that the National Era should serve as the preponderant authority for authorial preference (in minor matters, so-called accidentals) for approximately two-thirds of the serial text installments, the Jewett edition has preponderant authority (accidentals) for the final three chapters, and neither the newspaper nor the Jewett edition has preponderant authority for the chapters that were issued from the 12 February through the 11 March installments. Though I am not the first to reject Kirkham’s conclusion that the two-volume edition uniformly represents the author’s preference, the only previous statement of this view appeared in an unpublished dissertation (Madison). Now that my own scholarship has shown that the reprint “Edition for the Million” was revised by Stowe also, a complete analysis of all variants is essential to establish an authoritative text.

This return to Stowe’s Uncle Tom’s Cabin (work which began with my dissertation) follows the completion of a project that consumed the previous three years, “walter dear”: The Letters from Louisa Van Velsor Whitman to Her Son Walt. That project has now been published on the Whitman Archive and has been accepted into NINES after peer review (see vita). NINES (http://www.nines.org) is a federated repository that integrates digital scholarship devoted to the nineteenth century. From the 2010–2011 through the 2013–2014 academic year, I applied annually for funding by the NEH Scholarly Editions and Translations Grant to support a digital edition of Uncle Tom’s Cabin. These applications have been rejected for the following reasons: panelists urged collaboration with Stephen Railton’s Uncle Tom’s Cabin & American Culture, panelists do not believe Stowe’s work is worthy a scholarly edition, and errors in applications. The funded applications tend overwhelmingly to favor digital editions that are associated with long-standing projects and solid institutional support in the form of digital humanities centers. After the last review, I concluded that I can only refute the first two objections by having a print edition published by a major press and that the building of institutional infrastructure for a digital Stowe project must either be established in-house at Kent State or attract external funding after it is backed by the cultural heft of a print edition. In the current state of grant funding for digital editorial projects, panelists expect the hosting university to have already in place an infrastructure of digital support personnel in centers or institutes. My need for research funding to support a dedicated period of work is also related to making a transition from now superseded technologies. I must revise encoding that was originally prepared for a now orphaned digital collation application (a 16-bit tool, which I began using in 2004) and switch to the newly released open-source CollateX after thorough testing (http://collatex.net/). To prepare a print edition, I must concentrate on refreshing skills in a scripting language (Python), regular expressions, and the typesetting language LaTeX. My core textual work in establishing authoritative transcriptions of previously published historical editions is complete (and encoded to rigorous standards), but the sustained attention that is possible with funding support would ensure that the preparation of a proposal will have sufficient rigor to merit acceptance by a prestigious press. Furthermore, the refinement of processes for analyzing the text and building the apparatus, which will go far beyond the portion included in the proposal during the grant period, will ensure both that the edition will be completed after it is accepted for publication and that it will pass vetting by the Modern Language Association Committee on Scholarly Editions.

The remainder of this document sketches a plan for research during fall 2015, the term of the research appointment. This work benefits from the assistance of the IBE where I am an appointed fellow, and have been nominated as the director.

Fall 2015 Work

One purpose of fall research work is to prepare a historical and textual introduction, reading text, and apparatus for three chapters, which will serve as sample chapters in the book proposal. The remaining time will be devoted to advancing similar work through approximately half of the novel, which will allow the proposal to include a brisk schedule toward submission of the full volume.

  1. I will prepare a book proposal for Cambridge University Press (or comparable publisher) with sample chapters suitable for review by the six members of my editorial board by the end of November 2015.
  2. The book proposal will rely on CollateX and LaTeX to prepare the historical introduction and textual introduction, three sample chapters with an editorially established text, footnotes for textual revision narratives and historical annotation, significant accidental variants (that affect meaning) in print appendix, manuscript text in print appendix, and a supplementary apparatus of minor variants that will be made available in an archival repository only.
  3. I will complete a textual introduction that draws from the analysis of variants, the introduction to the project at Scholarly Editing, my dissertation, previous editions of the text that I have published (see vita), my collection of Stowe’s letters gathered during a research trip to Hartford (with support of summer 2012 research grant) and from microfilm of the Huntington Library Stowe collection, and published work by E. Bruce Kirkham, Michael Winship, and Claire Parfait (Winship).
  4. I will prepare installment introductions that highlight the periodical publication context in approximately 500 words, one for each installment. These will be drawn from my dissertation and from similar efforts by other scholars on the periodical publication context (Smith “Serialization”; Hochman). My goal in fall 2015 is to draft all 45 of these chapter introductions.
  5. My aim during Fall 2015 is to bring the revision narratives for 20 chapters to publication form and to move the textual work well beyond the three sample chapters that will be submitted with the proposal to a university press. I have already prepared and published revision narratives for 38 of 41 serial installments, which represented approximately a quarter of the total number of revision narratives that will need to be drafted. These were published for a blog project headed by the Stowe Center (see vita). That version reported only first edition variants from the newspaper installments. Full revision narratives of all five publication forms (modeled on Bryant’s Fluid Text) were published for a single chapter in Scholarly Editing.

Spring 2015 FPIL Proposal

During the 2015–2016 academic year, I intend to devote a semester of Faculty Professional Improvement Leave to work on my scholarly edition of Stowe’s Uncle Tom’s Cabin, the project that began with my dissertation and that occupied both the personal research portion of my post-doc at Nebraska (2006–2008) and the first four years of my work at Kent State (2008–2012). Though set to a back burner during the 2011–2012 and 2012–2013 academic years so that I could complete “walter dear”: The Letters from Louisa Van Velsor Whitman to her Son Walt (now a peer-reviewed digital publication both on the Walt Whitman Archive and accepted into NINES [http://nines.org]) I wish to use the 2015–2016 academic year to bring the project to a state at which it can be submitted as a book proposal during the 2015–2016 academic year (December or January), with the book published within 12 to 18 months of acceptance.

I have submitted a proposal for an academic-semester Research Activity Grant for the 2015–2016 academic year. If that application is not successful, I would use the period of FPIL to do instead the work that I proposed in the Research Activity Grant. As the two applications are different processes, I here summarize what that work entails: 1) Prepare a new textual and historical introduction to Uncle Tom’s Cabin that draws approximately 30 percent from previous editorial work (listed on my vita), other work in progress, and from research on publication history at Harvard University (am applying also for summer research grant for research in the Houghton Mifflin Collection—one fellowship is named after Stowe’s publisher); 2) Because the copy-text (one that serves as basis for my own for accidentals) will be the National Era newspaper installments, write installment introductions that describe newspaper publication context; 3) Prepare revision narratives for significant alternate readings in half of the novel (along principles advanced by John Bryant in The Fluid Text [2002]); and 4) Prepare full sample apparatus (annotation, emendation, substantive historical variants, line-end hyphens, etc.) for three chapters with collation tool CollateX.

This work would lead to a textual introduction and three edited sample chapters with sample apparatus that would be contributed as part of book proposal, which would be reviewed by my edition’s editorial board—a board is already in place—and submitted (probably) to Cambridge University Press.

If my academic-semester Research Activity Grant is funded, I will continue with exactly the same work during the FPIL period, though with confidence that I can promise a speedy completion of the edition (12 mos.) after a publication contract is signed. If the application for the Research Activity Grant is not funded, I will reapply during the 2015–2016 academic year and apply simultaneously to the NEH for a Scholarly Editions and Translations Grant as an alternate method to fund any remaining work necessary for the completion of the project.

On the matter of personal improvement, acquiring new skills and expertise that can further Kent State as an institution, I have been devoting myself intermittently during the last several years to studying German and have achieve reading comprehension near proficiency. My interest in that is in part because Stowe’s novel was translated into German by its American publisher and in part because one of the foremost tools for scholarly editing is a program called TUSTEP, which has documentation only in German. TUSTEP has now been released as open-access software, and I plan to use some FPIL time to improving my German, to reading in German editorial work and textual scholarship, and to testing TUSTEP as an alternative to CollateX. Despite the widespread adoption of CollateX and TUSTEP in European editorial projects, acquisition of considerable expertise in these tools at Kent State’s Institute for Bibliography and Editing would set it apart from most U.S. institutions, in which editorial work (at least in for writers in English) tends not to consider either of these tools.

Works Cited

Ashton, J.: Harriet Beecher Stowe: a Reference Guide, Boston: G.K. Hall, 1977

Belasco, S.: The Responsibility Is Ours: The Failure of Infrastructure and the Limits of Scholarship. In: Legacy 26 (2009) 329–336

Bryant, J.: The Fluid Text: A Theory of Revision and Editing for Book and Screen, Ann Arbor: Univ. of Michigan Press, 2002

Committee on Scholarly Editions, M.: Guidelines for Editors of Scholarly Editions. In: Burnard, L.; O’Keeffe, K. O. & Unsworth, J. Electronic Textual Editing, New York: Modern Language Association of America, 2006, S. 23–46

Gates, H. L.; Robbins, H. & Jeff, M. Uncle Tom’s Cabin Reconsidered, New York Public Library, 2006

Hildreth, M.: Harriet Beecher Stowe: a Bibliography, Hamden, Conn.: Archon Books, 1976

Hochman, B.: Uncle Tom’s Cabin in the National Era: An Essay in Generic Norms and the Contexts of Reading. In: Book History 7 (2004) 143–169

Jordan-Lake, J.: Whitewashing Uncle Tom’s Cabin: Nineteenth-Century Women Novelists Respond to Stowe, Nashville, Tenn.: Vanderbilt Univ. Press, 2005.

Kirkham, E. B.: The Building of Uncle Tom’s Cabin, Knoxville: Univ. of Tennessee Press, 1977.

Madison, E. L.: A Parallel Text Edition of Uncle Tom’s Cabin: Materials for a Critical Text, Dissertation, Rhode Island, 1986.

McGann, J.: The Textual Condition, Princeton N.J.: Princeton Univ. Press, 1991.

Meer, S.: Uncle Tom Mania: Slavery, Minstrelsy, and Transatlantic Culture in the 1850s, Athens: Univ. of Georgia Press, 2005.

Morgan, J.-A.: Uncle Tom’s Cabin as Visual Culture, Columbia: Univ. of Missouri Press, 2007.

Parfait, C.: The Publishing History of Uncle Tom’s Cabin, 1852-2002, Burlington, VT: Ashgate, 2007.

Raabe, W.: Harriet Beecher Stowe’s Uncle Tom’s Cabin: A Case Study of Textual Transmission. In: Jewell, A. & Earhart, A. The American Literature Scholar in the Digital Age Ann Arbor: University of Michigan Press, 2011, 63–83.

Raabe, W.: The Trouble with Facsimiles: Jewett’s Illustrated Edition (1853) of Uncle Tom’s Cabin. Fill His Head First with a Thousand Questions.

Raabe, W.: Editing Harriet Beecher Stowe’s Uncle Tom’s Cabin and The Fluid Text of Race. In: Documentary Editing 32 (2011), 101–12.

Raabe, W. & Harrison, L.: Selection from Harriet Beecher Stowe’s Uncle Tom’s Cabin: A Digital Critical Edition. In: Scholarly Editing 33 (2012).

Railton, S.: Uncle Tom’s Cabin and American Culture, 2009.

Reynolds, D.: Mightier than the sword: Uncle Tom’s cabin and the battle for America, New York: W. W. Norton & Co., 2011.

Smith, S. B.: Serialization and the Nature of Uncle Tom’s Cabin. In: Smith, S. B. & Price, K. M. Periodical Literature in Nineteenth-Century America. Virginia University Press, 1995.

Stowe, H. B.: Uncle Tom’s Cabin: or, Life among the Lowly. In: Douglas, A.: Uncle Tom’s Cabin. New York N.Y.: Penguin Books, 1981.

Stowe, H. B.: Uncle Tom’s Cabin. In: Gates, H. L. & Robbins, H.: The Annotated Uncle Tom’s Cabin. New York: Norton, 2007.

Stowe, H. B.: Uncle Tom’s Cabin: or, Life among the Lowly. Million ed., Boston: John P. Jewett, 1852/1853.

Stowe, H. B.: Uncle Tom’s Cabin: or, Life among the Lowly. Illus. ed., Boston: Houghton Osgood, 1879.

Stowe, H. B.: Uncle Tom’s Cabin: or, Life among the Lowly. Illus. ed., (1852), Boston: John P. Jewett, 1853.

Stowe, H. B.: Uncle Tom’s Cabin: or, Life among the Lowly. 2 vols. (1852), Boston: John P. Jewett, 1852.

Tompkins, J.: Sentimental Power: Uncle Tom’s Cabin and the Politics of Literary History. In: Glyph 8 (1981), 79–102.

Winship, M.: ‘The Greatest Book of Its Kind’: A Publishing History of Uncle Tom’s Cabin. In: Proceedings of the American Antiquarian Society 109 (1999), 309–332.

Posted in Uncategorized | Leave a comment

Build-Your-Own Damn Crash Course in Literary Theory

My areas of scholarly study are bibliography, editorial theory, and digital humanities. These areas of study inform my teaching practice and scholarship, but I do not expect graduate students to delve into my preferred area of theory to inform a paper in, say, an historical survey of American Literature, which I am teaching in summer 2015. Therefore, if you are an MA student and have not already chosen certain theoretical commitments that you want to continue working within, you need to quickly acquire mastery of “enough theory” to write a competent seminar paper. This post was written specifically for students in an early American literature survey, but I write it a bit more broadly so that it might serve students in other literary periods and traditions. This post was inspired by Natalia Cecire’s “Crash Courses for the Desperate” (more on that below).

First, let me specify what “enough theory” is not: it does not mean that you have identified an interpretive error in the reading of a canonical work of literature by three random critics, one that you in your seminar paper will correct by a close reading of several passages from the perspective an illuminating theoretical lens. The rationale for such a paper is that this fresh theoretical lens not yet been applied to the canonical text that you have selected. Don’t write that kind of paper. Why not? Because that is not likely to be better than a “B” paper. Yes, exceptional work of this type may be published as a brief item in Notes and Queries, which is roughly equivalent to a book review. Though valuable service, ambitious editors of literary texts often achieve as much with three of four annotation notes to an edition. If you’re interested in writing about 50 such annotations, talk to me about a different type of project than a paper. Scholarly service is valuable, but the purpose of a seminar paper is to write a strong draft of a paper that (after revision) could merit submission to a respected journal that would be received favorably enough to go to outside readers.

To aspire to publication (though it is not necessary to merit publication by end of semester: to earn “A” your paper need only to reasonably “aspire”), your paper needs to be strong in at least two of the following areas and competent in the third:

  1. Enough knowledge of an historical and cultural moment outside of the present, whether when work composed and first published or during a particular moment of reception. Sources include letters, contemporary reviews, author’s other works, other works in same genre, and scholarship that seeks to make sense of the work from such perspectives.
  2. Enough knowledge of the present as a particular moment in the critical heritage for the study of the text: Who are the major critics? Which are the major articles and books? What has been at stake when reading texts from this period? Sources include major scholarly monographs on related subjects (“major” is generously considered, from past 3 decades) and present-day (past decade of) scholarly articles, book chapters, monographs, calls for papers, dissertations.
  3. Enough theory.

The recommended readings in this course will provide many of the first two, but I will not provide direction on “enough theory” in a formal way. Therefore, you should devise a crash course for yourself within a contemporary theoretical framework, such as one of the following: Marxism, Feminist Theory, Cultural Studies, Queer Theory, Race Theory, Postcolonial Theory, Pyschoanalytic Theory, New Historicism, Formalism and Structuralism, Phenomenology and Hermeneutics, etc. Recall that I began by admitting that no one of these is my particular area of study. And yet, my names for theoretical schools were chosen deliberately: I opened the Norton Anthology of Theory and Criticism (2nd ed., 2010), edited by Vincent Leitch, and copied out section titles–some shortened–from the alternate Table of Contents, which is labeled “Modern and Contemporary Schools and Movements” (xviv-xxii). If you buy (or acquire by interlibrary loan) the Norton Anthology, select one of the schools in which you have an interest, and spend three days in a crash session reading from texts and head-notes under each section, you are on your way to “enough theory” to write a seminar paper in my class. To be more assured that you have a fuller grasp, read fuller versions of one or two “major texts” that are referenced in the anthology’s head-notes.

Do you now have “enough theory” now? Maybe, but there is the risk that you have taken a flyer the definition of an area of theory under the recommendation of a single scholar (Leitch) and are not yet able to connect your theoretical knowledge with one of the two other areas in which I suggested you need to write a competent paper: 1) “historical and cultural moment outside the present,” or 2) “the present as a particular moment in the critical heritage” for the text on which you want to write. The three backgrounds of your paper must be integrated. And in fact, a “crash course” is not really a shortcut: you need to allow all three things to percolate in your mind a bit. So long as you remain far from a thesis that is theoretically informed and that will be a useful for the two other areas I suggested, continue enriching your understanding of all three areas. Boundaries between areas of theory are not rigid: they cross-pollinate, as even a schema like that in the Norton Anthology hints: Antonio Gramsci appears both in Cultural Studies and Marxism, Paula Gunn Allen appears both in Feminist Theory and Race and Ethnicity studies, etc. Such is unavoidable: major shifts in scholarship often disrupt previously accepted boundaries. Some areas of theory are rising in importance, drawing from multiple antecedents, and gaining energy, and some are beginning to seem dated. Scholars who are serious about literary theory right now are asking whether my basic distinction between study of text in historic moment and the critically sophisticated present is not a little naive. For acknowledging the importance of boundary crossing only in passing but not exploring it, I plead pedagogical usefulness and my own bias toward editing and philology. The purpose of this post is not to teach you to make a contribution to cutting-edge theory but to advise you on putting together a crash course that allows you to write a first-year seminar paper in a graduate course.

To continue expanding your study of theory, I suggest consulting one or more of the following resources: John Hopkins Guide to Literary Theory and Criticism, ed. Michael Groden, Martin Kreiswirth, and Imre Szeman (2005), Continuum Encyclopedia of Modern Criticism and Theory, ed. Julian Wolfreys (2006), Literary Theory and Criticism: an Oxford Guide, ed. Patricia Waugh (2006). Again, these titles did not trip off my tongue. I consulted James L. Harner’s Literary Research Guide (Kent State Library, Online Copy) which has an entry on the John Hopkins Guide that mentions the Continuum Encyclopedia and the Oxford Guide as particularly valuable (and mentions other titles). Again, my purpose is not to advise how to make a serious contribution to theory: you only need “enough theory” to be recognized as aware of major texts that may inform your reading. If you wish to contribute to theory (or you wish to develop a better sense of how areas of theory may be divided or intersect) you would be consulting annual volumes from Year’s Work in Critical and Cultural Theory, reading widely in major theory journals, and probably studying with someone else in the department.

If this sounds intimidating because your interests in advanced study in English don’t fit easily under literature and theory—because, for example, your interest in English is from composition and rhetoric or linguistics; or your interests are philosophy or religion—I only ask that any work that draws from your favored discipline make accommodations to address readers outside of that discipline. As noted, this post was inspired by a comment on Cecire’s Crash Courses for the Desperate. Cecire prepared three one-page crash courses in “Queer Theory,” “Modernism,” and “History of Science.” Cecire writes as an expert on the areas that she has selected. Each of her crash courses consists of 5 or 6 important articles or chapters (with one or two sly of-course-you’ve-already-read-that hints), a series of questions to consider, two or three prominent journals from which to consult recent articles, and an advisory on building your own more extended reading lists. As no good deed by a female scholar in a public forum should be allowed to stand without queries for assistance mixed with presumptuous insult, a commenter promptly asked her to prepare additional reading lists.

It takes genuine expertise to devise a crash course on Cecire’s model. When initially enamored with her crash course proposal, I envisioned spending 8 or 10 hours a piece preparing a crash course in each of the following subjects: Early American Literature, Bibliography, Textual Criticism, Digital Humanities. Why I did not do this is the same reason Cecire is not going to post crash courses on demand. Such work is most useful, when it is useful, when you take it up yourself. This post serves a different purpose: to offer a guide to some of the infrastructure that will allow you to prepare a crash course for yourself. Students in my classes should not be surprised if I try out asking them to develop a new crash course as a graded exercise.

To finish the job on Cecire’s crash course model, you will need to devise questions and pick important journals in an area of literary scholarship. Through the act of selecting and reading and consulting, you should have a reasonable grasp on the major journals. For more comprehensive work, see the MLA Directory for Periodicals if your library subscribes. If you do not have access to that service, the Humanities Journals Wiki does a pretty good job on its list of literary journals. On a particular general theory topic, pick six or eight articles from most prominent journals to read, from during last two or three years. Year’s Work in Critical and Cultural Theory should again be consulted, as it may help you identify individual articles in the journals that regularly engage your area of study.

This is not career advice. But if it sounds like too much work for a seminar paper, then I do have career advice: quit graduate school. I welcome other useful hints and reference works for building up knowledge in a particular area. But for almost any scholar in English-related disciplines, Harner’s Literary Research Guide is an essential first-stop shop to gain your bearings in a new area of study.

Posted in Uncategorized | Leave a comment

Running CollateX on your Macintosh OSX (Mavericks)


This post was prepared originally when CollateX in version 1.5. Under CollateX, version 1.6, it is no longer distributed as Windows or Mac Zip file. Instead, it is distributed as a compressed Java archive, a .jar file. You can still run it from command line. But see new instructions at CollateX downloads.

You can also run it as a Python 3 module. For instructions on installing, see David J. Birnbaum’s Computer-supported collation with CollateX. The installation instructions include Python 3 Anaconda, CollateX, the Levenshtein Library, GraphViz, GraphViz bindings for Python, all part of a suite of resources to support a one-day seminar. It’s a bit intimidating, but I was able to install the entire suite of tools in a little over an hour, though I already had MacPorts and XCode. You need to be reasonably comfortable with command line and have Admin control over your OS.

Some troubleshooting that I had to do with setup (because I had Python 2.7 previously installed)  included re-setting Python to version 3.4 (rather than 2.7) after 3.4 installed, re-installing CollateX inside Anaconda 3.4, and re-installing GraphViz bindings in Anaconda. After each of those, restart Anaconda. Anaconda a new thing to me, and I may refer wrongly to it. But go ahead and go to Birnbaum’s installation guide and course (I’m working my way through that now). It’s much more useful than the newby instructions below.

Oh, Python scares the ever-loving bejeesus out of you? Me too. You can start at Python the Hard Way. Also, John Gutag, Eric Grimson and Ana Bell’s MITX course on Computation and Programming using Python. It’s really good, but that’s why I had 2.7 installed. Go learn it.

I believe scholarly editing is now unapologetically Tech Geekery. Sorry to be bearer of bad news: that’s just the way it is. I hope you’re younger than I am, more tech-savvy that I am, so you’ll adjust more easily. I’m having a really hard time with this.

Here begins the old post:

I had CollateX running on my Macintosh OSX (Lion, probably), but after installing Mavericks CollateX no longer ran. So I had to work through the steps for getting CollateX to run from the command line on this new OS. I had for a while satisfied myself with running it from Windows 7 PC, which still worked, but I need now to share these instructions for a graduate class anyway. These instructions are for newbies, and I welcome corrections if I use the wrong terms to describe what needs to be done from an OS or Java perspective, realms in which I have only achieved minimal literacy.

The basic steps are the following:

  • Download the application (a zip file with a directory structure) from http://collatex.net and place it anywhere on your computer.
  • Designate the downloaded CollateX program (named collatex on Macintosh) as an executable file so that the operating system will permit it to run.
    1. You must install the Java Software Development Kit (not same as end-user Java or JRE for your browser, though the CollateX page says should work with JRE, but it did not for me).
    2. You must add the path to JAVA_HOME to your .bash_profile and re-start the terminal, so CollateX can find the Java SDK.
  • With CollateX designated as an executable and able to find Java SDK, you must issue the proper command to launch CollateX on the command line from within the “bin” directory of the unzipped CollateX directory. The command “./collatex” (do not omit period or forward slash) will indicate the version and provide a list of commands for CollateX.

Download CollateX

  • Go to http://collatex.net/download/ and download “collatex-tools-1.5.zip“. Open it with the Archive utility (or your other favorite unzip tool) and place it someplace where you can find it, such as “/Users/[your user name]/Documents/collatex-tools-1.5“.
  • When you unzip it, CollateX will create a folder with the version name “collatex-tools-1.5”, and inside that directory one will find a “bin” and a “lib“. To run CollateX, you will need to enter your Terminal (shell) and navigate to the “bin” directory. If you’re experienced and feeling lucky, open Terminal, navigate to CollateX bin, and try to run it with the command “./collatex“. If that didn’t work, continue with steps below.

Designate CollateX an executable

  • In the Terminal (shell), use Unix commands to traverse the directories (ls to display current directory, and cd with "folder name"/"folder name" (single or double quotes needed in paths if spaces in directory names, or escape spaces with backslash followed by spaces). You will not need quotation marks or escaped spaces if you used recommended path above, so go to following destination: /Users/[your user name]/Documents/collatex-tools-1.5/bin.
  • Within bin directory, type the following command: “chmod +x collatex“. The command chmod has now designated collatex file as an executable. By the way, if in Windows, you will use “collatex.bat” rather than collatex file.
  • If Java previously set up, the following command may work: “./collatex“. Tip: Don’t omit the period or slash before the command.
  • If CollateX fails to run and error message explains that your OS can’t find Java, continue with steps below.

Install the Java SDK

If you attempt to run CollateX and it fails with a Java message like below, you probably need to install the JDK (technically, you may have it installed but not mapped to JAVA_HOME, but you are a newbie with this, so probably not).

JAVA_HOME is not defined correctly.
We cannot execute /System/Library/Frameworks/JavaVM.framework/Versions/CurrentJDK/Home/bin/java

  1. Go to the download site for the Java Development Kit
  2. Click the button for the JDK Download.
  3. Accept the License Agreement.
  4. Download the appropriate one for your operating system: Macintosh OS X x64. If on Windows, select Windows x64 (Windows 7, 64 bit or Windows 8), or Windows x86 (earlier or Windows 7, 32 bit).
  5. Install the JDK.

Tell Macintosh OSX where to find Java

These steps are some kind of with-Java-you-must-work-with-the-force-not-against-it that I don’t actually understand, but they were necessary.

  1. Open your terminal.
  2. Enter the following command and press Enter, exactly as below:
    echo export "JAVA_HOME=\$(/usr/libexec/java_home)" >> ~/.bash_profile
  3. Why? I don’t know. But see this link and this link if you are interested in sussing out why.
  4. Then exit and close the terminal. If you don’t exit and close the terminal entirely, CollateX will continue to NOT WORK.

Run CollateX

  1. Re-open terminal (you closed it after mapping JAVA_HOME in previous step).
  2. Using the finder, place two files that you want to collate (wit1.txt and wit2.txt in the bin directory. I recommend plain one-sentence text files. wit1.txt would have one line: “This is witness 1 of the work.” wit2.txt would have one line: “This is witness 2 of the same work.”
  3. Using the cd and ls command, make your way to “/Users/[your user name]/Documents/collatex-tools-1.5/bin“. To make life easier for yourself, you can open the Finder and have a visual reminder of your location.
  4. To confirm that CollateX is functioning, enter the following command: :./collatex” (do not omit period or forward slash). Because you did not tell CollateX to do anything, it will do nothing more than to display the documentation.
  5. If wit1.txt and wit2.txt are in the bin directory, collatex is an executable file, collatex can find Java on your OS because you instructed it where JAVA_HOME is located, and the Terminal is running the command from within the bin directory, enter the following command: ./collatex wit1.txt wit2.txt > witness1and2compared.txt
  6. If that works and outputs a file witness1and2compared.txt, have at it. The full documentation is at http://collatex.net.

If anyone asks what kind of work you do, you may now say with a straight face that you are a philologist.

Posted in Uncategorized | Tagged , | 1 Comment