\'Real Humanists Make Tools\' from TADA I’ve been procrastinating/conducting interdisciplinary research by looking around at Text-Analysis sites like TADA and TAPoR. Text-Analysis is a field of inquiry that deals with computer-assisted text-analysis. It has a history in ‘concordance’, as developed by Hugh of St. Cher in the 13th century. Concordance was a kind of indexing of themes in the Bible but now extends to the searching for and retrieval of information from texts. Text-analysis appears to be the study of how relations between concepts can be found and represented.

But what also I found interesting is the definition and typology of electronic texts put forward by Geoffrey Rockwell and Ian Lancashire in ‘Electronic Texts and Text Analysis’. Rockwell and Lancashire’s descriptions appear to be representative of the field and so are fascinating insights into the lens by which an area of analysis views a subject that is common to our ‘artistic’ inquiry.

Electronic texts digitally represent oral or written language in a form suitable for analysis with a computer. Typically an electronic text is either a electronic version of a written work, an electronic version of a transcript of an oral event, or a document composed on the computer. In any case the information in an electronic text is meant to be in a natural language that can be read by humans when displayed properly.

Obviously they are looking at the material characteristics and the material source of the text. I find the description and typology a confluence of perspectives. The typology, for instance, is separated into four major forms :

  • A copy of a work that was originally on paper - a digital representation of a literary, dramatic, or other type of written work that was originally in analogue form.
  • A work composed on the computer that is stored in that form, but was intended to be printed like a word-processing file or PDF (Portable Document Format) file.
  • A work composed on a computer that is meant to be accessed on a computer like a WWW page, electronic text database, or hypertext
  • A transcript of a conversation or other oral event

The description and typology don’t seem to have a continuity of variable. Hopefully someone out there can explain these for me. For now I’ll go through the inconsistencies I see. 1) A digital version of a work that was originally on paper = remediation from paper to digital text (regardless of content), a subject that is characterised by it having another incarnation in a medium other than a digital one. 2) A digital work that was composed on computer but intended to be printed in another medium (paper) = pre-remediation, a subject that is characterised by being created in a digital medium but to be read in another.

The first two seem to have the following variable = a digital text that is either anterior or posterior to a paper form. The third is a digital text that has no relationship with paper other than negating it. I don’t know if I’m just tired but what is the common factor between the first three? A digital text and a medium? Then to the last type, a transcript (digital text) of an oral event. This obviously refers to the origin of a content, which is another mode or channel type. All four types seem to be concerned with the medium where the content was created: on paper, on computer, on computer, verbally. But they have other concerns, such as the output: computer, paper, computer, computer.

I’ve tried to think of other sources, inputs and outputs of texts that may not be included in the typology in order to see if their typology is in fact sufficient.

Sources: material (paper, computer)
Input: image scan, OCR, keyboard, voice recognition…
Output: mediums (computer, paper), modes (listened to, read)…
Composition Medium: computer, verbal, written…
Remediation Degree: direct (one form to another. Example: a book then digitised), staggered (at least 2 forms and then another. Example: A script of an animation sequence which is then described in a digital text and printed. )

Is it just me or does the typology of Electronic Texts offered seem inconsistent?

5 Responses to “Forms of Electronic Texts”

  1. 1 Jeremy Douglass

    I agree this typology is a description of common forms. The thinking seems to be “where did it come from, where is it going?” - with physical-analog, oral, and born-digital being the three main options as you said above.

    - Visual-Digital (image of a page to computer encoding)
    - Audio-Digital (sound of words to computer transcript)
    - Digital-Digital (computer to computer transcoding)

    The weird thing that jumps out at me is the fourth type, #2 on their list, the classification of Postscript / PDF and other print-intended formats. Where are they going? The analog world, but they haven’t gotten there yet, and so it can be studied as digital text.

    - Digital-Analog (encoding to ink representation)

    Which raises the question, does Digital-Audio also count?

    I like the transcription-typology approach, but I wonder if computer users with accessibility issues would tend to find such an approach helpful or just annoying, since transcription into audio etc. is a fact of life - in fact, all kinds of what you call direct or staggered remediations are post-processes that are easily available for anything digitally passing through.

    I’m wandering. I may need to go look up that passage with Lev Manovich’s description of the Universal Media Machine.

  2. 2 Geoffrey Rockwell

    Just came across this entry - you should get a TADA wiki account and post an alternative definition - lets see if we can refine an improve what we mean. Now to try to defend the “defintion” we provided.

    First, the word “typically” - I was trying to describe the set of objects that get called electronic texts in humanities computing and classify them. I shouldn’t claim it is a coherent set - but the ones out there I see being treated as e-texts are typically electronic editions of literary works, linguistic corpora, born digital texts, and original files used for print. The passage you quote is followed by a list of examples. I wonder if we come up with a more logical classification that avoids the material problems you point out while still respecting the way the term is used? Alternatively one could define e-texts as X and explain that one is therefore excluding, for example transcripts of oral events.

    For me the more interesting issue is the problems we have with defining “text” - see the whole debate around Renear’s OHCO theory. A cheap definition of an e-text would be to say it is an electronic version of anything that can be called a text -and that is sort of what I try in the last sentence you quote. Could we end up using e-text for those things that are not remediations (which we would just call texts)? Anyway, I’m not happy with waving my hand about readability or linguistic objects. Any ideas?

  3. 3 Christy Dena

    Thanks for coming by Geoffrey. I’d love to paticipate in working out A definition of an etext, and I’m sure Jeremy (and Mark) would too. I’ve created an account with your TADAwiki but do not have permission to create a page. Would you like to keep discussing the idea here or on a wiki page? If the later, could you create a page under EText or the like?

    I’ll be away for the next couple of weeks, but then I’d love to jump right in. Thanks for referring us to Renear’s OHCO ideas too.

  4. 4 Geoffrey Rockwell

    Stéfan Sinclair can give you permission to edit. I’ll drop him a line. He organized the TADA conference where we all got T-shirts with the ironic logo “Real Humanists Build Tools”.

  5. 5 Christy Dena

    Good stuff. I love that logo by the way. ;)

