advertisement
On The Insider: Photo Gallery: Love Rihanna's Looks
Find Articles in:
all
Business
Reference
Technology
News
Sports
Health
Autos
Arts
Home & Garden
advertisement

Content provided in partnership with
Thomson / Gale

Evaluating electronic texts in the humanities - Libraries and the Internet: Education, Practice & Policy

Library Trends,  Spring, 1994  by Susan Hockey

INTRODUCTION

Electronic texts have been used for scholarly research in the humanities for the past forty years or so ever since Roberto Busa began work on his Index Thomisticus in 1949. However, it is only in the last three to four years, and particularly with the advent of the Internet, that humanities electronic texts have moved into the center of the scholarly arena as libraries begin to collect them and provide access to them. In the humanities, as in other disciplines, electronic textual resources offer many more possibilities than print, but, in general, libraries do not yet have any well-established practices for collecting and handling electronic texts as they have with print material. Shreeves (1992) discusses some of these questions from the perspective of the librarian, but there is a need also to look at what humanities scholars might want to do with the texts.

Most Popular Articles in Reference
The importance of understanding organizational culture
Credit card attitudes and behaviors of college students
What factors attract foreign direct investment?
Libraries Need Relationship Marketing - mutual interest marketing concept, ...
How to set performance goals: employee reviews are more than annual critiques
More »
advertisement

Electronic text is used here to mean primary source material in the humanities rather than journals and reference works. Such texts may be literary works (prose, verse, drama), historical papers, letters and memoranda, charters, papyri, inscriptions, and the like. The source material may be in any natural language and may be in print or manuscript form. The focus of this article is also on transcripts of text rather than digitized page or manuscript images. Images provide an exact reproduction of the original so that marginalia, annotations, parallel texts, illustrations, and the like are readily available. They can be used for access and preservation but the text cannot be searched or otherwise manipulated. A transcript of a text allows many more novel possibilities for research and teaching and exploits more fully the capabilities of electronic materials. In the future, a combination of image and text may well form the basis of the electronic library, where it will be possible to search the text and retrieve the image.

THE PRESENT SITUATION

The picture in the early 1990s is one of many humanities texts in many different places and in many different formats. The Georgetown Catalog of Projects in Electronic Texts lists over 300 institutions which hold electronic texts but not the texts themselves. From sources such as The Humanities Computing Yearbook 1989-90: A comprehensive guide to software and other resources (Lancashire, 1991), journals, and proceedings of annual conferences on humanities computing, the number of existing electronic texts in the humanities can be estimated at many thousands. The Internet gives access to a fraction of these, and the existence of most of the others is only known from articles which describe their use in specific projects.

Most of these texts are held by individuals or by research institutes (mainly in Europe) which have compiled them for their own research purposes. Examples include the Istituto di Linguistica Computazionale in Pisa, and the Institut fur Deutsche Sprache, Mannheim. For a variety of reasons, most of the collections of these institutes are not available for others to use. The few exceptions include many of the texts which were compiled for the Tresor de la Langue Frangaise at Nancy, which are now available from ARTFL (American Research on the Treasury of the French Language) in Chicago. The texts compiled for the Responsa Project at Bar-Ilan University are now available on CD-ROM as the Global Jewish Database, and the collection of Early Christian Latin at Louvain-la-Neuve has now been published as the CETEDOC CD-ROM.

The Thesaurus Linguae Graecae (TLG) was the earliest systematic attempt to create electronic versions of the complete literature of one language (Ancient Greek) and its 60 million word task is now almost finished after twenty years of work. The Packard Humanities Institute (PHI) has completed a complementary collection of Classical Latin which is about 8 million words. Both of these are distributed on CD-ROM.

The largest general purpose collection of electronic texts is the Oxford Text Archive (OTA), which was established at Oxford University Computing Services in 1976 in order to prevent texts from becoming "lost" once their compilers had finished with them. The bulk of its collection comes from donations from individual scholars. It is committed to maintaining any text which is deposited in it but does not actively pursue material to be added or correct errors within the texts. It now has some 1,200 texts in about thirty languages and makes these available at nominal cost provided that the compiler has given the appropriate permissions. Little information is known about the source of some OTA texts, and the OTA takes no responsibility for the accuracy of the texts. Some texts are available by FTP from <black.ox.ac.uk>.

It is estimated that about 95 percent of existing texts are plain text files--that is, ASCII files which are not indexed for any specific software. Those who use them must acquire or develop suitable software programs, depending on the nature of their application. Various software programs for humanities electronic texts are in widespread use, notably the Oxford Concordance Program (OCP), Micro-OCP (a PC version); TACT and WordCruncher (interactive text retrieval programs) which all provide some basic facilities as well as more sophisticated tools tailored to the specific needs of the humanities.