advertisement
On CHOW: Does drinking ice water burn calories?
Find Articles in:
all
Business
Reference
Technology
News
Sports
Health
Autos
Arts
Home & Garden

Featured Download

Speak Like a CEO

This chapter describes ten helpful actions and behaviors that will bring you...

advertisement

Content provided in partnership with
Thomson / Gale

Introduction - organizing the Internet

Library Trends,  Fall, 2003  by Andrew G. Torok

THE THEME OF "ORGANIZING THE INTERNET" brings to mind the late 1950s folk-rock singer Jimmie Rodgers's song titled "The World I Used to Know." A great many developments have transpired in the world of information science since the seminal works of S. C. Bradford, Claude Shannon, Vannevar Bush, and numerous other pioneers. To those of us who have been in the information science field for several decades, the peek-a-boo devices such as Termatrex, Mortimer Taube's Uniterm cards, and discussion of pre- and postcoordinate indexing have given way to the world of browsers, HTML, XML, and numerous other ways of coding text and multimedia. The Internet and the World Wide Web have had a profound impact on how we go about storing and retrieving information. Document integrity has become transient, with little assurance that the location, existence, or even the content of a publication will be the same tomorrow as even a few minutes ago. We are often hard-pressed to determine if the failure to retrieve a publication is one associated with network infrastructure of the publisher. The dream of universal bibliographic control seems quite remote. By being able to bypass traditional publication channels, anyone can publish virtually at will. The situation becomes more chaotic when we consider the increasing redundancy of knowledge and the rampant proliferation of misinformation and disinformation, to say nothing of social concerns with pornography, copyright violations, and other flagrant obtrusions into personal rights. Nevertheless, it behooves the information worker and the information user to make some sense of order if good information is to remain the basis of learning and decision making, and if documents are to continue as an archive of human knowledge.

Most Popular Articles in Reference
The importance of understanding organizational culture
Credit card attitudes and behaviors of college students
What factors attract foreign direct investment?
Libraries Need Relationship Marketing - mutual interest marketing concept, ...
How to set performance goals: employee reviews are more than annual critiques
More »
advertisement

As I reflected on writing this introduction, I began to ask myself just how far have we come from the world I used to know. The biggest paradigm change has not been that of technological development. Rather, the Internet has enabled virtually anyone with access to a computer to become intimately involved with the entire information cycle, namely, publishing, acquiring, organizing, and retrieving information, thereby bypassing information intermediaries such as indexers, reference librarians, and publishers. There is no question that the technology is vastly different from the early days of information retrieval. At the same time, the paperless office never materialized, nor are libraries being phased out as a result of the public's ability to access information directly from the desktop. More importantly, we still do not understand what constitutes information or how people make relevance judgments. Information retrieval (IR) to most searchers consists of character string matching between a query posed to a data source. In some ways, IR has even regressed, since now the trained search intermediary is no longer needed. The Internet consists of a vast unchecked sea and searching is referred to as "surfing." The issue is further complicated by the proliferation of document formats, incompatibility between generations of hardware, and questionable scalability of software. Even in doctoral seminars that I teach, I find the need to explain Boolean logic and patiently teach students how to develop search strategies, formulate queries, and even how to compute the precision of searches. While the Internet has empowered the general public to perform tasks once done by professionals, it has also created a large body of knowledge needing organization. Vocabulary control is extremely limited at best. The average Web searcher has little understanding of the search process much less a fundamental ability to determine the effectiveness or exhaustivity of a search. People rely on a limited set of search tools, especially general search engines such as Google, not realizing that less than 20 percent of all indexable documents are being accessed. Beyond that, there are many electronic text and multimedia publications that are not indexed at all by Web crawler software. This part of the Internet is called by many names, such as the Invisible Web, the Opaque Web, the Hidden Web, the Dark Web, and so on.

In all fairness, the Internet, especially the Web, is still in its infancy. Techniques for publishing, organizing, and accessing content are changing rapidly as a result of new technological developments, the competitive information marketplace, and the growing sophistication of searchers. As always, libraries are instrumental in promoting access to online publications, especially to those that belong to the invisible Web. Librarians are also educating users through the cooperative development known as information literacy. Developed by AECT (the Association for Educational Communications and Technology) and AASL (American Association of School Librarians) electronic information literacy standards are being taught to children and teachers alike. The ACRL (Association of College and Research Libraries) supports similar standards for higher education. The dynamic nature of the Internet is going to require methods of organization way beyond the relatively static classification schemes that have served libraries for many years. New methods of organization must take into consideration more sophisticated techniques for content description in order to minimize such problems as retrieving pornography or to be able to detect plagiarism and copyright violations. Eventually the exponential growth of the Web will itself subside. The Internet is not free. Market regulations will eventually restrict the free ride enjoyed by Web publishers. Publication patterns will be easier to recognize as publication activity becomes more linear. The end result will be that users will be able to discriminate in terms of specifying what they want or avoiding the retrieval of unwanted items.