advertisement

Informetric theories and methods for exploring the Internet: an analytical survey of recent research literature

Library Trends, Wntr, 2002 by Judit Bar-Ilan, Bluma C. Peritz

Dean & Henzinger (1999) applied cocitation techniques in order to find "related pages" on the Web. A related page is one that addresses the same topic as the original page. One of their algorithms, the cocitation algorithm, looks for pages that link to the given page, and assumes that the nearby links point to pages with similar topics. These pages were collected, their cocitation degree computed, and those with the highest degrees were returned as the most related pages.

Kumar, Raghavan, Rajagopalan, & Tomkins (1999) used cocitation techniques in order to identify specific communities on the Web--groups of content creators sharing a common interest. The study exploits "cocitation in the Web graph to extract all communities that have taken shape on the Web, even before the participants have realized that they have formed a community" (p. 1483).

Ross & Wolfram (2000) used coword analysis to analyze term pair topics submitted to the search engine Excite. Their data were based on more than a million queries submitted to Excite on a single day. The most frequent term pairs were coded into thirty categories based on the semantic and pragmatic intent of the term pair; a term pair could belong to more than one category. Cluster analysis and MDS were used for the data analysis. A high proportion of the term pairs were for adult-oriented material.

Leydesdorff & Curran (2000) studied the cooccurrence of the terms "university, "industry," and "government" in Web pages in three different domains. The domains were: Brazil, the Netherlands, and the so-called top level domains (.com, .edu, .gov, .org, .net, .mil). They studied the growth over time of these cooccurrences, using Alta Vista's option to limit searches to given dates. The queries were presented both in English and in the local language. Similar trends were detected in all three domains.

Content Analysis

Content analyses of Web and Internet sources serve as exploratory tools for getting a better understanding of the Internet's content.

Bar-Ilan & Assouline (1997) analyzed the content of messages distributed by the PUBYAC (a discussion list for Children and Young Adult services) for a period of one month in spring 1997. Six content categories were defined (reference, library administration and policy, collection management, extension programs, announcements, and other). The most popular category was reference. The lifespan of topics, the number of active participants, and the productivity of the participants were also examined. From the answers received to a specific question sent to the participants of the discussion list, it seems that the librarians find the list very useful: "It helps them find answers to specific questions and assists in collection management and planning extension programs" (p. 170). Several other studies analyzed the content of discussion lists. Sometimes several groups were analyzed in parallel and their characteristics compared (e.g., Aires-de-Sousa, 1999; Schoch & White, 1997; Berman, 1996).


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
  • Click Here
advertisement
Click Here

Content provided in partnership with Thompson Gale