Testing the calculation of a realistic h-index in Google Scholar, Scopus, and Web of Science for F. W. Lancaster

Library Trends, Spring, 2008 by Peter Jacso

ABSTRACT

This paper focuses on the practical limitations in the content and software of the databases that are used to calculate the h-index for assessing the publishing productivity and impact of researchers. To celebrate F. W. Lancaster's biological age of seventy-five, and "scientific age" of forty-five, this paper discusses the related features of Google Scholar, Scopus, and Web of Science (WoS), and demonstrates in the latter how a much more realistic and fair h-index can be computed for F. W. Lancaster than the one produced automatically. Browsing and searching the cited reference index of the 1945-2007 edition of WoS, which in my estimate has over a hundred million "orphan references" that have no counterpart master records to be attached to, and "stray references" that cite papers which do have master records but cannot be identified by the matching algorithm because of errors of omission and commission in the references of the citing works, can bring up hundreds of additional cited references given to works of an accomplished author but are ignored in the automatic process of calculating the h-index. The partially manual process doubled the h-index value for F. W. Lancaster from 13 to 26, which is a much more realistic value for an information scientist and professor of his stature.

INTRODUCTION

The h-index was developed by Professor Jorge E. Hirsch of the Department of Physics at the University of San Diego. It was published in the prestigious Proceedings of the National Academies of Science (Hirsch, 2005) soon after its preprint appeared in arXiv, the excellent and widely used preprint repository focusing primarily on physics (http://arxiv.org/pdf/ physics/0508025). It was welcomed much more widely and quickly than any other bibliometric and scientometric indicators received before (Lancaster, 1991).

Hirsch summarized the essence in a terse abstract: "I propose the index h, defined as the number of papers with citation number [greater than or equal to] h, as a useful index to characterize the scientific output of a researcher." He then explains that "A scientist has index h if h is his or her [N.sub.p] papers have at least h citations each and the other ([N.sub.p] - h) papers have [less than or equal to] h citations each." This means that an author with h=16 has 16 publications each of which received 16 or more citations. The h-index varies widely from discipline to discipline and even within disciplines and research areas. In library and information science, for example, a h-index of 16 is a high value, but in, say astronomy and retrovirology, it is considered to be a relatively low value.

SHORT LITERATURE OVERVIEW

Immediately after publication there was already a flurry of formal and informal comments and reactions by researchers from various disciplines with only a few dismissive and skeptical comments (Purvis, 2006; Ashkanasy, 2007; Berger, 2007), and plenty of supporting ones, in serious news sources, listserv fora and blog sites, beyond the many academic journals. It was cited by more than sixty papers by the end of August 2007. The most telling sign of the importance and appreciation of the h-index was that editors of Scientometrics found a way to squeeze in a paper about the h-index in its December 2005 issue (Bornmann and Daniel, 2005), then dedicated its April 2006 issue to the topic, with several substantial articles by some of the most respected scientometricians followed by three more in May, June, and July, then two more in 2007 in that journal alone. The papers approached the topic from a variety of theoretical (Egghe and Rousseau, 2006; Liang, 2006; Egghe, 2006, 2007a; Schubert, 2007, Glanzel, 2006; and practical angles (Costas & Bordons, 2007; Imperial & Rodriguez-Navarro, 2007; Vanclay, 2007).

There are several case studies that present the h-index for a variety of target groups. These include the prominent scholars, educators, and researchers in a specific field (Kelly & Jennions, 2006; Saad, 2006; Cronin & Meho, 2006 Oppenheim, 2007), lesser known researchers in the broad field of physics (Schreiber, 2007a), institutions within a country (Prathap, 2006), researchers of a discipline within a country (Salgado and Paez, 2007), researchers within a country in different fields (Imperial & Rodriguez-Navarro, 2007; Packer & Meneghini, 2006, Meneghini & Packer, 2006), across countries in a field of specialization (Oelrich, Peters, and Jung 2007), and in the highly select group of scientometrics, the winners of the award commemorating John Derek de Solla Price (Bar-Ilan, 2006a).

Some of the best papers about the h-index voiced reservations about the details of the proposed model, but they indicated their support of the theory of Hirsch by suggesting variant and derivative indexes built on the idea of Hirsch (Batista, et al, 2006; Egghe, 2006, Vanclay, 2006; Barendse, 2007; Jin et al, 2007). Several papers compared the h-index with other, traditional measures (van Raan, 2006; Barendse, 2007; Costas & Bordons, 9007).


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
Click Here
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale