Lessons learned with Arc, an OAI-PMH service provider
Library Trends, Spring, 2005 by Xiaoming Liu, Kurt Maly, Michael L. Nelson, Mohammad Zubair
ABSTRACT
Web-based digital libraries have historically been built in isolation utilizing different technologies, protocols, and metadata. These differences hindered the development of digital library services that enable users to discover information from multiple libraries through a single unified interface. The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a major, international effort to address technical interoperability among distributed repositories. Arc debuted in 2000 as the first end-user OAI-PMH service provider. Since that time, Arc has grown to include nearly 7,000,000 metadata records. Arc has been deployed in a number of environments and has served as the basis for many other OAI-PMH projects, including Archon, Kepler, NCSTRL, and DP9. In this article we review the history of OAI-PMH and Arc, as well as some of the lessons learned while developing Arc and related OAI-PMH services.
**********
Interoperability is one of the significant research problems in the field of digital libraries (DLs) (Lynch & Garcia-Molina, 1995). The inability to federate, filter, and provide value-added services on remote content limits DLs to covering only local holdings. The Open Archive Initiative (OAI) is a major, international effort to address technical interoperability and facilitate discovery of content among distributed repositories. OAI differs from other interoperability approaches, such as Z39.50 (Lynch, 1997) or SDLIP (Paepcke et al., 2000), through its emphasis on a limited, simple, and easy to implement protocol that layers over an existing repository. The OAI framework defines two functional roles: data providers (also "repositories") and service providers (also "harvesters"). Service providers develop value-added services that are based on the metadata collected from data providers. These value-added services could take the form of cross-archive search engines, linking systems, and peer-review systems.
The roots of the OAI lie in a vision to stimulate the growth of open e-print repositories. This concept began to be developed with the Universal Preprint Service (UPS) prototype (Van de Sompel et al., 2000), and was further advanced with the Santa Fe Convention (Van de Sompel & Lagoze, 2000). The UPS prototype was the discussion piece during an invitation-only workshop in Santa Fe, New Mexico, in the fall of 1999. This workshop brought together many of the leaders in the e-print community for the purpose of fostering interoperability between the various author-contributed e-print servers and institutional repositories in use at the time. Contemporary approaches toward interoperability were ad hoc at best. One of the distinguishing factors for the Santa Fe Workshop was the collective experience in building DLs and the associated interoperability problems; earlier interoperability workshops (Scherlis, 1996) were comparatively premature. The immediate result of this workshop was the Santa Fe Convention, an intermediate step toward the metadata harvesting model that would become the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH).
Realizing that the simple metadata harvesting idea had appeal to a broader reach of communities than that engaged in e-print publishing, version 1.0 of the OAI-PMH was released in January 2001. Following an extended period of evaluation and alpha and beta testing, version 2.0 of the OAI-PMH was released as a stable specification in June 2002 (Lagoze, Van de Sompel, Nelson, & Warner, 2002a). The development, history, impact, and secondary effects of OAI-PMH have been discussed in several publications, including Lynch (2001), Nelson (2001), Lagoze and Van de Sompel (2001), Van de Sompel and Lagoze (2002) and Lagoze and Van de Sompel (2003).
ARC
Arc (http://arc.cs.odu.edu) is the first end-user federated search service based on the OAI-PMH (Liu, Maly, Zubair, & Nelson, 2001). The Repository Explorer (Suleman, 2001) was released prior to Arc, but its targeted audience is mainly repository developers and maintainers, not end-users. Arc was initially released as an experimental service to investigate issues in metadata harvesting in October 2000. The software developed for the Arc service (http://oaiarc.sourceforge.net/) was released as an open source system under NCSA-style license in September 2002. It has been used in several production and research projects (see Table 1).
Arc was first developed as a proof-of-concept service for OAI-PMH; however, the development of Arc revealed interesting problems and inspired further research in these domains. In this article we introduce the development and architecture of the Arc system and follow-up research that attempted to improve or optimize the metadata harvesting system and search performance. We will discuss the Archon project for building value-added services to take advantage of rich metadata beyond Dublin Core (DC) (Weibel & Lagoze, 1997); the DP9 service to allow general search engines (Google, Yahoo, etc.) to index OAI-PMH compliant collections; and the recently funded Andrew Mellon Foundation DL Grid project for building a high-performance federated search service. When possible, interesting and general features resulting from these research projects are incorporated back into the publicly available Arc source code distribution.
Most Recent Reference Articles
- ARAB EUROPEAN RELATIONS - Dec 22 - Russia Denies Selling Missile System To Iran
- EGYPT - Dec 29 - Opposition Says Mubarak Blessed Israeli Attacks
- ARAB AFFAIRS - Dec 22 - Syria Will Eventually Move To Direct Talks With Israel
- ARAB AFFAIRS - Dec 30 - GCC Denounces Massacre
- ARAB ISRAELI RELATIONS - Israel Issues An Appeal To Palestinians In Gaza
Most Recent Reference Publications
Most Popular Reference Articles
- How Tyler Perry rose from homelessness to a $5 million mansion
- 9 questions to ask your new lover: what you were afraid to ask, but always wanted to know
- Free Sex Change? Move To Idaho - Brief Article
- BEST HAIR SALONS in DALLAS, The
- Vickie Winans: at home with the gospel star who lost 75 pounds and reenergized her career


