Lessons learned with Arc, an OAI-PMH service provider

Library Trends, Spring, 2005 by Xiaoming Liu, Kurt Maly, Michael L. Nelson, Mohammad Zubair

DEVELOPMENT OF ARC

Arc was initially released as an experimental service to investigate issues in metadata harvesting. It immediately attracted interest because it was the only vehicle to demonstrate the potential and promise of OAI-PMH at that time. As new data providers appeared, they often requested to be added to the Arc system for demonstration purposes; by continuously integrating various new data providers, the software was made stable and fault tolerant. Originally conceived as more of a tour de force, Arc has become a useful tool for helping new data providers to make their collections truly OAI-PMH-compliant by giving them feedback on errors during harvesting.

When applying the Arc software in various environments, we encountered a number of problems such as inconsistent metadata, lack of controlled vocabulary, and XML errors. Based on feedback from other adopters, we have been able to address these problems and have consequently added many new features for customization and installation. The architecture of the Arc system has been refined to easily add or extend new functionalities.

Arc is available for download (http://sourceforge.net/projects/ oaiarc/) and can be used with either Oracle or MySQL. OAI-PMH uses unqualified Dublin Core as the default metadata set, and most Arc end-user services are implemented on the data provided in the DC metadata. The current supported end-user services include simple search, advanced search, interactive search, annotation service, and browse/navigation over search result. Arc has a Web-based administration interface, which allows users to configure various parameters for harvesting and to check harvester logs to handle various error situations such as erroneous XML replies from data providers.

ARCHITECTURE OF ARC

The basic structure of OAI-PMH supports two basic components: the service provider and the data provider. Data providers administer systems that support the OAI-PMH as a means of exposing metadata, and service providers use metadata harvested via the OAI-PMH as a basis for building value-added services.

[FIGURE 1 OMITTED]

The OAI-PMH focuses on the clear interface between data providers and service providers. In Figure 2 we define the Arc model for metadata harvesting that addresses many of these issues. The data provider maintains one repository for digital records. Then a number of service providers work together to conduct metadata harvesting. The harvester is the key service that uses OAI-PMH to maintain the synchronization between data providers and other services, such as centralized federation services, replication services, and citation linking services. In addition, the Arc system includes OAI-PMH proxy, cache, and gateway services to optimize the functioning of the model underlying the OAI-PMH technology (Liu, Brody, Harnad et al., 2002). These services provide an infrastructure that can be used by all other components to achieve interoperability, scalability, and reliability.

[FIGURE 2 OMITTED]

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale