Access to biomedical information: the Unified Medical Language System

Library Trends, Summer, 1993 by Steven J. Squires

Abstract

The national library of medicine (NLM) is engaged in a long-term project to develop a Unified Medical Language System (UMLS) that will retrieve and integrate information from a variety of information resources. Two UMLS components use fundamental aspects of controlled vocabulary structure and management and their relationship to information retrieval that have general interest for librarianship. The UMLS project is described along with its initial deployment in retrieval environments.

Introduction

Bibliographic control of information has traditionally focused on locating and describing published documents, and indexing these in useful ways. In every subject domain, the problem of erecting a complete record of existing information is more or less acute, depending on the available support for and interest in comprehensive collections and provision of access. In the biomedical domain, due to its societal importance and to generous government support, the problem of finding and describing the published literature is not great in spite of the size of that literature. As chronicled by Adams (1981) and, more recently, as listed by Tilley (1990), massive government and private efforts are in place for building and maintaining bibliographic and reference databases and online systems that describe and index the biomedical literature.

More and more, however, important information has developed in forms other than the published record. In biomedicine, these include clinical databases and patient records. In these databases, the mechanisms for record creation, maintenance, access, and exchange are not as structured as for bibliographic data. The focus of bibliographic control has had to include describing and structuring records and retrieval tools that permit effective use of information in a large number of diverse information sources.

In spite of the ability of machines to search on any element of stored data, controlled vocabularies are still widely used to index information and to produce effective retrieval. Many different terminologies exist, even within the same subject domain, that have been created to organize and retrieve data for specific purposes. The Unified Medical Language System (UMLS) is conceived as a means of navigating among a disparate array of databases organized using different terminologies. Except perhaps for work in automated indexing, to which the UMLS is not unrelated, this effort is possibly the most important development in biomedical bibliographic control in recent years. This article will describe UMLS components, their potential uses, and some current efforts to incorporate them into retrieval environments. Efforts to evaluate UMLS are noted along with areas for future development.

Purpose of the Unified Medical Language System

In the mid-1980s, as the growth and development of electronic means of storing information progressed, and as computational and telecommunications resources for using that information proliferated, the National Library of Medicine recognized a need to assist the biomedical world in using the new resources and capabilities now more or less easily at hand. Its Long Range Plan of 1986 presents a comprehensive program of research, resource development, and educational endeavors to provide that assistance. A central part of that plan is the Unified Medical Language System.

Humphreys and Lindberg (1989) and Lindberg and Humphreys (1990) make the case for a UMLS. Their argument starts with the observations that useful biomedical information can be found among an increasingly large number of machine-readable databases, that these databases are different in important ways, and that these differences are among the barriers to effective use. Databases differ by content and by how that content is represented and described. They also differ by means of access. As users are confronted by the ever larger array of different databases, it is increasingly difficult to identify which databases have information relevant to a particular query. Users, too, have different ways of expressing the many concepts represented in databases and, as a result, formulate queries about those concepts differently. There is a lack of a universally recognized and accepted standard vocabulary for expressing biomedical phenomena and for recording health care events and transactions. Once information is found in a database, the need arises to organize it and possibly evaluate it for its intended use. The UMLS is meant to compensate for these problems, not by imposing uniformity on the diverse world of terminology and databases, but by minimizing the differences about which a user of information sources has to be aware (Lindberg & Humphreys, 1990, p. 121).

These problems are, of course, not new in the information world. Perhaps the most important aspect of the UMLS approach is its "unified" nature, its attempt to provide a single utility through which access to the variety of biomedical databases can be gained, and by which information from them can be easily retrieved and integrated.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
Click Here
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale