The epistemological foundations of knowledge representations

Library Trends, Wntr, 2004 by Elaine Svenonius

ABSTRACT

THIS PAPER LOOKS AT THE EPISTEMOLOGICAL FOUNDATIONS of knowledge representations embodied in retrieval languages. It considers questions such as the validity of knowledge representations and their effectiveness for the purposes of retrieval and automation. The knowledge representations it considers are derived from three theories of meaning that have dominated twentieth-century philosophy.

The discipline of philosophy impacts other knowledge disciplines, particularly in the theoretical constructs they employ. The purpose of this paper is to explore how epistemology, that branch of philosophy concerned with how and what we know, has contributed to the design of knowledge representations embodied in retrieval languages designed for organizing information. Different retrieval languages make different presuppositions about what is meant by knowledge. These differences give rise to questions such as

* How valid are the knowledge representations embodied in different retrieval languages, i.e., how well do they do what they purport to do, i.e., to represent knowledge?

* How effective are they in facilitating the achievement of the objectives of a retrieval language: collocation, discrimination, and navigation? How amenable are they to automation and semantic interoperability?

In the course of the twentieth century, the problem of what and how we know has been dealt with through language analysis and theories of meaning. Three theories of meaning are especially relevant to the discussion of knowledge representations: Operationalism, the Referential or Picture theory of meaning, and the Contextual or Instrumental theory of meaning.

OPERATIONALISM

Operationalism is a theory of meaning emanating from the philosophy of logical positivism. Logical positivism, an extreme form of empiricism, dominated philosophy of science in the first decades of the twentieth century. Empiricism holds that all knowledge is derivable from experience, i.e., from sense perceptions. For instance: our knowledge of time as used as a variable in a mathematical equation, e.g., v = d/t, is ultimately derivable from propositions recording our sensory experience of time. The experience upon which knowledge is based must be objective. This condition is expressed by the Principle of Verifiability, which states that in order to be meaningful, a proposition must be capable of verification. The totality of knowledge consists of all meaningful propositions. Examples of nonmeaningful propositions are those of an ethical, religious, or "esthetic kind," e.g., "truth is beauty" is not meaningful because it cannot be verified, therefore, it is excluded from the corpus of knowledge.

For a proposition to be verified, the concepts within it need to be defined operationally, i.e., they need to be defined constructively. In practice, defining a concept operationally often means defining it as a variable. Defining concepts as variables enables a discipline to advance. The most celebrated example of this phenomenon is Einstein's use of operational definitions in his analysis of simultaneity (Bridgman, 1938, p. 7). A graphic example of the practicality of operational definitions is that of Eddington's elephant sliding down a hill of wet grass (Eddington, 1929, pp. 251 if). Eddington asks us to consider the mass of this sliding elephant. Conceivably it could be regarded as a property of the elephant ("a condition which we vaguely describe as 'ponderosity'") (p. 251); on the other hand, it could be regarded as a pointer reading on a scale, i.e., two tons. It may be intuitive to think of mass as a property, but Eddington observes: "we shall not get much further that way; the nature of the external world is inscrutable, and we shall only plunge into a quagmire of indescribables" (p. 251). He goes on to argue that it is more productive to regard mass as a pointer reading, i.e., as a value of a variable. Not only does this give a method for testing the proposition "the elephant weighs two tons"; it enables the two tons of the elephant to be related to other pointer readings, i.e., to values of other variables, such as velocity, coefficient of friction, etc. Operational definitions, by providing empirical correlates for concepts in the form of variables, allow variables to be related one to another. Propositions that express relationships among variables are "scientific" in the sense that they take the form of generalizations and serve an explanatory function: if verified, they assume the character of laws; if awaiting verification, they have the status of hypotheses.

To the extent that problems of organizing and retrieving information are definitional in nature, solutions to them can be approached by introducing operational definitions. An example of a productive operational definition is the precision-recall measure, which was developed to measure the degree to which a given retrieval system does or does not achieve its discrimination and collocation objectives (Cleverdon, 1962). Precision measures the degree to which the system delivers only relevant documents and is defined as the number of relevant documents retrieved divided by the total number of documents retrieved, expressed as a ratio or percentage. Recall measures the degree to which the system delivers all relevant documents and is defined as the number of relevant documents retrieved divided by the total number of relevant documents, again expressed as a percentage or ratio. The use of these measures in quantifying the discrimination and collocation objectives makes it possible to generalize about the impact of various factors on retrieval effectiveness. One of the earliest factors studied was indexing depth, the number of index terms assigned to a document. The more index terms assigned--or, alternatively, the more access points a document admits of--the higher the recall, the lower the precision. This is, in part, the scientific explanation of why keyword searching nearly always results in infoglut.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
Click Here
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale