Business Services Industry

Defending and extending difference score methods

Journal of Management, Fall, 1994 by John Tisak, Carlla S. Smith

We define difference scores as the difference between distinct but conceptually linked constructs. This definition should not be confused with change scores, or the difference between a single construct measured at two or more points in time.

In the disciplines of education and human development, the attack against difference scores has stemmed from their use for assessing change on multiple measurements of some within-person characteristic (e.g., changes in abilities or skills) over time, usually in response to some type of treatment. Critics note that these change or difference scores must have some variability to function as good predictors (or outcomes), which they often do not, and that they frequently correlate with the initial level of the characteristic measured. As a consequence of these problems, several researchers (e.g., Cronbach & Furby, 1970; Lord, 1958; Werts & Linn, 1970) suggest that difference measures should be abandoned in favor of other techniques, such as residualized gain scores and regression-based estimates of change (Cronbach & Furby, 1970). Other researchers (e.g., Rogosa, Brandt & Zimowski, 1982; Rogosa & Willett, 1983; Zimmerman, Brotohusodo & Williams, 1981), however, disagree with this position, claiming that difference scores provide unique information on intraindividual change and should not be dismissed simply because they may not always be useful.

The historical arguments against difference scores that have arisen in educational and developmental research, however, often do not directly translate to management research. For example, there are notable distinctions between the difference scores criticized by psychometricians and the difference scores used by organizational researchers. Traditional psychometric arguments have mostly concerned change scores, or scores on identical variables over time. These measures are usually single pre and post scores collected from individual subjects. The difference scores collected by organizational researchers are often composite (multiple item), multiple source measures collected at a single point in time. Many of the measurement concerns about single item, single source, longitudinal data are not as relevant to multiple item, multiple source, cross-sectional data. As further evidence of their utility, differences among measures are implicit in our commonly used statistical procedures, such as analysis of covariance and repeated measures analysis of variance. Therefore, difference scores in general are certainly useful and acceptable measures.

Most studies that have used some type of difference representation have operationalized difference scores by combining two or more measures into a single index. For example, the most common bivariate indices of agreement are the algebraic, absolute, and squared difference measures. The algebraic difference index is the algebraic difference between two measures (X - Y); the absolute difference index is simply the absolute difference between two measures ([absolute value of] X - Y); and the squared difference index is the squared difference between two measures [(X - Y).sup.2]. Although different types of difference scores may yield different patterns of results, the selection of a specific type, as far as we can discern, has typically not been based on any identifiable, objective criteria.

Whereas not as widely used as the simpler types of difference measures, the more complex profile similarity indices are often preferred because they consider profile (i.e., dimension) level, dispersion, and shape, whereas simpler indices consider only level. Furthermore, as we will discuss later, they also ameliorate some of the traditional criticisms (e.g., reliability and model evaluation) of difference scores. The most commonly used profile similarity indices are the sum of absolute differences ([summation of][absolute value of] [X.sub.i] - [Y.sub.i]); the sum of squared differences ([summation of][([X.sub.i] -[Y.sub.i]).sup.2]); the square root of the sum of squared differences ([summation of][[([X.sub.i]-[Y.sub.i]).sup.2]).sup.1/2]); and the correlation between profiles of the component variables ([r.sub.xy], where the correlation is calculated between entities, e.g., respondent, rather than between measures).

Regardless of type, difference scores have been roundly criticized. For example, Cronbach (1958) argued against the use of profile similarity measures in person perception research. Johns (1981) admonished researchers for using any type of simple difference or profile similarity measure. More recently, Edwards critically examined several types of difference and profile similarity measures specifically within the theoretical framework of the person-environment fit model of stress (Edwards & Cooper, 1990) and organizational behavior research in general (Edwards, in press). All of these authors raised several issues related to the use of difference scores, although their primary concerns involved either reliability, or other measurement problems (e.g., validity, model evaluation).


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale