Health Care Industry
Industry: Email Alert RSS FeedIs There a Way for Pathologists to Decrease Interobserver Variability in the Diagnosis of Dysplasia?
Archives of Pathology & Laboratory Medicine, Feb 2005 by Montgomery, Elizabeth
Many obstacles interfere with our efforts to screen patients with Barrett esophagus. Probably the largest is choosing the appropriate patient group for screening. Beyond this problem, sampling error on the part of endoscopists is probably more serious a problem than observer variation among pathologists reviewing patient samples. Pathologists agree well on lesions that merit close follow-up or other intervention (high-grade dysplasia and invasive carcinoma), although interobserver agreement between pathologists interpreting lesser lesions is not good. This lack of agreement is not likely to improve substantially, and many adjunct markers are being sought in an attempt to identify patients with lesions of lower grades that are most likely to progress, allowing doctors to identify patients who would benefit from upgraded surveillance.
Most RecentHealth Care Articles
(Arch Pathol Lab Med. 2005;129:174-176)
About 20 years ago, criteria for grading dysplasia in ulcerative colitis were developed by a group of observers.1 Cases were categorized as negative for dysplasia or as low-grade or high-grade dysplasia. However, these observers also noted that there was a subset of cases in which it was unclear whether the epithelial changes were truly neoplastic or simply the result of ongoing repair. Such cases tended to display abundant inflammation and were classified as indeterminate for dysplasia. When criteria were subsequently published for grading dysplasia in Barrett esophagus, the indeterminate category was conceptually retained but the preferred term has become indefinite.2 As such, currently Barrett esophagus cases are classified as negative for dysplasia; indefinite for dysplasia; dysplasia, low-grade; or dysplasia, high-grade.3
To assess criteria for grading dysplasia in Barrett esophagus in 1988,2 71 test cases were circulated twice to 10 observers, and percent agreement was calculated. There was about 60% agreement in separating cases interpreted as negative for dysplasia versus indefinite and low-grade dysplasia (lumped) versus high-grade dysplasia and invasive carcinoma. The authors indicated that observer variation was a significant problem at the low end of the spectrum and suggested that such problems would be resolved by newer more objective techniques emerging at the time. Using the criteria published in 1988 as a basis for our own review,3 my colleagues and I circulated 2 sets of 125 slides twice each to 12 observers. Between circulation of the 2 sets of slides, a consensus criteria meeting was held. With the benefit of previously published criteria, we were able to attain about 75% agreement in separating the same categories (negative vs indefinite and low grade vs high grade and cancer), but we continued to have difficulty with observer agreement at the lower end of the diagnostic spectrum. Our improvement after the consensus meeting was modest, and some observers had poorer agreement after the meeting. In evaluating the cases, we used κ statistics and percent agreement.
κ Statistics were initially developed to assess observations in psychiatric studies, which were believed to have the potential for subjectivity and for which there was a concern that any observed agreement might be accounted for by chance alone.4,5 κ Statistics were developed to correct for observer agreement due to chance alone.4-6 Generated κ scores are quite unforgiving; a negative score can be attained, κ Scores range from negative values up to 1. Verbal scales have thus been developed together with the calculated numerical ones5,6: poor, from any negative value to 0; slight, 0 to 0.2; fair, 0.2 to 0.4; moderate, 0.4 to 0.6; substantial, 0.6 to 0.8; and almost perfect, 0.8 to 1.0.
In our study, when κ scores were calculated by diagnostic category, our scores were 0.65 (substantial) for high-grade dysplasia/carcinoma and 0.58 (moderate to substantial) for Barrett esophagus without dysplasia, but the scores were 0.32 (fair) and 0.15 (slight) for low-grade dysplasia and indefinite for dysplasia, respectively.3 How do we obtain better κ scores?
Increases in κ scores probably are best accomplished by having fewer categories of readily separable entities, which is unrealistic for highly variable lesions such as Barrett esophagus dysplasia. For example, Cross et al7 reported κ scores of 0.84 to 0.98 in separating hyperplastic and adenomatous rectal polyps. However, they noted that one inexperienced observer in their study had a κ score of 0.46, which improved after a tutorial. A similar study in which observers were asked to separate serrated adenomas, sessile serrated polyps,8 hyperplastic polyps, and tubular adenomas probably would yield considerably low-er κ scores. In a situation more similar to the separation of diagnostic categories in Barrett esophagus, a new classification for pancreatic intraepithelial neoplasia was published in 2001.9 In that study, κ scores of 0.43, 0.14, and 0.42 were obtained for the diagnostic categories of 1, 2, and 3, respectively.
In grading Barrett dysplasia, we developed an algorithm in which 4 overall categories were assessed: surface maturation, low-magnification architecture, cytologic features, and inflammation.3 Lesions classified as negative for dysplasia display surface maturation, ample lamina propria compared with glands, bland cytologic features, and typically little inflammation (Figure 1). Those lesions regarded as indefinite for dysplasia retain their architectural ratio of glands to lamina propria, have surface maturation (Figure 2), have nuclear alterations that are not particularly prominent, and tend to be complicated by inflammation (Figure 3). Low-grade dysplasia cases lose surface maturation but retain nuclear polarity (ie, the long axes of the nuclei remain perpendicular to the basement membrane in the pencillate fashion of tubular adenomas), display nuclear alterations, have minimal glandular crowding, and typically lack inflammation (Figure 4). In high-grade dysplasia, surface maturation is lost, glands become crowded (overrunning the lamina propria), nuclear alterations become striking (Figure 5), and abundant inflammation is not typical (but can be observed).
Brought to you by CBS MoneyWatch.com
- Best- and Worst-Paid College Degrees
- 6 Things You Should Never Do on Twitter or Facebook
- How Much Sleep Do You Really Need?
- 6 Big Myths about Gas Mileage
Most Recent Health Articles
Most Recent Health Publications
Most Popular Health Articles
- Make running easier: with this unique 'pose running' technique, you'll learn to actually enjoy your fat-burning sessions
- 50 home remedies that work: these safe, fast, and effective fixes will relieve what ails you - Cover Story
- Detox in 7 days: a detoux diet can help you shed up to 10 pounds and leave you feeling terrific. Our weeklong plan shows you how to lose the weight and keep it off - Cover story
- Treat sinusitis naturally: breath easy and relieve sinus pressure with these remedies - Quick Fixes and Long-Term Solutions
- All about nightshades: explore the hidden hazards of your favorite food with macrobiotic nutritionist Lino Stanchich


