Muscle tension dysphonia in patients who use computerized speech recognition systems

Ear, Nose & Throat Journal, March, 2004 by David E.L. Olson, Raul M. Cruz, Krzysztof Izdebski, Tracey Baldwin

Abstract

The use of speech recognition systems as a replacement for other types of transcription systems is increasing rapidly, partly because many people are unable to use conventional keyboards as a result of upper-extremity repetitive strain injury (RSI). However, the frequent or continuous use of such systems can cause muscle tension dysphonia in some patients. The scientific literature suggests that there is an association between upper-extremity RSI and muscle tension dysphonia. We present a retrospective case series of five patients with workplace upper-extremity RSI who developed muscle tension dysphonia soon after they began using discrete computerized speech recognition software. The diagnosis of dysphonia was based on laryngovideostroboscopy, acoustic analyses, and voice load testing. All patients had normal voice when using everyday speech, but speaking into the computer resulted in the rapid onset of aperiodicity, strain, and a decrease in fundamental frequency. In three of the five patients, laryngovideostroboscopy showed posterior glottic overapproximation, but no other abnormalities. Treatment was centered on voice therapy and avoidance of long periods of using computerized speech recognition systems. The condition of three of the five patients improved with therapy. We conclude that computer speech recognition programs can lead to the onset of muscle tension dysphonia in some patients. These patients can be successfully treated with voice therapy.

Introduction

Computerized speech recognition systems transmit voice input to a microphone and convert it into written text. Various systems differ with respect to the style of speech input. Discrete systems require distinct enunciation of each word with a short pause between words. Continuous systems do not require such a pause and thus allow for a more natural rate and flow of speech input.

Speech recognition systems are assumed to be useful replacements for a variety of manual transcription systems, particularly for patients who have repetitive strain injury (RSI) of the upper extremity, a condition often related to computer keyboard use. However, frequent or continuous use of speech recognition systems can cause muscle tension dysphonia, which is another form of RSI. (1)

In this article, we describe our retrospective review of the cases of five patients with workplace RSI who developed muscle tension dysphonia soon after they began using a computerized speech recognition system. Although an association between workplace RSI and muscle tension dysphonia has been suggested previously, (2) to our knowledge, our report is the first to present objective evidence of vocal dysfunction and to describe treatment options.

Patients and methods

Two of the authors (K.I. and T.B.)--both speech pathologists--retrospectively reviewed the outpatient medical records of the five patients and noted epidemiologic characteristics, signs and symptoms, and related factors (table). The five patients--three men and two women--ranged in age from 33 to 53 years. All five had a history of upper-extremity RSI severe enough to limit use of a keyboard, and all had used a variety of discrete speech recognition systems to continue their work. One patient (patient 2) had gastroesophageal reflux disease and a history of vocal abuse; none of the others had a history of voice pathology, and none smoked or habitually drank alcohol.

All patients had been evaluated by objective and perceptual means. Laryngovideostrnboscopy was performed on four of these patients during phonation to examine the symmetry and morphology of the vocal folds and surrounding structures and to measure the periodicity of the mucosal waveform. Laryngovideostroboscopy was performed with a stroboscopy unit (Kay Elemetrics; Lincoln Park, N.J.) and either a 70[degrees] or 90[degrees] rigid laryngoscope, which was passed transorally. Acoustic analysis of fundamental frequency was performed with a Computerized Speech Lab system (Model 4100; Kay Elemetrics). This analysis was performed on the patient's habitual voice as well as the "computer voice," which the patient produced when using a voice recognition system. Three of the five patients also underwent a voice load test, which evaluates voice over time in 5-minute intervals while the patient speaks all voiced segments repeatedly without breaks. (3)

Results

All five patients developed symptoms of dysphonia within 2 to 8 weeks after they began using a voice recognition system. The typical symptom pattern included hoarseness, which progressed to a strained and fatigued voice and in some cases proceeded to temporary aphonia. In addition, all patients complained of progressive odynophonia.

Objective data revealed several key similarities among patients. First, each patient's computer voice tended to differ from his or her normal speaking voice in both pitch and quality. The computer voice's fundamental frequency was 5 to 30% lower than the normal speaking voice, and the computer voices had a monotonous quality. Second, in most patients, vocal function was relatively normal during natural speaking, but it quickly deteriorated into dysphonia when the computer voice was used.

 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
Click Here
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement
Click Here

Content provided in partnership with Thompson Gale