Sound and speech in information retrieval: An introduction
Bulletin of the American Society for Information Science, Jun/Jul 2000 by Goodrum, Abby, Rasmussen, Edie
Through most of its history, information retrieval was synonymous with retrieval of the printed word. The last 30 years have seen rapid progress in retrieval of digital information: first Boolean queries, then natural language, now moving toward question answering. New technologies, and in particular the introduction of the World Wide Web as an information delivery mechanism, have changed our expectations for information organization and retrieval in ways that go beyond simple text retrieval. As information in media such as image, video and sound proliferates, we demand new ways of organizing and searching it. We want to match a face or an image, hum a tune to locate it, instantly find the right video clip. Research into techniques for content-based retrieval of images has resulted in systems that deliver images based on their color, texture and shape; recognition and retrieval in domains such as faces, fingerprints and trademarks are available. Researchers working with video address problems such as segmenting video by scene or topic, creating abstracts and supporting rapid browsing. Audio retrieval, especially music and speech, is also an active research area.
Retrieval of sound was explored in two panels presented at the ASIS Annual Meeting in Washington, DC, in November 1999. The first, The Sound of Information: Auditory Browsing and Audio Information Retrieval, was organized and moderated by Abby Goodrum, with papers presented by Stephen Downie, University of Illinois; Marilyn Tremaine, Drexel University; and Myke Gluck, Florida State University. The second, Information Retrieval from Speech, was organized and moderated by Edie Rasmussen, with papers presented by Ellen Voorhees, NIST; Douglas Oard, University of Maryland; Matthew Siegler, Media-Site; and Lynn Connaway, University of Denver, and Bob Bruce, netLibrary. Recognizing the commonality of theme for the two panels - the use of sound in support of information retrieval - the presenters were invited to document their presentations for the Bulletin, and this special section is the result.
Speech retrieval presents all the problems of text, while adding a layer of its own. For instance, in order to conduct retrieval operations on speech, it must be transcribed into its text format. The high cost of manual transcription was a barrier in this process. With automatic speech recognition, the cost/time barrier has fallen, but the recognition process is far from perfect (especially with multiple speakers and accented speech), leading to research questions associated with the impact of imperfect text on retrieval performance. Moreover, language patterns in speech differ from those in text, differing widely depending on circumstances surrounding the speech, and this raises questions about the effectiveness of retrieval techniques developed and tested on text. The combination of continuous speech recognition and information retrieval is referred to as spoken document retrieval. Speech is also a component of other media, such as video, which allows the retrieval process to draw on combined sources of evidence, adding complexity in the process.
Non-verbal audio is also a rich source of information whether we are talking about the subtle interplay of violins and cellos in a symphony or the familiar Doppler effect created by a race car as it speeds past. Music, for example, has its own semantics, calling for new forms of retrieval. The problems of digitizing and segmenting are joined by problems relating to the representation of non-textual, non-verbal information. Finding, for example, all instances of a certain pitch, harmony, rhythm or timbre challenges IR systems built to essentially match words in a query to words in a database.
Not only does audio convey its own information content, but it can also be used as an adjunct to other channels of information acquisition. Spreading our information retrieval and browsing abilities across sensory modalities allows visual attentiveness to be used elsewhere. For example, we use sound direction, echo and loudness as navigational tools and way-finding anchors.
In spite of its great potential - both as an information-bearing object and as an adjunct to support information seeking - research into audio retrieval and browsing is in its infancy. The purpose of this special section of the Bulletin is to provide a broad introductory perspective on the challenges and opportunities embodied in audio information retrieval. Overview
Spoken document retrieval has been a research program (track) within the TREC (Text REtrieval Conference) since 1997. Its addition recognized the potential of this domain for retrieval in large multimedia collections. In her paper, "The TREC Spoken Document Retrieval Track," Voorhees describes the track and its success in providing a research infrastructure and impetus for improvement in retrieval performance from spoken documents. Oard, in his paper "User Interface Design for Speech-Based Retrieval," makes a compelling case for the future of information retrieval from the growing corpora of audio broadcasts. He describes current research programs and argues that interface design will be critical in formulating queries that take advantage of the varied features of speech.
- 5 Rules for Immediate Annuities
- Death in the Family: 12 Things to Do Now
- Dumbest Things You Do With Your Money
- 6 Online Networking Mistakes to Avoid
- 401(k) Mistakes to Avoid
- 5 Economic Scenarios to Keep You Up at Night
- The Real ‘Best Places to Retire’
- Best Credit Cards for You
- 12 Tough Questions to Ask Your Parents
- The Real ‘Best Colleges’
- Home Buyer Tax Credit: How to Cash In
- Why You Shouldn’t Bash Cash
- 8 Phony 'Bargains' and Better Alternatives
- Danger: 3 Debit Card Scams to Avoid
- 6 Myths About Gas Mileage
- 29 Fees We Hate Most
- Quick and Easy Ways to Boost Returns
- Best Stocks to Buy Now
- Lower Your Taxes: 10 Moves to Make Now
- New Jobs: 8 Lessons from Real-Life Career Switchers
- The New Job Market: Who Wins and Who Loses?
- Health Care Reform's Public Option: Everything You Need to Know
- Volunteer Work When Unemployed: Should You Work for Free?
- Whose Recovery Is This?
- Long-Term-Care Insurance: 4 Biggest Risks to Avoid
Content provided in partnership with
Most Recent Reference Articles
- A Maryland state trooper gave Erik Bonstrom an $80 ticket for driving too slowly
- In California, postal worker Dean Hudson has been found guilty
- Alec Loorz, the 15-year-old founder of Kids vs. Global Warming and recent Brower Youth Award recipient, went to Congress in November for a press conference with Senators Barbara Boxer and John Kerry, who are championing legislation to stabilize US greenho
- ARAB EUROPEAN RELATIONS - Dec 22 - Russia Denies Selling Missile System To Iran
- EGYPT - Dec 29 - Opposition Says Mubarak Blessed Israeli Attacks
Most Recent Reference Publications
Most Popular Reference Articles
- Credit card debt on college campuses: causes, consequences, and solutions
- 9 questions to ask your new lover: what you were afraid to ask, but always wanted to know
- How Tyler Perry rose from homelessness to a $5 million mansion
- Rejoice anyway - Zephaniah 3:14-20, Philippians 4:4-7 - Living by the Word - Column
- Living by the word


