What is at stake with high stakes testing? A discussion of issues and research
Ohio Journal of Science, The, April, 2004 by Gregory J. Marchant
ABSTRACT. High stakes tests are defined as those tests that "carry serious consequences for students or educators." The consequences from standardized achievement tests range from grade retention for school children to rewards or punitive measures for schools and school districts. The nature of standardized achievement tests used in these situations poses validity problems for the decisions. Numerous unintended negative consequences for students, teachers, curriculum, and schools have been identified. Research has yet to establish clear benefits from these high stakes practices. Therefore, with little empirical support and financial and human costs high, a costs/benefits analysis suggests that the high stakes testing bandwagon, further fueled by No Child Left Behind, needs to be carefully evaluated before it continues to roll.
INTRODUCTION
For both advocates and opponents of the use of standardized tests in decisions regarding students, teachers, and educational policies, the answer to "what is at stake with high stakes testing?" is the same. The answer is, "everything." In an effort to implement accountability measures for districts, schools, teachers, and even individual students, testing originally designed to provide information regarding individual student achievement and ability for diagnostic/prescriptive teaching efforts is now being used as the measuring stick for evaluating the success of students, teachers, schools, districts, and even states. With important decisions resting on the results of certain test scores, it is important to know how well the scores reflect the quality of learning and education. It is also important to consider whether decisions based on these tests tend to reflect accurate interpretations and result in best practice. Even a potentially useful tool for education may be considered inappropriate if its use routinely results in harm to children.
This article began as a review of the current research to explore the results of high stakes testing; of particular interest was its affect on student learning. Surprisingly and unfortunately the impact of high stakes testing on student achievement has not been investigated. Therefore, this article reviews the research and concerns addressed in the literature regarding high stakes testing.
DEFINITION OF HIGH STAKES TESTING
A position statement issued by the American Educational Research Association issued in July of 2000 described high-stakes testing as follows:
Many states and school districts mandate testing programs to gather data about student achievement over time and to hold schools and students accountable. Certain uses of achievement test results are termed "high stakes" if they carry serious consequences for students or educators. Schools may be judged according to the school-wide average scores for their students. High school-wide scores may bring public praise or financial rewards; low scores may bring public embarrassment or heavy sanctions. For individual students, high scores may bring a special diploma attesting to exceptional academic accomplishment; low scores may result in students being held back in grade or denied a high school diploma.
The statement then identified the 1999 Standards for Educational and Psychological Testing as guidelines for high-stakes testing efforts. The guidelines include protection against high-stakes decisions based on a single test, full disclosure of likely negative consequences of high-stakes testing programs, alignment of the test and the curriculum, opportunities for remediation for those who fail, appropriate attention to language differences and disabilities.
High-stakes tests are usually national or state-wide standardized achievement tests. If a test is "standardized" it has set rules for administration, such that everyone taking the test receives the same exact directions and has the same restrictions of time and resources. Achievement tests are usually for one specific grade level and designed to create a distribution of scores. Popular national standardized achievement tests are the Terra Nova and the Stanford-9. Many states have taken up the costly task of developing their own state achievement tests aligned with their state's standards. Some of these tests were developed in conjunction with national test makers and share items. The SAT is not an achievement test, but an aptitude test designed to predict college achievement; however, because of its influence on college admissions decisions, it is also considered a high-stakes test.
THE NATURE OF STANDARDIZED ACHIEVEMENT TESTS
Most standardized achievement tests are norm-referenced, in that how well an individual does on the test is based on a comparison to a large group of test takers. "Good" is relative to others at the same grade level. This is in contrast to a criterion-referenced test that defines how well one does on a test based on the meeting of criteria or mastering a standard. High stakes decisions tend to involve either relative comparisons or reaching a pre-defined cut-off point. However, almost always the decision as to where the cut-off point will be is informed by norm-referenced information, such as difficulty levels of items selected or even percentile rank of a score. Such that, if a cut-off score equates to the 40th percentile, the decision makers know that approximately 40% of the test-takers will not "pass" the test. Therefore, the setting of the cut-off score is very important on high-stakes tests that require passage. For example if a state like Ohio, that averages 140,000 students at each grade level, was to raise a cutoff score for a required achievement test by 5 percentiles, approximately 7,000 more children would not reach the cutoff at each grade level.
Most Recent Business Articles
- Multiple criteria evaluation and optimization of transportation systems
- Multi-criteria analysis procedure for sustainable mobility evaluation in urban areas
- A two-leveled multi-objective symbiotic evolutionary algorithm for the hub and spoke location problem
- Multi-criteria analysis for evaluating the impacts of intelligent speed adaptation
- The development of Taiwan arterial traffic-adaptive signal control system and its field test: a Taiwan experience
Most Recent Business Publications
Most Popular Business Articles
- 7 tips for effective listening: productive listening does not occur naturally. It requires hard work and practice - Back To Basics - effective listening is a crucial skill for internal auditors
- FAS 109: a primer for non-accountants - Financial Accounting Standards Board's "Statement 109: Accounting for Income Taxes"
- LIFO vs. FIFO: a return to the basics
- Too Young to Rent a Car? - 25-years-old the minimum age for car renting - Brief Article
- Design a commission plan that drives sales - Sales Commissions


