Factors Associated With Persistence in Science and Engineering Majors: An Exploratory Study Using Classification Trees and Random Forests

Journal of Engineering Education, Jan 2008 by Mendez, Guillermo, Buskirk, Trent D, Lohr, Sharon, Haag, Susan

ABSTRACT

Many students who start college intending to major in science or engineering do not graduate, or decide to switch to a non-science major. We used the recently developed statistical method of random forests to obtain a new perspective of variables that are associated with persistence to a science or engineering degree. We describe classification trees and random forests and contrast the results from these methods with results from the more commonly used method of logistic regression. Among the variables available in Arizona State University data, high school and freshman year GPAs have highest importance for predicting persistence; other variables such as number of science and engineering courses taken freshman year are important for subgroups of the student population. The method used in this study could be employed in other settings to identify faculty practices, teaching methods, and other factors that are associated with high persistence to a degree.

Keywords: classification tree, logistic regression, random forest

(ProQuest: ... denotes formulae omitted.)

I. INTRODUCTION

Many studies have shown a lack of persistence among U.S. students who complete a science and engineering degree (BesterfieldSacre, Atman, and Shuman, 1997; Brainard and Carlin, 1997; Burtner, 2005; Grandy, 1998; May and Chubin, 2003; LeBold and Ward, 1998; Leslie, McClure, and Oaxaca, 1998; Levin and Wyckoff, 1991; Rayman and Brett, 1995; Seymour and Hewitt, 1997; White, 2005; Zhang, Anderson, Ohland, and Thorndyke, 2004). These studies have identified a number of variables such as high school GPA that are associated with persistence to a degree. Most previous work has identified factors related to persistence using standard statistical methods such as logistic regression. These methods work well for identifying simple relationships in the data. However, when predicting whether a student will graduate with an engineering degree, the relationships are often more complex. For example, female Hispanic students who participate in a mentorship program are more likely to persist to a degree, while some other groups of students in the program are less likely to persist. Such a relationship is easily missed when techniques such as logistic regression are used.

In this paper we use classification trees (Breiman, Friedman, Olshen, and Stone, 1984) to produce a new view of variables associated with persistence to earn a science, technology, engineering, or mathematics (STEM) degree. We also use the recendy developed statistical method of random forest (Breiman, 2001), related to tree-based classification methods, to identify factors that may be related to persistence but that might not be identified by other statistical procedures such as logistic regression. The primary goal of this paper is to show how classification trees and random forests can be used to identify factors and interactions not found by other methods.

Zhang et al. (2004) suggested that high school GPA and SAT math scores predicted engineering student graduation. However, these two cognitive variables explained only a small fraction of the overall variability in student graduation persistence rates suggesting that more predictors are needed to fully understand the nature of persistence in science and engineering. A recent study by Burtner (2005) supports the use of non-cognitive variables, such as confidence in college-level math/science ability, in models to predict student persistence. Other studies (Besterfield-Sacre, Atman, and Shuman, 1997; Brainard and Carlin, 1997) have supported Burtner's assertions by demonstrating associations between graduation rates and attitudinal and belief factors such as self-confidence and perceived ability in engineering as well as other factors such as work status, high school ranking, and SAT scores. Levin and Wyckoff (1991) also reported that high school GPA, scores on college placement tests in Chemistry, along with grades in Calculus, Chemistry, and Physics courses were all strong predictors of persistence through the second year of engineering programs. LeBold and Ward (LeBold and Ward, 1988) found that first and second semester grades along with cumulative GPA were strong predictors of persistence for freshmen engineering majors.

The majority of studies investigating persistence in science and engineering have focused on engineering students. Enrollment and tracking of engineering majors may be two key factors related to the restricted scope of these studies. Students enter university engineering programs early in their tenure (i.e., as freshmen) and their progress is directly tracked by the engineering school or college. Tracking students becomes more complicated for students majoring in STEM since students in these majors can change their area of study to another STEM field among and within colleges offering STEM degrees. While tracking students may differ between STEM and engineering, both groups of students have similar low retention rates. Cognitive and non-cognitive variables previously shown in models predicting graduation persistence in general STEM fields has been largely unexplored in the research literature up to this point.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a>)

advertisement
advertisement
advertisement

Content provided in partnership with ProQuest