The basic idea is that we work on this project in an open, reproducible format. Our initial goal is to develop a highly predictive algorithm for predicting bachelor's degree completion using data from the 2002 ELS. We'd like to demystify what goes into creating these algorithms, and also describe how they can be used/misused.
To begin with, we'll use data on students beginning at four year institutions in academic year 2004, using data from ELS 2002 via the online codebook.
This project will explore the use of algorithms to predict student success. In particular, this project will explore how these algorithms reinforce existing inequality. Predicting who does and doesn't currently succeed in our current system says more about the institutions themselves than it does about students, but many times existing patterns of student success are used to justify decisions about who will and won't be given resources. We'd like to explore this tension.
In addition to traditional academic venues, we anticipate displaying these results in interactive formats.
Will Doyle, Crystal Han, Ozan Jaquette, Patricia Martin, Monique Ositelu, Karina Salazar, Benjamin Skinner
Feel free to download the repository and use what we've posted. If you'd like to be included on this project, submit a pull request. Depending on the core team's decision regarding the level of contribution, an accepted pull request will either be acknowledge, or could result in inclusion in some of the products.