CMPS142, Spring 2015, Section 01: Project

All information regarding the project will be posted here.  See the attached project information page for more project information.

 Projects can be done in groups (preferably groups of two or three). So you should get started with forming a group.


For the final project there are two options:

  • Doing your own project: If you have a project of your own in mind that you want to do then you should meet the instructor as soon as possible to make sure the problem is well defined/doable within a limited timeframe.
  • Pick one from the suggestions below.



Suggested Options:

 1.) Titanic: Machine Learning from Disaster.

     This looks like a good problem to work on as a project. It also has a tutorial attached(In Python and R) to get people started with working on the data. You will need to create an account on Kaggle(free). Look at the data, read about the problem and see if this interests your group.

 2) Prediction of first year performance.  I have a data file of students admitted in Fall '13 with some admissions data (SAT scores, high school GPA, etc.) and their first year gpa.  The goal is to predict the first year GPA based on the admissions data -- do not use any of the first-year fields as part of your training data.  In addition to the training data, I have also attached a test data file with just the data, not the labels (FirstYrCumGpa).  This is a somewhat more open-ended problem as the data is not "clean" -- there are several missing values (e.g. some students took the ACT exam rather than the SAT, and some students took both).   The success metric is also flexible.  You could define good performance as a sufficient GPA, and turn it into a classification problem or treat it as a regression problem.