CS221 Final Project Guidelines

[slides]

In the final project, you will work in groups of one, two, three, or four to apply the techniques that you've learned in this class to a new setting that you're interested in. Note that the larger the group, the higher the expectations for the project.

You will build a system to solve a well-defined task. Which task you choose is completely open-ended, but the methods you use should draw on the ones from the course.

One thing we'd like you to think of is the social impact of your work. AI is more than just a collection of cool techniques; it has the potential for improving society (as well as doing harm). Please think about how you could use the techniques in this class for the former. Of course, sometimes there might be a trade-off between doing socially impactful work and doing technically interesting work, which is something your team should manage.

After you submit your proposal, you will be assigned one of the CAs as your official mentor. He or she will grade all your work and really get to know your project. You are encouraged to come to office hours often to discuss your project; you can also go to the instructor or other CA's office hours to get a second opinion. Note that it will take several iterations to find the right project, so be patient; this exploration is an essential part of research, so learn from it. Have fun and don't wait until the last minute!

The final project should consist of the following stages:

Milestones

Throughout the quarter, there will be several milestones so that you can get adequate feedback on the project.

Note: you can have an appendix beyond the maximum number of allowed pages with any figures/plots/examples if you need.

Submission

Submit the milestones using the submit script as usual, but make sure the same one member of your group submits on behalf of the entire group.

All milestones are due at 11pm (23:00, not 23:59). Late days (up to 2) can be used except for the final report.

For each milestone, you should submit:

For p-proposal, you should also submit a google form on gradescope which includes project information, team member information and mentor preference.

For p-final, you should also submit supplementary material. There are two ways to do this. First, you can just package it up: The file size limit is 20MB per file. If the data does not fit in the file size limit, submit a small but meaningful subset of the data.

We encourage you to submit your project as a CodaLab worksheet, which certifies that the experiments in your project are reproducible. With CodaLab, you upload your code and data and interleave the description of your experiments with their actual execution. Extra credit will be awarded to those that produce a meaningful CodaLab worksheet. Just include the link in the final report.

Grading rubric Your project will be graded on the following dimensions:

Of course, the experiments may not always be successful. Getting negative results is normal, and as long as you make a reasonably well-motivated attempt and you explained why the results came out negative, you will get credit.

An example strategy

This is a suggestion of how to approach the final project with an example.

Datasets

You are free to use existing datasets, but these might be not necessarily the best match for your problem, in which case you are probably better off making your own dataset.

Libraries

You are free to use existing tools for parts of your project as long as you're clear what you used. When you use existing tools, the expectation is that you will do more on other dimensions.

Some project ideas

You can also get inspiration from last spring's CS221 projects (student access only).

Frequently asked questions

Can I use the same project for CS221 and another class (CS229, etc.)? The short answer is that you cannot turn in the identical project for both classes, but you can share common infrastructure across the two classes. First, you should make sure that you follow the guidelines for the CS221 project, which are likely different from those of other classes. Second, if any part of the project is done for a purpose outside CS221 (for the final project in CS229 or other classes, or even for your own research), then in the progress and final reports, you must clearly indicate which part of the project was done for CS221 and which part was not. For example, if you're taking CS229, then you cannot turn in the same pure machine learning project for CS221. But you can work on the same broad problem (e.g., news recommendation) for both classes and share the same dataset / generic wrapper code. You should then explore the machine learning aspect of the problem for CS229 (e.g., classifying news relevance) and another topic for CS221 (e.g., optimizing diversity across news articles using search or CSPs).

Are there restrictions on who I can partner up with for the final project? The only hard requirement is that each member of your group must be enrolled in CS221. Thus, if you choose to use the same project for CS221 and another class, all of your partners must be in CS221. If you feel like you have a compelling case for an exception, please submit a request on Piazza detailing the parts of the project used for each class and the reasons for deviating from the project policies.

How do I choose a good baseline and oracle? Both baselines and oracles should be simple and not take much time. The point is not to do something fancy, but to work with the data / problem that you have in a substantive way and learn something from it. Here are some examples of baselines:

Guessing completely at random is technically a baseline, but is a really bad one because it doesn't really tell you much about how easy the problem is. Here are some examples of oracles: Always guessing the correct label is technically an oracle, but it's a really bad one, because you'd always get 100% and you don't learn anything from it.