(17-745 Machine Learning in Production)
Students enrolled in the PhD-level version of this course 17-745 instead of 17-445/17-645/11-695 will do a research project in the second half of the semester, instead of individual assignments 3 and 4.
The end goal of this project should be a 4 page research paper on some topic of your interest that relates to the course material.
We provide extensive flexibility to align the research project with your interests. Combining this with research you are already doing as part of your PhD is encouraged. Projects should relate to both software engineering and machine learning and have a research notion. Projects that focus only on core machine-learning topics and pure engineering projects to not apply.
The goal of this assignment is to perform some self-directed research in the field of the course. This can employ different research methods and take many forms, such as:
- A case study of a specific project, system, failure, or infrastructure component. This study could involve studying public artifacts, interviewing people involved, and others.
- An empirical study across multiple projects, in the form of a survey, interviews, controlled experiment, or analyzing public artifacts (mining software repositories).
- An evaluation of a technical design or tradeoff, such as a systematic benchmark of different engineering solutions.
- A literature survey on a specific topic, including surveys of academic literature and surveys of grey literature. Surveys should follow a principled approach and synthesize new insights.
- A replication study of a different study in the field. This can be an exact replication or a conceptual replication that varies aspects of the original study.
Given the short time frame of this project, we do not expect a fully complete study. The outcome should be a self-contained paper and corresponding presentation that motivates the work, describes the research questions, the state of the art, gives an overview of the study design, and possibly preliminary results. For more empirical work, a description and justification of a study with very preliminary results or even without results is acceptable -- if you do not have results, include a section "Expected Outcomes" that describes what you would expect if you had more time. For a survey or replication study, we expect some preliminary results at least.
If you are interested, the instructors are happy to work with you beyond the end of the semester on this project to submit it for publication.
The presentation is due at the same time as those for the final team project. The paper is due at 11:59pm the same day.
We strongly suggest the following milestones to better scope the project together with the instructors:
- Sep 21: Reach out to instructors for an initial discussion of possible project ideas
- Oct 12: Send a brief description of the project (not more than 1 page) to the instructors. The description should include references to at least 2 pieces of related work and some initial progress.
- Nov 9: Send an initial draft of the paper to the instructors. It does not need to be complete, but should contain the entire structure, an initial complete draft of the introduction, and some initial content in other sections.
We are happy to discuss ideas or drafts at any time during the semester, not just at the milestones.
If you plan to conduct interviews or surveys as part of the project, please plan early for IRB review and allow time for that review.
Submit a paper and a presentation on the final deadline.
The paper should be in a form submittable to a new-idea track or workshop in the field. It should have at least an introduction motivating the research, one or more clear and motivated research questions, a discussion of the state of the art or related work, and a description of the conducted or planned research. While we do not enforce a specific page limit, we would typically expect around 4 pages in an ACM format, such as for the ICSE-NIER track.
The presentation should be no longer than 8 minutes. How you structure the presentation is up to you. You do not need to cover everything, but consider how to make this interesting to the audience. It will be presented in the same time slot as the presentations from the group project.
Send paper and slides as attachments or links per email to the instructors.
This assignment is worth 200 points. For full credit, we expect:
- 40 points: The research described in the paper is in scope for the content of this course.
- 70 points: The research described in the paper makes a good faith attempt at exploring something novel in this field.
- 20 points: The paper states a clear research question and motivates why answering that research question is important and to whom
- 30 points: The paper discusses the state of the art regarding the problem studied
- 10 points: The paper makes a good faith attempt at describing the research method proposed or used to answer the research question
- 10 points: The paper includes some preliminary results or a description of expected results
- 10 points: The paper is reasonably well written and self-contained and would be suitable for a workshop or conference.
- 10 points: A presentation to the entire class that describes the project. Slides were submitted.
We typically expect the research to be conducted individually, tailored to the interest of each student. We are open to discussions about joint projects if there is a good justification, such as that two students bring together expertise from different backgrounds needed for a study. Groupwork needs to be approved by the instructors.