Your final project in this course will be a collaborative that uses each element of the data science process to answer questions on a topic of your choosing. Your team will be responsible for finding and cleaning data; producing visualizations and exploratory analyses; producing concrete data-centric deliverables; and disseminating results. You are expect to organize your work and to collaborate using best practices.
You will work closely with other classmates in a team of 4 on this project, and are free to form teams of your choosing. If you can’t find a team, or wish to form a team of a different size, please reach out to the instructor. In general, we do not anticipate that the grades for each group member will be different. We do, however, reserve the right to assign different grades to each group member based on peer assessments or public records of contribution (e.g. through commit histories).
Date | Description | Deliverable |
---|---|---|
November 6 by 1:00 | Form a team and submit a proposal | Written proposal document |
November 13-17 | Project review meeting | In person meeting – no “deliverable” |
December 6 by 4:00 | R Markdown and HTML | Written report giving detail project description (as a GitHub repo) |
December 8 by 4:00 | Webpage and screencast | Webpage overview of project, with short explanatory video (published online) |
December 8 by 4:00 | Peer assessment | Brief assessment of your teammates contributions (as a short document) |
December 11 | In class discussion of projects | Enjoy hearing about projects! |
First, you will define your teams and propose a project. This proposal should include:
There should be one proposal per group, written collaboratively using google docs. Conceptually, this is a 10% review.
Based on the topic of your proposal, you will work a member of the teaching team; this person will be your primary resource and will guide you through the rest of the project. In particular, you will schedule a project review meeting with your teaching team leader to discuss the proposal, anticipated stumbling blocks, and preliminary work. All team members are required to be present for the meeting. Conceptually, the project review meeting is a 30% review.
The R Markdown and HTML files produced by your team are central to this project. These files will detail how you completed your project, and should cover data collection and cleaning, exploratory analyses, alternative strategies, descriptions of approaches, and a discussion of results. We anticipate that your project will change somewhat over time; these changes and the reasons for them should be documented! You should write one R Markdown document per group, and be sure to include all group member names in the document.
Your R Markdown should include the following topics. Depending on your project type the amount of discussion you devote to each of them will vary:
As this will be your only chance to describe your project in detail make sure that your RMarkdown file and compiled HTML file are standalone documents that fully describes your process and results. We also expect you to write high-quality code that is understandable to an outside reader. Coding collaboratively and actively reviewing code within the team will help with this!
Your R Markdown document and HTML file should be included in a GitHub repository, along with the data used for the analysis. If the data are too big to fit in the repository, make the data accessible somewhere online (e.g. google drive or a downloadable link). Inside the RMarkdown file at the top, include instructions on where to access the data. If we cannot access your work or links because these directions are not followed correctly, we will not grade your work.
You will create a webpage summarizing your project. This should not be as detailed as your submitted R Markdown and HTML files, but should give a overview of the project scope, data, approaches, visualizations, and other results. You should include a link to the GitHub repository containing your R Markdown document.
You will also create a two-minute narrated screencast illustrating your project (screencasts are videos of your computer screen with spoken audio explaining what is shown on the screen – see the RStudio webinar page for some examples). You may use slides, demonstrations, or any other content that is relevant to your project. Publish your screencast on youtube, vimeo, or another online platform, and embed the screencast in your website. The two-minute limit will be strictly enforced.
For both the website and the screencast, your audience is classmates who worked on other projects. It will be helpful to put yourself in their shoes, and ask what information you think will be most interesting. We suggest you emphasize motivation, questions, and results over methods; after all, interested folks can view your complete project on GitHub.
It is important to provide positive feedback to people who worked hard for the good of the team and to also make suggestions to those you perceived not to be working as effectively on team tasks. We ask you to provide an honest assessment of the contributions of the members of your team, including yourself. The feedback you provide should reflect your judgment of each team member:
The examples below are drawn from submissions in the Fall of 2017 to give an idea of the range of possible projects.