Final Project: general information

Author
Affiliation

Jelmer Poelstra

Published

October 11, 2025



Introduction

This page gives you general information about the Final Project that you will do for this course. Your Final Project is an independent project on a topic of your choice in which you should apply many of the things you have learned during this course. A number of aspects that focus on your process, documentation, and reproducibility are required to get a good grade (see Graded aspects below).

Across four checkpoints, your Final Project is worth 40 points (out of a total of 100 for the course). Click on the links below to read more about each checkpoint:

What Due/When Points
Proposal Nov 16 5
Progress report Dec 1 5
Oral presentation Dec 9 10
Final submission Dec 12 20

Picking a topic

The topic and scope of the project are completely up to you – you don’t even necessarily need to work with omics data, for example. I recommend that you take advantage of this flexibility so your Final Project will be as useful as possible for your own research and/or personal development.

As such, it may be ideal to work with your own data or a publicly available dataset that is similar to what you expect to work with1. Also, please note that it is perfectly fine to use your Final Project to start working on a larger project, even though you won’t get far enough with that during your Final Project to be able to draw biological conclusions. (If you’re not even sure what kind of data you’ll work with in your research, you could pick something for your Final Project that seems of interest to you, and/or we could help you find something – see below.)

Alternatively, you may decide that it is most useful for you to do a Final Project that really focuses on practicing the process that you’ve learned (project structure, reproducibility, shell scripts that are run as Slurm batch jobs, etc.) and uses a small, trivial dataset that is not necessarily relevant to your research.

Contact the instructors to discuss Final Project ideas

We are happy to help you find a suitable topic and/or dataset. More generally, we highly recommend that you contact us to discuss your Final Project ideas (or lack thereof) before you settle on a project topic!

Graded aspects

Graded aspects of your Final Project revolve around appropriate usage of many of the tools and principles we covered during the course. To receive a high grade, your project should:

  • Be well-organized: contained in a single parent directory with a clear and sensible structure of subdirectories, descriptive file and directory names, no files floating around with unclear purpose or source, and so on.

  • Be well-documented, with at least one README in Markdown format in the root directory of your project.

  • Be version-controlled with Git throughout: with regular, meaningful commits. You will need to push to GitHub at least for the proposal, draft, and final submission checkpoints.

  • Contain shell scripts and R scripts/Quarto documents that do data processing and/or analysis. Beyond entering data, avoid doing manual work such as editing a data file in a text editor or Excel to fix column names, since this hinders reproducibility. Your shell scripts may of course run external programs like we’ve been practicing with.

  • Run one or more scripts as Slurm batch jobs at OSC (in general, you should do all Final Project work at OSC).

  • Be easily re-runnable using a “protocol” in one of your Markdown files. Note that after your final submission, we will actually try to rerun your project as part of the grading process!

Finally, you are allowed to use generative AI for your Final Project, as long as you are still able to explain what your code does. For example, if gen-AI comes up with some commands or constructs that we have not see in the course, you should do some research into those to make you understand what is being done. You may be asked to orally explain your code at several points.

Academic integrity

Use of generative AI Tools (e.g. ChatGPT, Microsoft Copilot, Google Gemini) is permitted
Getting help on the assignment is not permitted
Collaborating, or completing the assignment with others, is not permitted
Copying or reusing previous work is not permitted
Open-book research for the assignment is permitted
APA Citations and/or formatting for this assignment are not required
Back to top

Footnotes

  1. Note that “subsetting” the data may be useful or necessary if you have a large genomic dataset, and I can help you with that.↩︎