Final project: submission

The submission of your final project is due on April 30th. [20 points]

How to submit

As usual, open a new GitHub issue for your repository and tag @jelmerp in the text body of the Issue.

What to submit

Your repository should now contain:

Graded aspects

Below is a long list of graded aspects and what to aim for if you want a perfect score. I’m providing a lot of detail here, so there are no surprises. The TLDR is that you should aim to have a reproducible, well-organized and well-documented workflow – workflow size/complexity on the other hand, is fairly unimportant. (See also the General Info page for the final project for some more general background.)

Category Max.
score
Max. score if your project (examples given):
Project organization 2
  • Has a clear and appropriate directory structure.
  • Has informative and appropriate directory and file names.
  • Does not mix data, scripts, and results in individual directories.
Project background and documentation 2
  • Has a clear description of its background and goals.
  • Has a clear description of how different scripts are being used to achieve these goals.
  • Where appropriate, indicates what is still a work-in-progress (and optionally future directions).
Script documentation 2
  • Uses extensive (yet succinct) comments to document what is being done within scripts.
Good practices in scripts 4
  • Uses no absolute paths in scripts.
  • Uses scripts that take arguments where appropriate and minimizes “hard-coding” of potentially variable things like input/output dirs, file names, and some software settings. Any hard-coded variables/constants that are present are clearly set at the top of scripts.
  • Has individual scripts that are not overly long and don’t do multiple unrelated things.
  • Has no or only clearly annotated lines that are “commented out” in scripts.
  • Uses Bash scripts with proper set settings, and similar good practices as taught in the course.
Coding quality and complexity 3
  • Has code that demonstrates an understanding of topics covered in the course.
  • Has code that is appropriately broken up in small parts within scripts, e.g. with functions in Python.
  • Uses tools and commands that are (by and large) appropriate to accomplish its goals. (I will not dig in to fine details and parameter settings.)
Workflow
reproducibility
3
  • Has a script or Snakefile that includes all steps in the workflow and that can be run by anybody with access to your repository and the raw data files.
  • Has information for the instructor (or any other reader of the project!) about where at OSC to find the raw data files and other details needed to try to rerun the analyses.
  • Bonus: good software management, e.g. Conda environments (preferred) or OSC modules and no manual installs unless necessary; YAML files describing environments.
SLURM jobs at OSC 2
  • Has one or more scripts that are run as OSC jobs at SLURM.
  • Uses appropriate SLURM directives – either in the scripts or in a Snakemake profile YAML file.

Version control

ddddddddddddddddd

2

dddddd

  • Has Git commit messages that are informative.
  • Has reasonably appropriate commits, e.g. individual Git commits don’t consist of multiple completely unrelated edits.
  • Has a single .gitignore file that ignores files like large raw data files, and in most cases, results files.
  • Bonus: has a Git tag for the submitted version.
dddddddddddddddddddddddddddddddddddddddddddddd dddddddd dddddddd

Questions and advice

Don’t hesitate to contact Jelmer (or, where fitting, Zach) for questions about topics like:

We’re happy to answer questions by e-mail or in a Zoom meeting!

Late submissions

Late submission may be accommodated depending on circumstances, but you will need to contact Jelmer before 3 pm on April 30th and we can take it from there.

For late submissions with no advance notice, 4 points will be subtracted for each day the submission is late. (Things like forgetting to open an Issue won’t lead to subtracted points.)

Good luck!!

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".