The submission of your final project is due on April 30th. [20 points]
How to submit
As usual, open a new GitHub issue for your repository and tag @jelmerp
in the text body of the Issue.
What to submit
Your repository should now contain:
A finished set of scripts.
Final documentation in one or more README files that clearly describes:
- What the project does as a whole.
- What each script does.
- Where to access the data at OSC, assuming that the data is not in your repository.
- How the project’s scripts can be rerun using a single script or Snakefile.
A single script or Snakefile that aims to rerun the full workflow.
A file (e.g. submission_notes.md
) or a section in your main README file that provides some additional information for the instructor to grade your project appropriately. Some hypothetical examples of things you may want to include:
Additional instructions the instructor will need to try and rerun your project.
You want to alert the instructor to some files files in the repository that should be ignored.
You want to explain why you don’t have a functioning script or Snakefile, or why you didn’t run any SLURM jobs (which can be acceptable in some cases).
Graded aspects
Below is a long list of graded aspects and what to aim for if you want a perfect score. I’m providing a lot of detail here, so there are no surprises. The TLDR is that you should aim to have a reproducible, well-organized and well-documented workflow – workflow size/complexity on the other hand, is fairly unimportant. (See also the General Info page for the final project for some more general background.)
Project organization |
2 |
- Has a clear and appropriate directory structure.
- Has informative and appropriate directory and file names.
- Does not mix data, scripts, and results in individual directories.
|
Project background and documentation |
2 |
- Has a clear description of its background and goals.
- Has a clear description of how different scripts are being used to achieve these goals.
- Where appropriate, indicates what is still a work-in-progress (and optionally future directions).
|
Script documentation |
2 |
- Uses extensive (yet succinct) comments to document what is being done within scripts.
|
Good practices in scripts |
4 |
- Uses no absolute paths in scripts.
- Uses scripts that take arguments where appropriate and minimizes “hard-coding” of potentially variable things like input/output dirs, file names, and some software settings. Any hard-coded variables/constants that are present are clearly set at the top of scripts.
- Has individual scripts that are not overly long and don’t do multiple unrelated things.
- Has no or only clearly annotated lines that are “commented out” in scripts.
- Uses Bash scripts with proper
set settings, and similar good practices as taught in the course.
|
Coding quality and complexity |
3 |
- Has code that demonstrates an understanding of topics covered in the course.
- Has code that is appropriately broken up in small parts within scripts, e.g. with functions in Python.
- Uses tools and commands that are (by and large) appropriate to accomplish its goals. (I will not dig in to fine details and parameter settings.)
|
Workflow reproducibility |
3 |
- Has a script or Snakefile that includes all steps in the workflow and that can be run by anybody with access to your repository and the raw data files.
- Has information for the instructor (or any other reader of the project!) about where at OSC to find the raw data files and other details needed to try to rerun the analyses.
- Bonus: good software management, e.g. Conda environments (preferred) or OSC modules and no manual installs unless necessary; YAML files describing environments.
|
SLURM jobs at OSC |
2 |
- Has one or more scripts that are run as OSC jobs at SLURM.
- Uses appropriate SLURM directives – either in the scripts or in a Snakemake profile YAML file.
|
Version control
ddddddddddddddddd |
2
dddddd |
- Has Git commit messages that are informative.
- Has reasonably appropriate commits, e.g. individual Git commits don’t consist of multiple completely unrelated edits.
- Has a single
.gitignore file that ignores files like large raw data files, and in most cases, results files.
- Bonus: has a Git tag for the submitted version.
dddddddddddddddddddddddddddddddddddddddddddddd dddddddd dddddddd |
Questions and advice
Don’t hesitate to contact Jelmer (or, where fitting, Zach) for questions about topics like:
- Specific expectations for the final project that are unclear to you.
- Whether you are on the right track in making some adjustments that I asked for after your progress report.
- Advice on how to code or organize aspects of your project.
We’re happy to answer questions by e-mail or in a Zoom meeting!
Late submissions
Late submission may be accommodated depending on circumstances, but you will need to contact Jelmer before 3 pm on April 30th and we can take it from there.
For late submissions with no advance notice, 4 points will be subtracted for each day the submission is late. (Things like forgetting to open an Issue won’t lead to subtracted points.)
Good luck!!