Graded Assignment 3: Software and shell scripts

Week 6 graded assignment – due Sunday Oct 5 at 11:59 pm

Author
Affiliation

Jelmer Poelstra

Published

September 26, 2025



Overview

This graded assignment is worth 10 points and is due on Sunday Oct 5th. You will practice with shell scripts, shell variables, containers, modules, and version control.

Directions and grading

Submission expectations

  • Deadline: Sunday Sunday Oct 5th at 11:59 pm.
  • Submission: You will submit your assignment by tagging the instructor in an Issue in your GitHub as per the last step below.

Academic integrity

Gen-AI check

For this assignment, you are allowed to search the internet but not to use generative AI (so you cannot use, e.g., Google AI Mode output either).

If you use commands or code constructs we did not cover in class, you must provide the source: webpage or book page, etc. You may also argue that you already knew the command in question from previous self-study, but should then be prepared to answer live questions about it over Zoom. If you don’t provide a source or otherwise explain your usage of such commands, your answer will be considered wrong and not give you any points.


Use of generative AI Tools (e.g. ChatGPT, Microsoft Copilot, Google Gemini) is not permitted.
Getting help on the assignment is not permitted
Collaborating, or completing the assignment with others, is not permitted
Copying or reusing previous work is not permitted
Open-book research for the assignment is permitted
APA Citations and/or formatting for this assignment are not required

Rubric

You can earn a total of 10 points across the 6 parts as follows:

Part Points
A 1.5
B 2
C 3
D 1
E 1.5
F 1

Detailed steps

General

In the Markdown file with your answers, copy the numbered instructions/questions into your document, and insert your answers in between. You only need to do this where you think it makes sense: e.g., it is not needed for questions 1 and 2.

The code parts of your answers should be entered in Markdown code blocks, and that code’s output should also be part of your answer. For example:

1. List the files in your current working dir.

   I used the following command:
     
   ```bash
   ls
   ```
     
   The output was:
     
   ```
   file1.txt   file2.txt   annot.gtf
   ```

However, you don’t need to record every Git command you used in your README.md. This would be tedious and moreover, we can check much of your Git history using the repo itself.

Part A: Setting up

  1. Start a VS Code session at OSC in the folder /fs/ess/PAS2880/users/$USER. Create a new dir for this assignment, /fs/ess/PAS2880/users/$USER/GA3, and switch to that folder in VS Code using the “Open Folder” option. This should be your working dir for the entire assignment.

  2. Inside dir GA3, create a README.md file and open it in the VS Code editor. Use this file throughout the assigment to add your answers.

  3. Inside dir GA3, also create dirs data, scripts and results. This time, we’ll make your dir self-contained, like a typical situation for a research project, by copying some data into it:

    cp -v ../garrigos-data/fastq/ERR10802863* data/
    cp -v ../garrigos-data/meta/metadata.tsv data/
  4. Initialize a Git repository. Commit to the repo throughout the assignment as you see fit, but at least once for each “Part” of this assignment. Use and commit a .gitignore file as you see fit.

Part B: Running two scripts

Here, you’ll try to understand and run two tiny shell scripts.

  1. Save the script below as scripts/echo.sh. Look at the below three commands to run the script, and describe what you expect to be the output and why. Then, test the commands and if any output was not as expected, reconcile the differences.

    • The script:

      #!/bin/bash
      set -euo pipefail
      
      echo "$1"
      echo "$2"
      echo "$3"
      echo "Finished"
    • The commands:

      bash scripts/echo.sh Oct07 Oct08
      bash scripts/echo.sh Oct07 Oct08 Oct09 Oct10
      bash scripts/echo.sh Oct07 Oct08 "Oct09 Oct10"
  2. Save the script below as scripts/concat.sh. Run it to concatenate the two FASTQ files you copied to data/ earlier. (Input files can remain compressed and therefore, so will the output file be.)

    #!/bin/bash
    set -euo pipefail
    
    cat "$1" "$2" > "$3"
    ls -lh "$1" "$2" "$3"

Part C: A shell script that prints a specific line

You’ll write a shell script that accepts arguments, and test whether it works.

  1. Write a shell script scripts/printline.sh that accepts two arguments, a file path and a line number, in order to print (not store in a file) the requested line from the specified file. Don’t make the script print anything other than the requested line from the file, and make sure to include the discussed best-practice header lines.

    Hint: Figure out a way to combine head and tail to print one specific line from a file. If you don’t manage to, your script can instead print all lines up until the specified line, so you can still move on to the next two steps.

  2. Test your script twice by making it print two different lines from the data/metadata.tsv file you copied above. In one of these tests, redirect the script’s output to an appropriately named file in results.

    (Add the code to run these tests and the one in the next question to your README.md.)

  3. Run a final test, but now:

    • First, store the file name (just metadata.tsv) in one variable and the line number you chose in another.
    • Then, use these two variables as the arguments that you give to the script.
    • Finally, redirect the script’s output to a file in results whose name includes both the file name and the line number. But instead of simply typing the literal output file name, use the two variables you created again, now to “build” the output file name programmatically. The output file name should not include any spaces.

Part D: Containers

The program “Trim-Galore”, which you’ll use in the next few weeks of the course, trims and filters FASTQ files to remove adapters, poor quality bases, and short reads. Here, you’ll find and test-run containers with two different versions of this program.

  1. Go to https://seqera.io/containers and find a container image for the program (default, i.e. latest, version). Back in VS Code, test-run the container with the command trim_galore -v.

  2. It turns out that your collaborator has been using an older version of Trim-Galore to trim other FASTQ files in the same project, so you decide that it will be best if you use that same version as well. Find a container for Trim-Galore version 0.5.0 and test-run it with the same command as above. Are the versions printed in both cases as expected?

Part E: Modules and Pandoc

You’ll practice with OSC software modules and the program Pandoc, which can render Markdown files to HTML and PDF.

  1. See if Pandoc is available at OSC prior to loading anything, and if so, which version, by running pandoc -v. Then, search the internet to check if that Pandoc version is the most recent one.

  2. Check what other versions of Pandoc are available in OSC Lmod modules, and load the module with the most recent available Pandoc version.

  3. Use the below Pandoc command to create a PDF of your README.md, and check whether the PDF file is there.

    pandoc -o README.pdf README.md
  4. To view PDF files in VS Code, you’ll first need to install a VS Code extension. Install the extension “Papyrus PDF Preview”, and take a look at your PDF. Then, download the PDF file to your computer and also take a look at it there.

  5. Does everything in the rendered PDF look OK? Or did you make formatting errors, or did anything else not render as expected? If so, update your Markdown file, rerender the PDF, and check again.

Part F: Publish your repo on Github

You’ll publish your Git repo on GitHub and “hand in” your assignment by creating an Issue.

  1. Create a repository on GitHub, connect it to your local repo, and push your local repo to GitHub.

  2. Create a new issue and tag GitHub user menukabh, asking Menuka to take a look at your assignment.

Solutions

Click here for the solutions to this assignment (added 2025-10-13)

Back to top