Week 3: Working with files in the Unix shell

Author
Affiliation

Jelmer Poelstra

Published

September 5, 2025



Overview

This week, we will focus on working with files in the Unix shell. Because we’re going to start working with (DNA/RNA/protein) sequence files, you’ll also get introduced to various types of sequence files. Finally, you’ll learn about the specific files from the example dataset you’ll work with throughout the course.

Learning goals & lectures

Sequence file types and the example dataset (Tuesday)
  • What FASTA, FASTQ, and GFF/GTF files are for and what they look like
  • How base-call quality scores are coded in FASTQ files
  • What the experimental design the course’s example dataset’s study looks like
  • Which files are in our example data set
Working with files in the Unix shell I: The shell as a file browser (Tuesday/Thursday)

Learn to use Unix shell commands to:

  • Create, copy, move, rename and delete directories and files
  • Select multiple files with a shell wildcard
Working with files in the Unix shell II: Viewing, summarizing and manipulating files (Thursday)

Learn to use Unix shell commands to:

  • View the contents of text files in various ways
  • Use redirection and the pipe to flexibly process the output of commands
  • Search within, manipulate, and extract information from text files

Readings

  • Garrigós et al. (2025): Two avian Plasmodium species trigger different transcriptional tesponses on their vector Culex pipiens
    This is the paper associated with the dataset we will use throughout the course, starting this week.

Assignments & exercises

Further resources

Back to top

References

Buffalo, Vince. 2015. Bioinformatics Data Skills [Reproducible and Robust Research With Open Source Tools]. First edition. Beijing: O’Reilly.
Garrigós, Marta, Guillem Ylla, Josué Martínez-de la Puente, Jordi Figuerola, and María José Ruiz-López. 2025. “Two Avian Plasmodium Species Trigger Different Transcriptional Responses on Their Vector Culex pipiens.” Molecular Ecology 34 (15): e17240. https://doi.org/10.1111/mec.17240.