This list of further resources is organized by the main topics covered in this course – see the Contents on the left-hand site.
The first two are available online at the OSU library:
The Linux Command Line: A Complete Introduction (William Shotts, 2019)
Linux Command Line and Shell Scripting Bible (Christine Bresnahan, 2015)
Command Line Kung Fu: Bash Scripting Tricks, Linux Shell Programming Tips, and Bash One-liners (Jason Cannon, 2014)
Nice collections of one-liners, mostly for bioinformatics
The Bash Guide by Maarten Billemont with separate very useful FAQ and pitfalls pages.
Bash Guide for Beginners by Machtel Garrels.
Ten simple rules for getting started with command-line bioinformatics (Brandies & Hogg 2021, PLoS Computational Biology)
Five reasons why researchers should learn to love the command line (Perkel 2021, Nature)
I wouldn’t necessarily recommend diving so deep into Git as to read a book about it, but this book provides an excellent reference, is quite accessible, and is freely available online:
Happy Git and GitHub for the useR
Somewhat R-centric, but a very accessible introduction to Git.
Git best practices by Seth Robertson
Atlassian Git tutorials
Atlassian is behind Bitbucket, an alternative to GitHub that also hosts Git repositories, and its Git tutorials are very useful.
A quick GitHub overview of some Git and GitHub functionality including branching and Pull Requests, and how to do these things in your browser at GitHub.
Git-it – a small application to learn and practice Git and GitHub basics.
Visualizing Git
These visualizations can help to get some intuition for Git. Note that at the prompt, you can only type Git commands and since there are no actual files involved, you can’t use git add
– just commit straight away.
Some slides on undoing that we did not get to in our Git week.
How to undo (almost) anything with Git – by the GitHub blog
Oh Shit, Git!?! – by Katie Sylor-Miller
Git flight rules – by Kate Hudson.
Covers much more than undoing.
Excuse me, do you have a moment to talk about version control? (Bryan 2017, PeerJ)
Ten Simple Rules for Taking Advantage of Git and GitHub (Perez-Riverol et al. 2016, PLoS Comutational Biology)
These are all available online, freely or via the OSU library:
Python for the life sciences: a gentle introduction to python for life scientists (Alexandar Lancestar, 2019).
Very explicitly geared towards biologists with no or little programming experience, and takes a very practical and project-oriented approach. From what I’ve seen of the book, I can highly recommended it!
Python for Bioinformatics (Sebastian Bassi, 2018).
Starts with an introduction to Python and then has chapters that each describe practical problems/projects for Python. (Associated GitHub repository.)
Python programming for biology, bioinformatics, and beyond (Tim Stevens, 2015).
Starts with an introduction to Python and then has chapters on topics like “Pairwise sequence alignments”, “Sequence variation and evolution”, and “High-throughput sequences”.
Reproducible Bioinformatics with Python (Ken Youens-Clark, 2021).
A slightly more advanced book that does not start with an introduction to Python, but you should be able to follow the book with what you’ve learned in this course.
Python Data Science Handbook (Jake VanderPlas, 2016).
Focused on NumPy, Pandas, visualization with Matplotlib, and machine learning with scikit-learn. Freely available online.
Python for Data Analysis (Wes McKinney, 2017).
Focused on NumPy, Pandas, and visualization with Matplotlib.
Dive Into Python 3 (Mark Pilgrim, 2009).
Python for everybody – Includes course materials and lectures, and is also available at Coursera and edX.
Programming for Biology – This is the Cold Spring Harbor course that your TA Zach took, and the materials are available online.
Using Python for Research Free edX course – you can also find links to just the videos towards the bottom of this page.
Videos from the MIT course “Introduction to Computer Science and Programming in Python”
You can also use the Jupyter Notebooks / JupyterLab as an Interactive App at OSC OnDemand. If you’re interested in using this, I would recommend trying JupyterLab which can run Jupyter Notebooks but also regular Python scripts, a shell, and so forth.
This JupyterLab documentation provides a nice introduction to JupyterLab features (the link goes to documentation for a version close to the one at OSC).
For a general introduction to Jupyter Notebooks, see also How to Use Jupyter Notebook in 2020: A Beginner’s Tutorial.
Ten simple rules for writing and sharing computational analyses in Jupyter Notebooks – Rule et al. 2019, PLoS Computational Biology
The official Snakemake tutorial.
Can be run in your browser!
A Carpentries lesson on working on a compute cluster, with a large section on Snakemake starting on this page.
Workflow systems turn raw data into scientific knowledge (Perkel 2019, Nature “Toolbox” feature).
Sustainable data analysis with Snakemake (Mölder et al. 2020, Zenodo).
Practical Computing for Biologists (Haddock & Dunn, 2011)
Good enough practices in scientific computing (Wilson et al. 2017, PLoS Computational Biology)
Streamlining data-intensive biology with workflow systems (Reiter et al. 2021, GigaScience)
Reproducible Research: A Retrospective (Peng & Hicks 2021, Annu Rev Public Health)
Ten Simple Rules for Reproducible Computational Research (Sandve et al. 2013, PLoS Computational Biology)
The plain person’s guide to plain text social science (Kieran Healy)
A few courses on genomic data analysis with lots of great online material available:
Sites with online material for many computational biology workshops:
Rosalind – A website with bioinformatics exercises
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".