General Workshop Info & Signing Up

Overview

This workshop is geared towards people who would like to get started with analyzing genomic datasets.

It will be taught in-person with video-linking at the Wooster and Columbus Ohio State campuses, and it is also possible to join online through Zoom.

We will have an instructor at each campus: Jelmer Poelstra from the Molecular and Cellular Imaging Center (MCIC) at the Wooster campus, and Mike Sovic from the Center for Applied Plant Sciences (CAPS) at the Columbus campus. We will also have two graduate student TAs: Menuka Bhandari (Center for Food Animal Health) and Camila Perdoncini Carvalho (Plant Pathology).

The workshop will be highly hands-on and take place across three afternoons:
Wed, Aug 17 - Fri, Aug 19, 2022.

  • Anyone affiliated with The Ohio State University or Wooster USDA can attend
  • Attendance is free
  • No prior experience with coding or genomic data is required
  • You will need to bring a laptop and don’t need to install anything prior to or during the workshop
  • We will work with example genomics data but if you have any, you are also welcome to bring your own data.

See below for information about the contents of the workshop and to sign up.
For questions, please email Jelmer.

Contents of the workshop

The focus of the workshop is on building general skills for analyzing genomics data, such as RNAseq, metabarcoding, metagenomic shotgun sequencing, or whole-genome sequencing. These skills boil down to the ability to write small shell scripts that run command-line programs and submit these scripts to a compute cluster – in our case, at the Ohio Supercomputer Center (OSC).

Topics

  • Introduction to the Ohio Supercomputer Center (OSC)
  • Using the VS Code text editor at OSC
  • Introduction to the Unix shell (= the terminal / command line)
  • Basics of shell scripts
  • Software at OSC with modules & Conda
  • Submitting your scripts using the SLURM scheduler
  • Putting it all together: practical examples of running analysis jobs at OSC

The modules will be a mixture of lectures that include “participatory live-coding” (with the instructor slowly demonstrating and participants expected to follow along for themselves) and exercises.

Some more background

Command-line programs are preferred for many of the steps to analyze genomic sequencing data, such as those involving quality control, trimming or adapter removal, and assembly or mapping. Other features of such datasets are that they tend to contain a lot of data, and that many analysis steps can be done independently for each sample. It therefore pays off -or may be necessary- to run your analyses not on a laptop or desktop, but at a supercomputer like OSC.

Being able to run your analysis with command-line programs at OSC involves a number of skills that may seem overwhelming at first. Fortunately, learning the basics of these skills does not take a lot of time, and will enable you to be up-and-running with working on your own genomic data! Keep in mind that these days, excellent programs are available for almost any genomics analysis, so you do not need to be able to code it all up from scratch. You will just need to know how to efficiently run such programs, which is what this workshop aims to teach you.

Sign up!

To apply to attend the workshop, please fill out the form below. There is no real selection procedure: we accept anyone who is at OSU/USDA and signs up before we have reached the maximum number of participants.