Exercises: Week 9


Setup

Problems with the keyboard shortcut?

If this doesn’t work, check your keyboard shortcut by right-clicking in the script and looking for “Run Selection/Line In Python Interactive Window”.

Also, you can open the Command Palette (Ctrl+Shift+P) and look for that shortcut there, and change it if you want.


Intro to CSB exercises

From the CSB Chapter 3 preface to the exercises:

Here are some practical tips on how to approach the Python exercises (or any programming task):


Exercise CSB-1: Measles time series

In their article, Dalziel et al. (2016) provide a long time series reporting the numbers of cases of measles before mass vaccination, for many US cities. The data consist of cases in a given US city for a given year, and a given “biweek” of the year (i.e., first two weeks, second two weeks, etc.). The time series is contained in the file Dalziel2016_data.csv.

  1. Write a program that extracts the names of all the cities in the database (one entry per city).
Hints

  1. Write a program that creates a dictionary where the keys are the cities and the values are the number of records (rows) for that city in the data.
Hints

  1. Write a program that calculates the mean population for each city, obtained by averaging the values of pop.
Hints

import csv
citypop = an empty dictionary
open data file reading
set up dictionary reader

for each line in data
  my_city = extract the city
  my_pop = extract population
  if this is the first time you see this city, initialize:
     citypop[my_city] = [0.0, 0]
  citypop[my_city][0] = what it was before + my_pop
  citypop[my_city][1] = what it was before + 1

for each city
  divide the first element by the second to obtain the mean

  1. Write a program that calculates the mean population for each city and year.
Hints

Note that the worked-out solution in the link below uses the first strategy.


Bonus: Exercise CSB-2: Red queen in fruit flies

Singh et al. (2015) show that, when infected with a parasite, the four genetic lines of D. melanogaster respond by increasing the production of recombinant offspring (arguably, trying to produce new recombinants able to escape the parasite). They show that the same outcome is not achieved by artificially wounding the flies. The data needed to replicate the main claim (figure 2 of the original article) is contained in the file Singh2015_data.csv.

Open the file, and compute the mean RecombinantFraction for each Drosophila genetic line, and InfectionStatus (W for wounded and I for infected).

Print the results in the following form:

Line 45 Average Recombination Rate:
W : 0.187
I : 0.191
Hints

For each Dropsophila genetic line, you need to keep track of all the recombination rates for W (wounded) and I (infected).

For example, you could build a dictionary of dictionaries in which the first (outer) dictionary has a key for each line, and the inner dictionary has a key for each status (W or I) and a list of recombination rates as each value.

Then, you would calculate averages for each list at the end.


CSB Solutions

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".