Open a new file and save it as week08_exercises.py
or something along those lines.
Type your commands in the script and send them to the prompt in the Python interactive window by pressing Shift+Enter.
If this doesn’t work, check your keyboard shortcut by right-clicking in the script and looking for “Run Selection/Line In Python Interactive Window”.
Also, you can open the Command Palette (Ctrl+Shift+P) and look for that shortcut there, and change it if you want.
4.88
.
We can use the function type()
:
type(4.88)
<class 'float'>
= 4.88
num type(num)
<class 'float'>
n_samples = 658
, and then extract the third character from n_samples
.
You can’t index a number like n_samples[index]
, so you’ll first have to convert n_samples
to a string. Also, recall that Python starts counting from 0!
= 658
n_samples str(n_samples)[2]
'8'
adapter
. Print the number of characters in adapter
.
= 'CTTATGGAAT'
adapter len(adapter)
10
A
s by N
s in adapter
and assign the resulting string to a new variable. Print the new variable.
Use the string method replace()
, and recall that methods are called using the <object_name>.<method_name>()
syntax.
= adapter.replace('A', 'N')
bad_seq bad_seq
'CTTNTGGNNT'
replace()
method does by using the built-in help.
If you are typing your commands in a script rather than straight in the console, you will get some more information already when typing the opening parenthesis of the method (briefly pause if necessary).
To get more help, you can use a notation with a ?
, or help(object.method)
.
help(adapter.replace)
# Or: "adapter.replace?"
# Or: "?adapter.replace"
Help on built-in function replace:
replace(old, new, count=-1, /) method of builtins.str instance
Return a copy with all occurrences of substring old replaced by new.
count
Maximum number of occurrences to replace.
-1 (the default value) means replace all occurrences.
If the optional argument count is given, only the first count occurrences are
replaced.
As it turns out, the third argument, count
, determines how many instances of the substring will be replaced.
A
s in adapter
by N
s.
We specify 2
as the third argument, which is the number of instances of the substring that will be replaced:
'A', 'N', 2) adapter.replace(
'CTTNTGGNAT'
"False"
(with quotes), 0
, 1
, -1
, ""
, None
, and see if you can make sense of these results.
bool("False")
True
bool(1)
True
bool(0)
False
bool(-1)
True
As it turns out, among numbers and strings, only 0 is interpreted as False
, whereas anything else is interpreted as True
.
bool("")
False
bool()
False
bool(None)
False
But an empty string, nothing at all between parenthesis, and None
(Python’s keyword to define a null value or the lack of a value), are also interpreted as False
.
Note that as soon as you quote "None"
, it is a string again and will be interpreted as True
:
bool("None")
True
adapter.
(note the .
). Can you find a method that will print the last occurrence of a T
in adapter
?
The method rfind
will search from the right-hand side (hence r
), and will therefore print the last occurrence of the specified substring.
"T") adapter.rfind(
9
GAGTCCCTNNNAGCAACGTTNNTTCGTCATTAN
by N
s.
Use the split()
method for strings.
= "GAGTCCCTNNNAGCAACGTTNNTTCGTCATTAN"
seq = seq.split('N')
split_seq split_seq
['GAGTCCCT', '', '', 'AGCAACGTT', '', 'TTCGTCATTA', '']
plant_diseases
that contains the items fruit_rot
, leaf_blight
, leaf_spots
, stem_blight
, canker
, wilt
, root_knot
and root_rot
.
= ['fruit_rot', 'leaf_blight', 'leaf_spots', 'stem_blight',
diseases 'canker', 'wilt', 'root_knot', 'root_rot']
stem_blight
from diseases
by its index (position).
stem_blight
is the fourth item and because Python starts counting at 0, this is index number 3.
3] diseases[
'stem_blight'
diseases
.
Recall that when using ranges, Python does not include the item corresponding to the last index.
While index 5 is the sixth item, it is not included, so we specify 0:5
or :5
to extract elements up to and including the fifth one:
0:5] diseases[
['fruit_rot', 'leaf_blight', 'leaf_spots', 'stem_blight', 'canker']
Or:
5] diseases[:
['fruit_rot', 'leaf_blight', 'leaf_spots', 'stem_blight', 'canker']
diseases
.
Recall that you can use negative numbers to start counting from the end. Also, while 0
is the first index, “-0” (or something along those lines) is not the last index.
-1] diseases[
'root_rot'
diseases
.
Note that you’ll have to omit a number after the colon in this case, because [-3:-1]
would not include the last number, and [-3:0]
does not work either.
-3:] diseases[
['wilt', 'root_knot', 'root_rot']
yield_current
with the following items:"plotA_1": 12, "plotA_2": 18, "plotA_3": 2,
{"plotB_1": 33, "plotB_2": 28, "plotB_3": 57}
= {"plotA_1": 12, "plotA_2": 18, "plotA_3": 2,
yield_current "plotB_1": 33, "plotB_2": 28, "plotB_3": 57}
yield_current
{'plotA_1': 12, 'plotA_2': 18, 'plotA_3': 2, 'plotB_1': 33, 'plotB_2': 28, 'plotB_3': 57}
plotA_3
.
We can get the value for a specific key using the <dict>[<key>]
notation:
"plotA_3"] yield_current[
2
plotB_2
to be 31
and check whether this worked.
We can simply assign a new value using =
:
"plotB_2"] = 31
yield_current["plotB_2"] yield_current[
31
Use the len()
function.
len(yield_current)
6
Bonus: Create a dictionary obs_20210305
with keys plotA_3
and plotC_1
, and values 18
and 3
, respectively.
Then, update the yield_current
dictionary with the obs_20210305
dictionary, and check whether this worked.
= {"plotA_3": 18, "plotC_1": 3} obs_20210305
We use the update()
method as follows:
yield_current.update(obs_20210305)
yield_current
{'plotA_1': 12, 'plotA_2': 18, 'plotA_3': 18, 'plotB_1': 33, 'plotB_2': 31, 'plotB_3': 57, 'plotC_1': 3}
Now, our dictionary has an updated value for key “plotA_3”, and an entirely new item with key “plotC_1”.
Extract the values with the values()
method. Next, turn these values into a set to get the unique values. Finally, count the unique values with the len()
function.
len(set(yield_current.values()))
6
dna
with 4 items: each of the 4 bases (single-letter abbreviations) in DNA.
Recall the use of curly braces to assign a set.
The order of the bases doesn’t matter, because sets are unordered.
= {'A', 'G', 'C', 'T'} dna
rna
with 4 items: each of the 4 bases (single-letter abbreviations) in RNA.
= {'A', 'G', 'C', 'U'} rna
& rna dna
{'A', 'C', 'G'}
Or:
dna.intersection(rna)
{'A', 'C', 'G'}
| rna dna
{'G', 'C', 'T', 'A', 'U'}
Or:
dna.union(rna)
{'G', 'C', 'T', 'A', 'U'}
- rna dna
{'T'}
Or:
dna.difference(rna)
{'T'}
purines
with the two purine bases and a set named pyrimidines
with the three pyrimidine bases.
= {'A', 'G'}
purines = {'C', 'T', 'U'} pyrimidines
You can combine more than two sets either by chaining methods or adding another operator.
& dna & rna pyrimidines
{'C'}
Or:
pyrimidines.intersection(dna).intersection(rna)
{'C'}
- dna) & pyrimidines (rna
{'U'}
Or:
- dna & pyrimidines rna
{'U'}
Or:
rna.difference(dna).intersection(pyrimidines)
{'U'}
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".