class:inverse middle center # *Week 8 - First steps with Python* ---- # I: Python basics <br> <br> <br> <br> <br> ### Jelmer Poelstra ### 2021/03/02 (updated: 2021-03-04) --- ## Overview of this week Today, we will set up VS Code for Python and work our way through CSB chapter 3.1-3.4 to get started with Python. Thursday will focus on section 3.4: Python data structures. <br> .content-box-info[ The GitHub website has a [list of useful Python resources](https://mcic-osu.github.io/pracs-sp21/w08_python-resources.html). ] --- class: inverse middle center # Overview of this presentation ---- .left[ - ### [What is Python and why Python](#python) - ### [Python setup in VS Code](#setup) - ### [CSB 3.3: Getting started with Python](#getting-started) ] <br> <br> <br> --- name: python ## What is Python? - Python is an all-purpose programming language that was created by Guido van Rossem, first released in 1991, and named after Monty Python. <br> - It is an *interpreted* (or *scripting*) language, which means that its code does not need to be *compiled* prior to execution. Instead, we can run scripts as-is, like we did with Bash scripts. Moreover, we can open up an interactive prompt and run commands one-by-one, like we also did in the shell. --- ## What is Python? - ["The Zen of Python"](https://www.python.org/dev/peps/pep-0020/) describes guidelines for Python code with aphorisms that include: - Beautiful is better than ugly. - Explicit is better than implicit. - Simple is better than complex. - Complex is better than complicated. -- - Readability counts. - If the implementation is hard to explain, it's a bad idea. - There should be one –and preferably only one– obvious way to do it. <br> - The last point relates to the notion of "Pythonic" code that follows these and other [style guidelines](https://pep8.org/). --- ## Python versus Perl custom function <br> to calculate the average of a series of numbers - Python: ```python def avg(data): if len(data) == 0: return 0 else: return sum(data) / len(data) print(avg([1,2,3,4,5])) ``` - Perl: ```perl sub avg(@_) { $sum += $_ foreach @_; return $sum / @_ unless @_ == 0; return 0; } print avg((1..5))."\n"; ``` --- ## Why Python? - Good language for new programmers due to its straightforward, highly readable style. - Highly flexible, with e.g. good support for *object-oriented* and *functional* programming styles. - Lot of built-in features that would require third-party libraries in other languages. - Excellent availability of third-party libraries. - Cross-platform, open source, and very popular. --- ## Popularity of Python <figure> <p align="center"> <img src=img/python-popularity_stackoverflow.png width="75%"> <figcaption><a href="https://stackoverflow.blog/2017/09/06/incredible-growth-python/">From Stack Overflow: "The Incredible Growth of Python" (2017)</a> </figcaption> </p> </figure> --- <figure> <p align="center"> <img src=img/python-popularity_statista.png width="55%"> <figcaption><a href="https://www.statista.com/chart/21017/most-popular-programming-languages/">From statista.com (2020)</a> </figcaption> </p> </figure> --- ## Usage of Python Some of the most common broad usage categories for Python are: - Data science – especially machine learning - Engineering, astronomy, biology & bioinformatics - Web development - Electronics - Software like the Dropbox client, games, movies, ... <br> -- For instance, it is: - One of the 4 official languages of Google (along with Java, C++, and Go) - Tightly integrated with Linux, e.g. Ubuntu Linux prefers community contributions to be in Python. --- ## Where to write and run Python code **We will be working with Python in VS Code**, because: - VS Code has very good Python support. - This way, you don't have to learn a new environment along with a new language. --- ## Where to write and run Python code - An even more common way to run Python interactively is with *Jupyter Notebooks* / *JupyterLab*. - Jupyter Notebooks combine code with prose in Markdown format, and can be run in OSC OnDemand as Interactive Apps. - You can also run full-fledged Jupyter Notebooks in VS Code. .content-box-info[ The "Python Interactive" window we will use in VS Code in fact also uses the Jupyter infrastructure, so you will see annotations like "*Jupyter Server: local*". ] -- - Finally, there are also IDEs (Integrative Development Environment) similar to RStudio for Python – most popular are *PyCharm* and *Spyder*. -- .content-box-diy[ I would recommend to start following along in VS Code, but if you're not a fan, feel free to explore other environments – and even within the course you should be able to switch to another one. ] --- name: setup ## Python setup in VS Code - Open a terminal inside VS Code and start an interactive session: ```sh sinteractive -A PAS1855 -t 60 ``` - Load the OSC Conda module, create a new Conda environment with Python 3.9, activate it, and install `notebook` and `pylint` into it: ```sh module load python/3.6-conda5.2 conda create -y -n ipy-env -c conda-forge python=3.9 source activate ipy-env conda install -y -c conda-forge -y notebook pylint ``` The Conda installation should take a while and in the mean time you can do this next step: - Click on the `Extensions` icon in the VS Code Activity Bar (the far left, narrow side bar). In the Extensions side bar's search box, type "Python". Then, click on the `Install` button for what should be the top hit, which is simply called `Python` (by `ms-python`, 14.9 M downloads). --- ## Python setup in VS Code (cont.) - In the command palette (<kbd>Ctrl</kbd>+<kbd>Shift</kbd>+<kbd>P</kbd>), start typing "interpreter" and then select `Python: Select Interpreter to Start Jupyter Server`. Find the Conda environment you just created (`Python 3.9.2 64-bit (ipy-env: conda)`) and click on it. - In the narrow bar all the way at the bottom, there should now be a Python environment specification. If it is not the one you selected above, but e.g. `Python 3.6.8 64-bit`, click on that, and *again select the Python 3.9.2 Conda environment*. Now, the bottom bar should display the Python 3.9.2 environment. - Open the command palette again (<kbd>Ctrl</kbd>+<kbd>Shift</kbd>+<kbd>P</kbd>), type "interactive", and select `Python: Show Python Interactive Window`. If it worked, you should have a "Python Interactive" window including a pale blue panel at the bottom, which says *"Type code here and press shift-enter to run"*. --- class: center middle inverse name: getting-started # 3.3. Getting started with Python ----- <br> <br> <br> <br> --- ## Simple calculations - We can use Python as an oversized calculator: ```py 3 + 3 # Addition #> 6 3 * 3 # Multiplication #> 9 3 / 2 # Division #> 1.5 ``` -- - The usual precedence rules apply, but it's better to be explicit with parentheses: ```py 2 * 3 ** 3 # Read as 2*(3**3) #> 54 2 * (3 ** 3) #> 54 (2 * 3) ** 3 #> 216 ``` --- ## Simple calculations (cont.) - **Integer division** (AKA floor division) is division in which the fractional part (remainder) is discarded: ```py 5 // 2 #> 2 ``` <br> - **The modulo operator** returns the remainder of an integer division: ```py 5 % 2 #> 1 ``` _(As 5 = (2*2) + 1), 1 is the remainder of the integer division 5//2.)_ -- <br> .content-box-info[ Like in Bash and other languages, anything after a pound sign `#` is ignored by Python and generally functions as a "comment". ] --- ## Logical operators: `True`, and `False` - Logical operators return a Boolean value, which in Python consist of `True` and `False` (with that exact case): ```py 2 > 3 # Larger than #> False 2 == 2 # Equals #> True 2 != 2 # Does not equal #> False ``` --- ## Quoting defines strings - Quotes define strings, and you can single, double, triple-single, or triple-double quotes – but always make sure to use the same closing quote as the opening quote. ```py 'my string' "my string" '''my string''' """my string""" #> 'my string' ``` -- - Being able to use single versus double quotes helps if your string contains one of those: ```py "The tree's height" #> 'The tree's height' ``` - And triple quotes are useful if your string contains both single and double quotes: ```py """The tree's height is 6'2".""" #> 'The tree\'s height is 6\'2".' ``` --- ## Variable assignment From working with Bash, we are familiar with the process of assigning variables (also known as "declaring variables") and then using them. - This works the same in Python, but when recalling a variable, we don't need any special syntax (like `$` in Bash): ```py x = 5 x #> 5 ``` -- - Like in Bash, we can use variables to perform further calculations – they will simply be replaced by whatever value they hold: ```py x + 3 #> 8 y = 8 x + y #> 13 ``` .content-box-info[ Python is not sensitive to spaces around the `=` for assignment. ] --- ## Variable assignment (cont.) - When we assign a simple variable like a number or a string to a *new* variable, **a new object is created**, and the two are *not* linked: ```py x #> 5 y = x y #> 5 y = y + 10 y #> 15 x #> 5 ``` -- .content-box-info[ This behavior of "unlinked variables" may be very intuitive or expected – but in Python, this is only true for *immutable* objects! We will see examples of *mutable* objects on Thursday, such as *lists*. ] --- ## Variable types - Unlike in Bash, variables in Python have *types*, such as: - `int` – Integer (whole number) - `float` – Floating point number (number with decimal points) - `str` – String - `bool` – Boolean (`True` or `False`) -- - Python types variables *dynamically* (also known as "duck typing"). This means you don't have to a priori define what type a variable should be. You can check *what Python has decided for you* using `type()`: ```py x = 2 type(x) #> int x = "two" type(x) #> str ``` --- ## Setup for Thursday - **In the OnDemand form for VS Code, set the Working Directory to `/fs/ess/PAS1855/users/<your-username>`.** - Open a new file and save it as `week08.py`. - A Python environment should load: see the bottom bar. If it's not the Python in your Conda environment (`Python 3.9.2 64-bit (ipy-env: conda)`), click the bottom bar and select it. - Type a dummy line like `5+5`. - Right-click in the editor and select "Run Selection/Line in Python Interactive Window". *(The keyboard shortcut is probably <kbd>Shift</kbd>+<kbd>Enter></kbd>.)* .content-box-info[ Is this doesn't work, we can troubleshoot after class again – for now, open a terminal in VS Code and start a regular Python prompt with: ```sh $ python3 ``` ] --- <br> .content-box-info[ ### Alternative setup in VS Code - In the command palette (<kbd>Ctrl</kbd>+<kbd>Shift</kbd>+<kbd>P</kbd>), start typing "interpreter" and then select **`Python: Select Interpreter to Start Jupyter Server`**. - Find the Conda environment you created on Tuesday (should be: `Python 3.9.2 64-bit (ipy-env: conda)`) and click on it. - If the Python version in the narrow bar at the bottom is not the one you selected, click on it, and *again select the Conda Python 3.9.2 environment*. - Open the command palette again (<kbd>Ctrl</kbd>+<kbd>Shift</kbd>+<kbd>P</kbd>), type "interactive", and select **`Python: Show Python Interactive Window`**. ] --- ## Variable types (cont.) We can combine variables of the same type using operators like `+` and `*`. We already saw this for numbers, but we can do the same for strings: - Assign a string, and concatenate strings with `+`: ```py x = "The cell grew" x + " and is now larger" #> 'The cell grew and is now larger' ``` -- - This doesn't work with variables of different types: ```py y = 15 x + y #> --------------------------------------------- #> TypeError #> Traceback (most recent call last) #> <ipython-input-8-b50c5120e24b> in <module>() #> ----> 1 x + y #> TypeError: Can't convert ' int ' object to str implicitly ``` --- ## Variable types (cont.) - But we can convert variable types on the fly; to as string with `str()`: ```py # x = "The cell grew" # y = 15 x + " " + str(y) + " nm" #> 'The cell grew 15 nm' ``` -- - To an integer with `int()`: ```py z = "81" y + int(z) #> 96 ``` - And conversion to a floating point number or a Boolean is done with `float()` and `bool()`, respectively. .content-box-info[ Note that quotes always denote strings, so numbers in quotes, like `z` above, are interpreted as strings. ] --- ## Naming Python variables - Use only alphanumeric characters and underscores in variables names. - Numbers can't be used as the *first character* in variable names. - To delimit words, preferably use underscores (`my_variable`) rather than "camel case" (`MyVariable`). (And note that Python is case-sensitive.) - As we'll see in a minute, periods `.` have special meaning in Python and **can not** be used in variable names. --- ## Python variables .content-box-info[ To see which variables are in your environment, the VS Code interactive window has a nice feature at the top of the window: `Show variables active in Jupyter kernel`. <figure> <p align="center"> <img src=img/vscode-py-vars.png width="75%"> </p> </figure> <figure> <p align="center"> <img src=img/vscode-py-vars2.png width="75%"> </p> </figure> ] --- ## Built-in functions - In the previous slides, we saw several examples of *functions*, like `str()`, which are called using parentheses: ```py ## With no arguments specified: # function_name() # With arguments specified: # function_name(function_argument1, function_argument2) ``` -- - Another example is `len()`, which gets the length of (e.g.) a string: ```py s = "a long string" len(s) #> 13 ``` -- - Or `round()`, to round a number, with an optional number of decimals to round to (the default is 0): ```py round(3.14159) #> 3 round(3.14159, 3) #> 3.142 ``` --- ## Printing to screen .content-box-info[ As we've seen, when we typed `x` or `x + y`, the outcome was printed to screen. However, this only happens when we run Python code *interactively*. To be more explicit, and to print in any context, we can use the function `print()`: ```py print(x) #> The cell grew print(z) #> 81 ``` ] --- ## Strings and methods - **Python is "object-oriented"** – its variables are objects that contain: - The values/data assigned to them, _and_ - A type of functions specifically associated with the object type. These are called **"methods"**. - Methods are used with syntax `<object_name>.<method_name>()` instead of `<method_name>(object_name)`. -- <br> - Let's call our first method — we'll try `find()`, which prints the position of the first occurrence of a substring: ```py astring = "ATGCATG" astring.find("C") #> 3 ``` --- ## Getting help - To see which methods are associated with any object, type its name followed by a period `.` and a list of available methods will appear. For instance, for our string `astring`, type: ```py astring. ``` <figure> <p align="center"> <img src=img/show-methods.png width="75%"> </p> </figure> --- ## Getting help (cont.) - To get help about methods/objects, you can use: ```py # Help for a method: astring.find? # or "?astring.find" or "help(atring.find)" # Help for an object, given its type: astring? # or ?astring ``` <figure> <p align="center"> <img src=img/method-doc.png width="85%"> </p> </figure> <br> - On the internet, <https://docs.python.org> should be your go-to source besides Google. --- ## *Methods* versus (generic) *functions* .content-box-info[ - Call the function `print()` on a string: ```py # astring = "ATGCATG" print(astring) #> ATGCATG ``` - Calling the method "print" returns an error message that tells us that strings have no method called `print`: ```py astring.print() #> ---------------------------------------------------------- #> AttributeError: ' str ' object has no attribute ' print ' ``` --- ## More methods for strings - `count()` to count occurrences of substrings: ```py # astring = "ATGCATG" astring.count("G") #> 2 ``` <br> - `.replace()` to replace substrings: ```py astring.replace("T", "U") #> 'AUGCAUG' ``` -- .content-box-q[ Has `astring` been modified by our `astring.replace()` call? ] --- ## More methods for strings - `.strip()` to remove, by default, leading and trailing white space: ```py newstring = " Mus musculus " # Note the spaces before and after newstring.strip() #> 'Mus musculus' ``` <br> - `.split()` will split a string into components, using spaces by default: ```py newstring.split() #> ['Mus', 'musculus'] # Specify the character by which to split: newstring.split("u") #> ['M', 's m', 'sc', 'l', 's'] ``` <br> --- ## Other behavior of methods - You can even use string methods without first assigning the string to a variable: ```py # Make uppercase with `.upper()`: "atgc".upper() #>'ATGC' # Make lowercase with `.lower()`: "TGCA".lower() #> 'tgca' ``` -- <br> - And you can **combine multiple methods** in a single call by chaining them together like so: ```py # astring = "ATGCATG" astring.replace("T", "U").count("U") #> 2 ``` --- ## Concatenating strings - As we saw earlier, we can concatenate strings with the `+` sign: ```py genus = "Rattus" species = "norvegicus" genus + " " + species # Separate with a space #> 'Rattus norvegicus' ``` -- - But this is a slow way of concatenating – the `.join()` method is better: ```py # join requires a list of strings as input: human = ["Homo", "sapiens"] " ".join(human) #> 'Homo sapiens' # You can specify any symbol as a delimiter: "->".join(["one", "leads", "2", "the", "other"]) #> 'one->leads->2->the->other' ``` --- ## More on methods - Different object types have different methods: ```py my_int = 8 my_int. # See methods for integers my_float = 8.5 float. # See methods for floats mystring = "ATGC" my.string. # See methods for strings ``` --- ##
Intermezzo 3.1 1. Initialize the string `s = "WHEN on board H.M.S. Beagle, as naturalist"`. 2. Apply a string method to count the number of occurrences of the character `b`. 3. Modify the command such that it counts both lowercase and uppercase `b`'s. 4. Replace `WHEN` with `When`. --- ##
Intermezzo 3.1: Solutions 1. Initialize the string: ```py s = "WHEN on board H.M.S. Beagle, as naturalist" ``` 2. Count the number of occurrences of the character `b`. ```py s.count("b") ``` 3. Count both lowercase and uppercase `b`'s: ```py s.lower().count("b") ``` 4. Replace `WHEN` with `When`. ```py s.replace("WHEN", "When") ``` --- class: center middle inverse # Questions? ----- <br> <br> <br> <br>