There is another instructional break this week, so we will only meet on Tuesday.
This week, you will learn about using regular expressions in Python. We already talked a fair bit about regular expressions in week 4 and used them with grep
and sed
in the Shell. Regular expressions are shared across all programming languages but there are slightly different “dialects” in use by different languages. We’ll see that Python’s syntax, as exposed by the re
module, is similar to but a bit easier to use than that in Bash (e.g., no need to turn on extended regex!), and that there is some additional functionality too.
Some of the things you will learn this week:
All common regular expression constructs and their syntax in Python: metacharacters, character classes (AKA sets), quantifiers, anchors, and alternations.
Why and how we need to define regular expressions as “raw strings”.
Different functions in Python’s re
module that can be used to match and replace text.
How you can use and refer back to “groupings” in regular expressions for fine-grained matching and extraction of sub-matches within larger matches.
This week’s chapter, CSB’s “Regular Expressions”, is relatively short and straightforward. Yet learning how to use regular expressions can be extremely useful!
Dive Into Python is a nice and free online book on Python, and its chapter on regular expressions is particularly good.
Regex101 is a useful website to translate regular expressions into plain English and to test your code on example text.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".