Python and regular expressions

Using regular expressions with Python is easy. There is a built-in library called re which provides all of the functionality we will require. There are a lot of functions in this library so we will be only looking at a small subset:

  • re.match()
  • re.search()
  • re.findall()
  • re.sub()

If you are interested in investigating the full range of functionality available in the re library then the Python documentation is waiting for you.

Post code validation

In the previous section we created a regular expression for validating U.K. post codes. If you managed to develop the full single expression to validate all six types of post code it might look like this:

1
[A-Z]([A-Z](d{1,2}|d[A-Z])|d{1,2})sd[A-Z]{2}

Let's see how we could turn this into a program which would tell us whether we had entered a valid post code or not.


Task 5

Use the above video to create validation programs for your regular expressions for telephone numbers and car registrations.


Time to get verbose

Reading regular expressions can be a bit of a nightmare, especially if you are not used to it and your are trying to focus on Python code at the same time. Thankfully there is another option available to us - verbose regular expressions. Verbose expressions in Python can contain comments so they are much more readable.

Use of verbose expressions should be encouraged, especially whilst learning. They are far more readable than compact expressions. However, you should be familiar with both.


Task 6

Use the above video to convert your telephone numbers and car registrations programs from Task 5 to verbose regular expressions.


Other Python re functions

The other functions mentioned re.search(), re.findall() and re.sub() all work in a similar way to the re.match() function. The table below indicates how they differ:

Function Required Parameters Optional Parameters Return Value Example Explanation
re.search() pattern, string re.VERBOSE N/A re.search("dog","the dog was lying on the couch") search() will find the pattern anywhere in the string rather than just at the start like match()
re.findall() pattern, string re.VERBOSE a list containing all occurrences of the pattern re.findall("at","the cat was sat on the mat") findall() returns a list containing all matches e.g. ["at","at","at"]
re.sub() pattern, replacement text, string re.VERBOSE string with replacements re.sub("ham","beef","Beth ate a hamburger in her hammock") sub() replaces the values matched by the pattern with the replacement text e.g. "Beth ate a beefburger in her beefmock"

Task 7

Create a program that makes use of the following regular expressions functions

  1. re.match()
  2. re.findall()
  3. re.sub()

Time to move on

Over the past few pages we have looked at the purpose of regular expressions, been introduced to the syntax and the differences between compact and verbose expressions. Now that we have used them in Python it is time to look at how we can use regular expressions in conjunction with server-side scripting to help validate forms.