Key Terms

Just where are regular expressions actually used? It is fine to learn about the concept but what use are they actually in practice. Turns out they are used pretty much everywhere! Here are a few examples:

  • Validating form input from a web page
  • Rewriting URLs to a more user friendly format on a web server
  • Making sure files of a particular type are handled correctly on a web server
  • Find and replace in a text editor
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
server {
    #listen 80: ## listen for ipv4; this line is default and implied
    #listen [::]80 default_server ipv6only=0n; ##listen for ipv6

    root /usr/share/nginx/html;
    index index.html index.htm index.php;

    #Make site accessible from http://localhost/
    #server_name localhost;
    server_name python-school-server;

    location / {
        #First attempt to server request as file. then
        #as directory, then fall back to displaying a 404.
        try_files $uri $uri/ =404;
        allow 192.168.1.0/24
        allow 127.0.0.1;
        deny all;
        #Uncomment to enable naxsi on this location
        #include /etc/nginx/naxsi.rules
    }

    location ~ \.php$ {
        try_files $uri $uri/ =404;
        allow 192.168.1.0/24
        allow 127.0.0.1;
        deny all;
        include fastcgi_params;
        fastcgi_pass php5-fpm-sock;
    }
}

Notice that in the above example there is the regular expression .php$ - this basically means that any requests for files ending (remember $ matches the end of the string) with .php are processed by the PHP socket on this server.

Some key terms

Before we start using regular expressions in our Python code it is important that we recognise some of the key terms that are associated with the topic:

Key term Explanation
Regex Abbreviation of regular expression
Literal Any character that is used in the regular expression e.g. a
Metacharacter This is a character with special meaning. There are 13 of these: | ? * + ( ) [ ] \ ^ $ . -
Alternation The pipe separating alternatives e.g. uk|us
Character class A character class is any alternatives expressed using square brackets e.g. [a-z]
Grouping Using round brackets to group element together to define the scope and precedence of operators e.g. b(a|e)d
Quantification Using a quantifier (e.g. ? * +) to specify how many of the preceding element is allowed

Now that we have been introduced to the syntax and terminology surrounding regular expressions and we have created some of our own it is time to use them in a practical situation. Let's look at regular expressions and Python...