|
« Back to Unix / Linux
Apache mod_rewrite
The Apache mod_rewrite module is a very powerful tool to allow you to manipulate URL's before the server decides
what action to carry out, or what html file to send in reply. If you've ever wanted to be able to offer different
URL's for the same file, or even direct to a different website all together then mod_rewrite can help. It offers
more functionality than the mod_alias module which can be used to perform some basic mapping between URL's and filenames.
The problem that I have found is that whilst the documentation available explains what can be achieved it is a bit too
complex when you just want to carry out some basic operations. This is therefore a guide for mod_rewrite for people that
just want to get some quick rules up and working to improve their site.
Whilst many of the examples I use can be performed using mod_alias this should give enough of an introduction to
mod_rewrite to then allow more complex rules to be created for future requirements. This guide only covers some
basic concepts. After reading this and experimenting with your own server you should then read the
Apache mod_rewrite
documentation to understand what is actually happening. [This link to the documentation is for Apache version 2.0].
The first thing to consider is that the Apache mod_rewrite module uses regular
expressions extensively, so it's well worth learning a little about regular
expressions. You can view my Beginners guide to regular expressions.
The regular expression metacharacters are similar to those used by the perl language.
If however the talk of regular expressions sends a cold shiver down your spine then it is still possible to create
some rewrite rules (and examples are provided) without really understanding regular expressions. But you will
only have limited functionality and will not be using the full power of the mod_rewrite module.
Getting Started with mod_rewrite
The rewrite rules can be either in the server configuration files, or in the .htaccess files on a per folder basis. I am assuming
that these are defined in the server configuration files. This requires root access to edit the files.
If you are not using virtual hosts then you can add the entries within the <Directory /> section of the server config file.
This will normally be: httpd.conf; httpd2.conf; or commonhttpd.conf which is in the /etc/httpd/conf directory.
If you are using virtual hosts then you can define this on a per host basis within the Vhosts.conf file under each server.
The first entry is needed to turn the rewrite engine on. The entry should be:
RewriteEngine on
followed by one or more RewriteRule entries:
RewriteRule Pattern Substitution
As a basic example the following is a typical virtual hosts entry:
<VirtualHost *:80>
ServerName www.watkissonline.co.uk
DocumentRoot /var/www/html
RewriteEngine on
RewriteRule ^/index\.html$ /index.php [L]
<VirtualHost>
This example is for the virtual host www.watkissonline.co.uk. The first few lines are the normal definitions for the virtual host.
Next there is a RewriteEngine directive to enable the rewrite rules. In this example there is just one rewrite rule, but there
could be more by having multiple RewriteRule entries.
The Rewrite example can be read as follows:
RewriteRule ^/index\.html$ /index.php [L]
- ^/index\.html$ The pattern to find.
- /index.php What to substitute the entry with.
- [L] Options that control how this works.
This example is used because the index.html file has now been replaced with a PHP file index.php. This is a something
that I did with my website when I added the latest blog entry to the index page. So anyone that tried to access
http://www.watkissonline.co.uk/index.html, will get
the result of http://www.watkissonline.co.uk/index.php.
The pattern matched is the string ^/index\.html$, which works as follows:
- ^ Pattern must be begin with the following expression. In this example it meanst that the file must be /index.html
and not /info/index.html or any other combination.
- /index\.html The string to be matched must be /index.html. The \. means that it must be an explicit period character
and not any character which is what the period means in regular expressions.
- $ Whilst not strictly required in this example the dollar indicates that this must also be at the end of the string.
So it will not match anything after index.html such as: /index.html?test.
If the patter is matched then the file /index.php will be returned.
The [L] option indicates that this if the match is successful then this is the Last substitution it should try.
So if the file was index.html after this rule has been successfully applied any subsequent RewriteRule directives will be
ignored. Again this is not required for this example, but it is a good habit to always include [L] unless you want subsequent
rules to be applied (the flexibility of being able to apply subsequent RewriteRules to the modified URL is one of the powerful
features of this module).
The user viewing the pages will not know that this rewriting of the URL has occured. As far as they see they will be getting the
entry index.html.
A second example
A second entry is as follows:
RewriteRule ^/solaris/index.html$ /unix.html [R,L]
This is another real example where I got rid of a specific entry for solaris and merged it into a more generic UNIX page.
This is similar to the first example, although I have dropped the \. from the pattern, remember a period matches any character
so /solaris/index5html would be matched as well (an unlikely occurance). Note also that whilst the pattern is in a subdirectory
the resulting file is in the root of the directory. This can cause a problem if relative links or images are used in the html
file. If the URL is in a subdirectory then the browser will see relative links as being relative from that subdirectory, rather
than the new file. Which leads us on to the R option which is new in this example. The R option redirects the browser so that
the browser knows that the URL has changed and the URL changes in the browser location.
The next example shows how you can convert a user friendly URL so that it can be used for a script.
RewriteRule ^/test/(.*) /test.php?page=$1 [L]
This example is a little more complex in that we are actually using the regular expression to copy part of the
pattern into the substitution. We now have a part of the pattern in brackets which is the part we want to copy
over to the substitution. This matches anything below the test directory and puts it's result into the url of
the /test.php page. So for example if we had /test/unix this would be transposed to /test.php?page=unix .
If you had more sections then each bracket in the match will have a corresponding variable that can be used
in the substitution. ie. the second bracket would be $2, the third $3 etc.
Managing Rewrite Rules
The rewrite rules can be included in either the http config files (e.g. httpd.conf or Vhosts.conf), or if
allowed in .htaccess files. Another way of including entries into the web server configuration files is to
use the include statement. This allows the rewrite rules to be stored in a seperate file to make it easier
to manage. For example in the guide to basic Intrusion Detection for Apache
the rewrite rules are contained in the file /etc/httpd/conf/idsrewrite.conf and included in the Vhosts.conf
file using the entry:
Include /etc/httpd/conf/idsrewrite.conf
More Information
The following provide more information on using mod_rewrite:
|