htaccess – mod_rewrite, nice URLs, redirection

  htaccess, Web Hosting

Apache is the most popular web server. There are a large number of additions (modules) that extend its capabilities and functionality. The content of this text is a description of one of them especially – mod_rewrite.

Mod_rewrite allows in the runtime rewriting of URLs of incoming requests. The translation is done by the processing rules that are written in regular expressions. In addition rewriting, rules can be also managed by other conditions.

Rewriting of URL addresses may be as follows:
www.example.com/clanek/pocasi -> www.example.com/clanek.php?id=12342

Rewriting

URL address can be rewritten in two ways – by redirection or substitution.

  • Redirection – The client browser receives the HTTP response redirect (HTTP code 301 or 302) to a new location. The user finds that URL has been changed.
  • Substitution – The server returns to the browser content of the „new“ URL address but does not inform about it. The user does not detect that the address is substituted. Content may come from the same Web site or from another site on the same server or on a completely different server.

The important thing is when actually occurs to rewrite the URLs.This is the acceptance of HTTP requests by the server (after parsing the request headers, etc..), but before the requested URL will be interpreted (ie. before it starts to look for the file that the user has to send, or until the calling of scripting engine, which has to run the file). Interprets already rewritten URL. It follows that we can by any method affect what the user finally gets into their browser.

On the webhosting WEDOS mod_rewrite rules can be written to a file .htaccess.

Before you start writing rewriting rules is necessary to enable mod_rewrite by using:

RewriteEngine on

RewriteBase

Use this directive to set the default directory for all redirect targets from which is then derived a relative path.

RewriteBase path

RewriteRule

This directive indicates the actual rewrite rule and it has the following syntax:

RewriteRule Pattern Substitution [flags]

Pattern is a regular expression, which specifies when the rule should be performed. Regular expression syntax is similar to Perl, in addition, we can use negation character (“!” at the beginning). The pattern is compared (“matched”) with the URL that the client browser requires. If the pattern matches is performed transcript on substitution.

Substitution – new URL address of the page that truly displays to the client. The address can be either absolute (starting with http or https) or relative. The relative path starting with a slash is derived from the root of Virtualhost, otherwise from the current folder (from RewriteBase).

At substitution, we can also refer to „matched“ parts of regexp (section bounded by parentheses), system variables or mapping function (All this will be explained further herein). Link to the n-th part of the pattern we will call by $N. It is possible to refer to a part of the pattern from condition RewriteCond (see further) by using %N. Refer to the system variable that can be made by using %{VARNAME} and by using a mapping function ${mapname:key|default}.

At substitution is also possible to work with the parameters of URLs. If the substitution does not contain a question mark, at the end of the changed URL is added to the original parameters. By adding a simple question mark at the end-all of the parameters will be deleted. After question mark we can write the new parameters, even here can be used matched parts of a pattern ($N a %N). Moreover, if we state a flag QSA (see further), then we will be added both parameters original and new.

If there is in the configuration shown more clauses RewriteRule, they are processed in the order in which they are given. When a rule meets the URL, rewrite is performed and other rules work with the new URL (not with original). There may be more rewrites when the URL gradually transforms. Here you need to watch out for deadlocks. Rules are evaluated from beginning to end over and over again until there has been any rewriting.

[flags] – optional parameters. The most frequently used are as follows:

  • F – disables URL, returns HTTP 403 response (forbidden)
  • L – this is the final rule, others will not be performed
  • NC – case-insensitive expression
  • P – force proxy, enforcement processing of objective via mod_proxy
  • QSA – original URL parameters added at the end
  • R[=code] – redirect to a new URL
    • 301 – Moved Permanently
    • 302 – Found (Moved Temporarily) – default

Basic examples

Example 1 – Regular redirection

RewriteRule original-page1\.html new-page1.html [R]

It is a simple URL redirection one page to another page on the same site. Typically used when the specific address of the site is changed and we want clients to begin using the new address. Here it is important to ensure that the dots and other characters of the regular expression must be used with a slash, because the dot in a regular expression means any character. The second parameter is not a regular expression, do not use slash there. Also, we have to put a flag R here, otherwise implicitly performed substitution, not redirection (in this case, we want the user to learn a new address).

Example 2 – Regular substitution

RewriteRule original-page2\.html original-page2.html

A similar situation, but there is missing a flag R for redirection. Because the second parameter is a relative address, implicitly performs substitution. URL address does not change to the user in the browser, however, the user will see new content.

Example 3 – redirect elsewhere

RewriteRule page\.html http://www.example.com/anotherpage.html

There is an implicit redirection because the second parameter is the absolute URL. Need not be placed [R].

Example 4 – prohibition of pages

RewriteRule ^(.*/)?CVS/.* - [F]
RewriteRule ^(.*/)?\.svn/.* - [F]

Flag F is used to prohibit access to certain resources on the web. In this case, we do not want anyone to get through the browser to the content of tools directory CVS or Subversion. It is irrelevant whether such a directory or a file is there. The point is that URL address meets given regular expression. For this flag is pointless to indicate URL address in the second parameter, but due compliance of these parameters is necessary to write the hyphen.

Example 5 – setting the MIME-type of the document

RewriteRule ^(.+\.php)s$ $1 [T=application/x-httpd-php-source]

This is a trick which offers the opportunity to view the source code of all php files for visitors. During the request for a phps file, the user sends to the browser file of the same name with a special MIME type (This is especially for non interpret of php file, ie. script is not launched, but it will display its contents).

Example 6 – language hidden in the URL

RewriteRule ^cs/(.*)$ $1?lang=cs [QSA]
RewriteRule ^en/(.*)$ $1?lang=en [QSA]

This way you can transfer important parameters in the URL(In this case, the language code) if we do not want to transfer it as a parameter to the end of the address. We must not forget the flag [QSA], to add other parameters after the parameter of the languages ​​that were in the original URL address.

Example 7 – redirect all

RewriteRule (.*) http://www.example.com/

RewriteCond

By using RewriteCond directives you can specify one or more conditions, which must be met in order to apply the following rule RewriteRule.

RewriteCond TestString CondPattern [flags]

TestString – is a testing string, which will be matched with CondPattern. It can contain system variables or mapping functions.

CondPattern – can be a standard regular expression with negation. The second option is a common string of letters with special relevance. You can use comparison (<,>,= ), where will be a lexicographical comparison with TestStringem. You can also use the following flags, which always testing TestString.

  • -d – is TestString directory?
  • -f – is TestString file?
  • -s – is TestString not empty file?
  • -l – is TestString symmbolic link?
  • -x – have TestString executable permission?

The condition is valid if TestString matches the regular expression or other conditions in CondPattern. If before RewriteRulethere is more clauses RewriteCond, must be all valid to perform rewriting, as if they had AND between them (it can be changed by using OR).

[Flags] – in conditions we can use two flags:

  • NC – case-insensitive expressions
  • OR – The following condition is related to „OR“. By default conditions related with „AND“

At RewriteRule (v Substitution) and RewriteCond (v TestString) you can use system variables, entered in the format %{VARIABLE_NAME}. There are for example:

  • HTTP headers – HTTP_USER_AGENT, HTTP_REFERER, HTTP_HOST, …
  • information about a conjunction of request – REMOTE_ADDR, REQUEST_METHOD, QUERY_STRING, …
  • server variables – DOCUMENT_ROOT, SERVER_NAME, …
  • date and time – TIME, TIME_YEAR, TIME_HOUR, TIME_WDAY, …
  • %{ENV:variable} – environment variables
  • %{SSL:variable} – SSL conjunction parameters
  • %{HTTP:header} – Any HTTP header

Advanced examples

Example 8 – file existence

# only if requested file really does not exist
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^forum/topic-([0-9]+)\.html$ forum-topic.php?id=$1 [QSA,L]

The substitution of other sites is performed only if the originally requested file does not exist. This is also an example of rules for creating “pretty” URLs, where the parameter (eg. discussion topic number) is directly hidden in the file name and does not appear as a URL parameter. In this case, it is necessary to use QSA flag for the rules, that will add new and original URL parameters at the end of the address.

Example 9 – default script

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)\.html$ /unipage.php?page=$1 [L,QSA]

If the requested file actually exists, the rule is skipped and the file is sent to the browser. If the file does not exist, the request is forwarded to the universal script, which may try to find by URL address relevant content for example in a database or report an error (way to capture and handle errors 404).

Example 10 – running html files as PHP script

RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)\.html$ $1.php [L]

The rule used to be able to specify the URL files with the extension .html, but in a fact using .php files, which makes that the file will be run as a PHP script, but for the visitors will appear as a static .html file. The rule does not apply if the .html file exists. Existence of .php file is not solving, if file non-exists, it is usually is over with error 404.

Example 11 – redirecting by domain

# example.com -> www.example.com
RewriteCond %{HTTP_HOST} ^example\.com$
RewriteRule (.*) http://www.example.com/$1 [R=301]

The conditions and rules will take care of that visitor who will use the domain name without „www“, visitors will be immediately redirected to the same page, but with „www“ on beginning. This is desirable because some search engines penalize sites when the same content is available under multiple URLs. As an HTTP code, there is 301, which says that it is a permanent redirection.

Example 12 – redirection to HTTPS

RewriteCond %{HTTPS} ^off$
RewriteRule (.*) https://%{HTTP_HOST}/$1 [R]

This will provide us that the user is always redirected to a secure connection over SSL.

Example 13 – different versions of pages by date/time

RewriteCond  %{TIME_HOUR}%{TIME_MIN}  >0700
RewriteCond  %{TIME_HOUR}%{TIME_MIN}  <1900
RewriteRule  ^foo\.html$              foo.day.html
RewriteRule  ^foo\.html$              foo.night.html

Some of the modifications can be done by the current date and time. It uses the possibility of a lexicographical comparison of rules for RewriteCond and from folder date and time is cut to the same length (rounded to zeros from the left). In this example, we will substitute a page for visitors by the day time. Here is important that those two conditions apply to the next following conditions and there is a conjunction between them „AND“. The second rule is not subject to any condition. When the first rule is done then the second is not matched, because it will continue to use changed URL.

Example 14 – different versions of the pages by browser

RewriteCond  %{HTTP_USER_AGENT}  ^Lynx/.*          [OR]
RewriteCond  %{HTTP_USER_AGENT}  ^Mozilla/[12].*
RewriteRule  ^foo\.html$         foo.20.html       [L]
RewriteRule  ^foo\.html$         foo.32.html       [L]

A similar example for substitution of different versions of the site based on some properties of the visitor’s browser. In this case, we have the simpler version of the site for older browsers. Conditions there are conducted wit “OR”. The second rule works as a default if the conditions are not met for the first rule.