URL Rewriting with Apache Web Server

Apache HTTP Server ProjectThe Apache web server—included on IBM i as HTTP Server for i—contains a powerful feature known as mod_rewrite that can convert URLs (API or Web) from their original versions to any format you need.

This article offers a small taste of what URL Rewriting can do.

Note: This topic is for advanced users of HTTP Server for i. For the basics, refer to Apache for IBM i: Where to Find Documentation, or request a copy of my Apache Web Server Magic slides.

What It Can Do

URL Rewriting helps make your APIs and web applications more secure and more accessible to users, other applications, and search engines. It allows these improvements without forcing you to change your site or application.

Essential Directives

These directives go in your configuration file (/www/sitename/conf/httpd.conf), which you might edit using Web Administration for i.

  • RewriteEngine—Enables rewriting with the RewriteEngine On directive.
  • RewriteCond—”Rewrite Condition”: Optional rules stating whether the directives immediately following it will run or not. Its syntax is: RewriteCond TestString CondPattern where TestString is the string or variable to test (often a URL, but can be a port, environment variable, or other value) and CondPattern is a regular expression that represents the test to perform.
  • RewriteRule—The workhorse of rewriting. Its syntax is RewriteRule Pattern Substitution where Pattern is a regular expression to match the incoming URL and Substitution is the resulting URL you want.

Enhance Security

URL Rewriting can enhance security, such as by hiding your server’s true directory structure. Another security measure is to redirect plain text requests to encrypted URLs with the https:// protocol. It is done like this:

RewriteEngine On
RewriteCond %{SERVER_PORT} !^443$
RewriteRule ^/(.*) https://%{SERVER_NAME}/$1 [NC,R,L]

Explanation: If the RewriteCond detects that the server port is not 443 (the port normally used for TLS encryption, represented in the browser by the “https” prefix), we run a rewrite rule that redirects the browser to the same site but with an “https” prefix. The RewriteRule takes any path information (matched by the wildcard (.*)), substitutes it into the result at the symbol /$1, and prefixes it with “https” and the server name. Flags: [NC]=not case sensitive; [R]=ask the browser to redirect to the new URL; [L]=last request (don’t execute any more rewriting rules for the current request).

Example: The original URL is http://www.mytestsite.com. Apache redirects to https://www.mytestsite.com (notice the “s” in “https”).

Simplify the URL of Your Home Page

The URL of a dynamically generated home page can be complex. Some software tools require several parameters. This example was from a major retailer’s website (its name disguised) at one time: http://www.rdfrederick.com/cgibin/xyzweb?procfun+homeproc01+pghome+rdf+eng.

We should be able to reach the home page by a simple domain name (e.g., http://www.rdfrederick.com). URL rewriting provides the solution:

RewriteEngine On
RewriteRule ^/$ /cgi-bin/xyzweb?procfun+homeproc01+pghome+rdf+eng [PT,L]

The ^/$ indicates an empty string. The rule matches when a simple domain name is used, without any further path or file data. The rule, having been matched, will substitute the second parameter (/cgi-bin...). At the end of the rule, inside the brackets, there is no “R,” so no redirection takes place. The substitution of the longer URL occurs inside the Web server. Although the desired program (xyzweb with parameters) is called, the user’s browser just shows http://www.rdfrederick.com. Flags:[PT] (Pass through): Passes the rewritten result through to any other processing that the Web server might have to do.[L]: Last.

Fit a Long URL on a Short Screen

Many 5250 emulators provide an easy way to integrate internet content, such as web pages and images, with text-based 5250 screens. By default, ACS’s emulator recognizes when a URL is displayed, converting it to a clickable “hotspot.” Clicking it launches the associated content in the default web browser. One problem: If the URL is longer than the screen width, which by default is 80 characters (or a 24 x 80 screen), some of the URL will be cut off.

For example, our web-based invoice software could require a long URL that looks like this:

http://www.myinvsite.com/qsys.lib/wwwcgi.lib/softweb.pgm?procfun+myproc+func001+dev+eng+funcparms+stdrentry(A0010):Y+account(A0100):12345+ invoice(A0050):22222+line(A0060):43.

That’s a mouthful! We can reduce it to this dainty (and more readable) URL:

http://www.myinvsite.com/account/12345/invoice/22222/line/43

The conversion is managed with the following directives:

RewriteEngine On
RewriteRule /account/(.*)/invoice/(.*)/line/(.*) http://qsys.lib/wwwcgi.lib/softweb.pgm?procfun+myproc+func001+dev+eng+funcparms+stdrentry(A0010):Y+account(A0100):$1+invoice(A0050):$2+line(A0060):$3 [PT,L]

Notice the three wildcards “(.*)”, which are saved and substituted for the “$1,” “$2,” and “$3” symbols in the replacement URL. Apache pulls the three values out of the original URL and places them in the replacement URL. The user and Client Access see the short URL, while the Web server processes the long one.

Incidentally, search engines seem to prefer simple URLs over complex ones. A site with long, complex URLs might improve its search engine rankings by simplifying its URLs using this technique.

More Ideas and Information

Many inspiring  solutions can be found in Redirecting and Remapping and Advanced Topics. The study of regular expressions will aid the aspiring web wizard, as will this tutorial and the official mod_rewrite documentation.

(This article has been updated from its original published by MC Press Online.)

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.