The other day at work I was given the task of updating all page URLs to always use the trailing slash. On the surface it seemed like a pretty simple request, drop in a RewriteCond and RewriteRule into the host file and call it a day. For those that are looking for the final code, here is what I used:
RewriteCond %{REQUEST_METHOD} !=POST
RewriteCond %{REQUEST_URI} !(.*)/$
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ http://%{HTTP_HOST}$1/ [R=301,NE]
Everyone else, read on as I explain how I got to that.
Version 1, tested on my virtual machine
# This looks at the URI for the page and if it does not have a trailing slash already it moves on to the next condition. If it does have a trailing slash, it stops checking and serves the page.
RewriteCond %{REQUEST_URI} !(.*)/$
# This checks that the requested path is not for a file. If it is, just serve the file
RewriteCond %{DOCUMENT_ROOT}%{REQUEST_FILENAME} !-f
# If it gets to this point, redirect the url to the version with the trailing slash
RewriteRule ^(.*)$ http://%{HTTP_HOST}$1/ [R=301,NE]
* One thing to note: when I was reading up on checking to make sure the file exists I found a lot of sources saying that apache 2.2 needs the %{DOCUMENT_ROOT} but that apache 2.0, %{REQUEST_FILENAME} contained the absolute path. So if you are having trouble with that line, check the version of apache and double check what path it is checking for the file at. I gave this a few test runs and noticed a few things wrong with it.
- Some of our forms had been hard coded to use URIs that did not have trailing slashes. This means that when the data was POSTed it would 301 redirect and NOT pass along the POST data. So the forms would be useless.
- Some of our code that referenced files actually had a RewriteRule further down that mapped the path to the directory. So the second RewriteCond to see if the requested path was a file was returning false (allowing it to continue onward). For example: http://domain.com/directory/filename.xml mapped to http://domain.com/anotherdirectory/filename.xml. So when it didn't find the file at http://domain.com/directory/filename.xml it would redirect to http://domain.com/directory/filename.xml/
Fixing the form URIs
My options for fixing the form paths were to either update all of the actions or to figure out a way to ignore POST requests. Sure enough I found I could use the %{REQUEST_METHOD} to ignore POST requests. I was turned on to by this stackoverflow question.
RewriteCond %{REQUEST_METHOD} !=POST
Fixing the file paths that have URL rewriting already in place
The situation was similar to this. Request a path like http://domain.com/directory/filename.xml and that would be rewritten to /anotherdirectory/filename.xml. In my situation the files in that directory were more often than not referenced with an absolute path ending in the file extension. So I was able to just add another RewriteCond that excluded URLs in that directory from being processed.
RewriteCond %{REQUEST_URI} !^/directory/(.*)
Debugging mod_rewrite
While working on this I found it useful to do some debugging when I found instances that didn't work as I expected. I enabled logging so that I could see the route an incoming URL takes when it comes in. To turn it on you can add this to the conf file for apache:
# This path must be writeable by the web server
RewriteLog /var/log/apache2/rewrite.log
RewriteLogLevel 5
Once activated you can tail the file to view it in real time as you access the page. These were some helpful pages I used to help me out when I was working on this: http://apache-http-server.18135.x6.nabble.com/mod-rewrite-loaded-but-not-writing-or-logging-td4746292.html http://amandine.aupetit.info/135/apache2-mod_rewrite/