Update to nginx_alias_map


I've been doing a bunch of maintenance on my two blogs (company and personal) and one purpose has been to track down malformed and mis-mapped URLs on the site. Since both have been through changes in the underlying blog engine a couple of times, there are multiple sets of URLs that point to the same content. Generally, since there was a long period of time between each of the respins, the search engines picked up the changes in URLs, but occasionally I will see log entries for the two-engines-ago format, and I'd like to fix those as well, especially since the same content is still on the site in most cases.

The most recent versions (the Cartographica blog was on SquareSpace and Gaige's Pages as in drupal) are already mapped and had simple mappings due to good URI choices. However, both of these blogs were previously (initially) in Geeklog, and it had a format that was based on the article.php file and a query string.

As mentioned in my nginx_alias_maps post last year (this week, it turns out), I had written a bit of code to produce nginx maps to handle redirections.

However, as written, the map that I was generating was using the $uri variable in nginx, which cooks the URI by removing things like the query string. Obviously, this won't work for the old Geeklog query string-based redirection, so I needed to move to using the full $request_uri. That was fine, but came with drawbacks as well. As I mentioned, the $uri variable is cooked by nginx, removing relative directory traversals, double-slashes, query strings, etc. For most of my URIs, this is a much better fit. As such, I decided to complicate the plugin a bit and add support specifically for URIs which contained a ? as an indicator of query strings, and to process them in a second stage map. It's a little more time consuming, although it's not noticeable on my blogs.

The solution was to run the $uri map for any URIs not containing query strings and then run the $request_uri map for any URIs that did contain them. So, if you had an alias entry such as the one for Load up those album covers (header shown here):

Date: 2003-04-29 11:41
Alias: /node/4921,/article.php?story=2003042913413622
Tags:
Category: macintosh
Title: Load up those album covers

the code will generate entries in two maps:

map $uri $redirect_uri_1 {
    ~^/node/4921$ https://$server_name/load-up-those-album-covers.html;
}
map $request_uri $redirect_uri {
    default $redirect_uri_1;
    ~^/article\.php\?story=2003042913413622$ https://$server_name/load-up-those-album-covers.html;
}

Note here that the first map maps to $redirect_uri_1 and the second one maps to $redirect_uri, with a default value of $redirect_uri_1. Because of the way that nginx evaluates maps, you can't use $redirect_uri in both cases.

As with previous versions, you need to include the map in your http stanza in your nginx configuration, and you also need to check the value of $redirect_uri and send it back as a redirect if present:

include /opt/web/output/alias_map.txt;

server {
  listen       *:80 ssl;
  server_name  example.server;

    # Redirection logic
    if ( $redirect_uri ) {
        return 301 $redirect_uri;
    }

    location / {
        alias /opt/web/output;
    }
}

Of course, if you only have one or the other type of redirection, the code will make sure to only create a single-stage map.

Updated code is now available as nginx_alias_map on github.