Title mod_mirrorbrain fails when Apache configuration is for a symlink, not a directory
Priority bug Status resolved
Superseder Nosy List dfarning, poeml
Assigned To poeml Keywords

Created on 2009-10-26.20:04:47 by poeml, last changed by poeml.

msg230 (view) Author: poeml Date: 2010-09-06.14:24:49
Committed in r8114.

Will be included in 2.13.0.

David, thanks again for you help (and patience)!
msg229 (view) Author: poeml Date: 2010-09-06.14:18:36
The following patch would be the version that checks with each request:

@@ -1475,18 +1475,29 @@
     /* prepare the filename to look up */
-    char *ptr = canonicalize_file_name(r->filename);
+    char *ptr = realpath(r->filename, NULL);
     if (ptr == NULL) {
         ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, 
                 "[mod_mirrorbrain] Error canonicalizing filename '%s'", r->filename);
-        setenv_give(r, "file");
-    /* XXX we should forbid symlinks in mirror_base */
     realfile = apr_pstrdup(r->pool, ptr);
-    /* strip the leading directory */
-    filename = realfile + strlen(cfg->mirror_base);
+    /* the basedir might contain symlinks. That needs to be taken into account. See issue #17 */
+    ptr = realpath(cfg->mirror_base, NULL);
+    if (ptr == NULL) {
+        /* this should never happen, because the MirrorBrainEngine directive would never
+         * be applied to a non-existing directories */
+        ap_log_rerror(APLOG_MARK, APLOG_ERR, 0, r, 
+                "[mod_mirrorbrain] Document root \'%s\' does not seem to "
+                "exist. Filesystem not mounted?", cfg->mirror_base);
+    }
+    /* the leading directory needs to be stripped from the file path */
+    /* a directory from Apache always ends in '/'; a result from realpath() doesn't */
+    filename = realfile + strlen(ptr) + 1;
+    free(ptr);
     debugLog(r, cfg, "Canonicalized file on disk: %s", realfile);
     debugLog(r, cfg, "SQL file to look up: %s", filename);
msg228 (view) Author: poeml Date: 2010-09-06.13:26:10
It might happen that users change the file system layout also during runtime. A directory might be moved away 
and replaced with a symlink. Or a filesystem might be accidentally unmounted at Apache start, or mounted at 
the wrong place, and the user might try a symlink. In this case, it isn't sufficient if Apache checks only at 

One might argue that canonicalizing the base directory shouldn't be done for each request.

One might also argue that the non-obviousness of the symptom, as well as the potentially saved hair is a good 
reason to do the check each time.

The following patch deals with symlinks that exist already when Apache starts, but doesn't deal with later 
changes, because the check runs only during merging the directory configuration:

@@ -471,8 +471,25 @@
     mb_dir_conf *cfg = (mb_dir_conf *) config;
     cfg->engine_on = flag;
-    cfg->mirror_base = apr_pstrdup(cmd->pool, cmd->path);
-    return NULL;
+    char *cn = canonicalize_file_name(cmd->path);
+    if (!cn) {
+        ap_log_error(APLOG_MARK, APLOG_WARNING, 0, NULL,
+                     "[mod_mirrorbrain] Document root \'%s\' does not seem to "
+                     "exist. Filesystem not mounted?",
+                     cmd->path);
+        /* if a symlinks is used, it must exist at Apache's start already */
+        cfg->mirror_base = apr_pstrdup(cmd->pool, cmd->path);
+        return NULL;
+    }
+    /* Directory from Apache ends in '/'. A canonicalized path not. */
+    if (strncmp(cmd->path, cn, strlen(cmd->path)-1) == 0) {
+        cfg->mirror_base = apr_pstrcat(cmd->pool, cn, "/", NULL);
+        return NULL; /* works in both cases */
+    } else {
+        /* when could this occur? */
+        return apr_psprintf(cmd->pool, "symlink? path: %s vs. %s", cmd->path, cn);
+    }
 static const char *mb_cmd_debug(cmd_parms *cmd, void *config, int flag)
msg226 (view) Author: poeml Date: 2010-09-06.01:38:38
Checking for a symlink could be done by stating the path and looking at st_mode with the 
POSIX macro S_ISLNK. But it might be a waste of resources to do this check too often (the 
MirrorBrainEngine directive needs to be merged for each request).

I can reproduce this bug by creating a symlink "/srv/ooo" pointing to "/srv/" 
which I moved away -- if I set the option FollowSymlinks on the directory "/srv":

[mod_mirrorbrain] MirrorBrainEngine On, mirror_base '/srv/ooo/'
[mod_mirrorbrain] URI: '/extended/iso/de/foo.txt.mirrorlist'
[mod_mirrorbrain] filename: '/srv/ooo/extended/iso/de/foo.txt.mirrorlist'
[mod_mirrorbrain] File does not exist according to r->finfo
[mod_mirrorbrain] Representation chosen by .mirrorlist extension
[mod_mirrorbrain] r->uri: '/extended/iso/de/foo.txt.mirrorlist'
[mod_mirrorbrain] r->uri: '/extended/iso/de/foo.txt'
[mod_mirrorbrain] could not resolve country
[mod_mirrorbrain] could not resolve continent
[mod_mirrorbrain] Country '--', Continent '--'
[mod_mirrorbrain] AS '--', Prefix '--'
[mod_mirrorbrain] Canonicalized file on disk: /srv/
[mod_mirrorbrain] SQL file to look up: off/extended/iso/de/foo.txt
[mod_mirrorbrain] Successfully acquired database connection.
[mod_mirrorbrain] classifying 0 mirrors: 0 prefix, 0 AS, 0 country, 0 region, 0 elsewhere
(-1)Unknown error 4294967295: [mod_mirrorbrain] Could not retrieve row from database for 
off/extended/iso/de/foo.txt (size: 18, mtime 1283190051):  Likely cause: there are no 
hashes in the database (yet).
[mod_mirrorbrain] no hashes found in database
[mod_mirrorbrain] Sending mirrorlist

Note the wrong path being looked up.

I can not reproduce the bug if the FollowSymlinks option is not set, because then 
something different happens:

[error] [client] Symbolic link not allowed or link target not accessible: 
[error] [client] no acceptable variant: 

The latter will probably lead the admin to change the configuration to achieve the former 

I'm tossing/testing a fix...
msg42 (view) Author: poeml Date: 2009-10-26.20:04:47
When the Apache config for mod_mirrorbrain is done on a <Directory> block that really is a 
symlink, the module fails to redirect correctly. The symptom is (thanks David Farning for the 
report, for digging into it and finding the culprit):

> Judging form the log files it looks like either mod_mirrorbrain of the
> is chopping the several character form the beginning of the string it
> is searching in the database.
> Once the the apache config was working pointing at a normal directory
> it worked correctly.

I wonder how we can make this more foolproof. I can think of two ways:

1) add documentation that explains this -- but how to make sure that it is not missed?

2) add a check to mod_mirrorbrain that is performed at Apache start time, which resolves the 
directory to its canonical path and checks whether they match. If the check fails, either 
prevent starting or log an error message.

The config typically looks like this:
    <Directory /srv/mirrors/openoffice>
        MirrorBrainEngine On

3) thinking more about 2), I realize that all that's needed is to canonicalize the directory 
name in the beginning, and use the canonicalized path instead of the directory where the 
config has been placed.

Looking at the code, I actually see a comment that confirms the above assumptions:

    /* XXX we should forbid symlinks in mirror_base */
    filename = apr_pstrdup(r->pool, ptr + strlen(cfg->mirror_base));

I think 3) is the way to go to prevent us from running into this again!
Date User Action Args
2010-09-06 14:24:49poemlsetstatus: testing -> resolved
nosy: + dfarning
messages: + msg230
2010-09-06 14:18:36poemlsetstatus: in-progress -> testing
messages: + msg229
2010-09-06 13:26:10poemlsetmessages: + msg228
2010-09-06 01:38:39poemlsetstatus: chatting -> in-progress
messages: + msg226
2009-10-26 20:04:47poemlcreate