The mirror probe doesn't detect certain error scenarios. The most common case of failure of a mirror
is that the HTTP server doesn't reply in time, or returns an unusual HTTP status code (like 500).
This is detected by the mirror probe. This check is currently done only on the base URL (at the top
of the tree).
However, if a mirror administrator reconfigures a mirror e.g. by installing a machine from scratch,
often the base URL will continue to work, it can happen that a fresh webserver install will reply
with a 200 OK response, which will make the mirror probe believe that everything is fine, however the
file tree served before is gone.
So happened with the mirror http://opensuse-linuxmigratio.at/ resp. rsync://opensuse-
linuxmigratio.at/opensuse/
(As an additional bug that occured with this mirror, the scanner didn't clean up the file list. rsync
wasn't reachable anymore, and normally the scanner should fall back to HTTP then for scanning. But
that doesn't happen for this mirror. Will report that in a separate issue.)
What could be done to fix this is an additional check on some actual file in the tree. A file could
be chosen randomly from the database.
Checking on an actual file would mean that a HEAD request would be in order. Right now (when checking
on the base URL), we use a GET request; in the past it happened that a broken mirror would still
reply seemingly okay to HEAD requests but couldn't deliver files when requested with the GET method.
Maybe the "deeper" check could be performed less frequently than the base URL check (not every
minute), but include more thorough functional checks therefore. It would be a great improvement,
because with simple plausibility checks a lot more error conditions could be detected.
Another such check would be a consistency check between the filelists seen through the different
protocols (HTTP/FTP/rsync), to rule out misconfiguration with broken URLs.
|