[mirrorbrain-announce] Post-2.7 Work On The Toolchain.

From: Peter Poeml <poeml_at_cmdline.net>
Date: Tue, 10 Mar 2009 13:25:42 +0100
Since the new database format was changed with MirrorBrain 2.7, there
was a number of interesting improvements in the toolchain. I'll list
(and explain) the changes below; for the lack of discussion on the
mailing list (because it's not populated yet).

In the *scanner*, deletion of files for *subdirectory scans* from the
mirror database is now implemented. This required a full scan before,
because the database was too bloated to efficiently select the affected
files. So this became possible with the new database schema. Very cool
is that this opens the door for better scanning, which works much more
directory-based now and can do cleanups whenever needed. This again
allows for a tighter integration of mirror syncing with the database
update. A (push) rsync can not only trigger a scan right after syncing a
directory, but it could also enter the files directly into the database
-- and delete the ones that are obsolete.

A bug in the scanner which prevented the correct usage of
inclusion/exclusion of top-level directories in relation to subdirectory
scans as been fixed.

The *mirror choice* can now be influenced with a *query parameter*,
as=1234, appended to the URL. The number specifies the autonomous system
number which the server will base its mirror selection on, instead of
the AS of the client IP. Another possible parameter is country=XY, where
XY is a two-letter country code. As an example, you could look at the
following URLs::


The first URL gives a result depending on your location. The other two
generate a list for AS 680, or for the United Kingdom, respectively.
This shows some of the criteria for mirror selection that MirrorBrain
uses. (In reality, it uses more criteria for mirror selection; whatever
is available.)

Just as the mirrorlist is more or less for human admins to see what's
going on, the as= and country= are not meant for machine clients to
technically influence the mirror selection. For that, it would be more
appropriate to override the IP address "detection" in the first place.
The IP address, as looked up by mod_geoip and mod_asn, could be passed
via a *X-Forwarded-For* header, for instance. This would allow frontend
servers to influence the mirror selection appropriately. mod_geoip
already supports this. For mod_asn I plan to add this in the future.
mod_mirrorbrain just lets mod_asn and mod_geoip do that work and uses
what it finds in Apache's subprocess environment.

The "mb list" tool has new options to customize what's being
displayed when mirrors are listed, namely:

  --country --region --prefix --as --prio

The "mb file ls" tool can now probe files that were looked up in the
mirror database. So, contrary to "mb probefile", which probes for a
given file on all mirrors, "mb file ls --probe" looks up which mirrors
are known to have a certain file, or a certain list of files matching a
pattern. The --probe switch causes it to probe the file on each mirror,
and the --md5 switch to display the md5 hash of the returned content.
This can be used to check functionality of the mirrors:

   % mb file ls '*i586/ConsoleKit-0.2.10-63.8.i586.rpm' --probe --md5
  as vn  100 ok       ok   fpt.net                         200 140b82137811ee451929f9977266ab73
  eu de   50 ok       ok   widehat.opensuse.org            200 140b82137811ee451929f9977266ab73
  eu hu  100 ok       ok   fsn.hu                          200 140b82137811ee451929f9977266ab73

"mb new" fills in some data automatically now (AS number and prefix).

A new tarball has been spun and "can be downloaded from the download

Online version:

mirrorbrain-announce mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain-announce/

Note: To remove yourself from this mailing list, send a mail with the content
to the address mirrorbrain-announce-request_at_mirrorbrain.org
Received on Tue Mar 10 2009 - 12:25:43 GMT

This archive was generated by hypermail 2.2.0 : Tue Oct 20 2009 - 15:33:25 GMT