Re: [mirrorbrain] mirror list in plain text?

From: Per Jessen <>
Date: Wed, 16 May 2012 18:55:17 +0200
Peter Pöml wrote:

> Hi Per,
> Am 16.05.2012 um 13:47 schrieb Per Jessen:
>> is there a (standardized) way of retrieving the mirror list in plain
>> text?  I need this for setting up a URL rewriter for squid which will
>> help me cache segmented downloads.  For instance, I can retrieve the
>> HTML from and parse that
>> HTML quite easily, but I would prefer just getting a plain text file
>> straight from mirrorbrain.
> Do you mean all mirrors? If you have a certain file in mind, then
> appending .meta4 to the file's URL will give you parseable XML. Not
> plain text, though.

Hi Peter

Yep, I mean all mirrors. XML would be fine too, but here is an example - 

from I currently
generate/extract a list like this:

Getting the meta4 file could perhaps have worked, but I need to know the
mirrors before the URL is retrieved - otherwise I can't tell squid to
rewrite the URL when it is stored.

> The list of *all* mirrors can't be requested directly. It would be
> easy to implement that, but there are some things to keep in mind:
> Not all mirrors have all content, especially with openSUSE there is
> much variation between what the individual mirrors carry.

That is okay - I will use the list to rewrite all <mirror-urls>
to "".  If nothing is requested from a mirror,
there is no URL to rewrite/remap. 

> Some mirrors might want to remain private - which is the case for some
> mirrors located in countries with poor internationaly connectivity,
> where requests from outside the country need to be avoided. There is
> already a hack in the "mb mirrorlist" command (which generates also
> to exclude such mirrors
> from the listing. That might not be relevant in your case - I don't
> know if the URL rewriter could be deployed in a country with such a
> mirror.

I don't think it is important - what I do is "react" to requests that
have already been made, so if the current setup works wrt the above, I
don't see my rewriting (plus some other trickery) affecting anything. 

> The data you want to retrieve is the base URL of the mirrors, or
> anything else?

No, that's it, just the base. 

> With the latest MirrorBrain (newer than what is deployed on
>, mirrors are also listed in HTTP headers on requesting a
> file (Link headers, RFC 6249). Maybe that would be convenient too. A
> head request would be sufficient to get a list of mirrors. (That list
> is limited to 5 entries a the moment.)

Interesting possibility, although I can't quite tell if it would be

> BTW, I noticed a GSOC project that might share a similar goal with
> yours, but with another proxy:

Also interesting, thanks.  I've got a working setup already, I'm just
dealing with the rough edges now :-)

Per Jessen, Zürich (7.6°C)

mirrorbrain mailing list

Note: To remove yourself from this mailing list, send a mail with the content
to the address
Received on Wed May 16 2012 - 16:55:32 GMT

This archive was generated by hypermail 2.3.0 : Fri May 18 2012 - 21:47:02 GMT