Peter Pöml wrote: > Hi Per, > > Am 16.05.2012 um 13:47 schrieb Per Jessen: >> is there a (standardized) way of retrieving the mirror list in plain >> text? I need this for setting up a URL rewriter for squid which will >> help me cache segmented downloads. For instance, I can retrieve the >> HTML from http://mirrors.opensuse.org/list/all.html and parse that >> HTML quite easily, but I would prefer just getting a plain text file >> straight from mirrorbrain. > > Do you mean all mirrors? If you have a certain file in mind, then > appending .meta4 to the file's URL will give you parseable XML. Not > plain text, though. Hi Peter Yep, I mean all mirrors. XML would be fine too, but here is an example - from http://mirrors.opensuse.org/list/all.html I currently generate/extract a list like this: http://repo.ugm.ac.id/opensuse/ http://dl2.foss-id.web.id/opensuse/ http://mirror.isoc.org.il/pub/opensuse/ http://ftp.jaist.ac.jp/pub/Linux/openSUSE/ http://ftp.kddilabs.jp/Linux/packages/opensuse/ http://ftp.novell.co.jp/pub/opensuse/ http://ftp.riken.jp/Linux/opensuse/ http://ftp.yz.yamagata-u.ac.jp/pub/linux/opensuse/ http://ftp.daum.net/opensuse/ http://ftp.kaist.ac.kr/pub/opensuse/ http://archive.mmu.edu.my/opensuse Getting the meta4 file could perhaps have worked, but I need to know the mirrors before the URL is retrieved - otherwise I can't tell squid to rewrite the URL when it is stored. > The list of *all* mirrors can't be requested directly. It would be > easy to implement that, but there are some things to keep in mind: > > Not all mirrors have all content, especially with openSUSE there is > much variation between what the individual mirrors carry. That is okay - I will use the list to rewrite all <mirror-urls> to "download.opensuse.org". If nothing is requested from a mirror, there is no URL to rewrite/remap. > Some mirrors might want to remain private - which is the case for some > mirrors located in countries with poor internationaly connectivity, > where requests from outside the country need to be avoided. There is > already a hack in the "mb mirrorlist" command (which generates also > http://mirrors.opensuse.org/list/all.html) to exclude such mirrors > from the listing. That might not be relevant in your case - I don't > know if the URL rewriter could be deployed in a country with such a > mirror. I don't think it is important - what I do is "react" to requests that have already been made, so if the current setup works wrt the above, I don't see my rewriting (plus some other trickery) affecting anything. > The data you want to retrieve is the base URL of the mirrors, or > anything else? No, that's it, just the base. > With the latest MirrorBrain (newer than what is deployed on > openSUSE.org), mirrors are also listed in HTTP headers on requesting a > file (Link headers, RFC 6249). Maybe that would be convenient too. A > head request would be sufficient to get a list of mirrors. (That list > is limited to 5 entries a the moment.) Interesting possibility, although I can't quite tell if it would be useful. > BTW, I noticed a GSOC project that might share a similar goal with > yours, but with another proxy: >http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/nottheoilrig/1 Also interesting, thanks. I've got a working setup already, I'm just dealing with the rough edges now :-) -- Per Jessen, Zürich (7.6°C) _______________________________________________ mirrorbrain mailing list Archive: http://mirrorbrain.org/archive/mirrorbrain/ Note: To remove yourself from this mailing list, send a mail with the content unsubscribe to the address mirrorbrain-request_at_mirrorbrain.orgReceived on Wed May 16 2012 - 16:55:32 GMT
This archive was generated by hypermail 2.3.0 : Fri May 18 2012 - 21:47:02 GMT