[mirror discuss] Re: Mirror selection for improved buffer cache utilization in mirrors of equal proximity and capacity (was: Re: Download problem)

From: Eberhard Moenkeberg <emoenke_at_gwdg.de>
Date: Sat, 25 Sep 2010 23:35:32 +0200 (CEST)
Hi Peter,

On Sat, 25 Sep 2010, Peter Pöml wrote:
> Le 08.09.2010 à 22:00, Eberhard Moenkeberg a écrit :

>> Can mirrorbrain test latencies?
>
> Not properly. The check that is done every minute doesn't pay big 
> attention at the response time, it just assesses whether a mirror 
> responded within a fixed time (usually 20s). However, it would be 
> straighforward to extend the script to do more things with the response 
> time, like saving a history, plotting it, or pulling some trigger when 
> the response time suddenly grows.

Mirrorbrain could hold a "latency table" to see changes against yesterday.

Today bandwidth usually is not the biggest problem (al least in Europe), 
but latency. With light speed, you reach about 200 km per ms, with a 
transport of about 120 kbytes.

So latency is always higher for far away connections and lower for near 
connections. Mirrorbrain would need a "normal value" for each mirror from 
his point of sight to decide if the current situation may be out of 
normal.

> I never made a directed effort to assess latencies, other than 
> subjectively experimenting. I generally think that albeit latency is 
> easy to test, the interpretation could be difficult. The distance to the 
> tested mirror (from the point where the test is run) would play a role 
> and the results need related to this. Or one might measure the latency 
> relative to the "usual" (previously observed) latency. But that's again 
> also dependent on other factors. -I don't have an easy recipe, although 
> I like the idea.

Yes, good.

>> For the efficiency of the cache effect, it would be good if mirrorbrain 
>> could for each file request prefer a server to whom this request 
>> recently already got redirected. Is that possible?
>
> It could be done. Christoph Thiel, my predecessor and former colleague 
> at SUSE, also suggested it some years ago. The idea was to make better 
> use of the mirrors buffer caches.

Yes, I have added Christoph to cc:.

> I brainstormed a little bit and came up with the following conclusions.
>
> It seems definitely possible, and if I'm right, actually feasible.
>
> It is principally an old problem to optimize the utilization of caches 
> which are located between users and servers. The goal is to distribute 
> requests into "buckets" equally, close both to the users and to the 
> servers. Ideally in a stable fashion, that if one bucket is added, or 
> one drops out, not all caches are flushed by a redistribution.
>
> Characteristically, the classic problem is about a large number of 
> objects (say, web pages) cached, compared to a relatively small number 
> of caches (intermediaries). In todays mirroring, we often have a few 
> particular files (the hot spots) that are comparably large, but 
> infrequent (let's say 2 OOo files, or 5 Linux iso images), and 1-10 
> mirrors per country. However, the most obvious approach, a simple hash 
> function that distributes the requests, won't work here. Why not? I'll 
> demonstrate below.
>
>
> First, there are conditions that must be met.
>
> 1) The mirrors considered must be in *equal proximity* to the user. It 
> would obviously not make much sense to send all requests to DVD #1 to 
> national mirrors, whereas DVD #2 is served from another continent.
>
> 2) The mirrors considered must have *equal capacity*. If there are two 
> mirror nearby, but one of them has 10x the bandwidth, the requests can't 
> be split fifty-fifty. If there are two DVDs to be served, it wouldn't be 
> good to serve one from a strong mirror and one from a much weaker one.
>
> 3) If all the users run after just *one* file, the whole idea doesn't 
> apply.
>
> 4) The files to be distributed intelligently must be similar in size and 
> popularity. It would not work to send all requests on DVDs to mirror A 
> and all requests to the release notes to mirror B...
>
> Fifth, it is worth asking the question, how many mirrors are really 
> disk-bound in their operation, and how many are more bandwidth-bound. 
> Maybe the situation of some largest mirrors, whose bottleneck is disk 
> throughput, is not representative for everyone. There are mirrors like 
> in India for instance, where sheer bandwidth seems to be the biggest 
> bottleneck. So, the first question is, which percentage of mirrors would 
> benefit from better buffer cache utilization at all? Is it worth putting 
> time into this?
>
>
> Implications and limitations:
>
> - Currently, mirror selection tries to alternate between available 
> mirrors; this means, users have a certain chance to work around a broken 
> or slow mirror simply by repeating their request. The technique proposed 
> here would avoid that, and mirror selection would be stuck to a one 
> mirror, or to a smaller set of mirrors; users had no chance to get to a 
> better mirror with the next request.
>
> - Assuming the method works as designed, there can still always be 
> requests that "disturb" the cache nevertheless: like people accessing 
> the mirror directly, and "local" requests that are currently rightfully 
> routed to a certain mirror because they originate from the network or 
> the autonomous system of a mirror. Depending on the frequency of such 
> requests, they might kill the whole effort. I am not sure about the 
> effect of these requests. It probably depends on their absolute 
> frequency.
>
> + Even if only a small effect is achieved in the attempt to split up the 
> requests, some mirrors will benefit very much from it.
>
> + Byterange requests (partial GETs), which can reach mirrors in random 
> order, are alleviated, because it is obviously better for mirrors they 
> are limited to fewer files.
>
> + Intermediary proxy caches also benefit from the technique, 
> automatically, and for free.
>
>
> Now, how to go about it?
>
> * The technique should not require too much manual tweaking. Ideally, 
> none. A need to configure manually which files are redirected to which 
> mirror would not be very usable.
>
> * Obviously, mirrors that are considered for splitting requests up 
> between them need to be "grouped": Mirrors of similar location (same 
> country) and similar capacity (same priority) can be put together into a 
> group. Inside such an entity, there is room for the "fanciness".
>
> * A classic mechanism to distribute requests would be to calculate a 
> quick hash of the URL of filename, and take the modulo (remainder) of a 
> division by the number of eligible mirrors. However, this alone will 
> result in uneven distribution among mirrors, up to the extreme that if 
> there is only one really popular file, all requests would be sent to a 
> single mirror, instead of all mirrors. Or when onsidering 2 popular 
> files, and 7 mirrors, only 2 mirrors would get requests. Thus, the usual 
> "hash % number of buckets" algorithm is not suitable. (It would have 
> been easy to just calculate "inode % # mirrors" and be done...)
>
>
> I think it is illustrative to look at the different constellations that 
> need to be handled. In any case, there is a number of files (for which 
> requests are to be split up) and a number of mirrors (that fell into the 
> same group of geo-proximity and capacity):
>
> - With only 1 file, the exercise is pointless.
>
> - With only 1 mirror, the exercise is pointless as well.
>
> - With 2 files, and an even number of mirrors, a perfect distribution is 
> possible.
>
> - With 2 files, and an odd number of mirrors, the requests can be split 
> between n-1 mirrors, leaving one mirror unconsidered, which should 
> receive a fair share of requests on all files.
>
> - With 3 files it's getting a little bit more complicated:
> With 2 mirrors, 30% buffer cache can be saved by splitting 2 files and 
> sending the 3rd file to both mirrors. With 3 mirrors, we have the ideal 
> case where every mirror gets requests for one file. With 4 mirrors, we 
> need to ensure that mirror #4 does not get 3 times as many requests as 
> the other mirrors. 1/4 (one forth) of the requests needs to go to this 
> "remaining" mirror. With 5 mirrors, it's principally the same as with 4 
> mirrors: split requests between 3 of them, and give the remaining two a 
> fair share of the requests (two fifths = 40%). With 6 mirrors, we have 
> an "ideal case" again. With 7 mirrors, ... you get the pattern by now:
>
> 7 mirrors % 3 files = 1
>
> which can be transcribed by stating that 1 of the 7 mirrors needs to get 
> 1/7 of the requests, and the rest of the requests is distributed 
> round-robin to the other 6 mirrors.
>
> Stable round-robin distribution to a "fitting" number of mirrors is 
> straightforward if the mirrors are internally ordered by some internal 
> id that doesn't change.
>
>
> So to cast that into pseudo code, the mirror selection could be 
> implemented like this:
>
>        nF = number of files
>        nM = number of mirrors
>         f = # of the file in question
>  selected = # of the selected mirror (integer between 0 and nM-1)
>
> selected_mirror = int( rand(0..1) * (nm/nf) + f * (nm/nf) )
>
> This simple, and light-weight computation does the job. It is not even 
> necessary to take separate care of the remaining mirrors (nR = nM % nF), 
> because they are dealt with automatically.
>
>
> To see if this pencil-and-paper exercise is workable, I wrote a little 
> script. It is attached to this mail. It runs random "requests" to a 
> defined number of files, and uses the above algorithm to assign them to 
> a defined number mirrors.
>
> It appears to work wonderfully to me.
>
> Below, you see the output of the attached test script with various 
> parameters - let's start with 1 file. The output is as expected. (Sorry, 
> no colourful diagrams ;-)
>
> % ./ra.py --files 1 --mirrors 1
> 100000 requests (100%) to mirror0 for file0
> % ./ra.py --files 1 --mirrors 2
> 50095 requests (50%) to mirror0 for file0
> 49905 requests (49%) to mirror1 for file0
> % ./ra.py --files 1 --mirrors 3
> 33459 requests (33%) to mirror0 for file0
> 33170 requests (33%) to mirror1 for file0
> 33371 requests (33%) to mirror2 for file0
>
> With 2 files:
>
> % ./ra.py --files 2 --mirrors 2
> 50237 requests (50%) to mirror0 for file0
> 49763 requests (49%) to mirror1 for file1
> % ./ra.py --files 2 --mirrors 3
> 33329 requests (33%) to mirror0 for file0
> 33381 requests (33%) to mirror1 for file1, file0
> 33290 requests (33%) to mirror2 for file1
> % ./ra.py --files 2 --mirrors 4
> 25081 requests (25%) to mirror0 for file0
> 24968 requests (24%) to mirror1 for file0
> 25054 requests (25%) to mirror2 for file1
> 24897 requests (24%) to mirror3 for file1
> % ./ra.py --files 2 --mirrors 5
> 19851 requests (19%) to mirror0 for file0
> 20040 requests (20%) to mirror1 for file0
> 20008 requests (20%) to mirror2 for file1, file0
> 20019 requests (20%) to mirror3 for file1
> 20082 requests (20%) to mirror4 for file1
> % ./ra.py --files 2 --mirrors 6
> 16645 requests (16%) to mirror0 for file0
> 16808 requests (16%) to mirror1 for file0
> 16804 requests (16%) to mirror2 for file0
> 16759 requests (16%) to mirror3 for file1
> 16554 requests (16%) to mirror4 for file1
> 16430 requests (16%) to mirror5 for file1
>
> Perfect!
>
> Now, how does it cope with 3 files?
>
> % ./ra.py --files 3 --mirrors 1
> 100000 requests (100%) to mirror0 for file2, file1, file0
> % ./ra.py --files 3 --mirrors 2
> 49806 requests (49%) to mirror0 for file1, file0
> 50194 requests (50%) to mirror1 for file2, file1
> % ./ra.py --files 3 --mirrors 3
> 33284 requests (33%) to mirror0 for file0
> 33323 requests (33%) to mirror1 for file1
> 33393 requests (33%) to mirror2 for file2
> % ./ra.py --files 3 --mirrors 4
> 25095 requests (25%) to mirror0 for file0
> 25024 requests (25%) to mirror1 for file1, file0
> 24912 requests (24%) to mirror2 for file2, file1
> 24969 requests (24%) to mirror3 for file2
> % ./ra.py --files 3 --mirrors 5
> 20026 requests (20%) to mirror0 for file0
> 20102 requests (20%) to mirror1 for file1, file0
> 19871 requests (19%) to mirror2 for file1
> 19991 requests (19%) to mirror3 for file2, file1
> 20010 requests (20%) to mirror4 for file2
> % ./ra.py --files 3 --mirrors 6
> 16613 requests (16%) to mirror0 for file0
> 16626 requests (16%) to mirror1 for file0
> 16598 requests (16%) to mirror2 for file1
> 16876 requests (16%) to mirror3 for file1
> 16743 requests (16%) to mirror4 for file2
> 16544 requests (16%) to mirror5 for file2
> % ./ra.py --files 3 --mirrors 7
> 14251 requests (14%) to mirror0 for file0
> 14125 requests (14%) to mirror1 for file0
> 14257 requests (14%) to mirror2 for file1, file0
> 14447 requests (14%) to mirror3 for file1
> 14308 requests (14%) to mirror4 for file2, file1
> 14252 requests (14%) to mirror5 for file2
> 14360 requests (14%) to mirror6 for file2
>
> Fine. This is as good as it can get, and it saves buffer cache on all mirrors.
>
> And with 4 files?
>
> % ./ra.py --files 4 --mirrors 1
> 100000 requests (100%) to mirror0 for file3, file2, file1, file0
> % ./ra.py --files 4 --mirrors 2
> 49845 requests (49%) to mirror0 for file1, file0
> 50155 requests (50%) to mirror1 for file3, file2
> % ./ra.py --files 4 --mirrors 3
> 33213 requests (33%) to mirror0 for file1, file0
> 33212 requests (33%) to mirror1 for file2, file1
> 33575 requests (33%) to mirror2 for file3, file2
> % ./ra.py --files 4 --mirrors 4
> 24978 requests (24%) to mirror0 for file0
> 25034 requests (25%) to mirror1 for file1
> 25137 requests (25%) to mirror2 for file2
> 24851 requests (24%) to mirror3 for file3
> % ./ra.py --files 4 --mirrors 5
> 20206 requests (20%) to mirror0 for file0
> 19819 requests (19%) to mirror1 for file1, file0
> 20070 requests (20%) to mirror2 for file2, file1
> 20103 requests (20%) to mirror3 for file3, file2
> 19802 requests (19%) to mirror4 for file3
> % ./ra.py --files 4 --mirrors 6
> 16518 requests (16%) to mirror0 for file0
> 16533 requests (16%) to mirror1 for file1, file0
> 16755 requests (16%) to mirror2 for file1
> 16761 requests (16%) to mirror3 for file2
> 16602 requests (16%) to mirror4 for file3, file2
> 16831 requests (16%) to mirror5 for file3
>
> Excellent!
>
> And with 5 files (is anyone still following this far?):
>
> % ./ra.py --files 5 --mirrors 1
> 100000 requests (100%) to mirror0 for file3, file2, file1, file0, file4
> % ./ra.py --files 5 --mirrors 2
> 50059 requests (50%) to mirror0 for file2, file1, file0
> 49941 requests (49%) to mirror1 for file3, file2, file4
> % ./ra.py --files 5 --mirrors 3
> 33404 requests (33%) to mirror0 for file1, file0
> 33249 requests (33%) to mirror1 for file3, file2, file1
> 33347 requests (33%) to mirror2 for file3, file4
> % ./ra.py --files 5 --mirrors 4
> 24904 requests (24%) to mirror0 for file1, file0
> 25022 requests (25%) to mirror1 for file2, file1
> 25109 requests (25%) to mirror2 for file3, file2
> 24965 requests (24%) to mirror3 for file3, file4
> % ./ra.py --files 5 --mirrors 5
> 20192 requests (20%) to mirror0 for file0
> 19836 requests (19%) to mirror1 for file1
> 20202 requests (20%) to mirror2 for file2
> 19869 requests (19%) to mirror3 for file3
> 19901 requests (19%) to mirror4 for file4
> % ./ra.py --files 5 --mirrors 6
> 16880 requests (16%) to mirror0 for file0
> 16654 requests (16%) to mirror1 for file1, file0
> 16731 requests (16%) to mirror2 for file2, file1
> 16593 requests (16%) to mirror3 for file3, file2
> 16592 requests (16%) to mirror4 for file3, file4
> 16550 requests (16%) to mirror5 for file4
>
> Fantastic.
>
> To my surprise, it seems to work beautifully for larger number of files 
> even! Or am I shooting over the roof now? See:
>
> % ./ra.py --files 100 --mirrors 5
> 20083 requests (20%) to mirror0 for file3, file12, file1, file10, file7, file15, file5, file4, file8, file11, file9, file18, file16, file0, file6, file13, file17, file2, file19, file14
> 19777 requests (19%) to mirror1 for file32, file23, file27, file35, file36, file34, file39, file38, file28, file29, file24, file31, file22, file30, file20, file21, file26, file33, file37, file25
> 19823 requests (19%) to mirror2 for file57, file59, file56, file47, file40, file41, file42, file43, file44, file45, file46, file58, file48, file49, file55, file54, file53, file52, file51, file50
> 20059 requests (20%) to mirror3 for file75, file63, file73, file67, file65, file61, file64, file66, file74, file77, file76, file62, file70, file60, file72, file71, file79, file78, file68, file69
> 20258 requests (20%) to mirror4 for file87, file94, file80, file82, file96, file93, file99, file98, file88, file89, file86, file92, file84, file85, file91, file90, file97, file81, file95, file83
>
>
> It seems fair and efficient.
>
> When mirrors drop out or are added, the distribution changes -- it is 
> not stable against such changes, but I suppose we can live with that. 
> (After all, we live with complete entropy at present!)
>
> This all works only under a further condition: files that should be 
> treated this way need to be indexed in some way. Maybe one would want to 
> apply the special handling only to certain "hotspots" in the tree, like 
> freshly released DVDs of a new distro. Practically, these files could be 
> (automatically) assigned an integer as index during configuration. That 
> would be the missing number "f" in the formula. Or it could be the n-th 
> file inside a directory. That's an implementation-specific detail I 
> guess. I don't know for other redirectors, but for MirrorBrain I would 
> have a rough idea how to solve it in an easy way.
>
> Tagging only certain "hotspot" files for this extra treatment would 
> allow to stick to a simpler mechanism for the mass of other files. 
> Because, not to forget, for the buffer-cache-friendly request splitting 
> the mirrors need to be grouped in a sensible way. Which is not a big 
> task in itself, but the little things add up. In the picture that forms 
> in front of my eye, I would expect the code to be not very costly, 
> though.

Peter, you simply are one of the greatest. Work on, work on...

Archimedes' last words were: "noli tangere circulos meos" (don't touch my 
circles - he had some geometric figures drawn into the sand and was 
thinking about them) - take that as a pulse to go on.


Viele Gruesse
Eberhard Moenkeberg (emoenke_at_gwdg.de, em_at_kki.org)

-- 
Eberhard Moenkeberg
Arbeitsgruppe IT-Infrastruktur
E-Mail: emoenke_at_gwdg.de      Tel.: +49 (0)551 201-1551
-------------------------------------------------------------------------
Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen (GWDG)
Am Fassberg 11, 37077 Goettingen
URL:    http://www.gwdg.de             E-Mail: gwdg_at_gwdg.de
Tel.:   +49 (0)551 201-1510            Fax:    +49 (0)551 201-2150
Geschaeftsfuehrer:         Prof. Dr. Oswald Haan und Dr. Paul Suren
Aufsichtsratsvorsitzender: Dipl.-Kfm. Markus Hoppe
Sitz der Gesellschaft:     Goettingen
Registergericht:           Goettingen  Handelsregister-Nr. B 598
-------------------------------------------------------------------------

_______________________________________________
discuss mailing list
Archive: http://mirrorbrain.org/archive/discuss/

Note: To remove yourself from this mailing list, send a mail with the content
 	unsubscribe
to the address discuss-request_at_mirrorbrain.org
Received on Sat Sep 25 2010 - 22:11:10 GMT

This archive was generated by hypermail 2.3.0 : Sat Oct 16 2010 - 12:17:05 GMT