Re: [mirrorbrain] error with multiple ip addresses

From: Peter Poeml <poeml_at_cmdline.net>
Date: Mon, 21 Dec 2009 03:56:41 +0100
Hi,

On Sat, Nov 28, 2009 at 08:53:03AM -0600, dfarning_at_sugarlabs.org wrote:
> On Fri, Nov 27, 2009 at 1:35 PM, Peter Pöml <poeml_at_cmdline.net> wrote:
> >Hi David, hi Marten,
> >
> >Am 12.10.2009 um 22:56 schrieb David Farning:
> >
> >>Peter,
> >>Here is an interesting error.  The following site resolves to multiple
> >>ip addresses and causes mb new to throw and error.
> >
> >Sorry for late reply.
> 
> It looks like you are catching up after your vacation:)
> 
> I am also ccing this to systems_at_sl.o so their is a record of this thread in the archive.
> 
> >There are two separate issues to note here.
> >
> >>mb new nluug.nl -H http://ftp.nluug.nl/pub/os/Linux/distr/Sugar -F
> >>ftp://ftp.nluug.nl/pub/os/Linux/distr/Sugar
> >>warning: 'ftp.nluug.nl' resolves to a multiple IP addresses:
> >>192.87.102.42, 192.87.102.43
> >
> >At first, this mirror's hostname is resolving to two addresses, which means
> >that the clients use any of the addresses at random (DNS round-robin).
> >That's fine for load-balancing in general, but in our case it matters
> >whether the content on the two replica servers are identical. If the content
> >on the two servers behind the URL differs, then it is a problem. The file
> >list that the scanner sees could be different at each run, and a client that
> >is redirected might run into a "file not found".
> >
> >Having said that, with some mirrors this doesn't pose a problem, for
> >instance when the file tree is mounted from a shared network storage, so it
> >is in fact a single tree. It may also not be a problem if the content is
> >synced between two storage systems really really quickly and reliably (also
> >depending on the frequency of changes in the tree at all). But in many cases
> >that I have seen there are actually two (or more) different servers, with
> >different storage systems, sometimes different admins. Synchronization can
> >lag, or break.
> >
> >So, DNSrr is not necessarily a problem, but experience shows that it's a
> >source of trouble and best avoided.
> >
> >Therefore, "mb new" spits out the warning that the IP of ftp.nluug.nl
> >resolves to more than one address.
> >
> >It is best to add such a site as two separate mirrors, by using the "direct"
> >hostname that resolves to the individual IP addresses. There usually is an
> >individual hostname; if not, the IP itself could be used. The admins won't
> >mind since the load balancing is happening anyway.
> >
> >Thus, you would do:
> >
> >mb new ftp1.nluug.nl -H 'http://ftp1.nluug.nl/pub/os/Linux/distr/Sugar' -F
> >...
> >mb new ftp2.nluug.nl -H 'http://ftp2.nluug.nl/pub/os/Linux/distr/Sugar' -F
> >...
> >
> >
> >As a consequence, the mirrors will be scanned separately, and the content of
> >each be maintained in the database. For mirrors that have one actual storage
> >that may be unwanted, but if that's sure (because the admins confirm this)
> >then the mirror could still be entered as "one". As you did so far. That's
> >why "mb new" issues only a warning.
> >
> >As another consequence, the priority of the mirrors should be adjusted,
> >because if nluug.nl appears to the redriector as two mirrors, it'll get more
> >requests, and thus it makes sense to set the priority to e.g. 70 instead of
> >100. (The best number will depend on the total number on mirrors.)
> >
> >In the case of nluug.nl, I had them as two mirrors in the openSUSE database
> >because there was actually a case where inconsistencies had been seen (which
> >is not to blame nluug.nl -- such things happen). In fact, in the beginning
> >no problem was anticipated, because the two boxes were a cluster with shared
> >storage. (And, from my notes, there was also ftp.surfnet.nl running from the
> >same storage.) But eventually problems did occur, and inconsistencies lead
> >to clients getting 403s (forbidden) when they reached a particular server.
> >This shows that difficulties are not necessarily in differences in the file
> >tree. I ended up scanning and monitoring each server individually; no
> >problem since then.
> >
> >The fact that mirror monitoring takes place for each server is an additional
> >advantage. DNSrr itself doesn't deal with failing hosts; DNS happily keeps
> >resolving to dead addresses.
> 
> Thanks for the complete answer.

I just realized more side effects of DNSrr'ed mirrors.
(For MirrorBrain and similar mirror redirectors.)

I have a Vietnamese mirror that is run on two machines, and there's a
single hostname that resolves to their IP addresses. It was treated like 
one mirror and we used the hostname for HTTP and FTP access (the latter
for scanning). Now, since a few days, one of the two boxes is reachable,
and one isn't. 

Now, the following happened:

- the mirror probe gets random results when checking if the mirror is
  online, depending on which one it tries. Thus, it flips the mirror on
  and off.

- the clients have a 50% chance to either not be redirected to the
  mirror at all, since it's flagged offline, or they get redirected to
  it, when it happens to be flagged online.

- if a client is sent to the mirror, there's a 50% chance that runs into
  a timeout, or that it gets a reply.

- the scanner has a 50% chance in reaching the mirror through FTP; if it
  doesn't reach it via FTP, it will try via HTTP (as fallback), from
  where it'll alternate between running into timeouts and getting replies
  (because the HTTP doesn't seem to use Keepalive).


I actually only noticed this because the full scan of all mirrors
sometimes needed 2 hours instead of 15 minutes during the last days.

Of course, this is all not really acceptable. 

To fix the situation for this mirror, I created separate entries in the
database for it, using the IP address to access the two hosts, which
luckily works with the Apache virtual host setup in this case. This is
also how we fixed it in all previous cases. (It's mostly not required to
use IP addresses, because usually the hosts already have address records
with which they can be accessed directly. In this case, I wanted a fix
first, and I'll later inquire whether there are separate names, or
arrange for them.)


I'm thinking how to best prevent these problems. I wonder if (and how)
one could deal with this automatically. At the moment, I tend to think
that round-robin DNS mirrors should generally best be treated
separately. 

In all instances in the past, it was easy enough to work around, once
one knew about the problem. However, it was mostly discovered only
_when_ problems occured. When creating a new mirror, MirrorBrain gives a
warning, when it sees multiple IP addresses, but there is always the
possibility that a mirror becomes DNSrr'd later.

I think that the "separate treatment" needs to be enforced more
actively. It's not deterministic enough.

On the other hand, DNSrr is not uncommon -- more common than the
prevalence of resulting problems suggest. About every 10th mirror seems
to use it. And some mirrors have more than two adresses; for instance,
www.mirrorservice.org resolves to 8 IP address. I'm not aware that there
was ever a problem with them; It would be daunting to add them 8 times
to the database. It's certainly doable, but automatic handling would
still be nicer.

Maybe somebody else has an idea?

> >Taking this further, there is another level at which similar precautions are
> >in order: if a mirror uses GeoDNS instead of DNSrr. GeoDNS is not visible,
> >but depending from where you look (resolve an address) you see something
> >different. It's important to know about it - because the individual servers
> >are far away from each other in this scenario, and a synchronisation lag (or
> >major setup differences) are rather the rule than the exception. The
> >treatment is according to the same principle: access each mirror separately
> >under its own address, thus bypassing GeoDNS. (kernel.org is the only mirror
> >doing this that I'm aware of.)

Thanks,
Peter



_______________________________________________
mirrorbrain mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain/

Note: To remove yourself from this mailing list, send a mail with the content
 	unsubscribe
to the address mirrorbrain-request_at_mirrorbrain.org
Received on Mon Dec 21 2009 - 02:56:48 GMT

This archive was generated by hypermail 2.3.0 : Thu Mar 25 2010 - 19:30:56 GMT