Re: [mirrorbrain] error with multiple ip addresses

From: Peter Pöml <poeml_at_cmdline.net>
Date: Fri, 27 Nov 2009 20:35:33 +0100
Hi David, hi Marten,

Am 12.10.2009 um 22:56 schrieb David Farning:

> Peter,
> Here is an interesting error.  The following site resolves to multiple
> ip addresses and causes mb new to throw and error.

Sorry for late reply.

There are two separate issues to note here.

> mb new nluug.nl -H http://ftp.nluug.nl/pub/os/Linux/distr/Sugar -F
> ftp://ftp.nluug.nl/pub/os/Linux/distr/Sugar
> warning: 'ftp.nluug.nl' resolves to a multiple IP addresses:
> 192.87.102.42, 192.87.102.43

At first, this mirror's hostname is resolving to two addresses, which  
means that the clients use any of the addresses at random (DNS round- 
robin). That's fine for load-balancing in general, but in our case it  
matters whether the content on the two replica servers are identical.  
If the content on the two servers behind the URL differs, then it is a  
problem. The file list that the scanner sees could be different at  
each run, and a client that is redirected might run into a "file not  
found".

Having said that, with some mirrors this doesn't pose a problem, for  
instance when the file tree is mounted from a shared network storage,  
so it is in fact a single tree. It may also not be a problem if the  
content is synced between two storage systems really really quickly  
and reliably (also depending on the frequency of changes in the tree  
at all). But in many cases that I have seen there are actually two (or  
more) different servers, with different storage systems, sometimes  
different admins. Synchronization can lag, or break.

So, DNSrr is not necessarily a problem, but experience shows that it's  
a source of trouble and best avoided.

Therefore, "mb new" spits out the warning that the IP of ftp.nluug.nl  
resolves to more than one address.

It is best to add such a site as two separate mirrors, by using the  
"direct" hostname that resolves to the individual IP addresses. There  
usually is an individual hostname; if not, the IP itself could be  
used. The admins won't mind since the load balancing is happening  
anyway.

Thus, you would do:

mb new ftp1.nluug.nl -H 'http://ftp1.nluug.nl/pub/os/Linux/distr/ 
Sugar' -F ...
mb new ftp2.nluug.nl -H 'http://ftp2.nluug.nl/pub/os/Linux/distr/ 
Sugar' -F ...


As a consequence, the mirrors will be scanned separately, and the  
content of each be maintained in the database. For mirrors that have  
one actual storage that may be unwanted, but if that's sure (because  
the admins confirm this) then the mirror could still be entered as  
"one". As you did so far. That's why "mb new" issues only a warning.

As another consequence, the priority of the mirrors should be  
adjusted, because if nluug.nl appears to the redriector as two  
mirrors, it'll get more requests, and thus it makes sense to set the  
priority to e.g. 70 instead of 100. (The best number will depend on  
the total number on mirrors.)

In the case of nluug.nl, I had them as two mirrors in the openSUSE  
database because there was actually a case where inconsistencies had  
been seen (which is not to blame nluug.nl -- such things happen). In  
fact, in the beginning no problem was anticipated, because the two  
boxes were a cluster with shared storage. (And, from my notes, there  
was also ftp.surfnet.nl running from the same storage.) But eventually  
problems did occur, and inconsistencies lead to clients getting 403s  
(forbidden) when they reached a particular server. This shows that  
difficulties are not necessarily in differences in the file tree. I  
ended up scanning and monitoring each server individually; no problem  
since then.

The fact that mirror monitoring takes place for each server is an  
additional advantage. DNSrr itself doesn't deal with failing hosts;  
DNS happily keeps resolving to dead addresses.


Taking this further, there is another level at which similar  
precautions are in order: if a mirror uses GeoDNS instead of DNSrr.  
GeoDNS is not visible, but depending from where you look (resolve an  
address) you see something different. It's important to know about it  
- because the individual servers are far away from each other in this  
scenario, and a synchronisation lag (or major setup differences) are  
rather the rule than the exception. The treatment is according to the  
same principle: access each mirror separately under its own address,  
thus bypassing GeoDNS. (kernel.org is the only mirror doing this that  
I'm aware of.)


(This is information that needs to appear in the documentation,  
eventually. As a quick workaround, I'll add a link to this posting to  
the warning message.)


Now to the second issue. In your case, the warning was a bit  
concealed, because it was closely followed by the traceback of a  
different problem:


> Traceback (most recent call last):
> [...]
> psycopg2.IntegrityError: duplicate key value violates unique
> constraint "server_identifier_key"

What happened here is that a mirror with the identifier "nluug.nl",  
the same as you tried to create, already existed in the database. The  
identifier of mirrors must be unique, so the database vetoes (the  
database schema enforces it).

I am fixing this in svn, so now there is a proper error message  
reported in this case:


  # mb new nluug.nl -H 'http://ftp.nluug.nl/pub/os/Linux/distr/Sugar'
Error: a mirror's identifier must be unique.
There is already a mirror using this identifier. See output of `mb  
show nluug.nl`.
Exiting.


Thank you very much for the report!

Peter

_______________________________________________
mirrorbrain mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain/

Note: To remove yourself from this mailing list, send a mail with the content
 	unsubscribe
to the address mirrorbrain-request_at_mirrorbrain.org
Received on Fri Nov 27 2009 - 19:35:41 GMT

This archive was generated by hypermail 2.3.0 : Thu Mar 25 2010 - 19:30:55 GMT