[mirrorbrain] RFC: notify-after-sync for mirrors

From: Peter Pöml <peter_at_poeml.de>
Date: Mon, 10 Feb 2014 23:28:59 +0100
Hi,

it just occurred to me that it would be nice if mirrors could notify the
MirrorBrain scanning server when they have synced, in order to trigger a
scan. 

Even though it's a very logical thing to do, the question would be how
to implement it. The following issues come to mind:

Client side:
- something as simple as a mail could do the job, although a simple HTTP
  request would be way easier to implement than fiddling with mail
  filters. 
- or a simple HTTP request
- should be very easy to deploy for the mirror admin, ideally a single,
  simple command that is run subsequently to the rsync call. 
- shouldn't require the admin to install fancy stuff or give away of
  local privileges
- might optionally supply a path name, when only a subtree has been
  synced, and a full scan isn't necessary

Server side:
- should track when the last notify has happened, so that scans are not
  scheduled too frequently (eating up server resources)
- needs some kind of access control (mirror source IP? password?)
- probably needs additional scans that are scheduled periodically, to
  supplement the triggered scans


BTW, rsync _server_ log scanning is something that I explored long ago,
because in principal the rsync server seen _every_ change on the mirrors
(at least as they don't sync from another source). However, the rsyncd
can't be configured to log everything; deletes (afair) are not logged.
Otherwise, it would be possible to substitute mirror scanning completely
with "rsyncd log surfing". (Which would be cool... maybe it's worth to
hack rsyncd to be able to log client-side deletes. I don't know if
that's possible though, because the rsync client might not even tell the
server about it; dunno.) The rsync server sees connects and disconnects,
though. Each disconnect could trigger a scan. However, often the source
IP (and that's all that rsyncd can see, no name) isn't the same as the
IP that the mirror resolves to. Also, a mirror might connect much more
frequently than a scan is wanted. (Yes, push mirroring is better. Have
you looked at
http://svn.mirrorbrain.org/viewvc/mirrorbrain/trunk/tools/push2mirrors?revision=8255&view=markup
? And https://wiki.documentfoundation.org/Infra/Mirroring for more info)


Rsync _client_ server logs could of course serve as input for the
scanner (either by sending a file list or just a list of directories
that were changed). But that requires running fancy scripts on the
mirrors again.

Please add your thoughts.
Peter


_______________________________________________
mirrorbrain mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain/

Note: To remove yourself from this mailing list, send a mail with the content
 	unsubscribe
to the address mirrorbrain-request_at_mirrorbrain.org
Received on Mon Feb 10 2014 - 22:29:14 GMT

This archive was generated by hypermail 2.3.0 : Tue Feb 11 2014 - 23:32:04 GMT