[mirrorbrain-commits] [opensuse-svn] r6676 - trunk/tools/download-redirector-v2

From: Novell Forge SVN <noreply_at_novell.com>
Date: Tue, 3 Mar 2009 15:14:11 -0700 (MST)
Author: poeml
Date: 2009-03-03 15:14:07 -0700 (Tue, 03 Mar 2009)
New Revision: 6676

ABOUT: turn this into a forwarding pointer

Modified: trunk/tools/download-redirector-v2/ABOUT
--- trunk/tools/download-redirector-v2/ABOUT	2009-03-03 20:19:38 UTC (rev 6675)
+++ trunk/tools/download-redirector-v2/ABOUT	2009-03-03 22:14:07 UTC (rev 6676)
@@ -1,161 +1,2 @@
-Note, this document is no longer maintained.
-See http://mirrorbrain.org/
-                  The openSUSE download redirector
-  The openSUSE download redirector (a.k.a. the MirrorBrain) automatically
-redirects clients (per HTTP redirection) to a mirror server near them. It works
-similar to the systems employed by sourceforge.net, mozilla.com or similar
-large organizations, which face a number of download requests which is too high
-to be practically handled by a single site. To find a mirror close to the
-client, the redirector employs geolocation of the client's IP address. If
-several mirrors are suitable, the redirector load-balances requests to the
-mirrors based on their capabilities.
-  The core of the redirector is mod_mirrorbrain, a module for the Apache HTTP
-server, written in C, and designed for high performance and scalability, with
-security in mind.
-  Previous name of mod_mirrorbrain was mod_zrkadlo, pronounced "mod zurrcat
-low". Zrkadlo is Slovakian for mirror.
-  Due to the fast-evolving nature of the file tree offered by openSUSE project,
-the redirector doesn't simply choose one mirror for a client once, but acts as
-granular as on file-level, because mirrors are known to be incomplete,
-especially if content changes often. To achieve this, the redirector is
-supported by an SQL database which knows the exact contents of each mirror. The
-database is periodically updated by scanning all mirrors with a scanner
-program. In addition, there is a probing program which intermittently checks
-each mirror for responsiveness, and which can disable or pause redirection to a
-certain mirror, should it fail.
- - works transparently to the client, through HTTP redirection
- - can optionally return metalinks (http://metalinker.org), or human readable
-   mirror lists
- - supports transparent negotiation of metalinks (see 
-   http://groups.google.com/group/metalink-discussion/web/transparent-metalinks)
- - operates with file level granularity
- - involves only a single database query per HTTP request, using a database
-   connection pool through the Apache DBD framework
- - mirror choice per country / continent, using GeoIP database
- - uses a randomized, weighted algorithm for mirror selection (each mirror
-   having a score)
- - optionally memorizes client<->mirror association through memcache daemon
- - can make sure that mirrors get only requests from the same country or region
-   (important for countries with poor internet connectivity)
- - mirrors can be special catch-all type, to integrate content delivery networks
- - is configurable in Apache style configuration, with automatic per-directory
-   configuration merging
- - optionally redirects dependent on file name pattern, file size, mime type,
-   user agent, request origin, ...
- - flexible logging options
- - has a debug mode which can be enabled directory-wise, and thus is 
-   "compatible" with running production
- - the client IP address can be overridden for diagnostic purposes
- - canonicalizes file pathnames before database lookup, so the database needs to
-   hold only real files, and is not blown up by symlinks.
-So how does the redirector Apache module work?
-  This page http://en.opensuse.org/Build_Service/Redirector shows pseudocode
-which gives an outline how it works.
-Software requirements:
-Frontend (the redirector):
- - Apache HTTP server 2.2.6 or newer
- - libGeoIP, apr_memcache (or apr-util > 1.3.2), mod_memcache, and mod_form
- - memcache daemon
-Backend (rest of the MirrorBrain framework):
- - MySQL server. The tables should be InnoDB tables, because only that engine
-   offers row-based write locks. Due to optimizations of InnoDB engine for high
-   performance it makes sense to have a separate MySQL instance for this
-   database.
-   Postgresql should also work, but it hasn't been tested.
- - Python, python-mysql, python-sqlobject for the mirrorprobe and database
-   maintenance
- - Perl for the scanner process
-  There is a small mirror administration web frontend, built upon the TurboGears
-framework, but its development has just started.
-Hardware requirements:
-  File storage is attached to the webserver. (Running the redirector without
-attached file storage is a feature which is not implemented, but considered.)
-The openSUSE project currently hosts > 700.000 files using 850 GB.
-The webserver needs few computational resources. If it has other tasks, besides
-redirecting, those other tasks mainly determine the needed resources. However,
-to handle high amounts of redirects, like hundreds per seconds, it is
-recommended to run Apache in a hybrid prefork/worker configuration, with e.g.
-32 threads per process, which results in a good pooling of database
-  The most computational resources are needed by the database server. For large
-file trees, they can be considerable, like the openSUSE project, which
-redirects for a total of > 500.000 files. For performance reasons, the database
-server must be able to hold the database and indices completely in memory. The
-openSUSE redirector database is currently served by a 4-way Xeon 3.4Ghz with 4
-gigs of Ram, which is sufficient for the mysql server itself, as well as 12
-parallel scanner processes, and can handle 1000-2000 requests per second.
-HA (High Availability) setup:
-  For HA, the webserver, the database server and the connecting infrastructure
-needs to be redundant. A geographically distributed array of redirectors
-would be one way. Locally, it can be achieved by creating a hot standby for
-failover, or by running identical nodes with load sharing / balancing.  This
-could be implemented by 
- - deploying a hardware load balancer, which distributes requests to webserver
-   nodes, or using clusterip on the webserver nodes themselves to make them do
-   load sharing
- - two or more webserver nodes with identical setup for ease of maintenance
- - running a one or more mysql servers in slave configuration. Database queries
-   could be split so that write requests go only to the master, while read
-   requests go to master and slaves.
- - mysql-proxy does load balancing and r/w-splitting
-Obviously, there are different ways achieving HA, which will vary to local
-This product includes GeoLite data created by MaxMind, available from
+This document has moved to the main project page (online at

Opensuse-svn mailing list

mirrorbrain-commits mailing list

Note: To remove yourself from this list, send a mail with the content
to the address mirrorbrain-commits-request_at_mirrorbrain.org
Received on 2009-03-03Z22:14:31

This archive was generated by hypermail 2.2.0 : 2009-07-10Z19:18:12 GMT