Related Links
Alternative solutions
Redirectors:
- Apache2::Geo::Mirror Perl module to find closest Mirror written by Randy Kobes and T. J. Mather. In use by CPAN itself.
- Matt Domsch's MirrorManager, as used by the Fedora Project [trac]
- Mozilla.org's Bouncer
- Raphael Geissert is working on an HTTP-based redirector for Debian. See https://github.com/rgeissert/http-redirector and http://http.debian.net/
- Ryan Gordon's mod_offload
- Mandriva Linux has a download redirector that works together with their urmpi client. I don't know where to find it.
- The Apache Software Foundation uses a Perl script called closer.cgi to create their mirror lists
- openSUSE/MirrorCache resembles MirrorBrain and attempts to improve on it
DNS-based solutions:
Apache Traffic Control - an Open Source CDN https://trafficcontrol.apache.org/
GeoDNS (geographically-aware DNS resolution), as used by e.g. http://kernel.org/. A little rigid, as it expects all servers to be tightly synced at all times.
The Coral Content Distribution Network is DNS-based, but not transparent to client applications.
DNS round-robin, as used by Debian: simple scheme, which is a solid approach for its purpose, but generally requires a client that is configured to access an appropriate server. Requires servers to be tightly synced at all times.
Another example for DNSrr is the Ring Server Project.
Proxy networks:
- CoDeeN is a proxy server system created at Princeton University and deployed for general use on PlanetLab. To access this system, URLs are prefixed with http://coblitz.codeen.org/.
- Cacheboy is a software in development to build a system of interconnected proxies. It is based on a Squid 2 derivative and, according to the documentation, requires root access on all mirrors.
Comments/additions welcome!
About CDNs, theory, AS/IP mapping, ...
Some papers I found interesting.
- A Taxonomy and Survey of Content Delivery Networks (2007). A detailed overview on existing commercial and academic CDNs. http://www.cloudbus.org/reports/CDN-Taxonomy.pdf
- Tom Leighton from Akamai Technologies on Improving Performance on the Internet (2008): http://mags.acm.org/queue/200810/?pg=24
- Towards an Accurate AS-Level Traceroute Tool (2003): http://www.routeviews.org/papers/AS_traceroute.pdf
- Predicting Internet Network Distance with. Coordinates-Based Approaches (2002): http://www.cs.rice.edu/~eugeneng/papers/INFOCOM02.pdf
- IDMaps: A Global Internet Host Distance Estimation Service (2001): http://idmaps.eecs.umich.edu/papers/ton01.pdf
- Geographic Locality of IP Prefixes: http://www.coralcdn.org/docs/ipgeo-imc05.pdf
- On the scale and performance of cooperative web proxy caching (1999): http://www-cse.ucsd.edu/classes/fa01/cse222/papers/wolman-coopcache-sosp99.pdf
- An interesting and related research topic is the reasonability of P2P traffic with regard to network topology, called ''Application-Layer Traffic Optimization (ALTO)'' problem. See http://alto.tilab.com/. (See P2P in wikipedia for overview about swarm downloading technologies.)
- IETF draft "Framework for CDN Interconnection" https://datatracker.ietf.org/doc/draft-ietf-cdni-framework/?include_text=1
About Autonomous Systems and Network Topology
- General introduction to autonomous systems: http://www.cisco.com/web/about/ac123/ac147/archived_issues/ipj_9-1/autonomous_system_numbers.html
- 32-bit AS numbers coming soon. Be prepared! http://www.menog.net/meetings/menog2/presentations/philip-smith-32bit-asn.pdf
- Useful reports on Geoff Huston's site, updated daily: the IPv6 / IPv4 Comparative Statistics, 32-bit AS Number Report and 16-bit AS Number Report, and an overview on AS resource allocation by ISO-3166 code
- CIDR report: http://www.cidr-report.org/as2.0/
- GeoIP: http://www.maxmind.com/app/ip-location
- Team Cymru's IP to ASN Mapping: http://www.team-cymru.org/Services/ip-to-asn.html
- AS number to name mapping: http://www.potaroo.net/bgp/iana/asn-ctl.txt
Metalinks
- RFC 5854 - The Metalink Download Description Format http://tools.ietf.org/html/rfc5854
- RFC 6249 - Metalink/HTTP: Mirrors and Hashes http://tools.ietf.org/html/rfc6249
- http://www.metalinker.org/
- Transparent negotiation of metalinks: http://groups.google.com/group/metalink-discussion/web/transparent-metalinks
Practical hints for web server operators
- High performance web serving: http://www.stdlib.net/~colmmacc/Apachecon-EU2005/scaling-apache-handout.pdf
- http://kb.pert.geant2.net/PERTKB/ApacheScaling
- Clever load sharing and failover through IP clustering: http://www.linux-ha.org/ClusterIP
- Mirror setup and Apache DoS protection: https://en.opensuse.org/openSUSE:Mirror_howto#Protection_of_resources
Worldwide Internet Connectivity
- Broadband Internet access worldwide: http://en.wikipedia.org/wiki/Broadband_Internet_access_worldwide
- On the development of broadband access in rural and remote areas (2003): http://www.oecd.org/dataoecd/38/40/31718094.pdf
- Number of Internet users by country http://en.wikipedia.org/wiki/Demographics_of_the_Internet
Trivia
- The official ISO 3166-1 decoding table: http://www.iso.org/iso/iso-3166-1_decoding_table
- ISO 3166 country codes: http://en.wikipedia.org/wiki/ISO_3166-1
- Flag icons: http://www.famfamfam.com/lab/icons/flags/
- FTP vs. HTTP http://daniel.haxx.se/docs/ftp-vs-http.html
- A way to programmatically generate flow maps. Would be nice to play with. http://graphics.stanford.edu/papers/flow_map_layout/
Obsolete, but interesting stuff
- SuperSparrow: http://www.supersparrow.org/
- Globule: an Open-Source Content Distribution Network: http://www.globule.org/
- Netairt: a DNS redirector: http://www.globule.org/netairt/
Mirror Syncing
- (of course rsync)
- lsyncd - Live Syncing (Mirror) Daemon http://code.google.com/p/lsyncd/
- inosync http://github.com/hollow/inosync