Am 02.01.2010 um 09:24 schrieb Cory Fields: > Yup, Mirrorbrain is up and working beautifully. We'll be deploying as > soon as we can gather stats correctly and automatically. Hope to be > using metalinks in the near-future as well on the client side for > larger (~30mb+) downloads as well. They're working great via the file > index, just need to work on integrating them into our client. About gathering download stats, I intended to come up with enough helpful details for you to implement the same that I have done for OpenOffice.org... I implemented most of http://mirrorbrain.org/download-statistics/ (code at http://svn.mirrorbrain.org/svn/mod_stats/trunk/ ). Unfortunately, I so far failed to find the time to document it properly, and to iron out little things from the code that are specific to this first deployment. Anyway, I'd like to resurrect it later. mod_stats, the planned Apache module that counts in realtime, hasn't been written yet; as a quick solution, I came up with a script that crawls the logs once a day, and it is reasonably fast anyway: tools/dlcount.py tools/ooo.conf --db --db-home downloadstats /var/log/ apache2/download.example.com/$(date -d yesterday "+%Y/%m")/ download.example.com-$(date -d yesterday "+%Y%m%d")-access_log.bz2 processed 149016 lines in 97.5082769394 seconds found 3173 countables saved data in 79.724547863 seconds The database offers (through Django) some view on the data, like http://download.services.openoffice.org/stats/csv/20100201.csv , and more views could be easily added. (The `date -d yesterday` thing obviously requires a cronolog setup.) To deploy the current codebase, one would primarily need to define the ruleset that parses the logs. This requires taking tools/ooo.conf or tools/go-oo.conf and adjusting the rules to fit the content of your logfile. I'd be happy to assist; this would obviously made considerably easier if there was a walk-through documentation that explains it all. I used the script tools/dlcount.py for that, adding various print statements to make visible what happens. Later I added the code that actually puts the counts into the database, so the script would benefit from better separation of those steps. It should have some kind of debug mode, which summarizes what has been counted, and what fell through the cracks. Peter _______________________________________________ mirrorbrain mailing list Archive: http://mirrorbrain.org/archive/mirrorbrain/ Note: To remove yourself from this mailing list, send a mail with the content unsubscribe to the address mirrorbrain-request_at_mirrorbrain.orgReceived on Wed Feb 03 2010 - 21:41:16 GMT
This archive was generated by hypermail 2.3.0 : Thu Mar 25 2010 - 19:30:56 GMT