2.7 Release: Smaller and Faster Database

Wed, 04 Mar 2009

[Quite a long text, which aims to explain the recent under-the-hood changes.]

MirrorBrain 2.7 has been released, with the main change being a huge improvement in the database structure.

I have been using a "classic" relational database schema for years now, and wasn't not being very happy with the relational table alone being 2-3G in size with indexes, for the huge openSUSE file tree. For a small database that doesn't matter at all, but that file tree happens to be large enough (and growing) that 2.000.000 files and 200 mirrors result in sufficiently large number of rows in the relational table that the size is unavoidable. After optimizing out everything which wasn't needed, I still found 48 bytes used per row (two references to primary keys, and one timestamp column that was used to determine whether a file has been seen before or during the last scan).

I had an idea about a completely different organization of these data which doesn't waste 48 bytes per file per mirror where, in theory, one bit in a bit field would suffice. I found something that comes close in PostgreSQL in the form of the array datatype. The "list of mirrors per file" is now an array of two-byte integers which lives in a single column directly next to the path name. That way, only a single index remains.

All in all, the openSUSE database is now 5 times faster and 1/3 the size, which is exactly what I wanted.

The data is also more logically structured, looking up mirrors for a file doesn't require table joins anymore (which already were damn fast...), and the single index is a fast b-tree which is perfect for all needs. In particular, it is now easy to do efficient substring matches on the beginning of path names, which would have required a join over huge tables in the past. (The smaller size helps a lot as well, of course.)

This opens the door for fixing a previous shortcoming in the scanner: it was not possible to efficiently delete files from a subdirectory only, which have disappeared between two scans. That's now straightforward to implement. It also opens the door for a tight integration of mirror syncing with database updating, which would work it's way through a large tree on a directory basis.

The scanner doesn't need a timestamp anymore. It now creates a temporary table with the list of files at the beginning, scans, and in the end it just deletes all files that are still in the temp table.

With this change, MySQL is no longer supported; at least not by the framework in the whole. The core, mod_mirrorbrain, will still work, -- it doesn't care about the database, it just runs a database query and the query can be anything. The rest of the framework has now become quite adjusted to the PostgreSQL database schema now.

Of course, if there's interest, MySQL support in the toolchain could be maintained as well. For now, nobody uses it.

View other news