[mirrorbrain] Import and export metalinks?

From: Derek Hofmann <derek.hofmann_at_gmail.com>
Date: Mon, 12 Mar 2018 16:35:57 -0700
Hi, I'm thinking of setting up a MirrorBrain server that would be a
permanent way to locate files as their original locations vanish from
the web. Metalinks (RFC5854) seem ideal for this because they support
multiple URLs.

I want to do this as cheaply as possible for multiple terabytes of
data, so I don't want to store the mirrored files locally except maybe
some that aren't yet available on the Internet or have few mirrors of
their own.

Q1. Is there a way to import .metalink files into the database? This
would be a quick way to populate the database with hashes even without
storing the files locally. For other mirror admins who want to store
the files locally, this would save a step of hashing all the files
which could take hours depending on the number of terabytes.

Q2. Is there a way to export the database as one large .metalink file
or maybe one large .metalink file per directory? It would be similar
to "mb file ls" but in XML format. I could make this file available
via http/ftp/rsync for someone who wants my metadata but doesn't use
MirrorBrain. (Metalink files being formally described in an RFC would
be more portable than database dumps.) Of course I would compress it
first. Or my cron job would commit the file into a github repository.

Q3. Must the directory structure of a mirror match the other mirrors,
or can the files be located anywhere and MirrorBrain or an external
tool uses the filename/size/hash to resolve their actual locations?

Q4. Must the filenames match across mirrors, or can MirrorBrain or an
external tool use other means (such as file size+hash) to match them
up?

    Thanks,
    Derek


_______________________________________________
mirrorbrain mailing list
Archive: http://mirrorbrain.org/archive/mirrorbrain/

Note: To remove yourself from this mailing list, send a mail with the content
 	unsubscribe
to the address mirrorbrain-request_at_mirrorbrain.org
Received on Tue Mar 13 2018 - 04:35:07 GMT

This archive was generated by hypermail 2.3.0 : Tue Mar 13 2018 - 06:17:04 GMT