while thinking about curl, wget, Chrome & Firefox adding Metalink support, I realized there is no documentation for Metalink/XML clients, just the XML format in RFC 5854. what about publishers & caches? here are some initial thoughts...obviously the publisher section is most important regarding MirrorBrain, but it's interaction w/ caches could be too. what do you think? is there more to add? Metalink/XML publishers MUST use correct MIME type for metalink files SHOULD advertise Metalink/XML file with Link HTTP header field from regular download for "transparent metalink" usage ( Link: <http://example.com/example.ext.meta4>; rel=describedby; type="application/metalink4+xml" ) SHOULD publish with chunk hashes if error recovery ability is desired (and files meet certain criteria like "large enough" - no point for 10k size file). MAY do Accept header transparent content negotiation (deprecated?) SHOULD use Metalink/XML origin element and dynamic="true" if updated metalinks will be offered. Metalink proxy cache consumers ??? whitelist for trusted sources by domain name (ie kde.org, ubuntu.com, fedoraproject.org) detect & log metalink usage so able to add to whitelist SHOULD use preferred mirrors (those that are most cost efficient/better/local) should they repair errors or use hashes? I guess so, but the client will be verifying hashes too. Metalink/XML clients for some of the download behavior, RFC 6249, Section 7 could be edited & re-used. MUST! sanitize directory traversal information as specified in RFC 5854 Section 4.1.2.1 MUST process metalinks available by URI. MAY (or SHOULD?) process local metalinks (like aria2's -M option). MUST recognize by MIME type. (what about misconfigured/unupdated server that does not have correct MIME type?) SHOULD(?) client recognize metalink by file extension as well? if HTTP client, MUST(?) support "transparent metalink" usage from regular download to Metalink/XML advertised with Link header ( Link: <http://example.com/example.ext.meta4>; rel=describedby; type="application/metalink4+xml" ) if HTTP client, MAY do Accept header transparent content negotiation (deprecated?) if file with same name already exists, SHOULD verify full file hash and if hash is correct, do not re-download the file? if file exists and full file hash is incorrect, MAY repair file if chunk hashes exist. otherwise, MAY write to other file name (file_2 or file(2) like some apps already do). SHOULD (or MUST?) verify full file hash after download completes. if error, MUST describe as corrupted and MAY re-download or keep download? SHOULD verify chunk hash if available and re-get error parts. SHOULD (or MAY?) be done during initial download process, MAY be done after download completed or to repair file downloaded another way? SHOULD(?) use BitTorrent chunk hashes with HTTP/FTP downloads to repair file if client supports torrents? (what if chunk hashes are present in torrent and metalink, should one be preferred?) if client supports Metalink/XML (3/4) AND Metalink/HTTP, which info should be preferred (in case they differ)? SHOULD make use of Metalink/XML origin element if dynamic="true" to check for updated metalink? -- (( Anthony Bryan ... Metalink [ http://www.metalinker.org ] )) Easier, More Reliable, Self Healing Downloads _______________________________________________ mirrorbrain mailing list Archive: http://mirrorbrain.org/archive/mirrorbrain/ Note: To remove yourself from this mailing list, send a mail with the content unsubscribe to the address mirrorbrain-request_at_mirrorbrain.orgReceived on Tue May 22 2012 - 22:55:45 GMT
This archive was generated by hypermail 2.3.0 : Thu May 31 2012 - 10:17:03 GMT