Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't send 404 to files that still exist on some of the mirrors #21

Open
poeml opened this issue Jun 5, 2015 · 2 comments
Open

Don't send 404 to files that still exist on some of the mirrors #21

poeml opened this issue Jun 5, 2015 · 2 comments
Labels

Comments

@poeml
Copy link
Owner

poeml commented Jun 5, 2015

                                                                                          [          ]

Issue migrated (2015-06-05) from old issue tracker http://mirrorbrain.org/issues/issue26

Title    Don't send 404 to files that still exist on some of the mirrors
 Priority   bug              Status        chatting
Superseder                  Nosy List      poeml, rhertzog
Assigned To poeml           Keywords

msg70 (view) Author: poeml Date: 2009-12-01.15:35:29

When experimenting with a MirrorBrain setup that uses a dummy file tree, I ran into the
situation that the file tree wasn't complete, and I got 404s (file not found) in the
client. The same would happen if the tree is not up to date, and some new files are not
present yet.

When trying to keep the system running under adverse circumstances, it doesn't make sense
to error out in such a case, and it would probably make sense to redirect such requests to
one of the fallback servers. (Referring to the fallback servers that can be configured
since recently, r7880.) Or maybe a different set of servers, don't know.

An obvious disadvantage is that those fallback servers end up getting all requests that
requests that lead to a 404. Those mirror servers must be assumed to be fairly complete
for the whole thing to make sense.

On the plus side, this way the redirector could keep running even when it looses its file
tree (disk crash).

Not to forget, this feature (and similar ones) could be made configurable, so the
behaviour could be switched on only in emergency, thereby minimizing negative
consequences. Or, touching a file in the filesystem could signal to Apache that it needs
into "degraded mode".

As a slight variant of this, Apache could still do database lookups, even if the file tree
is gone. That would preserve the ability to redirect to all mirrors that have a requested
file, and only those that have it (and not blindly).

The feature would need to hook in earlier in the request phase. It should be relatively
straightforward to implement.

msg542 (view) Author: rhertzog Date: 2014-02-17.14:44:22

I don't have a strong need to send 404 to fallback servers but I have a real
need to not send 404 when the requested file is still available on some of the
mirrors in the database, even though it's gone from the master copy.

I'm using mirrorbrain on top of Debian package archives. Some servers tend to
lag behind for a few hours/days for various reasons. Imagine a situation where
the package list references package_1.0_all.deb and all servers are in sync. The
master copy is updated with a new package list and file tree that contains
package_2.0_all.deb but not package_1.0_all.deb. Until the various mirror get in
sync, people will be redirected to old package list and they will request the
old package but they will get back 404 because the old package is gone from the
master copy while the local mirror they are usually redirected to still has the
required file.

Thus the default setup is actively harmful in that regard. If you consider that
serving old files might be a security issue, you might want to add a
configuration parameter to limit the time that you accept to redirect to
obsolete files. But we need some time period where this is allowed or things
will break.

I'm thus taking the liberty to change the title because I believe that's the
better way to solve your initial problem too.

msg544 (view) Author: poeml Date: 2014-02-19.12:42:56

Thanks for this thoughtful comment. MirrorBrain is indeed a bit
narrow-minded in this regard, because it simply assumed that this case
doesn't occur (or is not wanted). I kind of accepted this but I also see
the limitation. But your suggestion makes a lot of sense. It would be
very clever to simply do a database lookup in case of a request on a
non-existing file. There might even be hashes in the database for such a
file, which a client could use to verify file integrity.

I have to think about the implementation. No time right now, but I
wanted to at least reply shortly for now. I heard you :-)

msg545 (view) Author: rhertzog Date: 2014-02-19.14:46:43

Thanks for the answer. Let me know if I can help.

I don't know if I can bribe you to implement my requests (also #150) but I'd be
willing to upload (and maintain) mirrorbrain to the official Debian archive in
exchange (I'm a Debian developer). :-)

msg546 (view) Author: poeml Date: 2014-02-20.00:49:04

Of course you can bribe me :-) In fact, one of my biggest wishes would
come true! I actually started looking into becoming a Debian package
maintainer recently, because the lack of MirrorBrain packages became
already evident (and I stumbled over your fine manual!). I would be
thrilled if you could help out with packages. Then I can actually use
the time to work on MirrorBrain itself.
Having said that, you are of course also welcome to join hacking; and as
one of the next steps I'll collect ideas for implementation, as it's
always hard to find the right places in the code when not being familiar
with it. (And it serves as a refresher for myself.) This also applies to
the other request you sent. So stay tuned...

History
         Date              User       Action                 Args
2014-02-20 00:49:04 poeml           set    messages: + msg546
2014-02-19 14:46:43 rhertzog        set    messages: + msg545
2014-02-19 12:42:57 poeml           set    messages: + msg544
                                             priority: wish -> bug
                                             nosy: + rhertzog
                                             messages: + msg542
2014-02-17 14:44:23 rhertzog        set    title: Send 404s to certain fallback
                                             mirrors? -> Don't send 404 to files
                                             that still exist on some of the
                                             mirrors
2014-02-17 14:31:30 rhertzog        set    files: - ul36.html
2013-10-27 16:09:19 funnycafeteria6 set    files: + ul36.html
2009-12-01 15:35:30 poeml           create

(end of migrated issue)
@poeml poeml added the bug label Jun 5, 2015
@rhertzog
Copy link

rhertzog commented Jun 5, 2015

Just subscribed to the ticket. And I want to remind you that this one is important for distro users like Kali.

@rhertzog
Copy link

Ping on this ticket. Is there anything we can do to help get this one fixed? Can we hire you or offer a bounty? Feel free to contact me privately if you want to discuss details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants