[MERGE][158972] Don't use timeout in HttpServer

Vincent Ladeuil v.ladeuil+lp at free.fr
Thu Nov 1 21:18:39 GMT 2007


>>>>> "john" == John Arbash Meinel <john at arbash-meinel.com> writes:

    john> Vincent Ladeuil wrote:
    >> We had a problem on Mac OS X since the first http test server
    >> implementation.
    >> 
    >> It turns out it's due to the fact that the python http test
    >> server we build upon use socket.makefile() internally and since
    >> http://docs.python.org/lib/socket-objects.html says timeout and
    >> makefile should not be mixed, we shouldn't.
    >> 
    >> Why didn't it break more obviously ? I don't know.
    >> 
    >> But this patch remove the timeout use and the test suite still
    >> pass, and, more importantly, webdav test suite is passing again :)
    >> 
    >> This does not address
    >> https://bugs.edge.launchpad.net/bzr/+bug/69978, it may make it
    >> more prone to occur, but testing under both Gutsy and OS X didn't
    >> triggered it.
    >> 
    >> Vincent

    john> I would actually rather avoid "makefile()" than
    john> settimeout.

I don't feel like rewriting python httplib, so I guess that rules
that out ;-)

    john> If we can't make that work, then I'm okay with this.

    john> From what I remember, the main difference is that on
    john> Linux, it can block waiting for the timeout, or issue
    john> EAGAIN.

This is what we have all always thought. I had doubts about it on
OS X for a long time, I had to use the same trick to make the
webdav test suite work, and after re-reading
http://docs.python.org/lib/socket-objects.html, I just had a
light bulb and tried that patch.

    john> But on Windows (and maybe Mac?) if it would block *at
    john> all*, then it raises an exception. So basically
    john> settimeout sets "raise exception on blocking" and
    john> doesn't actually do any timeout.  However, if it
    john> wouldn't actually block (there is already data in the
    john> pipe) then you don't get a failure.

That's the theory and that is how it works under linux. But under
OS X, you just get EAGAIN as long as you want, but never any data
(at least for when there is a lot of data as in
test_readv_with_adjust_for_latency).

    john> I guess with those tests, it just stresses it a bit
    john> more, and it is slightly more likely that the consumer
    john> thread tries to read before the producer thread gives
    john> it data.

No. I could never get the data, even when the producer gave all
of it.

    john> The reason to have settimeout is because otherwise a
    john> bad test would cause the pqm to hang, etc.

I know that (hence the reference to bug #69978), but I think (as
mentioned in #158972) that the right approach is to have a
watchdog in the test suite to kill hanging tests, or, as you and
Martin mentioned, at least a watchdog for the whole test suite on
pqm. I would prefer a watchdog for each test so that we don't
throw away the whole suite.

    john> I think the Launchpad pqm has a watchdog running, that
    john> kills the commit if it takes too long. Maybe we just
    john> need to do something like that?

Yes, but I think that should be another bug.

     Vincent



More information about the bazaar mailing list