AW: AW: AW: AW: bzr selftest (on solaris 10): too may open files
Vincent Ladeuil
v.ladeuil+lp at free.fr
Wed Nov 19 18:34:36 GMT 2008
>>>>> "jam" == John Arbash Meinel <john at arbash-meinel.com> writes:
jam> ...
>> *and* this is also true with python-2.5.2.
>>
>> Interrupting the selftest and using pfiles <pid> reveals that in
>> fact the open files are sockets...
>>
>> I have yet to understand why this happens on Solaris and not on
>> Linux but it means that only selftest is concerned by the problem
>> and not should not have consequences when using bzr itself.
>>
>> I'll run the test suite by parts to check which tests are really
>> failing and keep you informed, but the important result is that
>> you should be safe using bzr.
>>
>> Vincent
>>
>>
jam> I would guess that this is the "spawned threads" not
jam> getting cleaned up quickly. (Run the test suite on other
jam> platforms and it will tell you that we leaked XX
jam> threads.)
It says so there too :)
jam> This is generally caused by any Remote test, because it
jam> spawns a smart server in the second thread, which waits
jam> on a socket to respond to user requests. And as we don't
jam> have an explicit "close()" for remote connections, the
jam> service tends to stay around for a while.
Same goes for http tests and may be some others.
jam> So one possible fix would be to add a timeout to the
jam> socket, and if there hasn't been a request for XX
jam> seconds, go ahead and shutdown cleanly. Also we can have
jam> the test suite itself notice that the test has finished
jam> and poke at the thread to tell it to shut down.
jam> As I understand it, this causes problems on Windows
jam> because you can't mix a socket with a timeout with the
jam> file-like wrappers on a socket. Also, adding a
jam> "timeout" on a socket on windows is actually just
jam> setting O_NDELAY which will raise an exception if the
jam> socket request *would have* blocked.
Not only on windows, the python doc says you shouldn't mix the
two (file-like and timeout) and I fix a couple of bugs around
that.
I also tried at one point to force the socket shutdown but
stopped doing it as that slowed down the test suite too much (or
was too invasive or ugly, I don't remember which).
Instead, I relied on gc to get rid of the server threads (and
sockets) and get warned by Robert about it :) We settled on the
'xxx tests leak sockets' warning.
It may be time to revisit the problem and its various solutions
(2.6 also change some details in the socket servers used for http
but I was still able to avoid addressing the core problem :-).
Vincent
More information about the bazaar
mailing list