[Merge] lp:~vila/bzr/leaking-tests into lp:bzr

John Arbash Meinel john at arbash-meinel.com
Thu Jun 17 15:04:55 BST 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


...

> 
>     > Basically, I wrapped socket.socket() to produce an object that
>     > tracks the traceback when it was created, and then caches that
>     > when .connect() is called. 
> 
> Interesting, care to show us how you achieve that ? That would certainly
> help a lot.

The branch is at lp:~jameinel/bzr/leaking-test-experiment

It is pretty hacked up. The code that matters for this case is:
class WrappedSocket(object):

    def __init__(self, sock, stack):
        self.__socket = sock
        self.__stack = stack

    def __getattr__(self, name):
        return getattr(self.__socket, name)

    def connect(self, *args, **kwargs):
        val = self.__socket.connect(*args, **kwargs)
        socket._created_tb[self.__socket.getsockname()] = self.__stack
        return val

socket._created_tb = {}
_socket_create = socket.socket
def _tb_socket(*args, **kwargs):
    sock = _socket_create(*args, **kwargs)
    stack = traceback.extract_stack()
    return WrappedSocket(sock, stack)
socket.socket = _tb_socket

And then at the exception site:
try:
    return osutils.read_bytes_from_socket(
        self.socket, self._report_activity)
except socket.timeout:
    peer = self.socket.getpeername()
    sys.stderr.write('timeout while wanting to read %d bytes on %s peer:
%s\n'
                     % (desired_count, self.socket.getsockname(),
                        peer))
    sys.stderr.writelines(traceback.format_list(socket._created_tb[peer]))
    raise

It is definitely hackish, but it gets the job done.

Also, because of "possible_transports" it is possible that the bzrdir
probe is just the *first* time it is getting connect, and the socket is
actually in active use until the end.

I don't really know. I *do* know that we have a fair amount of trouble
with Branch being self-referential, so its always in a gc cycle.

I tried just instrumenting all the places that looked reasonable for
'connect', but they never found the connection which was failing.

As mentioned before, if I just run the one test:

 bzr selftest -s bb.test_branch.TestBranchStacked --verbose smart_server

Nothing hangs. It only hangs if I run all of the TestBranchStacked tests:

 bzr selftest -s bb.test_branch.TestBranchStacked --verbose

In both cases *6* client requests get closed down. It is just that in
the multi-test case, one of them fails to complete. Which hints that
maybe the socket *is* getting closed, and because of racing threads, it
is not being seen.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkwaK4cACgkQJdeBCYSNAAN1fgCgt851q/nlScl24oFKhDxL/enK
rj0AmwU3oQfl+rjD0PDVQkJ4rguz0DEa
=f+8n
-----END PGP SIGNATURE-----



More information about the bazaar mailing list