gc.garbage from paramiko

Andrew Bennetts andrew at canonical.com
Wed Feb 8 01:17:13 GMT 2006


On Tue, Feb 07, 2006 at 10:39:27AM -0800, Robey Pointer wrote:
[...]
> >
> >What do you need to do on __del__?  socket.sockets already close  
> >themselves just
> >like files.
> 
> I guess that was kind of vague, sorry. :)  Python borrows a concept  
> from c++, and causes files & sockets to auto-close themselves as soon  
> as the last reference goes away.  Since paramiko supplies both socket- 
> like things (Channels) and file-like things (SFTPFiles), I've found  
> that python coders expect them to have the same del behavior as  
> builtins, allowing things like this:
> 
>     open('email.txt', 'w').write('jdoe at example.com')

But why do you need to do anything explicitly in paramiko to support this?  The
only external resource you have (that I'm aware of) is a socket, and python's
socket type already takes care of closing that when the last reference goes
away.

By implementing __del__, you can actually make the situation worse: the
transport might never be collected (because it might be in a cycle), so its
socket is never reclaimed, thus causing a file descriptor leak (as well as the
obvious memory leak).

So, I guess I'm missing what finalisation behaviour you want to happen that
doesn't already happen without __del__?  Sending some sort of polite "quit"
message to the server rather than just closing the socket?

In some cases it can be possible to replace __del__ with weakref callbacks
(which don't have this problem), although I can't say whether that's a
possibility here without understanding what you're using __del__ for :)

Another clever trick can be to replace the __del__ on your main object with a
__del__ on an attribute of that object that is never referenced elsewhere.
Twisted's Deferred objects do this -- for various reasons, it really helps to
log when a Deferred with an unhandled error is collected.  We used to have a
__del__ on Deferred, which then would tend to cause some Deferreds to get
trapped in gc.garbage (and their warnings about unhandled errors never to be
logged, either, because they were never collected) -- and it was always hard to
predict if any given Deferred would be in a cycle or not.  Now they have a
"_debugInfo" attribute, which holds a "DebugInfo" object with a __del__.  The
DebugInfo object has no reference to its Deferred.  The Deferred will update the
DebugInfo as appropriate, so that it can always log the right warning (if any).
The DebugInfo object has a __del__, but is extremely simple and only ever
referenced by one object, and only has references to simple strings.  Deferreds
no longer have a __del__, so there's no problem with them being in cycles.

> That said, I think the __del__ in Transport may be unnecessary.  (I  
> don't think a Transport can be collected until all its channels are  
> already gone.)  Though I'm still interested in cleaning up any  
> remaining cycles that get created, because after reading threads on  
> python-dev about this, I don't think the GC is going to be fixed  
> anytime soon.

Yeah, I don't expect the GC in CPython to change significantly any time soon.

> >Cycles are really, really hard to avoid.  If a traceback occurs,  
> >that can create
> >cycles.  There's lots of ways to unexpectedly make a cycle.
> 
> How could a traceback create a cycle?

See the warning at the top of http://www.python.org/doc/lib/inspect-stack.html

A traceback raised from inside paramiko and caught by a caller might be used in
all sorts of ways that paramiko can't know about -- including keeping a
reference to it (for later inspection, perhaps).  So even though paramiko itself
might not be creating cycles that it won't break, callers of paramiko can
easily make other cycles with paramiko inadvertently.

> >There's a pretty obvious cycle right here:
> >
> >        self.packetizer.set_keepalive(interval,
> >            lambda x=self: x.global_request('keepalive at lag.net',  
> >wait=False))
> 
> Oops, yeah.  Added to my to-do list. :)  Although I don't think  
> that's the one you're seeing, because this code isn't called by bzr.

Yeah, I don't think this is my culprit either.  But ensuring that your cycle
avoidance/breaking is perfect seems like much harder work than not using
__del__...

-Andrew.





More information about the bazaar mailing list