[RFC] locking decorators and iterators

Aaron Bentley aaron.bentley at utoronto.ca
Tue Feb 13 14:21:20 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> I ran into a nice bug today, and its structural in our code.
> 
> This is because while extras() requires a read lock, and even takes one,
> the lock expires before any code from extras() executes: calling the
> method simply creates the iterator object.

Definitely an anti-pattern I've run into a few times.  I agree that
iterators should not take locks.  In Python 2.5, iterators can have
try/finally blocks around a yield, so it's a different story, but the
lock still needs to be taken inside the iterator.

So yes, lock + iterator bad.  A related issue is that iterators should
not update progress bars, because they can't know what is happening to
the progress bar while they aren't running.

I'm not sure whether the bug with extras is actually related to locking,
though.  Let's say extras actually did take out and hold a read lock for
the duration of its iteration.

This would still work:

tree.lock_write()
extras = tree.extras()
tree.add('foo')
'foo' in extras => True

Because taking a read lock is always permitted when you already have a
write lock.  Locks are a defense against others, not a defense against
yourself.

So I think you've identified two problems:
1. putting a lock decorator on an iterator
2. failing to treat an iterator as an iterator

But wait-- isn't the behavior you've identified correct?

# foo is unknown at this point, so it goes in the list
extras = list(tree.extras())

# Now future calls to tree.extras won't return foo
tree.add('foo')

# But here, we're testing a copy of extras that was made before adding
# foo
'foo' in extras => True

> Secondly, I wonder if we should just start returning lists, to avoid
> such race conditions as that above.

I think iterators can have significant advantages, despite their
possible drawbacks.  I don't think a blanket prohibition would serve us
well, but I think it's useful to include iter_ in their name, to
telegraph the fact that we're not returning a list.

But in the case of extras, using an iterator allows us to behave in a
more responsive way, because we can update the screen between listdir
calls.  For local trees, this isn't a great advantage, but for similar
remote operations, it could be.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFF0clg0F+nu1YWqI0RAhbIAJ948a0JmK1EU2WXxkDWTUmHurNOVACfZu8r
TenGUOI5GvNbzc2hKs+Nqn8=
=Wk1X
-----END PGP SIGNATURE-----



More information about the bazaar mailing list