[MERGE] TreeTransform avoids many renames when constructing trees
John Arbash Meinel
john at arbash-meinel.com
Mon Jun 4 23:32:09 BST 2007
John Arbash Meinel has voted +1.
Status is now: Approved
Comment:
Adding self.rename_count doesn't seem strictly necessary. Is it just for
testing?
I find names like "frexpar" to be more confusing than just saying
"first_dir" or "second_dir". Having more meaningful names would make it
easier to figure out what the tests are doing.
This test seems a little odd:
+ def test_change_parent(self):
+ transform, root = self.get_transform()
+ frexpar = transform.new_directory('frexpar', root)
+ elphaba = transform.new_file('elphaba', frexpar, 'contents')
+ oz = transform.new_directory('oz', root)
+ transform.adjust_path('elphaba', oz, elphaba)
+ transform.apply()
+ self.failIfExists(self.wt.abspath('frexpar/elphaba'))
+ self.failUnlessExists(self.wt.abspath('oz/elphaba'))
+ # rename libbo/new1 => frexpar, rename limbo/new3 => oz
+ # no rename for elphaba
+ self.failUnlessEqual(2, transform.rename_count)
In that you create the file in one directory *and* rename it into
another path within one .apply(). It would make more sense to me if you
created it in one .apply() and then renamed in another.
Is this just because you want the api to support any set of odd changes
in one go?
I also don't understand how the 'new_file()' call wouldn't have created
'elphaba' in 'frexpar' to start with, and then renamed it into 'oz'
after the adjust_path() call. Though maybe you aren't counting renames
outside of the '.apply()' call.
At the very least, it should be "limbo/new1 =>" not 'libbo'.
+ transform, root = self.get_transform()
+ frexpar = transform.new_directory('frexpar', root)
+ elphaba = transform.new_file('elphaba', frexpar, 'contents')
+ nessarose = transform.new_file('nessarose', frexpar,
'contents')
+ try:
+ transform.cancel_creation(frexpar)
+ except OSError:
+ self.fail('Failed to move elphaba before deleting frexpar')
+ transform.cancel_creation(nessarose)
+ transform.create_directory(frexpar)
+ try:
+ transform.cancel_creation(frexpar)
+ except OSError:
+ self.fail('Transform still thinks nessarose is a child of
frexpar')
+ oz = transform.new_directory('oz', root)
+ transform.adjust_path('elphaba', oz, elphaba)
+ transform.apply()
+ self.failIfExists(self.wt.abspath('frexpar'))
+ self.failUnlessExists(self.wt.abspath('oz/elphaba'))
+ # rename limbo/new-3 => oz, rename limbo/new-2 => elphaba
+ self.failUnlessEqual(2, transform.rename_count)
The try/except clauses here don't seem to match what I think the code is
actually doing. They may have been correct at one point, but it could
also be because 'elphaba' was considered a child of 'frexpar'.
Maybe it should just say "Transform still thinks frexpar has children".
And it should probably print out the OSError just in case it is
something else.
test_adjust_and_cancel()
doesn't this need to check that 2 directories are created, but nothing
else happens?
test_noname_contents
I don't understand how "create_directory()" can work without giving it a
final name for that path. Is it just that TT doesn't need it at create_*
time, just at finalize/apply time?
Maybe just a docstring to indicate you are checking exactly that
"create_directory()" doesn't need a final name yet.
I would be more intrigued if you could create a target directory which
claims to be a child of oz-id and neither one had a name yet. Just doing
it at the top level means you haven't tried to do any nesting yet. (It
may not be possible to create a child in a directory that doesn't have a
name).
The more I look at it, I don't understand why "test_change_parent()" can
have the final file in the correct directory, but "test_reuse_name()"
can't. Since it seems whatever trick TT uses to know that adjusting the
path from frexpar/elphaba => oz/elphaba is sufficient to track that
oz/elphaba (2) => oz/galinda. (or effectively new-3=>oz/galinda before
the .apply happens).
Also, we don't really need the top-level directories to have their final
names until the rename. Since we have to "os.rename('limbo/*', 'wt/*')"
anyway. I don't know whether this simplifies your code or not. But you
could use "oz == 'new-2'" until you go to "rename limbo/new-2 => oz"
You don't actually have to build up the temporary list here:
- for trans_id, kind in self._new_contents.iteritems():
- path = self._limbo_name(trans_id)
+ entries = [(self._limbo_name(t), t, k) for t, k in
+ self._new_contents.iteritems()]
+ for path, trans_id, kind in sorted(entries, reverse=True):
You could do it as a generator, which saves a bit of memory. Or you
could do it as:
entries = sorted((self._limbo_name(t), t, k)
for t, k in self._new_contents.iteritems(),
reverse=True)
I don't think it is strictly a big deal, but this is O(tree) so it is
nice to avoid extra copies when possible.
I'm guessing TT._limbo_name() could be optimized a bit (like have it do
1 lookup with a try/except KeyError rather than 2 with a "x in y: return
y[x]". But that sort of depends on whether x is usually present or not.
And from what I could see, it is a 50/50 mix. (You use it the first time
you create something, and then when you are renaming everything into
place)
Anyway, this is mostly just nit-picking. I'm more than happy to have
this merged as is. But if you want to address some of this, feel free.
:)
For details, see:
http://bundlebuggy.aaronbentley.com/request/%3C466465DE.30108%40utoronto.ca%3E
More information about the bazaar
mailing list