nested trees design-approach : composite trees vs iter_changes
Robert Collins
robert.collins at canonical.com
Tue May 5 00:32:54 BST 2009
I've been concerned about CompositeTree as an approach for a while, but
we haven't been able to actually get anywhere in discussing it; today
Aaron asked broadly that I either veto, or stop complaining. I hadn't
wanted to veto the vote because that seems to block people vs trying to
change their mind; however Aaron feels that this will force some
discussion to resolve it... so I've rejected in BB, and this mail is
hopefully sufficient to get discussion going.
CompositeTree worries me for a number of reasons.
* Our behaviour will change in unexpected ways depending on whether
a CompositeTree is being used. For instance 'tree.add' with a
specific fileid will error on duplicates when a CompositeTree is
used, and not (with the same tree, same fileid) if the programmer
forgets to use CompositeTree (CT from here on in).
* Because of that we're going to end up using CT everywhere, always.
Perhaps this is good, but if so we should change the factory objects
that create working trees and revision trees, rather than having
commands know to start using CT. As it stands people writing to and
in our API will get it wrong all the time. (e.g. loggerhead, PQM,
launchpad).
* Because CT is separate to the tree objects, we appear to need a new
index of nested trees for performance, which is redundant data. Its
not necessarily wrong to have it; but having to have it to have the
feature work at all is rather concerning.
* Because trees won't be responsible for their own nested items, it
may/will be very tricky for bzr-svn to do the right thing with
externals [it may need to represent them as something different
again, though I'm not 100% sure about that].
There is another concern, which is more weakly coupled to CT itself. I
think that CT makes this harder to change in the future, which is why I
mention it now:
* The CT design enforces 'file ids are unique' across all the nested
trees; this perhaps not meant to be a long term constraint, but it
is one that we can't enforce - unlike our normal behaviour users
will be able to make bzr break, and it won't be clear why, or how
they should fix it (and arguably they won't have done anything
wrong that would need fixing).
Aaron has expressed concerns about supported nested trees inside our
tree code. As most recently expressed to me:
* it makes everyone pay the cost of the feature
* it makes recursion mandatory
* it hides what's really going on/it makes the code trickier to debug.
* It's an all-or-nothing sort of solution. We need an incremental one.
Now, I think these are all addressable. Specifically:
* we want all commands and features to be nested tree capable, so
everyone should be paying the cost - and the cost should be zero when a
tree has no nested children.
* Its not hard to add 'recursive=True' default parameters to things
like changes_from and iter_changes, which would make recursion
optional. Having to use a different class to enable/disable recursion
is a pretty big hammer.
* I don't think it hides whats going on any more or less than having a
separate class which pulls data out of all the children. If anything,
having a call stack that shows iter_changes from each tree nesting
down would be clearer, rather than having one composite that has to
manually implement that same structure, is less clear.
* We can address the all-or-nothing issue by making recursion
controlling parameters default to False initially, with a plan to
switch them to True as soon as that doesn't break the test suite.
-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090505/47d000e2/attachment.pgp
More information about the bazaar
mailing list