non-recursive status of a directory?
Robert Collins
robertc at robertcollins.net
Sat Jun 7 03:55:12 BST 2008
On Sat, 2008-06-07 at 10:30 +1000, Mark Hammond wrote:
> > Given what I'm hearing, I don't really percieve non-recursive as a
> need
> > for the tbzr code:
>
> Sure - its not a *need* - but please take my word for it that tbzr's
> implementation would be "faster", in terms of user responsiveness,
> with it, and quite a bit simpler. Similarly, the people running 'bzr
> status' also don't *need* non-recursive status, but the person who
> added the "todo" note, and the people who implemented svn thought it
> might be a helpful option to provide :)
>
> Note that svn allows recursive or non-recursive. tsvn explicitly
> chooses the non-recursive option for good reason. tsvn has lots of
> real-world based tuning tweaks, which is why I'm trying to follow
> their model as closely as possible. I think the reality is that on
> Windows, tbzr will be compared performance wise against tsvn and
> people will draw conclusions about the performance of bzr versus svn
> from that.
I certainly don't want to impede performance. Note however that
references to svn for performance are - well problematic. Last I recall
checking our status leaves svn st for dead; svn *requires* non-recursive
mode because of a fundamentally problematic approach to representing
branches. Neither of these make the fact that svn has a non-recursive
facility a compelling reason for bzr (or tbzr) to have one.
The use case for 'I need an emblem for the contents of <dir>' is
certainly something to support. But that isn't the same as a
non-recursive iter_changes. Specifically you don't care about all the
changes in a non-visible directory, you only care *if* there are changes
anywhere down-tree from <path>. Its kindof like 'diff' vs 'cmp -s'; in
the former case you want the details, in the latter case you just want a
boolean.
> So no, I don't *need* it, but I believe I've excellent reasons for
> wanting it.
>
> > You describe an iterative process whereby details on a directory
> > accumlate, starting with 'not modified' and ending up with 'a
> > reasonable UI flag'.
>
> The thing is, in many cases, it is *not* necessary to recurse to the
> bottom of a tree to find the full status of a directory, so in some
> cases, the bottom children will *never* be looked at. As soon as you
> find a modified child, at any depth, you could present the status of
> that directory. Thus, asking bzr to recurse fully means far more
> operations than necessary would have occurred before the state can be
> shown to the user (or alternatively, more operations are wasted after
> the status is shown). On a large tree, this could be a significant
> win.
If I have a single directory with 10K files in it, the same argument
applies to stopping at the first modified file (when reporting on the
directory above).
And in this case I think:
generator = tree.iter_changes(specific_files=['path_to_dir_to_scan'])
try:
generator.next()
except StopIteration:
modified = False
else:
modified = True
del generator
will do the least possible work to answer the question. In a totally
unmodified tree it will stat everything. In a modified tree with a file
changed in the directory it will stop at the first file.
> Its not yet clear to me that iterating file by file will give me
> "missing" items etc, and I'd really be surprised if that didn't come
> at a significant performance cost - but regardless, I'm a little
> confused by your position. Is it:
>
> * tbzr doesn't need, or even *want* non-recursive status, and while
> you think it does you don't understand the problem.
>
> or
>
> * bzr doesn't provide non-recursive status, and its such an obscure
> requirement it is unlikely to do so in the short term. Please make
> alternative arrangements.
>
> or something else?
Its: 'directories are a poor proxy for dividing work up within the
system'.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080607/491752de/attachment-0001.pgp
More information about the bazaar
mailing list