non-recursive status of a directory?

Robert Collins robertc at robertcollins.net
Sat Jun 7 03:55:12 BST 2008


On Sat, 2008-06-07 at 10:30 +1000, Mark Hammond wrote:
> > Given what I'm hearing, I don't really percieve non-recursive as a
> need
> > for the tbzr code:
> 
> Sure - its not a *need* - but please take my word for it that tbzr's
> implementation would be "faster", in terms of user responsiveness,
> with it, and quite a bit simpler.  Similarly, the people running 'bzr
> status' also don't *need* non-recursive status, but the person who
> added the "todo" note, and the people who implemented svn thought it
> might be a helpful option to provide :)
>
> Note that svn allows recursive or non-recursive.  tsvn explicitly
> chooses the non-recursive option for good reason.  tsvn has lots of
> real-world based tuning tweaks, which is why I'm trying to follow
> their model as closely as possible.  I think the reality is that on
> Windows, tbzr will be compared performance wise against tsvn and
> people will draw conclusions about the performance of bzr versus svn
> from that.

I certainly don't want to impede performance. Note however that
references to svn for performance are - well problematic. Last I recall
checking our status leaves svn st for dead; svn *requires* non-recursive
mode because of a fundamentally problematic approach to representing
branches. Neither of these make the fact that svn has a non-recursive
facility a compelling reason for bzr (or tbzr) to have one.

The use case for 'I need an emblem for the contents of <dir>' is
certainly something to support. But that isn't the same as a
non-recursive iter_changes. Specifically you don't care about all the
changes in a non-visible directory, you only care *if* there are changes
anywhere down-tree from <path>. Its kindof like 'diff' vs 'cmp -s'; in
the former case you want the details, in the latter case you just want a
boolean.

> So no, I don't *need* it, but I believe I've excellent reasons for
> wanting it.
> 
> > You describe an iterative process whereby details on a directory
> > accumlate, starting with 'not modified' and ending up with 'a
> > reasonable UI flag'.
> 
> The thing is, in many cases, it is *not* necessary to recurse to the
> bottom of a tree to find the full status of a directory, so in some
> cases, the bottom children will *never* be looked at.  As soon as you
> find a modified child, at any depth, you could present the status of
> that directory.  Thus, asking bzr to recurse fully means far more
> operations than necessary would have occurred before the state can be
> shown to the user (or alternatively, more operations are wasted after
> the status is shown).  On a large tree, this could be a significant
> win.

If I have a single directory with 10K files in it, the same argument
applies to stopping at the first modified file (when reporting on the
directory above).

And in this case I think:

generator = tree.iter_changes(specific_files=['path_to_dir_to_scan'])
try:
    generator.next()
except StopIteration:
    modified = False
else:
    modified = True
del generator

will do the least possible work to answer the question. In a totally
unmodified tree it will stat everything. In a modified tree with a file
changed in the directory it will stop at the first file.

> Its not yet clear to me that iterating file by file will give me
> "missing" items etc, and I'd really be surprised if that didn't come
> at a significant performance cost - but regardless, I'm a little
> confused by your position.  Is it:
> 
> * tbzr doesn't need, or even *want* non-recursive status, and while
> you think it does you don't understand the problem.
>
> or
> 
> * bzr doesn't provide non-recursive status, and its such an obscure
> requirement it is unlikely to do so in the short term.  Please make
> alternative arrangements.
> 
> or something else?

Its: 'directories are a poor proxy for dividing work up within the
system'. 

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080607/491752de/attachment-0001.pgp 


More information about the bazaar mailing list