non-recursive status of a directory?

Robert Collins robertc at robertcollins.net
Sat Jun 7 06:14:04 BST 2008


On Sat, 2008-06-07 at 14:13 +1000, Mark Hammond wrote:
> > I certainly don't want to impede performance. Note however that
> > references to svn for performance are - well problematic. Last I recall
> > checking our status leaves svn st for dead;
> 
> But isn't it likely that on a very large tree, asking svn for a
> non-recursive status of the root is likely to compare favorably to bzr
> doing a full status?

I'm guessing you mean 'it is not likely' ? I don't know, as I don't know
where the overhead in svn comes in in sufficient detail to make
predictions.

> > In a totally unmodified tree it will stat everything.
> 
> And there's the rub!  tsvn has a very smart "crawler" - it crawls
> using a thread at idle priority, it's capable of knowing when Explorer
> is currently asking for items (so crawling is suspended), handling
> that the directory being requested is currently being crawled, uses a
> smart cache to prevent crawling directories is considers "fresh
> enough" (which works fine given that directories are "watched" so are
> effeciently notified of change), etc.

(speculation) I think its possible that tsvn needs this smart crawler
because of other limitations of svn. It is in my experience a mistake to
look at another piece of software and clone it is ideas and tricks
without throughly understanding what each one brings to the table.

concretely - if you want to write such a crawler for tbzr, and you think
its needed, I full support that.

> > > but regardless, I'm a little confused by your position.  Is it:
> > >
> > > * tbzr doesn't need, or even *want* non-recursive status, and while
> > > you think it does you don't understand the problem.
> > >
> > > or
> > >
> > > * bzr doesn't provide non-recursive status, and its such an obscure
> > > requirement it is unlikely to do so in the short term.  Please make
> > > alternative arrangements.
> > >
> > > or something else?
> > 
> > Its: 'directories are a poor proxy for dividing work up within the
> > system'.
> 
> That is the same as the first option: while your statement is correct
> in the general case, this isn't the general case.  I maintain that for
> tbzr, it *is* a more appropriate option.  If you think it is
> productive to continue thrashing out the tbzr design until that
> becomes evident to you (or conversely, it becomes evident to me how we
> can be as efficient without it) then I will reluctantly do so.  On the
> other hand, if it can be accepted that tbzr would benefit from
> non-recursive status updates, the question simply becomes whether tbzr
> can have that facility, and if so, how.

mmm, I don't consider my answer the same as teh first option;
nevertheless,  I don't want to spin on this, as I don't actually have a
goal of convincing you you are wrong; I've simply been expressing my
doubts.

The part of the code that all the tree delta logic your trees will be
dealing with is centralised in bzrlib.workingtree_4, in the
WorkingTree4.iter_changes function. This is unfortunately highly
performance sensitive code: changes to it need to be benchmarked with
care - it easy to throw status performance out the window (I've done it
myself). I hope we can find some way to do what you need without costing
performance in the scenarios that users benchmark us on (of which,
console status is one of the most common).

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080607/589b8f76/attachment.pgp 


More information about the bazaar mailing list