[MERGE] Redo annotate more simply, using just the public interfaces for VersionedFiles.

Robert Collins robertc at robertcollins.net
Wed Jun 25 21:24:29 BST 2008


On Wed, 2008-06-25 at 09:21 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Robert Collins wrote:
> | This builds on the stackings knits patch to allow annotate to work.
> |
> | It's a lot simpler than the previous code [John suggested this
> | approach :P].
> |
> | It likely has higher memory use, but shouldn't AFAICT be slower, because
> | this will issue a single get_record_stream call (and that should be
> | doing optimisations anyway).
> |
> | -Rob
> |
> 
> It will be slower primarily because we can't use the cached matching
> blocks. So we have to re-diff every file.

There is self._extract_matching_blocks. Oh, the optimiser is missing
there on knits - I'll do a patch to reinstate that today. But bundle
generation uses that for make_mpdiffs and is still totally generic.

> I don't know about the specific memory use, but a lot of the complexity
> was allowing the records to come back in non-topological (pack optimal?)
> order, and deferring entries that still needed parents.

This conflicts with wanting to not cache all the texts :P. That said,
get_record_stream is expected to make that optimisation for us. (To
balance memory and optimal access to pack contents). I think its well
worth doing better than get_record_stream does - but it should not be in
the annotate specific code.

> Oh, and not caching all of the texts. That is actually a rather big
> memory issue. It tracked when texts would no longer be needed. But as I
> said on IRC, there is a difference from working at all, and from being
> fast if it only actually worked.

Ah yes, trimming cached_parents when there is no text left that has a
given parent as an ancestor. I'm happy to add that back in as part of
the review of this patch.

> Though I was thinking you would do it based on whether we were stacked
> or not, as this pretty much nukes all of Pack based annotation,
> regardless of whether you are stacking on another repository.

well, I could put an if len(self._knit._fallback_vfs) condition in there
if you like. That will still nuke what I suspect will be the common case
on big projects. Getting an annotation cache back would be nice :).

I'd actually like the annotation for this to use annotate() at the end
of the reachable graph, but thats even more complex (and doesn't even
consider recursive graph edges).

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080626/7127695d/attachment.pgp 


More information about the bazaar mailing list