[MERGE] Redo annotate more simply, using just the public interfaces for VersionedFiles.
Robert Collins
robertc at robertcollins.net
Wed Jun 25 21:24:29 BST 2008
On Wed, 2008-06-25 at 09:21 -0500, John Arbash Meinel wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Robert Collins wrote:
> | This builds on the stackings knits patch to allow annotate to work.
> |
> | It's a lot simpler than the previous code [John suggested this
> | approach :P].
> |
> | It likely has higher memory use, but shouldn't AFAICT be slower, because
> | this will issue a single get_record_stream call (and that should be
> | doing optimisations anyway).
> |
> | -Rob
> |
>
> It will be slower primarily because we can't use the cached matching
> blocks. So we have to re-diff every file.
There is self._extract_matching_blocks. Oh, the optimiser is missing
there on knits - I'll do a patch to reinstate that today. But bundle
generation uses that for make_mpdiffs and is still totally generic.
> I don't know about the specific memory use, but a lot of the complexity
> was allowing the records to come back in non-topological (pack optimal?)
> order, and deferring entries that still needed parents.
This conflicts with wanting to not cache all the texts :P. That said,
get_record_stream is expected to make that optimisation for us. (To
balance memory and optimal access to pack contents). I think its well
worth doing better than get_record_stream does - but it should not be in
the annotate specific code.
> Oh, and not caching all of the texts. That is actually a rather big
> memory issue. It tracked when texts would no longer be needed. But as I
> said on IRC, there is a difference from working at all, and from being
> fast if it only actually worked.
Ah yes, trimming cached_parents when there is no text left that has a
given parent as an ancestor. I'm happy to add that back in as part of
the review of this patch.
> Though I was thinking you would do it based on whether we were stacked
> or not, as this pretty much nukes all of Pack based annotation,
> regardless of whether you are stacking on another repository.
well, I could put an if len(self._knit._fallback_vfs) condition in there
if you like. That will still nuke what I suspect will be the common case
on big projects. Getting an annotation cache back would be nice :).
I'd actually like the annotation for this to use annotate() at the end
of the reachable graph, but thats even more complex (and doesn't even
consider recursive graph edges).
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080626/7127695d/attachment.pgp
More information about the bazaar
mailing list