brainstorming - what to do with annotations

Robert Collins robertc at robertcollins.net
Tue Sep 11 03:22:41 BST 2007


On Mon, 2007-09-10 at 11:48 -0400, Aaron Bentley wrote:

> So it's not a complete solution.  It seems to me that cached fulltext
> annotations are required somewhere, somehow.  With my algorithm, we can
> zip through tens of revisions quickly enough.  It's when we get to
> hundreds that performance becomes problematic.



> Given that annotations use data we have already calculated, (i.e, text
> comparisons), they ought to be cheap.  Perhaps it's merely the
> commingling of annotation and line that makes it expensive?  But
> annotations can be stored separately, and full-annotation-data doesn't
> have to be synchonized with fulltexts.

I've not spent much time on delta performance yet but:
 - no-parent commits require an extra full content copy because we
mingle the data.
 - n-parent commits, regardless of doing delta compression, the same
extra content copy, then one delta per parent to assign new lines, and
finally commingled serialisation again.

> > But what about ghosts
> > and shallow repositories -  do they still have full texts
> 
> I think every repository format needs fulltexts, so that it doesn't
> scale O(n) with the size of history.  But see above-- I think
> full-annotations and fulltexts can be decoupled.  Full-annotations might
> happen whenever a text has a ghost parent.  Or at least, annotations of
> the lines introduced.

I agree with decoupling. I think annotations into ghost regions is
interesting, because sometimes you'll want them (e.g. in shallow
branches) and other times you'll not want them (nuclear launch code
scenarios)

> > and when
> > there is a full text in an mpdiff stream is it annotated? Can we store
> > annotations in a different way thats more efficient? ....
> 
> Some format that listed revision, followed by ranges, would probably be
> much more efficient:
> 
> abentley at home-randomstuff: 2-3, 5-7
> abentley at home-randomstuff: 1-1, 4-4

It would certainly be less data to handle.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070911/3e16178a/attachment.pgp 


More information about the bazaar mailing list