[BUG] (trivial) duplicate text in "bzr pull --help"

Wed Jun 28 14:41:48 BST 2006

Aaron Bentley wrote:
> John Arbash Meinel wrote:
>>> The 0.8 releases didn't pass the selftest suite on win32, though I'm to
>>> blame for breaking a little bit more recently. So I'm doing penance and
>>> trying to fix everything up.
>>>
>>> With any luck, I'll have it done this week... I don't know what Martin
>>> has in mind for an 0.9, but I could see us getting an rc1 out within 2
>>> weeks or so.
> 
> I'm also looking at some penance.  It seems knit caching was used
> 1. for converting weaves to knits
> 2. for fetching knits
> 
> So those things need to be fixed to avoid performance regressions.

Sure. Is it possible to add a benchmark for them, *ducks*. :)

Actually, it shouldn't be that bad. Just add a parameter to the
'make_heavily_merged_tree' which lets you set the format, and then
create one with Weave format, and then upgrade it to Knit format, and
see how long it takes.

> 
> One thing about knits that's troubling: It should be possible to copy
> knit records across directly, without rediffing.  But it's also possible
> for two knits to annotate lines differently, and I think that could lead
> to inconsistency in the annotations, since knit records contain
> annotation data.  But I don't think we have a cheap way of finding out
> whether the shortcut of copying knit records will create different
> annotations from those that would be produced by rediffing.
> 
> Aaron

Well, as you yourself has mentioned, no diff algorithm is perfect, since
there is no actual standard for perfection. (Especially when you have
duplicate lines).

So, I wouldn't worry about it too much. My biggest concern about copying
hunks directly is just data integrity. You can copy the hunk, but you
should extract it first, and make sure the sha1 is valid, and then just
insert the hunk you copied.
I do believe that applying a hunk is much faster than creating a new one
(diff algorithms typically being N^2 and all).
It would be really nice if we could generate a cheap Testament at the
same time.
It has been discussed to remove the 'sha1=' field from Revisions, and
change it to a 'testament_1_sha1'. Then we could do lots more integrity
checking, and doing signature validation becomes relatively cheap. If we
already checked the testament sha1 when we inserted the data, we don't
have to re-build it just to check the signature, just grab it out of the
revision entry.

I'm not sure the best routes for enabling validation while maintaining
good performance. But I would really like to see integrity checking
before we copy knit hunks into our local storage, and maybe that could
call out to a validate function, which at this point might just check a
simple sha1 sum, but in the future could check signatures at the same time.

John
=:->