Interesting merge optimization
John A Meinel
john at arbash-meinel.com
Mon Dec 5 01:41:45 GMT 2005
Robert Collins wrote:
> On Fri, 2005-12-02 at 23:37 -0600, John Arbash Meinel wrote:
>
>> Speaking of knits, I was thinking that we might be trying for too much
>> by setting the requirement that we don't have to load the whole thing
>> into memory. I realize we would like to see a version which gets around
>> the O(n_lines) behavior by not having to load all lines.
>
> The key thing about not loading the entire thing is that we can copy
> remote revisions across without scaling per commit.
>
> I.e. if you have 60000 commits in a repo, and I hav 59999 of those, how
> much data do I need to read to end up with 60000
>
> Rob
That is a different portion.
I'm saying that you still need to read all 60,000. You can just chose to
read the local ones rather than all from remote.
On of the things we were going to try to do with knits was say:
I want to reproduce version 6123 in memory, to do so,
I only need to read hunks 52, 55, 1123, 4432, 5524, and 6123
I don't know how easy that is going to be. And for Codeville weaves, it
isn't possible (they explicitly require loading of all previous lines).
Having each revision be a diff against others, and having these
locations put into an index lets us get append-only, and the ability to
only read the remote chunks that we don't have locally.
I think that is actually more important than being able to recreate the
in-memory text by only reading a selected number of hunks.
I'm not sure if I'm being clear. But one is talking about how do you add
new information into a knit. And the other is how to you regenerate the
full text of a revision. Even if you figure out how to regenerate a text
by only reading in specific hunks, I don't know that you will gain a
lot. Since most texts will be made up of many many revisions.
And if you only read revisions which contribute to the current text, you
lose any ability to have deleted lines affect where things go. Which is
sometimes useful. Certainly for weave merge you need to know what lines
have been deleted.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051204/5af20ea4/attachment.pgp
More information about the bazaar
mailing list