please check out weave-format branch

John Arbash Meinel john at arbash-meinel.com
Thu Sep 22 18:35:15 BST 2005


Martin Pool wrote:
> Hi,
> 
> My branch using the weave format is just about ready for adoption.  If
> you would like a copy you can either pull or rsync from
> http://bazaar-ng.org/bzr/bzr.newformat, or get this tarball:
> 
>   http://bazaar-ng.org/pkg/snapshot/bzr-0.1pre-1374.tgz
> 
> As test data, a copy of the bzr history converted to weaves is here:
> 
>   http://bazaar-ng.org/pkg/snapshot/bzr.newformat.asweaves-20050922.tgz
> 
> (The uncompressed history size shrunk from 101MB to 6.8MB).
> 
> Existing branches can be converted using the 'tools/history2weaves.py'
> script (which should be changed to run from the upgrade command.) 
> This upgrades a branch in place; make a copy first.
> 
> --
> Martin
> 
> 

4) What is supposed to be in "ancestry.weave"? How does it work when you
do a merge? In looking through the file, it looks like each revision
adds only one new entry, shouldn't some of them add 2? If I understand
the idea, you are trying to have a file which can be queried to get the
complete ancestry for a given revision. But I don't see how it handles
branching.

5) The new weave format means that all uploads to a remote store are
going to have to be done atomically rather than using just append. Have
you looked into a different weave format, which might allow append-only?
For example, you could give each line a unique number, and then have
some sort of information about what lines were added/removed in each
revision.
For example:

1 This is some text
2 This is more text
3 And the third line
rev=1 +[1-3 at 0]
4 A different third line
rev=2 parent=1 -[3] +[4 at 2]
5 Changing the first line
6 And inserting some text
rev=3 parent=2 -[1] +[5-6 at 0]
rev=4 parent=3 -[2,4-6]

I think this syntax is a little bit odd, but I think it would handle
what I am thinking. Each line is numbered. Each revision tells you what
lines were added or removed. Removed lines just give the line numbers of
the removals, added lines give the line numbers, and where they were
added. Ranges of lines in a row are collapsed.

Or you could explicitly list which lines existed in what order something
like:
rev=1 1-3
rev=2 1-2,4
rev=3 5-6,2,4
rev=4

Either way could be broken out into 2 separate files (like the revfile
idea), which would let you keep the meta-information separate from the
line-by-line store.

I might be willing to look into something like this. I'm a little leery
about replace in place formats, since you can accidentally destroy your
entire history. With an append-only design, you can just truncate the
end of the file if it gets corrupted, and you still have the history.

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20050922/a9cf34d1/attachment.pgp 


More information about the bazaar mailing list