[RFC] Alternative to current push/pull semantics [use weave.join]

John Arbash Meinel john at arbash-meinel.com
Fri Dec 16 19:56:28 GMT 2005


Goffredo Baroncelli wrote:
> On Friday 16 December 2005 18:59, John Arbash Meinel wrote:
> [...]
> 
>>What I was thinking is that we probably could cheat, and instead of just
>>adding each text, just do a weave.join().
> 
> 
> weave.join( ) does so: it extracts every text not present, then adds it to the
> target repository.
> However I think that for non merge revision ( == revision with only one parent )
> it is possible to merge to a weave without extraction then addition...
> I need some time to write the code.
> 
> [...]

Yes it does extract of every text. But it does it at 1 time, rather than
doing it once, writing out the file. Then reading it back in later and
doing it again.

I also agree that weave.join() should be optimizable to not require
re-doing the diffs. But it probably isn't worth the effort if we are
switching to something like knits.

> 
> 
>>But it would change the current semantics to:
>>	download the remote weave header
>>	see that it is missing the revision we want
>>	download the full remote weave
>>	read the local weave
>>	weave.join()
>>	save the remote weave
> 
> 
> 
>>After that, even if we don't cache what weaves have what revisions,
>>future steps would just be:
>>	download the remote weave header
>>	see it has the revision we care about
>>	no upload needed
>>
>>Now right now we can't just download the header. I think we really can,
>>but since our buffer size (32k) is greater than the average file size
>>(8k), it doesn't gain us much. 
> 
> 
> In order to know which revision are in the weave it should be sufficient to inspect
> the history weave: in this file are recorded both the file id and the
> revision id.
> If you want to know which revision id of the README file are in the repository:
> 

My whole point is that we would add extra revisions inside a weave file
that may not be represented in the remote inventory or revision-store yet.
The point is that when we get the chance, add all the revisions for a
given file-id in the belief that more likely than not, we are going to
want to do it in the (near) future.

The constraint we are working under is that *if* a revision is present
in the revision-store, then its inventory, and all associated texts are
also present. There is no constraint that there isn't extra information
in the weaves themselves.

I'm just making pull greedy with respect to each text weave it sees.
We could make pull say "these are the revisions I'm grabbing", and then
have weave.join allow a restricted set of revisions to merge in. But I
don't really think it is worth a lot.

Coupled with some decent caching so we can remember what revisions we
have added to what weaves, I think we could get "bzr pull/bzr push"
times back on par with "bzr branch" (1-2min not 1-2hours).

John
=:->


> $ ./bzr inventory --show-ids | grep ^README
> README                                             README-20050309040720-8f368abf9f346b9d
> 
> $ grep README-20050309040720-8f368abf9f346b9d .bzr/inventory.weave  | \
> 	sed -e 's/^.*revision="//' -e 's/".*//' | sort | uniq
> mbp at sourcefrog.net-20050309040815-13242001617e4a06
> mbp at sourcefrog.net-20050314025438-d52099f915fe65fc
> mbp at sourcefrog.net-20050314030104-0116680aba6499b6
> mbp at sourcefrog.net-20050314070724-ba6c85db7d96c508
> mbp at sourcefrog.net-20050319235021-a4a900883ea8e2d8
> mbp at sourcefrog.net-20050326030328-350bb37fb45b5eb4
> mbp at sourcefrog.net-20050429004334-bbb9dc81ce0d9de3
> mbp at sourcefrog.net-20050509012317-a503ae2eed842146
> mbp at sourcefrog.net-20050510082309-92b2ba534866314e
> mbp at sourcefrog.net-20050510083156-166057e03c21c713
> mbp at sourcefrog.net-20050511010322-54654b917bbce05f
> mbp at sourcefrog.net-20050722233200-ccdeca985093a9fb
> mbp at sourcefrog.net-20051019070401-0536bd80d1282c7f
> mbp at sourcefrog.net-20051101215717-fce2860f5195075a
> mbp at sourcefrog.net-20051118081007-80523bf145eb319b
> robertc at robertcollins.net-20050919060519-f582f62146b0b458
> 
> Goffredo
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051216/8118026f/attachment.pgp 


More information about the bazaar mailing list