[RFC] bzr.jrydberg.versionedfile
John Arbash Meinel
john at arbash-meinel.com
Wed Dec 21 20:20:21 GMT 2005
Goffredo Baroncelli wrote:
> On Wednesday 21 December 2005 20:03, you (John Arbash Meinel) wrote:
>
>
>>>Moreover I see some disadvantages to have revision not referenced.
>>>1) If we wont to know the changes related to a file, we can parse only the file; instead
>>>if we allow unreferenced revision, we have to intersect the revision related to
>>>the file, which the ones "official".
>>
>>I don't think there is a large gain here. A list of revisions officially
>>merged are small. And you always need to know what revision you care
>>about the changes. If only to annotate the current object.
>
> The annotate only show the added/changed line, not the deleted ones; about the
> size of the revision see below
>
>
>>Since knits contain the ancestry information for the file, all I need to
>>know is the current revision, and then all of those extra changes are
>>hidden.
>
> Yes, but so you have to unpack the inventory...
Actually, no. The knit itself contains the ancestry. Or I assume knits
do. Weaves do already.
(In the table of contents is a revision-id => local-id mapping, along
with the sha1 sum for the full text, and the parents for that revision)
>
>
>>>2) if the project is a big project with many contributors, which pull from others
>>>contributors, we can have an explosion of the storage size.
>>
>>Not really. You rarely will get anything that you wouldn't get anyway.
>>The only things that are 'wasted' are merges/pulls which you then decide
>>that you don't want.
>
>
> What you say is true if the other repository from you merge is in a clean state. For
> example this is not true for the bazaar ones:
>
> $ grep README-20050309040720-8f368abf9f346b9d ../inventory.weave | sed -e 's/^.*revision="//' -e 's/".*//' | wc -l
> 16
> $ grep ^n README-20050309040720-8f368abf9f346b9d.weave | wc -l
> 26
>
> The example above shows that README weave file contains 26 revision; instead the
> inventory references only 16: about the ~40% of the revision are meaningfulless.
>
> $ grep "file_id" ../inventory.weave | \
> sed -e 's/^.*file_id="/ /' -e 's/".*revision="//' -e 's/".*//' | \
> sort | uniq | wc -l
> 6727
> $ grep -h ^n */*.weave | wc -l
> 9025
>
> The example above shows that the inventory references only the 6727/9025 = 74% of the
> revision present: 1/4 of the repository is without sense. And these information are
> replicated in every developer repository
>
> Even tough the other repository pass the "bzr check" control, it may be that this contains
> a lot of not usefoul information.
>
>
>>Well, I'm not sure about Johan's specific implementation, but it would
>>be possible to supply the 'Knit.join()' command a list of revisions
>>which I think I'm going to be interested in. And it will only bring
>>those in.
>
> I toughted the same
>
>
>>If it turns out that I don't want them (I do 'bzr merge' and then
>>realize I don't want those changes), are you asking for them to be deleted?
>
> If I don't bring what i don't want, i don't delete it :-)
>
>
>>If it is simply "I want to sanitize everything which has not actually
>>been added to this branch", we would want a command like that anyway.
>
>
> For me, the default to the pull command is to the merge something that
> is published. So why i have to pull something that i don't use ?
>
>
>>Because a repository is going to share knits, and if you make something
>>public, you may need to sanitize it.
>>But this would be more of a "take what I have, and generate a new
>>branch, stripping out all unreferenced information". It would be an
>>occasional pruning, not a common operation which might delete stuff.
>
>
> Again I don't want to delete anything: I don't want to merge it if I don't
> want; I don't like the idea but, if you prefer, what about a switch like
> '--dont-fetch-all'; or better, what about a default option to set in the
> bzr.conf file ?
>
>
>>John
>
> Goffredo
>
I believe I understand what you are wanting. Such that you *do* pull
everything which is an ancestor of what you want, but not extra things
that are present. (So that if I merge something and choose not to keep
it, it may stay in my files, but isn't pulled into yours when you merge me.)
Which according to Johan's join() API, is what you would get.
His function is designed that you can pass a list of revisions, which
will be pulled, along with their associated ancestry. That means the
whole file won't get joined.
>From you argument I thought you were opposing it staying in *my* files,
which would require deleting things. Not existing in *yours* is
reasonable, and seems like it is probably implemented.
John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051221/c28fa603/attachment.pgp
More information about the bazaar
mailing list