[RFC] bzr.jrydberg.versionedfile

Johan Rydberg jrydberg at gnu.org
Wed Dec 21 14:43:58 GMT 2005


John Arbash Meinel <john at arbash-meinel.com> writes:

>> I think having a generic API is worthy goal, and the user should be
>> well aware if there is risk for collisions; if so, she should use a
>> more specific key.
>
> The hard part about a generic API is that you also have to make sure
> everyone doing a particular action agrees on the 'more specific key', so
> that you get the most benefit from actually storing the object.
> Otherwise if one section of code says "Weave-" + file_id, and another
> says "weave-" + file_id, they won't actually share anything.
> Also, it lets you do interesting things at the map layer. For instance,
> I would like a "add_weave_prelude", which would add a weave which only
> had the prelude read. But if later someone calls "add_weave" with the
> same file-id, it will delete the prelude version, since we now are
> storing a more complete object.
> That is also hard to do from the outside because you have to 'remove'
> the old prelude, which is debatable if the map should have remove. (You
> might have a decent case for it, but my case above may not be sufficient).

Versioned files are owned by the store that they are located in, and
only the store should add files to the identity map.  Also, there may
be several stores using the same map, so using "weave-" + file_id as
key just don't cut it.  

Stepping back, the identity map is only used to cache read files
internally to the store, meaning that the store implementation only
has to agree with itself how to specify keys. 

>>>>I put in the remove_object method to handle cache invalidations.  This
>>>>originates from the need to flush the cache in WeaveStore.copy_file.
>>>>My personal opinion is that weaves should never be copied, but instead
>>>>always merged (using WeaveVersionedFile.join), but I did not know the
>>>>performance implications so I left it in.
>>>
>>>mmm, I'll need to look at this. I'll do so next time I get a good shot
>>>to eyeball the code.
>> 
>> 
>> I just think we should let copy_file go.
>
> We probably nead it for Weave, because "join()" extracts all texts. But
> that doesn't mean we need it for knits.

Yes, it can be a semi-private method for the WeaveStore, and only be
used by the weave fetcher.

>> I disagree.  I think get_text should be the primary operation, and
>> that it should return a list of lines.  If you want it as a string,
>> we'll provide a convenience method for you.
>> 
>> I do not think of 'text' as a stream of bytes.  I guess that is the
>> difference.
>
> I think of a text as a stream of bytes. Probably because of it being
> used that way so far in the code base. So I'm not stuck on it. But that
> is what "get_text()" indicates in my brain.

>From weave.py in bzr.dev/bzrlib :

    def add(self, name, parents, text, sha1=None):
        """Add a single text on top of the weave.
           ...
        text
            Sequence of lines to be added in the new version.
           ...
        """

But get_text returns a string.  There needs to be consistency.  I
suppose we only have to agree upon something and stick to that.  I
prefer "text" meaning a list of lines.  You and Robert disagree.

(my opinion originates mostly from esthetics)

~j





More information about the bazaar mailing list