[RFC] bzr.jrydberg.versionedfile

John Arbash Meinel john at arbash-meinel.com
Wed Dec 21 15:01:10 GMT 2005


Johan Rydberg wrote:
> John Arbash Meinel <john at arbash-meinel.com> writes:
> 
> 

...

> 
> Versioned files are owned by the store that they are located in, and
> only the store should add files to the identity map.  Also, there may
> be several stores using the same map, so using "weave-" + file_id as
> key just don't cut it.  

What stores are sharing the identity map? Isn't there 1 map per branch?
So even if you have revision-store and text-store sharing the map, they
don't have overlapping ids.

> 
> Stepping back, the identity map is only used to cache read files
> internally to the store, meaning that the store implementation only
> has to agree with itself how to specify keys. 
> 

I'll concede that point.

> 
>>>>>I put in the remove_object method to handle cache invalidations.  This
>>>>>originates from the need to flush the cache in WeaveStore.copy_file.
>>>>>My personal opinion is that weaves should never be copied, but instead
>>>>>always merged (using WeaveVersionedFile.join), but I did not know the
>>>>>performance implications so I left it in.
>>>>
>>>>mmm, I'll need to look at this. I'll do so next time I get a good shot
>>>>to eyeball the code.
>>>
>>>
>>>I just think we should let copy_file go.
>>
>>We probably nead it for Weave, because "join()" extracts all texts. But
>>that doesn't mean we need it for knits.
> 
> 
> Yes, it can be a semi-private method for the WeaveStore, and only be
> used by the weave fetcher.

Sure, you can even make it "_copy_file()" to ensure that people don't
think it is a public function.

> 
> 
>>>I disagree.  I think get_text should be the primary operation, and
>>>that it should return a list of lines.  If you want it as a string,
>>>we'll provide a convenience method for you.
>>>
>>>I do not think of 'text' as a stream of bytes.  I guess that is the
>>>difference.
>>
>>I think of a text as a stream of bytes. Probably because of it being
>>used that way so far in the code base. So I'm not stuck on it. But that
>>is what "get_text()" indicates in my brain.
> 
> 
>>From weave.py in bzr.dev/bzrlib :
> 
>     def add(self, name, parents, text, sha1=None):
>         """Add a single text on top of the weave.
>            ...
>         text
>             Sequence of lines to be added in the new version.
>            ...
>         """
> 
> But get_text returns a string.  There needs to be consistency.  I
> suppose we only have to agree upon something and stick to that.  I
> prefer "text" meaning a list of lines.  You and Robert disagree.
> 
> (my opinion originates mostly from esthetics)

Actually, I think having "add(... text)" is bogus. Because it really
wants "lines".

I realized why I use the term 'text' to indicate a string. The problem
comes more from other languages where you have to declare you object
type. But in C++ string is already used. I can't write:

int myfunc(std::string string)
{
}

Technically I could probably get away with it because of the namespace.
 But I digress.

In Python, 'string' is a standard module and the 'str' function is
builtin, and I don't like to overload existing names. (One of the
reasons I hate to see "dir" as a variable, so I tend to use
something_dir, or path).

So the next word to turn to (for me) is 'text'. And I think 'lines'
conveys the idea of a sequence of lines quite well.

> 
> ~j

John
=:->
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051221/2586470a/attachment.pgp 


More information about the bazaar mailing list