[RFC] bzr.jrydberg.versionedfile

John Arbash Meinel john at arbash-meinel.com
Wed Dec 21 14:18:58 GMT 2005


Johan Rydberg wrote:
> Robert Collins <robertc at robertcollins.net> writes:


...

> 
> I think having a generic API is worthy goal, and the user should be
> well aware if there is risk for collisions; if so, she should use a
> more specific key.

The hard part about a generic API is that you also have to make sure
everyone doing a particular action agrees on the 'more specific key', so
that you get the most benefit from actually storing the object.
Otherwise if one section of code says "Weave-" + file_id, and another
says "weave-" + file_id, they won't actually share anything.
Also, it lets you do interesting things at the map layer. For instance,
I would like a "add_weave_prelude", which would add a weave which only
had the prelude read. But if later someone calls "add_weave" with the
same file-id, it will delete the prelude version, since we now are
storing a more complete object.
That is also hard to do from the outside because you have to 'remove'
the old prelude, which is debatable if the map should have remove. (You
might have a decent case for it, but my case above may not be sufficient).

> 
> 
>>>>The remove_* functions should probably not exist either - the point of
>>>>the map is to exist for the whole transaction, when things are finished
>>>>with the whole map should go away when the transaction does.
>>>
>>>I put in the remove_object method to handle cache invalidations.  This
>>>originates from the need to flush the cache in WeaveStore.copy_file.
>>>My personal opinion is that weaves should never be copied, but instead
>>>always merged (using WeaveVersionedFile.join), but I did not know the
>>>performance implications so I left it in.
>>
>>mmm, I'll need to look at this. I'll do so next time I get a good shot
>>to eyeball the code.
> 
> 
> I just think we should let copy_file go.

We probably nead it for Weave, because "join()" extracts all texts. But
that doesn't mean we need it for knits.

> 
> 
> 
>>>>+    def get_text(self, file_id, version_id, transaction):
>>>>+        """Return a sequence of lines that makes out version-id of
>>>>+        specified file."""
>>>>+        file = self.get_file(file_id, transaction)
>>>>+        return file.get_text(version_id)
>>>>+
>>>>+    get_lines = get_text
>>>>+
>>>>
>>>>I'd expect get_text and get_lines to return (respectively) a string and
>>>>a sequence of lines. Having them the same is rather, confusing.
>>>
>>>I rather have get_text return a sequence of lines, and have a
>>>get_string method that returns the lines concatenated.
>>>
>>>The reason get_lines is there is that I am lazy and didn't want to
>>>change a large set of tests.
>>
>>If I have a variable 'foo_text' with no extra explanation, would you
>>assume that to be a string or a list of lines ? I'm pretty sure the
>>former would be the case -> and thats why I'm encouring get_text return
>>a string.
> 
> 
> I disagree.  I think get_text should be the primary operation, and
> that it should return a list of lines.  If you want it as a string,
> we'll provide a convenience method for you.
> 
> I do not think of 'text' as a stream of bytes.  I guess that is the
> difference.

I think of a text as a stream of bytes. Probably because of it being
used that way so far in the code base. So I'm not stuck on it. But that
is what "get_text()" indicates in my brain.

John
=:->

> 
> 
> ~j
> 
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051221/92861807/attachment.pgp 


More information about the bazaar mailing list