merging in versioned-file

John A Meinel john at arbash-meinel.com
Mon Feb 20 18:28:49 GMT 2006


Robert Collins wrote:
> So I've started on the merge conflict resolution for this, and Martin
> and I [hes camping at my place for 'net until his new adsl is installed]
> looked at some of the structural changes involved.
> 
> A couple of obvious things:
> 
> The transport changes are not backwards compatible, so we're going to
> deprecate the existing names and provide new names like 'open_append'
> rather than replacing the meaning of 'append'. 
> 

Sounds reasonable. I assume you are completely switching over to the
"Transport returns a file object" version. So I would expect you to add
the functions "open_read", "open_write", and "open_append".
I think the current code has "put()" returning a file-like object to
write to. But that doesn't seem very nice, so I would like to see that
changed to 'open_write'.

Further, we need to make it clear that the file-like objects that are
returned, aren't garbage collected properly, so they must be used in a
'try/finally' to make sure that they are fully uploaded, and then put
into place atomically. (The old put() interface did that for you, now
all calling code needs to call close() in a finally block).

Also, if we are going to this form, I would really like to see the
file-like objects have a readv() function for just reading part of the
file. Rather than using seek+read. I think it will be a lot nicer for
knits where you could easily end up with a disjoint set of ranges that
you want to read. And seek+read falls down once you do it 2 times.
(unless you translate all read() requests into a partial read, rather
than being able to optimize that readv() is a partial, and read(1000) is
trying to read the whole thing, just 1000 bytes at a time).


> Secondly, the fetcher logic is depending on the type code of the branch,
> and Martin and I have hammered out what seems to be a good solution to
> this recurring pattern - we have fast code paths between two
> implementations of an interface and a slower generic code path.
> 
> I.e. 'copy_multi_to' and 'fetch'.
> 
> So we are proposing that we introduce a 'InterTYPE' concept in these
> cases (specifically for the fetcher at this point). The InterTYPE
> interface will provide the methods that operate between two objects of
> TYPE, and will be subclassed or overridden as appropriate. We may
> provide a facade to make this look seamless (in fact, will have to for
> backwards compatability).
> 
> So sample code without the facade will look something like:
> 
> fetcher = InterRepository.get(from_repo, to_repo)
> fetcher.fetch(revision_id=XXX, basis=YYY)
> 
> 
> fetcher here is the variable name, but the object returned will have
> methods on it for all of the inter-repository operations -
> copy_content_into, fetch (at this point). For transports for instance it
> would have the *into methods on it.
> 
> 
> Whether those methods have local variables or instance variables, or
> whether they use MethodObjects or not is up to the individual concrete
> classes. 
> 
> The goal is to allow plugins that add transports or repositories or ...
> to add fast code paths as appropriate rather than having the slow code
> path kick in as soon as someone extends the system. (I.e. if you add a
> weave based format that is compatible for 'fetch' you should be able to
> have fast weave fetches occur.)
> 
> 
> Rob

I like the idea of fast code paths based on the specific objects of
concern (weaves vs knits vs v4 storage) rather than based on the branch
format. So +1 to that concept.

John
=:->


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060220/d6112f4e/attachment.pgp 


More information about the bazaar mailing list