[MERGE] Re: merging in versioned-file

Tue Feb 21 00:24:10 GMT 2006

On Mon, 2006-02-20 at 12:28 -0600, John A Meinel wrote:
> Robert Collins wrote:
> > So I've started on the merge conflict resolution for this, and Martin
> > and I [hes camping at my place for 'net until his new adsl is installed]
> > looked at some of the structural changes involved.
> > 
> > A couple of obvious things:
> > 
> > The transport changes are not backwards compatible, so we're going to
> > deprecate the existing names and provide new names like 'open_append'
> > rather than replacing the meaning of 'append'. 
> > 
> 
> Sounds reasonable. I assume you are completely switching over to the
> "Transport returns a file object" version. So I would expect you to add
> the functions "open_read", "open_write", and "open_append".
> I think the current code has "put()" returning a file-like object to
> write to. But that doesn't seem very nice, so I would like to see that
> changed to 'open_write'.

Yes, thats the plan.

> Further, we need to make it clear that the file-like objects that are
> returned, aren't garbage collected properly, so they must be used in a
> 'try/finally' to make sure that they are fully uploaded, and then put
> into place atomically. (The old put() interface did that for you, now
> all calling code needs to call close() in a finally block).

Well, they can't offer any stronger gc than python does. I wonder if
keeping the convenience methods is useful? I.e. keep Transport.put() at
the top level, undeprecated, and implemented in terms of try: stream =
self.open_write(); stream.write(content); finally: stream.close()

> Also, if we are going to this form, I would really like to see the
> file-like objects have a readv() function for just reading part of the
> file. Rather than using seek+read. I think it will be a lot nicer for
> knits where you could easily end up with a disjoint set of ranges that
> you want to read. And seek+read falls down once you do it 2 times.
> (unless you translate all read() requests into a partial read, rather
> than being able to optimize that readv() is a partial, and read(1000) is
> trying to read the whole thing, just 1000 bytes at a time).

I think readv() is a useful optimisation to implement, but its important
to note that we do not know the required ranges a-priori much of the
time, so having seek + read as well is fine IMO. Or are you saying 'only
provide readv, and if someone wants a stream interface, they build it on
top of readv' ?

..
> I like the idea of fast code paths based on the specific objects of
> concern (weaves vs knits vs v4 storage) rather than based on the branch
> format. So +1 to that concept.

Well, its not necessarily down at that level, though as the concept
spreads I imagine we will get to that point. With InterStore etc etc.

For now, it removes the multiple format checks at the repository level
into a single check, which can be extended at runtime by plugins (either
derive from InterWeaveRepo and implement a new is_compatible) or write a
branch new InterXRepo class as needed). These are also tested against a
set of conformance tests, so we can find bugs. One interesting one is
that all the repository slow-code paths were completely untested: we
have only one repository layout that is writable, and the slow code
paths were never triggering. So I've added a FIXME to reinstate testing
of them as soon as we have an incompatible format to use them against.

Seeking a +1 on the InterRepository implementation attached.

inter-repo-api$ wc -l inter-repo.patch 
1731 inter-repo.patch

inter-repo-api$ diffstat inter-repo.patch 
 NEWS                                                                 |    8 
 bzrlib/branch.py                                                     |   46 -
 bzrlib/fetch.py                                                      |  123 --
 bzrlib/merge.py                                                      |    8 
 bzrlib/repository.py                                                 |  439 +++++++---
 bzrlib/revisionspec.py                                               |    3 
 bzrlib/tests/__init__.py                                             |    4 
 bzrlib/tests/branch_implementations/test_branch.py                   |    8 
 bzrlib/tests/bzrdir_implementations/test_bzrdir.py                   |   90 +-
 bzrlib/tests/interrepository_implementations/__init__.py             |   54 +
 bzrlib/tests/interrepository_implementations/test_interrepository.py |  158 +++
 bzrlib/tests/repository_implementations/test_fileid_involved.py      |    4 
 bzrlib/tests/repository_implementations/test_repository.py           |   73 -
 bzrlib/tests/test_bzrdir.py                                          |    2 
 bzrlib/tests/test_commit_merge.py                                    |    7 
 bzrlib/tests/test_fetch.py                                           |   31 
 bzrlib/tests/test_merge.py                                           |    3 
 bzrlib/tests/test_repository.py                                      |   87 +
 bzrlib/tests/test_revision.py                                        |   13 
 bzrlib/tests/test_selftest.py                                        |   28 
 20 files changed, 846 insertions(+), 343 deletions(-)

Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inter-repo.patch
Type: text/x-patch
Size: 77962 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060221/a6ca23b8/attachment.bin 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060221/a6ca23b8/attachment.pgp