[MERGE] Update to bzr.dev.

Mon Jun 16 07:01:38 BST 2008

On Thu, 2008-06-12 at 17:22 +1000, Andrew Bennetts wrote:
> Some more review comments:

> > === modified file 'bzrlib/bundle/serializer/v4.py'
> [...]
> >      def write_files(self):
> >          """Write bundle records for all revisions of all files"""
> > -        for vf, file_id, revision_ids in self.iter_file_revisions():
> > -            self.add_mp_records('file', file_id, vf, revision_ids)
> > +        texts = self.repository.texts
> > +        text_keys = []
> > +        for file_id, revision_ids in \
> > +            self.repository.fileids_altered_by_revision_ids(
> > +                self.revision_ids).iteritems():
> > +            revision_ids = list(revision_ids)
> > +            for revision_id in revision_ids:
> > +                text_keys.append((file_id, revision_id))
> > +        self.add_mp_records_keys('file', texts, text_keys)
> 
> This is a little confusing.  I don't see any reason why you list()ify
> revision_ids before iterating it.

I think it was for debugging use - print <generator> isn't very
informative. Removed.

> Also, a minor source of confusion here is that this method uses both
> “self.revision_ids” and “revision_ids” in the same 3-line statement.  Perhaps
> this would be a little clearer:
> 
>         fileids_altered = self.repository.fileids_altered_by_revision_ids(
>             self.revision_ids)
>         for file_id, revision_ids in fileids_altered.iteritems():
>             ...

I went with altered_fileids, it seemed clearer :).

> [...]
> > +    def add_mp_records_keys(self, repo_kind, vf, keys):
> >          """Add multi-parent diff records to a bundle"""
> > -        revision_ids = list(multiparent.topo_iter(vf, revision_ids))
> > -        mpdiffs = vf.make_mpdiffs(revision_ids)
> > -        sha1s = vf.get_sha1s(revision_ids)
> > -        parent_map = vf.get_parent_map(revision_ids)
> > -        for mpdiff, revision_id, sha1, in zip(mpdiffs, revision_ids, sha1s):
> > -            parents = parent_map[revision_id]
> > +        ordered_keys = list(multiparent.topo_iter_keys(vf, keys))
> > +        mpdiffs = vf.make_mpdiffs(ordered_keys)
> > +        sha1s = vf.get_sha1s(ordered_keys)
> > +        parent_map = vf.get_parent_map(ordered_keys)
> > +        for mpdiff, item_key, sha1, in zip(mpdiffs, ordered_keys, sha1s):
> > +            parents = [key[-1] for key in parent_map[item_key]]
> >              text = ''.join(mpdiff.to_patch())
> > +            # Infer file id records as appropriate.
> > +            if len(item_key) == 2:
> > +                file_id = item_key[0]
> > +            else:
> > +                file_id = None
> >              self.bundle.add_multiparent_record(text, sha1, parents, repo_kind,
> > -                                               revision_id, file_id)
> > +                                               item_key[-1], file_id)
> 
> It's a bit of a wart that (revid,) and similar is needed everywhere.  For
> example changing “revision_id” to “item_key[-1]” definitely seems like a step
> backwards for clarity.  Mostly the changes have tended to make the code more
> readable (e.g.  the new “texts” attribute rather than weave_store and friends),
> but this aspect of the patch is making me cringe a little bit every time.  Maybe
> the multi-element keys have spread through more of the API than they should
> have?  It feels like an implementation detail of a low level that has leaked
> into a few more layers than it needed to.
> 
> Signalling the type of a record by the number of elements in the key also feels
> a bit dirty to me.

I think it is the reverse actually - due to time constraints I haven't
propogated the changes as widely as they should be. Its things on the
edge here, like bundle serialization, where the add_multiparent_record
is working in string values. Note that the len() test is following suite
with e.g. decode_name in the same file. I can probably make that
function a little clearer with a comment or two - I shall do so.

> > === modified file 'bzrlib/bzrdir.py'
> [...]
> > -        _revision_store = TextRevisionStore(TextStore(revision_transport,
> > -                                                      prefixed=False,
> > -                                                      compressed=True))
> > +        from bzrlib.xml5 import serializer_v5
> > +        revision_store = RevisionTextStore(revision_transport,
> > +            serializer_v5, False, versionedfile.PrefixMapper(),
> > +            lambda:True, lambda:True)
> >          try:
> > -            transaction = WriteTransaction()
> >              for i, rev_id in enumerate(self.converted_revs):
> >                  self.pb.update('write revision', i, len(self.converted_revs))
> > -                _revision_store.add_revision(self.revisions[rev_id], transaction)
> > +                text = serializer_v5.write_revision_to_string(
> > +                    self.revisions[rev_id])
> > +                key = (rev_id,)
> > +                revision_store.add_lines(key, None, osutils.split_lines(text))
> 
> It seems a bit odd that you pass the serializer to the revision_store, but then
> you still need to serialize it manually before giving it to the revision_store.

It is a little odd I agree, but its how the layering of the thing worked
out. Note that this is converting really old repositories :).

> > +        def yield_roots():
> > +            for root_id, rev_id in root_id_order:
> > +                key = (root_id, rev_id)
> 
> It probably doesn't make a real difference, but it's slightly wasteful to unpack
> a tuple and then immediately make an identical tuple.

Thanks it is - fixed.

-Rob
-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20080616/bf7e7251/attachment.pgp