[MERGE] Move serialisation of inventories into CommitBuilder
Robert Collins
robertc at robertcollins.net
Wed Oct 3 00:59:35 BST 2007
On Tue, 2007-10-02 at 19:20 +1000, Andrew Bennetts wrote:
> bb:tweak
>
> This change seems reasonable to me. I'm not deeply familiar with CommitBuilder
> though, so if someone else wants to review the basic idea of this change that
> would be great.
>
> Further comments inline.
>
> > === modified file 'bzrlib/commit.py'
> > --- bzrlib/commit.py 2007-09-21 04:22:53 +0000
> > +++ bzrlib/commit.py 2007-10-02 01:06:55 +0000
> [...]
> > + content_summary = self.work_tree.path_content_summary(path)
> > + if not specific_files or is_inside_any(specific_files, path):
> > + if content_summary[0] == 'missing':
> > + deleted_paths.add(path)
> > + self.reporter.missing(path)
> > + deleted_ids.append(file_id)
> > + continue
> > + # TODO: have the builder do the nested commit just-in-time IF and
> > + # only if needed.
> > + if content_summary[0] == 'tree-reference':
> > + # enforce repository nested tree policy.
> > + if (not self.work_tree.supports_tree_reference() or
>
> Is it really possible that we could have a tree-reference here if work_tree
> doesn't support tree references? That seems odd. Can you explain to me how
> that might happen? (I'm guessing this seems odd to me because of my ignorance,
> rather than because it's a problem.)
path_content_summary reports the status of whats on disk, not the
logical support for a feature. The check here is a policy check, to keep
that method from having to lie ('no its not a subtree, even if it is'),
which keeps it general enough to be reused e.g. by add.
> > === modified file 'bzrlib/memorytree.py'
> > --- bzrlib/memorytree.py 2007-08-23 00:12:35 +0000
> > +++ bzrlib/memorytree.py 2007-09-05 05:51:34 +0000
> > @@ -98,6 +98,26 @@
> > return None, False, None
> > return entry.kind, entry.executable, None
> >
> > + def path_content_summary(self, path):
> > + """See Tree.path_content_summary."""
> > + id = self.path2id(path)
> > + if id is None:
> > + return 'missing', None, None, None
> > + kind = self.kind(id)
> > + if kind == 'file':
> > + bytes = self._file_transport.get_bytes(path)
> > + size = len(bytes)
> > + executable = self._inventory[id].executable
>
> It doesn't make sense to me to get the executable bit from the inventory but the
> file size from transport. Shouldn't you get both from the same source?
This is what we do for support on filesystems that are not natively
supporting the execute bit; as this is primarily test harness support it
seems sufficient unto the purpose to me.
> [...]
> > === modified file 'bzrlib/repository.py'
> > --- bzrlib/repository.py 2007-09-25 03:36:29 +0000
> > +++ bzrlib/repository.py 2007-09-27 21:11:38 +0000
> > @@ -200,14 +200,15 @@
> [...]
> > + :param content_summary: Summary data from the tree about the paths
> > + content - stat, length, exec, sha/link target. This is only
> > + accessed when the entry has a revision of None - that is when it is
> > + a candidate to commit.
> > :return: True if a new version of the entry has been recorded.
> > (Committing a merge where a file was only changed on the other side
> > will not return True.)
> > """
> > if self.new_inventory.root is None:
> > self._check_root(ie, parent_invs, tree)
> > + if ie.revision is None:
> > + kind = content_summary[0]
>
> I do wonder if it would be a little clearer to unpack the content_summary tuple
> rather than indexing it all the time. You've done a good job of commenting when
> necessary though so I'm not too worried about this. It might be fractionally
> quicker though ;)
It will be worth profiling when the body of this method gets to be a
larger chunk of overall time.
> > + else:
> > + # ie is carried over from a prior commit
> > + kind = ie.kind
> > + # XXX: repository specific check for nested tree support goes here - if
> > + # the repo doesn't want nested trees we skip it ?
> > + if (kind == 'tree-reference' and
> > + not self.repository._format.supports_tree_reference):
> > + # mismatch between commit builder logic and repository:
> > + # this needs the entry creation pushed down into the builder.
> > + raise NotImplementedError
>
> A brief message in this assert would be good, just to make sure we don't get
> mysterious bug reports from users if they do trigger this somehow.
"raise NotImplementedError('missing subtree support in the repository')"
> > + # transitional assert only, will remove before release.
> > + assert ie.kind == kind
>
> Is there a bug or something tracking this, to make sure we remove it before the
> release? Maybe stick a "or version != (0, 92)" in the condition ;)
Another couple of days and I hope to have a patch for it; its not
appropriate IMO to break the release via a condition like that, cute
though it may be.
> > + else:
> > + # Either there is only a hash change(no hash cache entry,
> > + # or same size content change), or there is no change on
> > + # this file at all.
> > + # There is a race condition when inserting content into the
> > + # knit though that can result in different content being
> > + # inserted so even though we may have had a hash cache hit
> > + # here we still tell the store the hash we would *not*
> > + # store a new text on, which means that it can avoid for us
> > + # without a race condition and without double-shaing the
> > + # lines.
> > + nostore_sha = parent_entry.text_sha1
>
> I don't quite follow that comment. That sixty-four word sentence is just a bit
> too convoluted for me to be able to detangle as I read it. Could you please try
> rephrasing it using shorter sentences?
Uhm, nothing comes to mind; but I've been up on the coal face on this
bit of code for some time. Care to suggest a rephrase?
> > + if store:
> > + nostore_sha = None
> > + try:
> > + ie.executable = content_summary[2]
> > + lines = tree.get_file(ie.file_id, path).readlines()
> > + ie.text_sha1, ie.text_size = self._add_text_to_weave(
> > + ie.file_id, lines, heads, nostore_sha)
> > + except errors.ExistingContent:
>
> That try block seems overly broad; surely only the final line can raise that
> exception?
Thanks, fixing.
> > + elif kind == 'symlink':
> > + current_link_target = content_summary[3]
> > + if not store:
> > + # symmlink target is not generic metadata, check if it has
>
> "symmlink" -> "symlink"
Ditto.
-Rob
--
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20071003/8fea1259/attachment.pgp
More information about the bazaar
mailing list