[MERGE] Refactor commit to prepare for population by tree walking

Ian Clatworthy ian.clatworthy at internode.on.net
Thu Jul 5 15:22:09 BST 2007


Aaron Bentley wrote:
>> The
>> CommitBuilder code needs the SHA value in order to test for some special
>> cases - 'changed' isn't enough.
> 
> That doesn't sound right to me.  I can see how it would want the SHA in
> order to record it, but not in order to test for special cases.  Could
> you explain further?

Sure. Here's the snippet of code from modified_file_text() in
repository.py (line 2026 or so):

         # special case to avoid diffing on renames or
         # reparenting
         if (len(file_parents) == 1
            and text_sha1 == file_parents.values()[0].text_sha1
            and text_size == file_parents.values()[0].text_size):
            previous_ie = file_parents.values()[0]
            versionedfile = self.repository.weave_store.get_weave(file_id,
                self.repository.get_transaction())
            versionedfile.clone_text(self._new_revision_id,
                previous_ie.revision, file_parents.keys())
            return text_sha1, text_size

The SHA is currently set earlier inside _read_tree_state() in
inventory.py and used for _unchanged() inside the InventoryFile class. I
can refactor the inventory code so that _unchanged() doesn't need to be
called because iter_changes tells me that. I don't think I can skip the
logic in modified_file_text() though?

Right now, the profiler is telling me that the lookup of SHA and
executable flag inside InventoryFile._read_tree_state() is a big part of
performance. I can avoid those lookups if I have iter_commitable passing
them to me if the working tree (dirstate) knows them. (For new files, I
need to calc the SHA at the end anyway so no point doing it earlier.)

Ian C.



More information about the bazaar mailing list