Accelerate build_tree by using working tree files

John Arbash Meinel john at arbash-meinel.com
Fri Dec 21 14:06:24 GMT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Aaron Bentley wrote:
> Ian Clatworthy wrote:
>> Aaron Bentley wrote:
> 
>>> I think these are sub-optimal results, because AFAICT, you're running
>>> against a dirty dirstate.  _iter_changes should have no reason to call
>>> os.sha_file_by_name, because it should all be in dirstate.  But in fact,
>>> it's being called 37 667 times!
>> That's the number of (real) files in the tree. At that point in the
>> script, the only things that have been done are:
> 
>> init repo
>> init
>> add
>> commit
> 
> If you add 'status' here, it should set up dirstate.  But I am surprised
> that it's not being updated as part of 'commit'.  Unless the files are
> all less than 2 seconds old when you commit.

No, commit doesn't update the dirstate. I'm pretty sure we have an open bug on
it. At the moment, we don't have ways to feed back the sha1 from the commit
logic, back into the dirstate logic.

The only way to update the dirstate is through "update_entry" which is called
by status.

It isn't good layering, it is just how we have it right now. (status
*shouldn't* update the dirstate at the current location, as a side-effect of
_iter_changes. It should have _iter_changes that can return "may_be_modified"
and the higher level code can check and hand back "this is the new sha1".

That way higher levels that may want the file text only have to read it 1 time
(rather than reading to check the sha1, and then reading again to diff/commit
the text to the repository).

> 
>> I don't have time right now to look any deeper as I have one or two
>> other things to wrap up before my break. Either commit isn't updating
>> the dirstate or the dirstate is being generated for the new branch, yes?
> 
> Generating the dirstate for the new branch wouldn't cause any sha1s to
> be calculated.  It's got to be that commit isn't updating dirstate.
> 
>> Can we and should we optimise the dirstate generation of the new branch
>> using information from the old one?
> 
> I think the only case where that would make sense is when hardlinking
> working trees.  But in that case, you should be able to copy the stat
> data and sha1s for hardlinked files from the old tree into the new one.
> 
> Aaron

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHa8ggJdeBCYSNAAMRAnLbAKDI/rGXEXd+3Tfa4igNm0lgyDzzqwCfR9SF
8YlxurO9QApy95iXoxU1otk=
=heqp
-----END PGP SIGNATURE-----



More information about the bazaar mailing list