[MERGE] Be more efficient with sha operations during commit

John Arbash Meinel john at arbash-meinel.com
Thu Sep 27 21:01:08 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> On Mon, 2007-09-24 at 13:53 +1000, Ian Clatworthy wrote:
>> Robert Collins wrote:
>>> This saves 10 seconds or so on an initial commit of a mozilla tree by
>>> using sha_string, rather than sha_strings.
>> bb:tweak
>>
>>> As part of it I've removed an unused function from the _KnitData class,
>> As _KnitData is a private class, it's not strictly needed though
>> mentioning in NEWS that add_record was removed would be nice.
> 
> As it really is internal,  I think its noise to do that.
> 
> 
>> Setting of size & bytes is common to both branches of the if statement
>> so pull it out as common code.
> 
> I have another patch that is not yet baked that makes the assignment be
> different in one branch, for another performance win.
> 
> -Rob

One quick comment here. If sha.new(''.join(strings)).hexdigest() is so much
more efficient than sha.new()...map(sha.update, strings), then why don't we
just change 'sha_strings' to use the more efficient form?

Also, we should be aware that this is a big space versus time tradeoff, and I'm
not sure it is one we want to make. It requires creating another copy in memory
(the full text of the file, versus a list of its lines).

It may be okay, but it is one of those places that bloats us from holding 2
copies in memory to holding 3 (even if only for a brief second while we compute
the sha hash).

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG/AwEJdeBCYSNAAMRAk4QAKCHi6NtsoVOtRF5pZQPwuSRth+LFwCeJTpc
1mddvXfNVt9c9UAg0R7hKQo=
=jM7T
-----END PGP SIGNATURE-----



More information about the bazaar mailing list