[MERGE] Be more efficient with sha operations during commit

Robert Collins robertc at robertcollins.net
Thu Sep 27 22:51:15 BST 2007


On Thu, 2007-09-27 at 15:01 -0500, John Arbash Meinel wrote:
> 
> 
> One quick comment here. If sha.new(''.join(strings)).hexdigest() is so
> much
> more efficient than sha.new()...map(sha.update, strings), then why
> don't we
> just change 'sha_strings' to use the more efficient form?

Because of two things. One is that once we've done the string join, we
can use it for serialization in the full-text case (I have a further
patch that does this). Secondly, the space vs time tradeoff you mention
would be surprising for someone calling a _strings method - one expects
such methods to operate on the strings and not toss e.g. 2GB of memory
away.

> Also, we should be aware that this is a big space versus time
> tradeoff, and I'm
> not sure it is one we want to make. It requires creating another copy
> in memory
> (the full text of the file, versus a list of its lines).

Indeed it does. I thought I noted that somewhere in the patch.

> It may be okay, but it is one of those places that bloats us from
> holding 2
> copies in memory to holding 3 (even if only for a brief second while
> we compute
> the sha hash).

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070928/3792a88d/attachment.pgp 


More information about the bazaar mailing list