performance note sha1 vs md5 in python

Robert Collins robertc at robertcollins.net
Mon Aug 6 05:27:52 BST 2007


On Mon, 2007-08-06 at 13:57 +1000, Martin Pool wrote:
> Those are interesting rules-of-thumb to keep in mind.
> 
> If we're going to use md5 then we could even use md4 and be faster
> still, though that would require providing our own C implementation.

It appears to be in Crypto.Hash - how available is that?

> There are slight advantages to SHA1 in: consistency, not answering
> questions about why we chose it or whether it's a problem, and having
> security even for spaces where it's not expected.
> 
> If we could do a 64MB commit or pull in less than 2.5s then changing
> from sha1 to md5 would be a 10% overall improvement, which would be
> noticeable.

Pulling from bzr.dev the first 2000 mainline revisions (pull -r 2000)
generates 33MB of data, of which 30MB is the pack. Thats ~= .15seconds
to md5sum, and 0.3 to sha1sum. Currently it takes 36 seconds to pull
that from a knit repository, and 54 from another pack repository - but
knit to knit is 26 seconds. So there's clearly a lot of difference
compared to 2.5 seconds. However - 
s$ time cp -a experimental{,2}

real    0m0.452s
user    0m0.004s
sys     0m0.208s

Thats a lower limit to aim for :). And at half a second, 0.3 of a second
would be a serious hit.

-Rob

-- 
GPG key available at: <http://www.robertcollins.net/keys.txt>.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070806/dfb8c9d4/attachment.pgp 


More information about the bazaar mailing list