[MERGE] 10-15% faster inventory serialisation but changes canonical form

John Arbash Meinel john at arbash-meinel.com
Fri Sep 21 19:55:51 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> bb:reject
> On Thu, 2007-09-13 at 08:25 +1000, Robert Collins wrote:
>> This patch does several things.
> ....
> 
> There is more test fallout - ither things looking at the xml.
> 
> For 100ms on the initial commit benchmark I'm hacking on this isn't
> worth spending more time on. So I'm going to leave it here - if someone
> wants to pick it up and polish it later that would be fine with me.
> 
> -Rob (who is looking for multi-second gains, not sub-second)

Well, you might consider rewriting it in a Pyrex extension.

Our current "append = out.append" hack is because list appending is the
fastest thing we could find in Python. if checks get a lot cheaper in Pyrex.

But what is the total time for inventory serialization? I have the
feeling it is maybe 1s for a Moz tree. Which means at best you can save
only 1 second of commit time. Which starts to matter a bit for
incremental commits (because we are creating a full inventory and then
diffing, etc).

Oh, and you asked off list about sha1 sums, etc. And I'm pretty sure
that our "osutils.sha_strings()" function is quite fast. (Calculating
the sha1 of 485 files takes 190ms on my machine.)

So I would guess that building up the Inventory into a list of strings
and passing that directly to the patience code would be better than
serializing all the way down to a pure string, and then building it back up.

The flip side is that you might end up creating 1 list per line (which
you then ''.join()) so it might cost a bit of malloc time, etc.

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFG9BO3JdeBCYSNAAMRAh5pAJ49HkR6nEDfTLMBgO4Ho4dUQSqjbACgufVq
CRezntmjNeQD2zb5YaA3HfA=
=BKY3
-----END PGP SIGNATURE-----



More information about the bazaar mailing list