Rev 4747: Some more performance data. in http://bazaar.launchpad.net/~jameinel/bzr/2.1-static-tuple

John Arbash Meinel john at arbash-meinel.com
Fri Oct 2 22:46:18 BST 2009


At http://bazaar.launchpad.net/~jameinel/bzr/2.1-static-tuple

------------------------------------------------------------
revno: 4747
revision-id: john at arbash-meinel.com-20091002214612-rq4vb31klg6tqqr2
parent: john at arbash-meinel.com-20091002211441-8j4d2d00cjg4oi7p
committer: John Arbash Meinel <john at arbash-meinel.com>
branch nick: 2.1-static-tuple
timestamp: Fri 2009-10-02 16:46:12 -0500
message:
  Some more performance data.
  
  #1 thing is that StaticTupleInterner type is not actually in the GC, because it doesn't
  have any pure 'object' attributes. As such it doesn't implement tp_traverse either.
  Which means we don't track the size via Meliae as well as I would like.
  However, in an interesting result, memory consumption is down *and* speed is better.
  On LP: 259460KB => 243936KB (15MB) and 0m30.525s => 0m26.457s
  My guess is that by not walking the intern dict, gc has a lot less work to do when
  walking all those keys that we know it will just ignore next.
  Adding the 'self.hash' attribute brings memory consumption back up a bit:
  252724KB and time down barely at all 0m25.756s
  Probably we could get more time performance out of StaticTupleInterner by making
  the internal table only point to StaticTuple objects, and then we could access
  their self->hash directly.
  
  However, seeing this means I can move StaticTuple itself into a pyrex function,
  since I'll only have C attributes and a PyObject* table. I just hope I can figure
  out how to get a tp_traverse that Meliae can use...
  
  Note that as a reference point, w/ bzr.dev:
    326984KB, 53.4s
  Or we now have a 1.34:1 memory savings and 2:1 speed improvement.
  The other guess on why StaticTupleInterner works well is that the hash functionality
  really does spread things out evenly, so we rarely get collisions.
  And if you don't have a collision, then you don't care about the time
  for hash() because it is only computed for the new key (which you have to do anyway)
  and not for any of the entries already present.
  
  Time for 'bzr log -r -1 -n0 bzr.dev' is at 691ms which is ~= bzr.dev
  'bzr log -v bzr.dev' is 19.1 => 18.7s and 169MB=>197MB which is a net loss... :(
  May have to revisit the CHKMap internals.
  'bzr log -n0 -v bzr.dev' is  2m25s=>2m21s and 235MB=>244MB which isn't a great
  tradeoff.
  Strange that things don't seem to be a win in 'real-life', I wonder if
  somewhere we are casting things back into regular tuples that I missed.
  (tuple(tpl) is tpl, but tuple(static_tpl) is not static_tpl).
  
  There are also possibilities that tuples are special cased in more places, or
  the custom __iter__ functionality or... etc.
  
  Next up, mem testing on 64-bit.
-------------- next part --------------



More information about the bazaar-commits mailing list