[MERGE] Pyrex RIO implementation

Thu May 14 22:27:06 BST 2009

On Thu, May 14, 2009 at 03:53:21PM -0500, John Arbash Meinel wrote:
> >> value = PyUnicode_DecodeUTF8(buffer, bytes_used-1, "strict")

> >> and you don't have to worry about any of the list append calls, or
> >> decoding bits as you go, etc.

> >> You also get to *re-use* this buffer for every key:value pair, since it
> >> is only used until you finish.
> > I've implemented this, too.

> >> You unfortunately don't get to share this buffer between Stanzas. You
> >> *could*, though I'd want to make sure to memory cap it to something
> >> reasonable, so that decoding a giant Revision text won't leave us with
> >> lots of wasted memory just sitting around.
> > I think the current improvements make things fast enough. Doing a
> > malloc (there'll usually just be one or two per revision) doesn't seem
> > like a huge problem to me.
> As you said in the other message:
> > With this patch, parsing all revision texts from bzr.dev now takes
> > ~1.35s with the XML serializer and ~0.91s with the RIO one.

> 0.91s certainly brings it out of the critical path for 'bzr log' time,
> though I'm curious where the remaining time is. (It may be in Stanza =>
> Revision, for example.)
Yeah, it's about 50/50 at the moment (50% utf8 -> stanza, 50% stanza
-> rev).

> Also, you don't need ';' at the end of lines...
Habits of a C programmer... :-)

> Otherwise it looks good to me.

> BB:tweak
Thanks for the quick and detailed reviews :-)

Cheers,

Jelmer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 315 bytes
Desc: Digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090514/ed5c8cad/attachment.pgp