One reason to use single character states

John Arbash Meinel john at arbash-meinel.com
Mon May 7 22:53:02 BST 2007


I was digging through python source code the other day, so I could
understand some of the C API. One thing I found was that the empty
string ('') and all single character strings are automatically interned
and re-used.

Which means that if you do f.read(1), you will always get the same
object back for the same character. It also means that making the
'minikind' 1 character does more than just save space in the file. It
also makes parsing fast because we don't create 20,000 objects just for
the minikind field.

A similar thing should be happening for the "executable" field (since it
is 'y' or 'n'). And when the size is 0.

Anyway, it was just something interesting that I found. I'm working on
re-implementing some of the core parsing in pyrex/C, and I realized I
didn't have to optimize object re-use at that level.

John
=:->



More information about the bazaar mailing list