Switching from Pyrex to Cython

John Arbash Meinel john at arbash-meinel.com
Tue Jul 14 16:30:02 BST 2009


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

So I've recently been playing around with writing some code in "pyrex"
and I figured I'd play around with Cython. So far, I've been really
happy with it.

The compiler gives much better errors, for example, if I remove a
trailing ')' which creates a syntax error, I get:

pyrexc bzrlib/_annotator_pyx.pyx --> bzrlib/_annotator_pyx.c
c:\Users\jameinel\dev\bzr\bzr.dev\bzrlib/_annotator_pyx.pyx:255:8:
Expected ')'

versus
Error converting Pyrex file to C:
- ------------------------------------------------------------
...
                                   this_annotation, parent_key):
        """Reannotate this text relative to a second (or more) parent."""
        (parent_annotations,
         matching_blocks) = self._get_parent_annotations_and_matches(
                                key, lines, parent_key
        _merge_annotations(this_annotation, annotations, parent_annotations,
       ^
- ------------------------------------------------------------

c:\Users\jameinel\dev\bzr\bzr.dev\bzrlib\_annotator_pyx.pyx:255:8:
Expected ')'


This is a moderate thing, but it also does a bit better at type
checking, etc. (It also refuses to cast a double to an int without an
explicit cast, which I probably prefer anyway.)

I've also seen cases where Cython can do compile-time checking of
variables, and let you know that a variable has not been defined, rather
than waiting until a RuntimeError. (Though we already have a habit of
needing tests to cover these cases anyway.)



The really big win for Cython would also be present if we just required
Pyrex 0.9.8+ which is that instead of doing:

cdef extern from "Python.h":
  int PyList_Append(object, object) except -1

...

  mylist = []
  PyList_Append(mylist, new_val)

You can just write:

  cdef list mylist
  mylist.append(new_val)

And the compiler can notice the type, and use PyList_Append as
appropriate. There are a few further advantages of Cython over Pyrex,
such as:

1) 'tuple' as a recognized type. I don't know if it generates optimized
code, but it at least lets you declare:
  cdef tuple mytuple

In pyrex that is a type error

2) Slightly faster tuple unpacking and for loop work. I'm not sure that
this would honestly show up on profiling but it does add stuff like:

if (PyList_CheckExact(obj)):
  # Use PyList_GetItem instead of PyObject_GetIter

So it adds a pointer comparison to avoid doing a malloc on common cases
in inner loops. Similarly for:

  x, y, z = function()

It has a:
if (PyTuple_CheckExact())
on the return value.

It also does stuff like uses the 'likely' and 'unlikely' macros, though
I'm not sure how much that really effects things.


Anyway, I'd *really* like to at least upgrade to a version of pyrex that
supports
  cdef list myobject
and
  myval += 1

And if we are going to do that, we should consider upgrading to Cython
anyway.


Some other differences I've seen:

1) If you get an exception in Pyrex code, it gives you a traceback which
includes the lines in the .pyx file. Cython does this too, but it also
includes the lines in the .c file. (potentially too much info, but can
help track down debugging.)

2) Pyrex adds code comments like:

/* C:/Path/to/bzr/bzrlib/foo.pyx:125 */

Which lets you know where the line came from. Cython goes a step further
with:

/* C:/Path/to/bzr/bzrlib/foo.pyx:125

   python code that
   was turned into this code	# <<<<<<<<<<
   and some context lines
 */

While this makes it a bit worse for versioning the auto-generated files,
it has been a bit useful when trying to debug how things are being
generated, without having to go back and forth between the files.

I believe the big reason we deferred back in the day was because
packages were not available at that time for the PQM machine. And right
now the latest version of pyrex on Jaunty is still 0.9.7 (which doesn't
support += nor "cdef list foo").

I think, though, that there is a genuine benefit to moving to
*something* newer, and if we have to package things ourselves, it seems
like Cython is better than Pyrex 0.9.8...

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkpcpHoACgkQJdeBCYSNAAPaWgCgvRxazNLVxikfylnYJPMDvXrt
JFIAniuYuSy/JfZaS8Eg/pIXZvyMfr7S
=Se0q
-----END PGP SIGNATURE-----



More information about the bazaar mailing list