Bazaar on IronPython
Andrew Bennetts
andrew.bennetts at canonical.com
Mon Jun 29 01:23:11 BST 2009
Martin (gzlist) wrote:
> In short, see the attached patch for getting bazaar to run on
> IronPython. It only passes a subset of the tests, but can do some of
[...]
Interesting!
> Some misc observations from the process:
>
> Bazaar has a lot of platform switching code in various different
> places. I get a strong impression that anything touching sys.platform
> or errno should be migrated to osutils.
Yes, that's probably a good guideline. (I'm not sure it would be right to
make it a firm rule, though.)
> Using ref-counted RAII is convenient. To change everything to
> finally-close would be too much pain for no actual benefit.
When Python 2.5 becomes common enough we might drop support for Python 2.4,
which would let us use the “with” statement, which would be even better than
relying on ref-counting or try-finally. It would certainly make some of our
lock/do-work/unlock code better.
> Required modules - I could get away with moving subprocess and bz2 to
> lazy import, but too much of the test suite mechanics depended on
> zlib. There's an emulation binary created by Jeff Hardy that resolved
> that, though it has various problems (noted in the patch near the
> workarounds).
> <http://jdhardy.blogspot.com/2008/12/solving-zlib-problem-ironpythonzlib.html>
> <http://cid-414fa1a9bd174b4b.skydrive.live.com/self.aspx/Public/IronPython.Zlib.zip>
Bazaar also relies on zlib for some disk formats, so having a working (and
fast) zlib implementation is pretty important.
zlib and I think bz2 are both used in parts of the network protocol, too.
> The combination of bazaar trying to pretend that everything is a UTF-8
> byte string and IronPython trying to pretend the str type is unicode
> is recipe for trouble. I think bazaar should give here, and sort out
> a better set of idioms for working with both binary data and
> non-english text - it'll be needed for Python 3 anyway.
> <http://www.infoq.com/news/2007/06/IronPython-STR>
> <http://www.smallshire.org.uk/sufficientlysmall/2009/06/18/string-compatibility-between-python-implementations/>
I disagree. Python 3, like Python 2, has a type designed to hold 8-bit byte
strings, and mostly Bazaar prefers to work in byte strings rather than
needlessly decoding then re-encoding them. Obviously user input is
typically text, not bytes, but much of the data Bazaar works with is bytes
from disk or the network. And while some of our data like revision IDs are
defined as being serialised as UTF-8, we almost never display them so it's
much more efficient to handle them as bytestrings (less memory consumption,
and no computation wasted on decoding/encoding). So I'd expect that Bazaar
on implemented Python 3 would make heavy use of the “bytes” type, but Bazaar
is implemented on Python 2, so that means “str”.
IronPython is broken here, IMO. Python 2 (and 1!) clearly defines “str” as
8-bit bytestrings, and always has. By choosing to implement them
differently IronPython has chosen to be arbitrarily incompatible. So it's
implementing a language that is rather like Python, but very definitely not
Python. Last time I chatted to an IronPython developer (over a year ago,
admittedly) I got the impression that they realised this was a mistake and
were considering how to fix it. Perhaps they're just waiting for everyone
to move to Python 3?
> IronPython tries in some places to be helpful by stubbing in various
> methods and raising NotImplementedError-s from them - but didn't get
> the signatures right, so you get some completely different cryptic
> exception instead.
D'oh :)
> This line at the bottom of bzrlib.builtins:
> from bzrlib.foreign import cmd_dpush
> pulls in a bunch of extra imports, and makes a difference of about a
> tenth of a second and a megabyte of disk read to `bzr rocks` on my
> machine. Or twenty four seconds for IronPython 2.0.0...
Ouch. Not sure why it would be so slow, bzrlib.foreign is a fairly slim
module, and doesn't import much that wouldn't already be imported, except
perhaps bzrlib.branch.
> Likewise, it seems to make inspect_for_copy.py pointless. Is there any
> ongoing auditing of perf hacks like the lazy imports and regexps and
> this to ensure they remain net wins?
We don't specifically measure those all the time, but we regularly do
benchmark and profile our performance. I think inspect_for_copy has already
been showing signs of age against newer Pythons than 2.4, but we still want
to run (and run well) on 2.4.
> Overall, it was quite impressive how little needed altering, though
> the path to discovering those small changes was often winding, even
> the root cause was the same as a previously handled issue.
Yeah, it's very interesting to see. Thanks!
-Andrew.
More information about the bazaar
mailing list