Bazaar on IronPython

Robert Collins robert.collins at canonical.com
Tue Jun 30 01:53:57 BST 2009


On Mon, 2009-06-29 at 22:59 +0100, Martin (gzlist) wrote:

...
> Bazaar expects to be able to str-format together any of: paths from
> the filesystem, messages from the OS, metadata from bazaar, the
> contents of files, and user input, then write it out to the terminal.
> There are a number of places that make the effort to do the right kind
> of conversions, but lots more that don't, so I frequently get junk
> output of one kind or another. The root cause is the same problem
> IronPython has - using a single type that treats random binary data
> and text interchangeably.

We use two types - byte strings and unicode. Its a defect in python < 3
that when a bug occurs and we do an operation with an object of both
types that it succeeds :).


> This doesn't necessarily mean bazaar has to start using the unicode
> type everywhere, but it does need a clear differentiation between
> internal bytes and any other text from the environment. Relying on all
> inputs being UTF-8 already, and the terminal being UTF-8, or
> everything being ascii, does not work where you have a UTF-16
> filesystem, a CP1252 user environment, and a CP850 terminal.

We don't rely on this, and yes - you're completely right that relying on
that would suck :).

> This is all resolvable, but will mean some changes to abstractions.
> I'm particularly adverse to interfaces like
> bzrlib.xml8._encode_and_escape as commented in the patch - the caller
> of an api *has* to know the provenance of a string it supplies,
> after-the-fact heuristics are at best inefficient.

Thats a deep internal function. We know the provenance of xml content in
our repositories. 

I would say our biggest unicode vs ascii tensions are:
 - outputting diffs (because garbage in garbage out, and that messes up
the terminal).
 - handling nonsense filesystem paths (e.g a file '0x01').

-Rob
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20090630/15d963b0/attachment.pgp 


More information about the bazaar mailing list