[merge] Hacking updates
Martin Pool
mbp at canonical.com
Mon Jul 31 04:27:10 BST 2006
On 27 Jul 2006, John Arbash Meinel <john at arbash-meinel.com> wrote:
> >> +When a ``Command`` object is created, it is given a member variable
> >> +accessible by ``self.outf``. This is a file-like object, which is bound to
> >> +``sys.stdout``, and should be used to write information to the screen,
> >> +rather than directly writing to ``sys.stdout`` or calling ``print``.
> >> +This file has the ability to translate Unicode objects into the correct
> >> +representation, based on the console encoding. Also, the parameter
> >> +``self.encoding_type`` will effect how unprintable characters will be
> >> +handled. This parameter can take one of 3 values:
> >> +
> >> + replace
> >> + Unprintable characters will be represented with a simple '?', and no
> >> + exception will be raised. This is for any command which generates text
> >> + for the user to review, rather than for automated processing.
> >> + For example: ``bzr log`` should not fail if one of the entries has text
> >> + that cannot be displayed.
> >
> > Is it always '?' - I thought it might be a different character in utf-8?
> >
>
> Well, there are no Unicode characters that utf-8 can't represent. So it
> is never an escaped char. Now, your terminal might do all sorts of weird
> things depending on how it handles unicode characters.
>
> However, I suppose in certain code pages things could be represented
> differently. Reading the docs here:
> http://www.python.org/doc/current/lib/module-codecs.html
>
> It says:
> 'replace' (replace malformed data with a suitable replacement marker,
> such as "?")
>
> I can just use something like that.
You're quite right, any unicode string can be represented as utf-8 by
definition. I was thinking of input from utf-8, where errors can be
replaced by \ufffd, REPLACEMENT CHARACTER, as an example of where
'replace' gives something other than '?', though it doesn't apply to
output.
--
Martin
More information about the bazaar
mailing list