[merge] Hacking updates

John Arbash Meinel john at arbash-meinel.com
Thu Jul 27 15:28:04 BST 2006


Martin Pool wrote:
> On 25 Jul 2006, John Arbash Meinel <john at arbash-meinel.com> wrote:
>> Robert Collins requested an update to HACKING, which described how to
>> use the new Command.outf file.
>>
>> The attached diff cleans up HACKING a bit, so that it can be converted
>> with 'rst2html' without any errors. And then adds a few paragraphs
>> discussing the new encoding stuff.
> 
> +1, thanks very much.
> 
>>  (The current version of this document is available in the file ``HACKING``
>> -in the source tree, or at http://bazaar-ng.org/hacking.html)
>> +in the source tree, or at http://bazaar-vcs.org/hacking.html)
> 
> Unfortunately that URL doesn't work at the moment.
> 
> Since the web site is coming along so nicely, I think we should try to
> get more of the rst documentation regularly available on the web.  Since
> the wiki really "wants" to own the whole vhost namespace, I think we
> should make a different virtual host, and have say
> 
>     http://doc.bazaar-vcs.org/0.9/hacking/
>     http://doc.bazaar-vcs.org/0.9/tutorial/
> 
>     http://doc.bazaar-vcs.org/current/hacking/
> 
> etc (vaguely similar to python.org)
> 
> So you can either change that now to
> 
>     http://doc.bazaar-vcs.org/current/hacking/
> 
> if you think it makes sense, or we can leave it and fix it later.

I like it a lot, done.

> 
>> +Unicode and Encoding Support
>> +============================
>> +
>> +This section discusses various techniques that Bazaar uses to handle
>> +characters that are outside the ASCII set.
> 
> very good
> 
>> +
>> +``Command.outf``
>> +----------------
>> +
>> +When a ``Command`` object is created, it is given a member variable
>> +accessible by ``self.outf``.  This is a file-like object, which is bound to
>> +``sys.stdout``, and should be used to write information to the screen,
>> +rather than directly writing to ``sys.stdout`` or calling ``print``.
>> +This file has the ability to translate Unicode objects into the correct
>> +representation, based on the console encoding.  Also, the parameter
>> +``self.encoding_type`` will effect how unprintable characters will be
>> +handled.  This parameter can take one of 3 values:
>> +
>> +  replace
>> +    Unprintable characters will be represented with a simple '?', and no
>> +    exception will be raised. This is for any command which generates text
>> +    for the user to review, rather than for automated processing.
>> +    For example: ``bzr log`` should not fail if one of the entries has text
>> +    that cannot be displayed.
> 
> Is it always '?' - I thought it might be a different character in utf-8?
> 

Well, there are no Unicode characters that utf-8 can't represent. So it
is never an escaped char. Now, your terminal might do all sorts of weird
things depending on how it handles unicode characters.

However, I suppose in certain code pages things could be represented
differently. Reading the docs here:
http://www.python.org/doc/current/lib/module-codecs.html

It says:
  'replace' (replace malformed data with a suitable replacement marker,
  such as "?")

I can just use something like that.

I'll fix up self.encoding_type as per what Robert requested, and then
submit.

John
=:->

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20060727/aa7219f4/attachment.pgp 


More information about the bazaar mailing list