[patch] encodings again

John A Meinel john at arbash-meinel.com
Thu Dec 22 23:05:20 GMT 2005


Alexander Belchenko wrote:
> In last month bzr.dev was improved to support non-ascii filenames and
> messages but there is still some bugs. Attached patches fix bugs in
> showing log [1] (proper encoding log based on encoding of sys.stdout),
> and in commit with external editor [2] (infotext should be encoded from
> unicode to flat strings). Author of last patch is Ivan Vilata i Balaguer
> (see bug report page https://launchpad.net/products/bzr/+bug/5041). I'm
> test this patch and this one seems to work correct.
> 
> I hope someone (John?) includes these into his own integration branch.
> 
> [1] attached: log_encoding.diff
> [2] attached: msgeditor_encoding.diff
> 
> Alexander
> 

The changes look good. I'll give them a +1. To really have this do the
right thing, we could really use some test cases.
I don't know if we could set EDITOR to something like 'cat', just to
have the actual encoding code path activated. I realize it is a little
bit hard to test, but we should be able to find a way to make sure this
isn't broken. Rather than just expecting Alexander to run it, and find
out where it breaks.

I know Robert did some i18n work, with running some of the tests with
special encoding. Can those tests be expanded somehow?

I'm just wondering if someone who works in a different encoding could
write up a test which sets the local encoding, then does some commits
using encoded characters, and then extracts them (possibly after
changing encoding again).
It would be nice to get some real tests in, rather than the back and
forth that we have right now.

John
=:->

PS> In case you missed the discussion, the current rule for the main
branches (Robert & my integration branches, and probably bzr.dev) is
that we need 2 people to review the change and give it +1. If someone
else speaks up, I'll merge it into my integration branch.

> 
> ------------------------------------------------------------------------
> 
> === modified file 'bzr' (properties changed)
> === modified file 'bzrlib/builtins.py'
> --- bzrlib/builtins.py	
> +++ bzrlib/builtins.py	
> @@ -954,11 +954,15 @@
>          if rev1 > rev2:
>              (rev2, rev1) = (rev1, rev2)
>  
> -        mutter('encoding log as %r', bzrlib.user_encoding)
> +        if hasattr(sys.stdout, "encoding"):
> +            output_encoding = sys.stdout.encoding or bzrlib.user_encoding
> +        else:
> +            output_encoding = bzrlib.user_encoding
> +        mutter('encoding log as %r', output_encoding)
>  
>          # use 'replace' so that we don't abort if trying to write out
>          # in e.g. the default C locale.
> -        outf = codecs.getwriter(bzrlib.user_encoding)(sys.stdout, errors='replace')
> +        outf = codecs.getwriter(output_encoding)(sys.stdout, errors='replace')
>  
>          log_format = get_log_format(long=long, short=short, line=line)
>          lf = log_formatter(log_format,
> 
> 
> 
> 
> ------------------------------------------------------------------------
> 
> === modified file 'bzrlib/msgeditor.py'
> --- bzrlib/msgeditor.py	
> +++ bzrlib/msgeditor.py	
> @@ -19,9 +19,11 @@
>  
>  """Commit message editor support."""
>  
> +import codecs
>  import os
>  from subprocess import call
>  
> +import bzrlib
>  import bzrlib.config as config
>  from bzrlib.errors import BzrError
>  
> @@ -84,7 +86,8 @@
>          if infotext is not None and infotext != "":
>              hasinfo = True
>              msgfile = file(msgfilename, "w")
> -            msgfile.write("\n\n%s\n\n%s" % (ignoreline, infotext))
> +            msgfile.write("\n\n%s\n\n%s" % (ignoreline,
> +                                        infotext.encode(bzrlib.user_encoding)))
>              msgfile.close()
>          else:
>              hasinfo = False
> @@ -95,7 +98,7 @@
>          started = False
>          msg = []
>          lastline, nlines = 0, 0
> -        for line in file(msgfilename, "r"):
> +        for line in codecs.open(msgfilename, 'r', bzrlib.user_encoding):
>              stripped_line = line.strip()
>              # strip empty line before the log message starts
>              if not started:
> 
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051222/b97a6ea1/attachment.pgp 


More information about the bazaar mailing list