[MERGE][Take two] Show the diff in the commit messages
Goffredo Baroncelli
kreijack at tiscalinet.it
Mon Jul 16 19:39:10 BST 2007
On the basis of the email from Martin and Aaron I changed the encoding logic
in the function make_commit_message_template()
- in the diff, if the line is an header is decoded as UTF8 (because the
before the header was encoded as UTF8)
- otherwise the line is decoded as bzrlib.user_encoding
the rationale is that the message will encoded as bzrlib.user_encoding during
the writing.
So the header which is know the encoding are correctly decoded. For the other
data, the decoding is the same of one which is used during the file writing
in order do minimize the encoding/decoding effect.
In order to move the user encoding at the UI level, the parameter
user_encoding is added to the following functions
- make_commit_message_template
- _create_temp_file_with_commit_template
- edit_commit_message
Finally, I also added a paragraph to the tutorial about the option, and added
another while I was there (thank to James Westby)
I hope that my patch is ready for the inclusion in the mainline.
In any case, I think that there is no _one_ "right" encoding for the diff.
Because the paths are generated from the info stored in the repository
(unicode) we have to encode/decode the header of the diff as appropriate(due
to the paths), and leave untouched the data of the body (due to the fact that
we don't know the encoding).
Moreover we have te same problem for the "annotate" functions: the username
are unicode.
Goffredo
On Monday 16 July 2007, you (Martin Pool) wrote:
> On 7/14/07, Aaron Bentley <aaron.bentley at utoronto.ca> wrote:
> > -----BEGIN PGP SIGNED MESSAGE-----
> > Hash: SHA1
> >
> > Goffredo Baroncelli wrote:
> > >> Causes the diff to be interpreted as utf-8, instead of leaving it as
> > >> binary.
> > >
> > > Yes, because the paths in the diff header are utf8 encoded!
> >
> > Perhaps we should make the diff header encoding selectable.
>
> This actually came up just recently when Jonathan posted a patch to
> fix the display of filenames in the diff. If I understand correctly,
> what we really want is:
>
[...]
> 2- Interpret the diff as being in the user's encoding and read it into
> the commit message template with errors=replace. That has the benefit
> that they should actually be able to read all of it without complaints
> from their editor, and since the diff is just for display it doesn't
> matter so much if some data is lost. The main problem here is that
> people with non-utf8 locales and non-ascii filenames will get them
> mangled until we fix diff to use the user's encoding. But we should
> do that anyhow.
>
> So I think I like #2 best.
>
> In fact, just reading it as ascii, errors=replace would be pretty useful.
>
> --
> Martin
>
--
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo)
<kreijack_AT_inwind.it>
Key fingerprint = CE3C 7E01 6782 30A3 5B87 87C0 BB86 505C 6B2A CFF9
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bzr-ci-diff.bundle.diff
Type: text/x-diff
Size: 36302 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20070716/1c1c2b8c/attachment-0001.bin
More information about the bazaar
mailing list