Both hgwebdir.cgi and webserve-dir.cgi are mbcs broken

Matt Mackall mpm at selenic.com
Thu Feb 1 06:10:33 GMT 2007


On Thu, Feb 01, 2007 at 11:51:17AM +0800, Dongsheng Song wrote:
> For mercurial:
> When I use mbcs file name and logs, the hgwebdir.cgi DO NOT display
> use same char encoding, see:
>    http://www.foresee.com.cn:9999/hg/scratch

The problem is not particular to multi-byte character sets. Mercurial
makes no attempt to transcode anything beyond our internal metadata
(author, changeset message, etc.) as there's generally no guarantee that:

a) filename or file contents match specified system encoding of committer
b) filename or file contents of committer could be represented on a
   given destination system
c) transcoding filenames or file contents wouldn't break build systems

So we just show the raw binary and report your system locale as the
HTML page encoding. Your system is claiming UTF-8, but your filenames
are not UTF-8.

If your project has a native encoding, you can set your hgweb up to
display it:

# If you'd like to serve pages with UTF-8 instead of your default
# locale charset, you can do so by uncommenting the following lines.
# Note that this will cause your .hgrc files to be interpreted in
# UTF-8 and all your repo files to be displayed using UTF-8.
#
#import os
#os.environ["HGENCODING"] = "UTF-8"

Just put your preferred encoding in place of UTF-8.

-- 
Mathematics is the supreme nostalgia of our time.



More information about the bazaar mailing list