[Bug 839609] Re: [11.10 beta1] UnicodeDecodeError crash on localized input in multiple encodings/languages
Zygmunt Krynicki
zygmunt.krynicki at canonical.com
Wed Sep 14 17:07:39 UTC 2011
This bug is actually caused by invalid handling of input (sys.argv), not
output. When binary string (in utf-8) is coerced with unicode strings
(that are part of translated system messages) UnicodeDecode error is
raised as, by default, python coerces unicode and binary strings by
converting the binary string to unicode assuming ansi encoding.
A possible fix is to properly decode sys.argv arguments. I've tried this
by hard-coding UTF-8 input but it would be nice to fix this in general
too.
** Changed in: command-not-found
Status: In Progress => Triaged
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to command-not-found in Ubuntu.
https://bugs.launchpad.net/bugs/839609
Title:
[11.10 beta1] UnicodeDecodeError crash on localized input in multiple
encodings/languages
Status in Automatic command lookup from available packages:
Triaged
Status in “command-not-found” package in Ubuntu:
Confirmed
Bug description:
The command-not-found package crashes on input of a simplified chinese
character representing a bogus command. The problem was found with in
11.10 beta1, for both the x86/i386 and amd64 systems. Debugging the
python script in /usr/lib/command-not-found shows that a
UnicodeDecodeError is thrown. The crash_guard() callback framework
catches this and reports the error.
Here are further observations.
(1) With the same simplified chinese input, 11.04 handles the test case gracefully, returning a message
explaining that the command is not found.
(2) Between these two series, python has change: 11.04 (Python 2.7.1+)
versus 11.10 beta1 (Python 2.7.2+).
To elaborate on this problem, the following files have been included:
(1) Screen shots showing step-by-step how to reproduce the bug. As switching to Simplified Chinese is
difficult to explain in words, a video was taken to show how this process.
(2) A screen shot showing /usr/lib/command-not-found script traced by means of the Python pdb module.
This shows the zh_CN.UTF-8 byte stream input and the point where UnicodeDecodeError is thrown.
This issue was investigated in 11.10 beta1 host running in VirtualBox.
===
Taken from To_Reproduce_Bug.txt attachment.
01_After_ISO_Installation.png The VirtualBox VM with default English
locale.
02_Open_Lanugage_Support.png Prepare to switch to Simplified Chinese locale.
See the accompanying video for this process.
03_Enable_IBUS_Pinyin.png After switching to Simplified Chinese. Note the locale
environment variables. Click the IBUS keyboard icon
and select Pinyin input.
04_Pinyin_Enabled.png Ready for Pinyin input. Note the blue
IBUS icon.
05_Type_Phonetic_Pinyin.png Type in two letters: 'w' followed by 'o'. Phonetically these
correspond to the Chinese character representing 'I' or 'Myself'.
IBUS displays options. You want the first one. Hit the space
bar to choose it.
06_Chinese_Input_Complete.png Chinese 'wo' in zh_CN.UTF-8 is ready to be passed to the Bash.
Hit the return key to do so.
07_Crash_command_not_found.png Bash calls command-not-found, which
can't handle the input.
08_Disable_Pinyin_Input.png Instruct IBUS to disable Simplified
Chinese input.
ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: command-not-found 0.2.43ubuntu1 [modified: usr/lib/command-not-found]
ProcVersionSignature: Ubuntu 3.0.0-9.15-generic 3.0.3
Uname: Linux 3.0.0-9-generic x86_64
Architecture: amd64
Date: Fri Sep 2 10:23:08 2011
InstallationMedia: Ubuntu 11.10 "Oneiric Ocelot" - Beta amd64 (20110901)
PackageArchitecture: all
SourcePackage: command-not-found
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/command-not-found/+bug/839609/+subscriptions
More information about the foundations-bugs
mailing list