[Bug 321656] [NEW] iso-8859-1 and/or utf-8 character not decoded properly

Zooko O'Whielacronx zooko at zooko.com
Mon Jan 26 21:57:59 UTC 2009


Public bug reported:

Binary package hint: kdebase-kde4

When I view this page:

http://eprint.iacr.org/2008/527

The non-ascii char in the last name of the author "Michal Rjaško"
appears as a black diamond with a question mark in it.  If I do "View ->
View Document Information" then the resulting dialog box says "Document
encoding: UTF-8" *and* says "Content-Type: text/html;
charset=ISO-8859-1".  Inspecting the headers with 'wget --save-headers'
shows that the server is indeed specifying "Content-type: text/html;
charset=iso-8859-1".  Even more interesting, the "View -> Set Encoding"
option currently shows the radio button labelled "Western European ->
Autodetect".  If I change that radio button to "Western European ->
ISO-8859-1" then the author's name renders correctly.  Also if I change
that radio button to "Unicode -> UTF-8" then it also renders correctly.
I think the auto-detection algorithm could use some work.  ;-)

$ apt-cache policy konqueror-kde4
konqueror-kde4:
  Installed: 4:4.0.3-0ubuntu2
  Candidate: 4:4.0.3-0ubuntu2
  Version table:
 *** 4:4.0.3-0ubuntu2 0
        500 http://us.archive.ubuntu.com hardy/universe Packages
        100 /var/lib/dpkg/status

** Affects: kdebase-kde4 (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: encoding iso-8859-1 unicode

-- 
iso-8859-1 and/or utf-8 character not decoded properly
https://bugs.launchpad.net/bugs/321656
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.

-- 
ubuntu-bugs mailing list
ubuntu-bugs at lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


More information about the universe-bugs mailing list