Update on package import failures involving non-ascii filenames

James Westby james.westby at canonical.com
Wed Oct 26 12:52:02 UTC 2011


Hi Martin,

Thanks to you and the others that worked on this, it's great to have
some progress on this long-standing problem.

Thanks,

James


On Wed, 26 Oct 2011 09:56:58 +0100, Martin Packman <martin.packman at canonical.com> wrote:
> Yesterday we deployed a fix for <http://pad.lv/508258> to how the
> package importer handles non-ascii filenames. Some packages have now
> been successfully imported, and better feedback, including the problem
> filename, is now given where there are still errors.
> 
> 
> Jonathan Riddell correctly noted in the bug that some packages have
> filenames that aren't UTF-8, but the issue was also preventing some
> that did have names the importer could decode from succeeding, such
> as:
> 
> <http://anonscm.debian.org/gitweb/?p=bash-completion/debian.git;a=tree;f=test/fixtures/_filedir>
> 
> Several of the remaining failures look similarly to be from test
> suites, given the full filenames now listed in the error message. It's
> encouraging that programs care about this and want to test they handle
> non-ascii filenames correctly, though generating them at test time
> would be a better approach. :)
> 
> 
> The overall results are:
> * Before there were 69 package with failures across four similar
> UnicodeDecodeError signatures.
> * Now there are 36 across three BadFilenameEncoding signatures, plus 3
> new failures with InvalidNormalization, and 1 other issue exposed.
> 
> Filenames in encodings other than UTF-8:
> <http://package-import.ubuntu.com/status/08eff66a5fe37a967e2f2b06210cc608.html>
> <http://package-import.ubuntu.com/status/30ae559e05ce5bb20844bede31da986f.html>
> <http://package-import.ubuntu.com/status/85d4559836f13603269ee4914ae9d629.html>
> 
> Many of these are filenames in legacy single byte encodings, some are
> double byte, and some are clearly junk. A complete fix needs
> <http://pad.lv/63324> in bzr core resolving, but hacking some fallback
> into the importer would be possible if it was deemed worthwhile.
> 
> Packages with filenames that are not in unicode normal form 'NFC':
> <http://package-import.ubuntu.com/status/67718244ffa439a0d328d3ac8954ec61.html>
> 
> This is a symptom of the incomplete unicode normalisation code in bzr,
> which OSX <http://pad.lv/172383> also runs into
> 
> Funky symlink issue:
> <http://package-import.ubuntu.com/status/8a7cf4cbdc3d56a34c087549cacbdee2.html>
> 
> Probably a bug in the bzr-builddeb (copied from bzrtools) import_dir function.
> 
> 
> There were a few wrinkles in getting this deployed. Some existing
> fallout from recent lp:udd changes needed tackling first, and then
> using the latest lp:bzr-builddeb broke a few things that needed
> interface updates on the udd side. Most excitingly the first
> normalisation failure caused a loop that lead to infinitely repeating
> tracebacks. Fortunately we were monitoring the process at the time so
> could kill it and fix the problem.
> 
> Martin
> 
> -- 
> ubuntu-distributed-devel mailing list
> ubuntu-distributed-devel at lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-distributed-devel



More information about the ubuntu-distributed-devel mailing list