import failures
John Arbash Meinel
john at arbash-meinel.com
Tue Jan 5 17:18:59 GMT 2010
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
...
> On Fri Dec 11 03:35:01 +0000 2009 james.westby wrote:
>> 639 packages failed
>>
>> 94 repeated reasons:
>>
>> 61 packages failed with reason
...
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/util.py", line
>> 358, in find_extra_authors
>> match = extra_author_re.match(change.decode("utf-8"))
>> File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>> return codecs.utf_8_decode(input, errors, True)
>> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 9-14:
>> unsupported Unicode code range
>
> First of a set which is probably non-utf8 data in changelogs. There may be hacks
> we can do for this. Don't discount the possibility that it is faulty encoding
> handling though.
>
^- 'unsupported Unicode code range' sounds funny, but it may just be
that they have latin-1 chars in what should otherwise be a UTF-8 doc. Is
changelog *defined* as UTF-8? Or is it just '8-bit, put whatever feels
good to you' in there?
>> 43 packages failed with reason
...
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 159, in import_dir
>> import_archive(tree, dir_file, file_ids_from=file_ids_from)
>> File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 234, in import_archive
>> trans_id = tt.trans_id_tree_path(relative_path)
>> File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 241, in
>> trans_id_tree_path
>> path = self.canonical_path(path)
>> File "/usr/lib/python2.5/site-packages/bzrlib/transform.py", line 1282, in
>> canonical_path
>> abs = self._tree.abspath(path)
>> File "/usr/lib/python2.5/site-packages/bzrlib/workingtree.py", line 394, in
>> abspath
>> return pathjoin(self.basedir, filename)
>> File "/usr/lib/python2.5/posixpath.py", line 65, in join
>> path += '/' + b
>> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 25:
>> ordinal not in range(128)
>
> This is usually that there is a filename from a different encoding in the
> package. We may not be able to get around this. I imagine some are utf-8
> though, so it may be a bug that it is trying to decode in ascii.
>
My guess is that you are handing us 8-bit paths, and inside bzrlib all
*paths* are supposed to be Unicode. And if you hand us an 8-bit string,
and we up-cast it to Unicode, then we fail because the upcast is
generally done via ascii.
So I would at least take a first look at the 'import_archive' code, and
make sure it is trying to work in Unicode paths, rather than 8-bit strings.
>> 36 packages failed with reason
>> 'launchpadlib.errors.HTTPError:<module>:main:get_versions:lp_call:__call__:_requ
>> est':
>
...
>> File "/usr/lib/python2.5/site-packages/launchpadlib/_browser.py", line 211,
>> in _request
>> raise HTTPError(response, content)
>> launchpadlib.errors.HTTPError: HTTP Error 503: Service Unavailable
>
> Launchpad doesn't like me. These 36 happened in the few hours I
> was working on this task.
>
Could this be related to the overloading of whatever machine that also
happened? Meaning running this stuff is hammering on a machine hard
enough that it times out occassionally? (Swapping, etc?)
...
>
>> 30 packages failed with reason
>> 'UnicodeDecodeError:<module>:main:import_package:import_package:_do_import_packa
>> ge:import_upstream:decode':
>>
>> /srv/package-import.canonical.com/new/scripts/python-debian/debian_bundle/change
>> log.py:274: UserWarning: Unexpected line while looking for next heading of EOF:
>> vim:ai:et:sts=2:sw=2:tw=78:
>> warnings.warn(message)
>> Traceback (most recent call last):
>> File "./import_package.py", line 884, in <module>
>> sys.exit(main(args[0]))
>> File "./import_package.py", line 849, in main
>> import_package(temp_dir, package, version, distro, release, pocket,
>> package_url, possible_transports=possible_transports)
>> File "./import_package.py", line 532, in import_package
>> use_time_from_changelog=True)
>> File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 1555, in import_package
>> timestamp=timestamp, author=author)
>> File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 1434, in _do_import_package
>> timestamp=timestamp, author=author)
>> File
>> "/srv/package-import.canonical.com/new/scripts/plugins/builddeb/import_dsc.py",
>> line 1155, in import_upstream
>> revprops['authors'] = author.decode("utf-8")
>> File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>> return codecs.utf_8_decode(input, errors, True)
>> UnicodeDecodeError: 'utf8' codec can't decode bytes in position 10-12:
>> invalid data
>
> Changelog data again perhaps.
An author field that is non-ascii and not utf-8. There is always the:
def decode_as_best_you_can(s):
try:
return s.decode('utf-8')
except UnicodeDecodeError:
return s.decode('latin-1')
>
...
>> "KnitPackRepository('lp-45193168:///~ubuntu-branches/ubuntu/karmic/awstats/karmi
>> c/.bzr/repository')\nis not compatible
>> with\nCHKInventoryRepository('lp-45193168:///~ubuntu-branches/ubuntu/jaunty/awst
>> ats/jaunty-updates/.bzr/repository')\ndifferent serializers")
>
> This is because there are some packages that were imported in an older format.
> We should upgrade them. It's failing as there are no smarts to pick a compatible
> format when we work on those packages.
>
Is it possible to get a query of old ones, and just run a bulk-update of
them?
...
>> line 1264, in import_debian
>> revprops['authors'] = "\n".join(authors).decode("utf-8")
>> File "/usr/lib/python2.5/encodings/utf_8.py", line 16, in decode
>> return codecs.utf_8_decode(input, errors, True)
>> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position
>> 3: ordinal not in range(128)
>
> Changelog data again?
The author field seems especially sensitive.
...
>> "/usr/lib/python2.5/site-packages/bzrlib/repofmt/groupcompress_repo.py", line
>> 583, in _execute_pack_operations
>> packer.pack()
>> File "/usr/lib/python2.5/site-packages/bzrlib/repofmt/pack_repo.py", line
>> 749, in pack
>> return self._create_pack_from_packs()
>> File
>> "/usr/lib/python2.5/site-packages/bzrlib/repofmt/groupcompress_repo.py", line
>> 471, in _create_pack_from_packs
>> self._pack_collection.allocate(self.new_pack)
>> File "/usr/lib/python2.5/site-packages/bzrlib/repofmt/pack_repo.py", line
>> 1715, in allocate
>> 'Pack %r already exists in %s' % (a_new_pack.name, self))
>> bzrlib.errors.BzrError: Pack 'ac9506e4e5ddccb2730a2920256091bc' already
>> exists in <bzrlib.repofmt.groupcompress_repo.GCRepositoryPackCollection object
>> at 0x263e490>
>>
>> qmmp
>
> bzr bug?
Happens if you commit exactly the same data 2 times, or if you try to
autopack a single file. We've fixed a few of them, but having
reproducible data here would help.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iEYEARECAAYFAktDdIMACgkQJdeBCYSNAAOw+gCggy6/FufD0H0jS9W7+R+dhX/Q
7moAnRdOPYPxThRdqdxkfobndm2CKrg+
=D23h
-----END PGP SIGNATURE-----
More information about the ubuntu-distributed-devel
mailing list