+1 Duty Report
Christian Ehrhardt
christian.ehrhardt at canonical.com
Fri Jan 15 10:39:49 UTC 2021
Hi,
on this +1 Duty I was first looking for some things stuck in proposed that I'd
usually would not unblock, but still might have an advantage since they are
close to packages I often work with. From there I regularly checked excuses
for the usual identify test/build issues or add retry triggers.
But for the major efforts I wanted to try something else this time (last
time I was looking at plenty of low hanging fruits), so I picked a few that
seemed to become long debug-fests. Thereby this time the list isn't as long
as usual (also I got quite some NMI's), so I worked on:
#1 - GPSD dependencies
This came in as a sync from Debian and I've in the past worked on that for
Chrony. But being a sync the soname bump seems to have gone unnoticed. Sadly
some of the dependencies are of rather poor quality so it is almost expected
that they need some help.
I found some related issues about a new symbol that in the past got dropped
and now reintroduced with different arguments - fixed Debians .symbols file
for that.
Furthermore plenty of packages (since it is a transition that got fully synced)
blocked on this implicitly. There was a dependency chaing gpsd -> direwolf ->
hamlib -> 3 FTBFSes.
ppc64:
https://launchpad.net/ubuntu/+source/grig/0.8.1-3build1
https://launchpad.net/ubuntu/+source/qsstv/9.4.4-1build2
arm64
https://launchpad.net/ubuntu/+source/fldigi/4.1.14-1build1
Resolving these would unblock plenty of packages.
The "fix" if you want to call it that was easy, it seems those three builds
died in a hiccup on the 6th of January without a build log. A simple retry of
the builds got them working and with it plenty of other packages to migrate.
Another dependency of it around hamlib needed builds and tests but those
resolved just fine without further help. The next blocker it was entangled
with is src:mapnik that moved from 3.0 to 3.1. Several dependent packages
needed rebuilds - Locutus was faster on triggering those rebuilds.
But the transition now needed some cleanup of "old binaries left" which were
not automatically done, so I pinged ubuntu-AA about it.
Various small transitions are intertwined atm, this story continues below around
gstreamer, octave, gdal and a few others.
#2 dpdk-kmods
This is code that was once in DPDK but now is split into a separate package
by upstream. It also came in via a sync and first if all needed to be accepted
from hirsute-new so I started with some IRC pings.
After that everything else got resolved by time and it migrated into
hirsute-release.
# gst-plugins-bad blocks plenty of things
While skimming over excuses I found that a no change rebuild of
https://launchpad.net/ubuntu/+source/gst-plugins-bad1.0/1.18.2-1ubuntu2
had FTBFS on s390x and ppc64el.
In turn that blocked opencv, and that blocked digikam, gdal, ros-vision-opencv,
pytorch, actiona, .... TL;DR Plenty of (usually unwatched) packages that could
benefit from this being resolved.
The log first shows a red herring with a failing test, but that RC is ignored.
Later on and even more suspicious is dh_gstscancodecs which leads to a core
dump.
Those two are on identical source, but two months apart from each other.
https://launchpad.net/ubuntu/+source/gst-plugins-bad1.0/1.18.1-1ubuntu1/+build/20200835
https://launchpad.net/ubuntu/+source/gst-plugins-bad1.0/1.18.2-1ubuntu2/+build/20721799
I've tracked this down to one particular lib "libgstlv2.so" that was crashing
gst-codec-info-1.0 of pkg gstreamer-dev. At that time vorlon send a mail on his
+1 and he had looked at (and found) the same.
Locutus pinged me that he was looking at the case as well. At about the same
time we worked together and identified libsord 0.16.4-1 -> 0.16.6-1 to be the
critical update.
I compiled various sord from git and found it isn't the sord version, but the
toolchain. gcc-9 works all the time, while gcc-10 -O2 and higher break.
Further debugging showed that -fno-schedule-insns2 is enough to get it working.
That is small enough for a fix so I was prepping bugs and an upload to resolve
things for now.
=> https://bugs.launchpad.net/ubuntu/+source/gcc-10/+bug/1911142
=> https://gitlab.com/drobilla/sord/-/issues/1
Overall this is involved not only in gpsd, but also gdal and a bunch of
other transitions. Due to that I can't pretend to have done all of that alone -
plenty of people were involved turning knobs here and there. But eventually
things resolved \o/ and that is what matters.
=> https://launchpad.net/ubuntu/+source/sord/0.16.6-1ubuntu1
=> https://launchpad.net/ubuntu/+source/gst-plugins-bad1.0/1.18.2-1ubuntu2
# octave
This built fine
https://launchpad.net/ubuntu/+source/octave/6.1.1~hg.2020.12.27-3
But had autopkgtest regressions on all architectures
https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-hirsute/hirsute/amd64/o/octave-parallel/20210108_160028_6467e@/log.gz
These tests belong to src:octave-parallel
It was already triggered by several people with custom triggers around other
octave-* things like:
octave/6.1.1~hg.2020.12.27-3 octave-parallel/4.0.0-2build1
octave-struct/1.0.16-8
Unfortunately if you just install things in hirsute or debian-sid the test
works fine. So much for debugging. But OTOH on autopkgtest.ubuntu.com it keeps
failing even with all-proposed.
Upgrading from manual tests to autopkgtest but in local VMs works for
hirsite as-is as well as hirsute-proposed and a selection of just
octave,octave-parallel,octave-struct.
Worth to note, on armhf this worked just fine on autopkgtest.u.c
The debugging of this went on and on - so I created a bug to track what I've
found to share it with other debuggers and be visible due to the update-excuse
tag.
=> https://bugs.launchpad.net/ubuntu/+source/octave-parallel/+bug/1911400
I had some fun debugging this and after some TIL and a long strange trip
it turned out to be rather easy. This is a lib for parallelization and
in the new version it fails on 1 vcpu - so it needs to be marked as big_package.
Nevertheless - since it is a regression from our POV - I also filed an
upstream bug about it.
=> https://savannah.gnu.org/bugs/index.php?59869
The MP to mark it as huge is here:
=> https://code.launchpad.net/~paelzer/autopkgtest-cloud/+git/autopkgtest-cloud/+merge/396300
After that was merged I did some custom triggers to get the set of packages
tested together as needed. And I checked the test log if they used the new
instance - they did and results finally went good unblocking many other things
that were indirectly linked.
Just when I thought things were happy I got told that a new dependency came in.
Thanks for shattering my hopes Rick :-P. This was an installability issue on
riscv64 between nheko/fmtlib which was taken care of by vorlon/Rik/Gianfranco
while it was nighttime for me. But this story seems to continue.
After that was resolved a new plasma-workspace joined the entanglement party.
But finally at the end of this week all of those bits moved in the next run
of britney. \o/
# dune-* FTFBS
A bunch of dune-* packages were blocked on one of them failing to build.
The reason was an unclear segfault that seemed to be flaky. I found that
Locutus has already uploaded a delta to "Reduce parallelism to 3 in arm64"
only to now have it fail on armhf and ppc64.
I didn't reach him, so I gave things a try in a PPA if that change would
help all architectures. But it still failed with that change (flaky, not 100%).
Rebuilds as-is fail (2/2) and lowering the parallel on all arch does not
fix armhf (which formerly built fine). This really seems to be a huge
flaky-fest or a situation that got worse with recent toolchains.
To crank it up a bit I was doing various other test builds but all of them
were flaky. In a discussion on #ubuntu-release the last theory was that
(other than retrying forever) using gcc-9 might help as it did in other cases.
... it feels like not that much as usual, but still progress being
made at least.
See you next week.
--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd
More information about the ubuntu-devel
mailing list