VirtualBox Postmortem: Second Large User-Facing Regression in the last 30 days

Simon Quigley simon at tsimonq2.net
Wed Jan 17 21:17:32 UTC 2024


Dear Ubuntu Technical Board and Ubuntu Stable Release Updates Team,

(I am writing to both teams in lieu of a proper escalation procedure 
within the Stable Release Updates Team for problems such as this. I 
understand it is a current action item for the Technical Board to work 
with the Stable Release Updates Team on a documented process for team 
membership. For the sake of the issue at hand, I will be ignoring this 
discussion and debate entirely.)

VirtualBox is an incredibly popular piece of cross-platform 
virtualization software, with a wide-ranging user base. On Ubuntu, 
VirtualBox is used as an easy alternative to Red Hat's Virtual Machine 
Manager, and could very well be used in servers with a GUI. While this 
is in Multiverse, there is a large demand for VirtualBox.

Gianfranco Costamanga effectively maintains VirtualBox in Debian and 
Ubuntu. He is an outstanding contributor with deep technical knowledge 
that surpasses mine at unexpected times. Gianfranco helps with 
transitions often, pinging in the appropriate channels to get eyes on 
issues from the appropriate teams. Even as my own knowledge continues to 
grow, when I am completely out of my depth, Gianfranco usually has the 
answer. I trust him wholeheartedly.

A situation occurred in which the latest HWE kernel was released. 
Unfortunately, this caused a showstopping regression in VirtualBox, 
which has a set of DKMS modules. Here is the timeline of events:
  - 20151104: Martin Pitt ACKs a VirtualBox SRU on a single-case basis, 
as a response to Gianfranco's MRE request: 
https://lists.ubuntu.com/archives/technical-board/2015-November/002177.html
  - 20230420: Gianfranco files a bug for tracking of a new update to 
VirtualBox: 
https://bugs.launchpad.net/ubuntu/mantic/+source/virtualbox-hwe/+bug/2017101
  - 20230421: Andreas Hasenack follows up on the bug asking for more 
information, Gianfranco follows up afterwards but receives no response: 
https://bugs.launchpad.net/ubuntu/mantic/+source/virtualbox-hwe/+bug/2017101/comments/4
  - 20230915 10:14 UTC: Robie agrees this should receive a followup on 
the mailing list: 
https://irclogs.ubuntu.com/2023/09/15/%23ubuntu-release.html#t10:14
  - 20230915 10:23 UTC: Gianfranco follows up on this request, and asks 
for a general discussion on the issue. No response is received on the 
mailing list: 
https://lists.ubuntu.com/archives/ubuntu-release/2023-September/005787.html
  - 20230916: Gianfranco indicates that these uploads are now in the 
queue: 
https://bugs.launchpad.net/ubuntu/mantic/+source/virtualbox-hwe/+bug/2017101/comments/8
  - 2023/09/25/#ubuntu-release [10:19] <LocutusOfBorg> hello, ping to 
check virtualbox SRU
  - 2023/09/28/#ubuntu-release [16:00] <LocutusOfBorg> can anybody 
please have a look at virtualbox SRU?
  - 20230928 16:22 UTC: Gianfranco indicates that he updated the 
exception page given feedback from Robie Basak: 
https://irclogs.ubuntu.com/2023/09/28/%23ubuntu-release.html#t16:22
  - 2023/11/06/#ubuntu-release [11:16] <LocutusOfBorg> hello SRU team, 
any news w.r.t. virtualbox SRU?
  - 2023/11/07/#ubuntu-release [15:27] <LocutusOfBorg> tjaalton, hello, 
do you think we can do something w.r.t virtualbox sru?
  - 2023/11/17/#ubuntu-release [09:02] <LocutusOfBorg> also virtualbox 
is waiting there since months
  - 20231125: 
https://bugs.launchpad.net/ubuntu/+source/virtualbox/+bug/2044598 is 
filed regarding incompatibilities between the Linux 6.5 HWE kernel and 
virtualbox as in the archive.
  - 20231127: The Kernel Team uploads the initial 6.5 HWE kernel to 
Jammy: 
https://launchpad.net/ubuntu/+source/linux-meta-hwe-6.5/6.5.0.14.14~22.04.6
  - 2023/12/11/#ubuntu-release [13:04] <LocutusOfBorg> also ubuntu SRU, 
is it normal to have virtualbox in unapproved queue since april?
  - 2023/12/11/#ubuntu-release [13:04] <LocutusOfBorg> 
https://bugs.launchpad.net/ubuntu/mantic/+source/virtualbox-hwe/+bug/2017101
  - 20231213 10:02:32 PM US Central: The 6.5 HWE kernel is promoted to 
Main: 
https://launchpad.net/ubuntu/+source/linux-meta-hwe-6.5/+publishinghistory
  - 20240110 04:00:23 AM US Central: 6.5 HWE kernel lands in -updates 
and starts phasing.
  - 20240110 06:40:28 AM US Central: 6.5 HWE kernel lands in -security 
which sets phasing to 100%.
  - 20240111 3:18 AM US Central: Graham Inggs pings Gianfranco in a 
private channel asking if he knows anything about virtualbox-dkms 
failures in Jammy. Gianfranco follows up with deep frustration within a 
minute. He indicates that he stopped pinging because nobody would listen 
to him.
  - 20240111 21:00 UTC: I caught up on the backlog and realized what is 
going on. Pinged both the SRU Team and the Release Team (the latter of 
which simply for visibility): 
https://irclogs.ubuntu.com/2024/01/11/%23ubuntu-release.html#t21:00
  - 20240111 23:18 UTC to 20240112 01:16 UTC: Brian Murray reviews the 
SRUs in the queue.
  - 20240112 00:59 UTC: Aaron Rainbolt begins testing the package.
  - 20240112 18:48 UTC: I emphasize the gravity of the situation in 
#ubuntu-release and ask for Brian to carefully weigh the options: 
https://irclogs.ubuntu.com/2024/01/12/%23ubuntu-release.html#t18:48
  - 20240112 6:30 PM US Central: After several meetings, I look outside 
to see that the Wisconsin blizzard has my car almost fully covered. I 
had about 5-10 minutes to leave before I would have been stuck at the 
cowork space for the night. Aaron Rainbolt begins his usual sabbatical, 
which occurs from Friday at sunset to Saturday at sunset.
  - 20240112-20240115: I ping the Lubuntu Team asking for QA hands to 
give some extra help if they could. Nobody else besides me and Aaron 
ended up testing this.
  - 20240114 17:18 UTC: Aaron Rainbolt pings the SRU Team asking for 
acceptance of the package into -updates. Timo Aaltonen indicates that 
it's still a Sunday, and nobody will be around to accept it. I respond 
indicating that it was, in fact, Monday already for one SRU Team member, 
Chris Halse Rogers, who ignored my ping asking for a second set of eyes.
  - 20240115 12:20:39 PM US Central: The VirtualBox update was published 
to -updates, going through the normal phasing process, unlike the kernel 
update. Fortunately, the phasing has not been halted. Unfortunately, as 
of the time of writing, 20% of users are still facing this issue.
  - 20240115 20:28 UTC: Ubuntu Weekly Newsletter 822 was published, with 
two of the five top posts on Ask Ubuntu for the week being about this 
issue: 
https://lists.ubuntu.com/archives/ubuntu-news/2024-January/000898.html

Before I get into the various issues here, allow me to thank the people 
who have been involved with this. Gianfranco, thank you for your efforts 
in trying to get this addressed. Aaron, thank you for taking the large 
amount of time to test the majority of the packages in question. Brian, 
thank you for promptly acting on my loud ping, and addressing it the 
best you can (I don't expect you to work weekends.) Robie, thank you for 
reviewing and guiding Gianfranco on the MRE process. And Martin, on the 
off chance you are reading this, thank you for your work on this in 2015.

Here are the rather large issues I am seeing with how this was handled:
  - The Ubuntu Stable Release Updates Team needs to at least acknowledge 
proposals when given to them, especially by prominent community members. 
Robie took a great first step, but that mailing list post is from 
September, and no followup of any kind on the mailing list was received.
  - Gianfranco Costamanga is a contributor that is more valuable to 
Ubuntu than anyone could put into words. Ignoring his pings for four 
months only de-motivates him. I am glad he is happy about this being 
addressed, but what does this tell Gianfranco about future issues on a 
similar scale? The SRU Team as a whole owes him an apology.
  - Exact figures are not public, but VirtualBox is very widely used. 
Ubuntu's offering of VirtualBox needs to work, and user-facing 
regressions in an LTS release, let alone any stable release, are 
completely unacceptable. This can not happen again. All of us, as 
Ubuntu, owe an apology to our users.
  - In the Ask Ubuntu articles linked in UWN, the answers for both were 
"just install VirtualBox from a third-party repository." This may not 
seem like a large issue at first, but think about it. If I was a 
malicious attacker, this would have been the perfect opportunity to spin 
up a repository with a dirty VirtualBox. Users just want to be 
unblocked, most of them do not spend time worrying about what exactly is 
in this third-party repository that they're enabling (and it's a great 
argument for snaps in a general sense.) We are VERY lucky this did not 
happen.
  - Canonical has customers that rely on Ubuntu to be secure and 
relatively bug-free. I do not work for Canonical at this point in time, 
and I am completely unfamiliar with current customer deployments. That 
being said, it would not surprise me if a paying Canonical customer uses 
VirtualBox, and was angry about this regression. This not only hurt our 
image as Ubuntu, this hurt Canonical as a company.

It really does pain me to write this email. I am not doing this for the 
sake of starting an argument, my genuine intent is to lay out all of the 
facts and discuss it, so we can move forward together and better. Nobody 
should take this as a personal attack, this is not how I mean it. That 
being said, this is the *second* large, user-facing regression in the 
last 30 days. The Desktop Team committed to providing a postmortem on 
the Mutter issue, where video playback was broken on all stable 23.10 
installs, but they have not delivered on that commitment. What is going 
on here?

I have one, simple question for the Technical Board and the Stable 
Release Updates Team: how do we ensure this never happens again?

--
Simon Quigley
simon at tsimonq2.net
tsimonq2 on LiberaChat and OFTC
@tsimonq2:ubuntu.com on Matrix
5C7A BEA2 0F86 3045 9CC8
C8B5 E27F 2CF8 458C 2FA4
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7m
Type: application/octet-stream
Size: 99240 bytes
Desc: smime.p7m
URL: <https://lists.ubuntu.com/archives/ubuntu-release/attachments/20240117/e3b94cf0/attachment-0001.obj>


More information about the Ubuntu-release mailing list