Release Process concerns (QA) and suggestions

Thu Aug 30 18:53:49 UTC 2012

On 08/28/2012 09:38 AM, Gema Gomez wrote:
> Hi all,

Hi,

> during the last release meeting[1] I brought my concern about how
> quality is not taken into account as much as I think it should be by the
> release process and was asked to produce an email with suggestions for
> improvement.
> 
> My main concern with the release process is that decisions on respins
> are made without *explicitly* considering that to be able to release a
> good image, there is a lot of testing that needs to be done after the
> respin, we are used to just have one run to verify that the system can
> install and then we are good to go. In my opinion, this is not enough
> anymore, especially if we want to increase the quality of our releases
> going forward.

I'm going to start by saying that we never respin for fun, when we
respin, there's a really good reason to do so.

The release team is in charge of releasing a pre-defined set of images,
for a given list of media at a given date. That's how things are.

When we unfortunately hit a bug at the last minute, like happened last
week, the release team needs to check how critical it's. If it's
considered as a show-stopper, like was the case here, the only action to
take is to fix it as soon as possible, re-test and then release.

If we know it's technically impossible to get it re-tested in time, then
we need to release a day later, but that's a very last resort as
releasing on a Friday brings its own set of problems.

In the case of 12.04.1, we noticed on release day that an image didn't
actually fit on its target media and apparently no tester bothered to
actually burn it to a standard CD...

We found an obvious way of fixing it (removing a langpack) within just a
couple of hours, got the change reviewed, tested, the image rebuilt, the
content checked and then fully re-tested by 3 testers in less than 3 hours.
Leaving us a good 10-12h before we actually released the set.

> The release process should be at the heart of good QA practices and
> needs to ensure beyond any doubt that testing has happened to a
> reasonable standard. The release engineer shouldn't be the only one
> verifying the software he has just fixed (because by definition he is
> going to be biased in the verification). Also, third party testers,
> people different from the developers or release engineers, are always
> recommended to validate a product. Also, we should avoid conflicts of
> interest, i.e. the release engineer wants to get the release out of the
> door as priority 1, the QA engineer wants that release to be of the
> highest quality as priority 1, and those two are often conflicting
> views, in my opinion the release manager should take those two bits of
> information into account and make an informed decision.

That's correct and we didn't have that.

I'm the one who noticed the issue (not QA ...), tracked it down and fixed.

That fix was reviewed by Adam Conrad, landed by IS, then the resulting
image was tested by Jean-Baptiste, Nicholas and I.

So we had more peer review than required at every step, I'm really not
understanding what you're complaining about.

> The test cases we have at the moment are very minimal, better coverage
> is achieved by having people go through somewhat open ended test cases
> and hence ending up testing more than it is written on the test case
> itself. This is not ideal from the QA perspective, but it is a good
> compromise when a more comprehensive set of test cases doesn't exist
> yet. This means that when we run through the mandatory test cases on a
> Thursday before releasing in a hurry, we are not getting as much
> coverage as when all the community is running open ended test cases at
> home on a varied set of hardware. Needless to say we are working hard to
> improve this situation by using more automated testing and allowing
> people do more specialized testing, but we are not there yet.

That's a problem with the testcases and so, is something the QA team
should be working on. By release team standard, we had full coverage by
more than one tester, which is good for release.

> I am not saying "let's not respin on release week/milestone", I am
> saying let's respin responsibly, and only if there is enough time to run
> enough testing to satisfy the quality of the images is reasonable. It is
> better to release with known errors than to release with unknown ones,
> quality is not about releasing with 0 errors, but about knowing what
> errors are there in what is being released.

We always respin responsibly, believe it or not, respinning is at least
as much pain for the release team as it's for the testers, we never take
such action lightly.

Critical installation bugs, security bugs, immediate post-install bugs
and CD size problems are usually considered show-stoppers as these can't
be worked around by the user. It'd be wrong not to respin for these.

Obviously I'm also dreaming of a world where software doesn't have bug
or where issues are found months ahead of release, but sadly that world
doesn't exist and we have to live with reality.

> The release process[1] is not detailed enough, in my opinion. We have no
> guidelines as to what is a good reason to respin and what is not, and
> this leads to different engineers making different decisions (at least
> in my opinion, although the decisions is normally to respin, even for
> corner cases such as [4], that only fails with upgrades without network
> connectivity) when presented with the same problem. The Release
> Validation Process[3] is ancient and hasn't been updated in the past two
> years, I take an action to work on that.

The "corner case" in [4] is a supported upgrade path, used by
governments and other internet-less environments. Not fixing that bug
was resulting in completely broken, unbootable systems, and as such
definitely fits respin criteria.
Our alternative would have been to drop support for these, which we
considered and decided not to for 12.04. However 12.04 is going to be
the last release where such an upgrade path is supported.

We have a limited set of people from the release team who are in charge
of spinning images, these are the ones making the decision.

You can see who's in charge of what here:
https://wiki.ubuntu.com/QuantalQuetzal/ReleaseTaskSignup

> I was asked to suggest possible improvements to the current process.
> Here they are:
> 
> - Let's document what constitutes a respin and what doesn't, so that
> whenever we see a bug we all know if that is going to trigger a respin
> or not, let's create guidelines for it.

I suppose we can do that, ultimately it's always going to be up to the
release team to do a go/no-go on case by case basis, but writting some
generic guidelines can't hurt.

At least for me, anything that fits one of the following is release
critical:
 - Security issues affecting the live/install environment
 - Kernel bugs preventing the boot of the image for commonly available
hardware
 - Installer bugs leading to installation failure or broken post-install
experience without obvious workaround
 - Upgrade bugs leading to broken/non-working systems that can't be
fixed post-upgrade through SRU.
 - Critical bugs affecting common software used immediately
post-installation

I'm also always happy to respin for lower priority issues for images
that didn't get any testing or when explicitly asked by a product
manager (mostly for flavours).

> - Let's improve the static analysis of images so that we don't have the
> image size problem again, we are adding a job for this to Jenkins this week.

Can you also make sure someone actually burns the image on the supported
media?

I'm still amazed that for a whole week, nobody even tried to burn a CD
with our image...

> - Let's require more than just one run of the test cases to validate an
> image. What is reasonable in terms of ensuring reasonable HW coverage?
> I'd like to see at least 3 x 100% run rate with 100% pass rate on the
> current test cases, from people different from the release engineer.

For final images I usually look for at least 2 people testing the
various code paths. Unfortunately these code paths can't be easily
represented in the UI, so ultimately it's a release team decision to
know whether the threshold has been met.

In the case of amd64 and amd64+mac, the livefs is actually identical (as
in, really identical, same build, same checksum), so we essentially want
full coverage of amd64 and someone to make sure that amd64+mac works on
a mac.

> - Whenever a respin is going to happen, the release team gives the
> opportunity to disagree to the QA team (to whoever is leading the
> testing on that particular milestone), and to any other interested group.

I don't agree with this. We always welcome input and are always
discussing these things in the open in #ubuntu-release, anyone is
welcome to comment there, but ultimately it's the release team's choice,
not anyone else's.

> - Or maybe we should simply have a three person group that makes the
> calls on respins during release/milestone week, with all the interested
> parts represented.

As I said, the call is the release team's and more specifically the
release team members assigned to that specific release.
For 12.04.1, that was Kate, Dave and I who could make that call.

> Here are my thoughts as promised, thanks for reading. Many more things
> could probably be done, if you can think of any, please say so!
> 
> Cheers,
> Gema

Sorry for the rather long e-mail, but I hope it's explaining a bit more
how things work.

As I said on IRC and discussed with the rest of the team, I don't
believe anything was done wrong, I do believe our images were
sufficiently tested and as far as I know, nothing actually went wrong.

You don't seem to agree and I can acknowledge that, however I don't
intend on changing anything and I don't think we should have.

> [1]
> http://ubottu.com/meetingology/logs/ubuntu-meeting/2012/ubuntu-meeting.2012-08-24-15.00.log.html
> [2] https://wiki.ubuntu.com/ReleaseProcess
> [3] https://wiki.ubuntu.com/ReleaseValidationProcess
> [4] https://bugs.launchpad.net/ubuntu/+source/fontconfig/+bug/1039828
> 

-- 
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 897 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/ubuntu-release/attachments/20120830/a16ea67a/attachment-0001.pgp>