Web lists-archives.com

Re: autopkgtest results influencing migration from unstable to testing




Hi Steve,

On 12-05-18 18:07, Steve Langasek wrote:
> I think the status quo is that we have a lot of autopkgtests that are
> useless as a CI gate.

I wonder. I estimate about 15% of tests in Debian have never passed.
Obviously they are useless for anything except maybe the careful
maintainer that reads all of them and uses them even when they fail (the
minority I assume). The rest is usable for gating.

> In Ubuntu, we have 245 overrides in place to ignore "regressions" in tests
> that once passed but no longer do.  Many of these are tests that are simply
> flaky and only pass sometimes.  Some are tests that once succeeded but now
> fail because something else changed in the overall distribution, and wasn't
> / couldn't be caught by autopkgtest to prevent the regression.  Some are
> tests that are treated as "regressions" now because previously they couldn't
> be run on a particular architecture, but Ubuntu's infrastructure has
> improved to where they now can be (and are seen to fail).

In Debian we define regression differently than in Ubuntu. Ubuntu
compares the current result to all the results that were ever obtained.
In Debian we compare to the current status of testing (barring some
implementation details). All the flaky tests remain an issue (and we are
filing bugs¹ for those and ignoring the results² until they are fixed).
245 sounds like an awful lot. Do you know how many of these are flaky?
(This triggers me, I'll check what you have as I love to reuse the
Ubuntu triaging effort and use them in Debian if they apply to Debian).

> And there has been no release of Ubuntu in which more than 93% of tests
> passed, on any architecture. (http://autopkgtest.ubuntu.com/statistics)

Debian is doing worse. Current status for unstable (on amd64 only) is
about 80-85%³. testing isn't complete yet, and on top of that the raw
statistics are confusing, because it contains the result of the tests
that prevent migration (for 10 days).

> I have also been submitting a lot of bug reports about autopkgtests that the
> maintainers have allowed to regress in new versions of the Debian package.
> 
>   https://bugs.debian.org/cgi-bin/pkgreport.cgi?users=ubuntu-devel@xxxxxxxxxxxxxxxx;tag=autopkgtest

That list (with open bugs) is shorter than I feared as the list that I
started a couple of months ago is already longer⁴. But I expect that to
be merely the effect of you not filing all issues (I don't blame you,
it's a lot of work). I expect Debian maintainers will start to put more
emphasis on passing results, as they now matter *in* Debian.

> It's fine if the raw number of tests goes down, if the overall quality of
> the tests - and therefore the quality of the release - goes up (and the time
> wasted hunting buggy tests goes down).

Fully ack on this though.

Paul

¹
https://bugs.debian.org/cgi-bin/pkgreport.cgi?users=debian-ci@xxxxxxxxxxxxxxxx;tag=flaky
² https://release.debian.org/britney/hints/elbrus
³ https://ci.debian.net/status/https://bugs.debian.org/cgi-bin/pkgreport.cgi?users=debian-ci@xxxxxxxxxxxxxxxx;tag=regression;tag=breaks;tag=needs-update

Attachment: signature.asc
Description: OpenPGP digital signature