Web lists-archives.com

Re: CI system maintainability




On Thursday, 28 March 2019 09:50:47 CET Kevin Ottens wrote:
> Hello,
> 
> On Thursday, 28 March 2019 09:41:29 CET Luca Beltrame wrote:
> > In data giovedì 28 marzo 2019 09:29:22 CET, Kevin Ottens ha scritto:
> > > at your screen or pair with you" in the past. Clearly this compromise
> > > gets
> > > somewhat exploited and that's especially bad in the case of a fragile
> > > and
> > > central component like KDE PIM.
> > 
> > I'm not sure I agree. I can't speak for seasoned developers, but I've
> > found
> > myself in a situation (more than once) where the fix is trivial (compile
> > error, missing ";", etc) and being forced to go through review would (IMO)
> > unnecessarily raise friction.
> 
> I don't fully disagree with this (although I'd have stories on nefarious
> typos even for what was supposed to be a "trivial fix"). But it becomes a
> question of trade-off, IOW "how hurtful to the project mandatory reviews?"
> vs "how hurtful to the project a central component being very regularly
> broken?".
> 
> I'd argue we're loosing more with the current state of PIM than we'd loose
> with mandatory reviews.

Review all relevant changes is nice in theory, but for this to work you need 
at least two people spending comparable amount of time on this. I wish we had 
that luxury in KDE PIM.

It works to some extend for Akonadi between Dan and David, I don't see it 
working for Laurent on KMail or for me on KItinerary, there's simply not 
enough review bandwidth there.

Also, when looking at such drastic changes we should keep in mind the amount 
of changes that go in without trouble. There's always the risk of breakage 
when we change stuff with the best intentions of improving things, we need to 
deal with that no matter how many people review a change (see the nasty 
transaction lock regression in 18.12.x Akonadi).

What failed in this specific case is not the lack of review IMHO, I'm not sure 
I would have spotted the issue from the diff. It's also not that nobody cared, 
or that people ignored the issue. The underlying problem was fixed within 62 
minutes of its occurrence according to the Git log, it was however forgotten 
to push this to the 19.04 branch.

This resulted in a single build failure in the stable build for kcontacts, 
which I (and I guess others too) did not immediately act on. That does not 
mean it's being ignored, but for a change I did not do myself the overwhelming 
majority of failures is transient (as either the author fixes it upon being 
notified, or it's a dependency issue that will resolve itself with the next 
build).

That single build failure resulted in the dependent builds failing, which 
again I did not immediately act on as the error message made me believe it's 
the a dependency issue that will resolve itself. Combined with the fact that 
this is in the stable branch, which is less active than master, I had indeed 
not realized we have a persistent issue here that nobody else felt responsible 
for until I saw this email. At that point Laurent had already fixed it btw, 
which was a matter of a simple cherry-pick from master. If I missed other 
earlier communication about this somehow, I apologize of course.

So, yes, things went wrong and it's a valid question how to improve this going 
forward. But rather than questioning the usefulness of the CI or the entire 
development workflow, how about just simply pinging the people who feel 
responsible for the affected repos? It's not like we miss every single build 
issue after all. I simply might not always manage to keep the state of 120+ 
repos in various configurations in my head, and which of those needs most 
urgent attention (I for example broke half of binary factory's Android builds 
with a kpackage change recently, despite review, and yet have to find the time 
for a proper fix for that).

Regards,
Volker

Attachment: signature.asc
Description: This is a digitally signed message part.