1664462 - Don't run source-test-python-mozbuild-* (mbu) on all platforms

As worded -- "Don't run source-test-python-mozbuild-* (mbu) on all platforms" -- the bug is WONTFIX, imo. The mozbuild suite tests fundamental functionality that in my (admittedly anecdotal) experience 1) if it is broken, very heavily suggests other stuff is probably also broken and 2) quite frequently breaks on only one platform without affecting another. Tests meeting these two criteria are exactly the tests that SHOULD be run on all platforms for a large set of pushes, if not all of them.

These tests weren't run on your push arbitrarily; they were run because your push touches modules/libpref/init, which triggers them -- see https://searchfox.org/mozilla-central/rev/eb9d5c97927aea75f0c8e38bbc5b5d288099e687/taskcluster/ci/source-test/python.yml#372. (Full disclosure, I added this code. :) )

If we relax the actual feature request named in this bug, the mozbuild suite is indeed pretty monolithic and could maybe stand to be split up a bit, but I'm not confident on how to do that immediately without impacting either the correctness of the tests or having an unexpected negative effect on cost.

Ricky Stewart

Comment 2

•

5 years ago

Maybe it goes without saying, but the mozbuild suite takes so long to run because of a few long-tail tests. The large majority of these are unit tests that don't take much time at all to run. Addressing those specifically (either by optimizing or removing the tests or otherwise by splitting them off into another suite) is a legitimate approach as well.

Marco Castelluccio [:marco]

Comment 3

•

5 years ago

What about running them on Linux normally, and more rarely only on Windows and Mac?

Ricky Stewart

Comment 4

•

5 years ago

I would point again to something I've already said:

The mozbuild suite tests fundamental functionality that in my (admittedly anecdotal) experience... quite frequently breaks on only one platform without affecting another

If that's the case, then running the tests on all platforms is important and something that we should continue doing.

Again, this is anecdotal evidence and I don't have numbers to back this up... but then again, nobody else in this thread has brought up anything but anecdotal evidence either, so I don't know why we should take on a "cost-saving" measure on the back of someone else's anecdotal evidence over mine.

Marco Castelluccio [:marco]

Comment 5

•

5 years ago

(In reply to Ricky Stewart from comment #4)

I would point again to something I've already said:

The mozbuild suite tests fundamental functionality that in my (admittedly anecdotal) experience... quite frequently breaks on only one platform without affecting another

If that's the case, then running the tests on all platforms is important and something that we should continue doing.

Again, this is anecdotal evidence and I don't have numbers to back this up... but then again, nobody else in this thread has brought up anything but anecdotal evidence either, so I don't know why we should take on a "cost-saving" measure on the back of someone else's anecdotal evidence over mine.

Just to be clear, I'm not suggesting stopping to run the tests on all platforms. I'm suggesting to run them more rarely, as we did in many other cases.
In those other cases, even though we knew we were introducing a risk of noticing a regression a bit later than landing time, we decided to accept the risk in exchange for cost savings.
In this case, the risk of running less frequently is likely lower than in some of those other cases where we decided the risk was worth it.

Anyway, I'll check what the actual risk is by counting how often on autoland these tests fail on Windows or Mac and not in Linux. We usually do that before taking a decision.

Ricky Stewart

Updated

•

5 years ago

Severity: -- → S3

Priority: -- → P5

Marco Castelluccio [:marco]

Comment 6

•

5 years ago

According to the data I have from autoland, these jobs fail on Windows/Mac but not on Linux really infrequently (twice out of ~900 runs).

Ricky Stewart

Comment 7

•

5 years ago

Yeah, I mean, I was talking about try (not autoland), like the OP was. But that's useful data to have as well. When we make these decisions do we always only look at autoland?

Marco Castelluccio [:marco]

Comment 8

•

5 years ago

(In reply to Ricky Stewart from comment #7)

Yeah, I mean, I was talking about try (not autoland), like the OP was. But that's useful data to have as well. When we make these decisions do we always only look at autoland?

It is not always representative, but so far yes.

Another reasonable approach is bug 1638395 (no manual decisions, rely on the same algorithm we rely on for test tasks).

Bugzilla

Don't run source-test-python-mozbuild-* (mbu) on all platforms

Categories

(Firefox Build System :: General, enhancement, P5)

Tracking

(Not tracked)

People

(Reporter: padenot, Unassigned)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Comment 6

Comment 7

Comment 8