Closed Bug 1065501 Opened 5 years ago Closed 4 years ago
Un-hide debug Gaia python integration tests when they meet visibility standards
Gip on Linux64 debug is very timeout-prone (~20% of the time). It does so in completely random tests, making it nearly impossible to file useful orange bugs. I'm not sure who from the B2G team would even own fixing things up at this point, but I've hidden debug Gip on all trees where it currently runs until it can be made to pass reliably.
I have a feeling that we have some low-level issue that could be impacting Mnw, Gip, and Gij (bug 1037924). I will ping a few people to see if they have any ideas. I would be *so happy* if the same fixed took care of this and bug 1037924.
See Also: → 1037924
Product: Webtools → Tree Management
Hi Geo, Do you know who can help to fix Gaia python integration tests? Thanks!
(In reply to Josh Cheng [:josh] from comment #3) > Hi Geo, > Do you know who can help to fix Gaia python integration tests? > Thanks! I'm switching the NI to John Dorlus, who has GIP as one of his areas of focus. The short answer is that we're actually trying to retire the Python Integration Tests, so--especially if there's a hairy low-level issue of some kind--it's more likely we'd just want to wontfix this.
Flags: needinfo?(gmealer) → needinfo?(jdorlus)
Hi Geo, Hi John, Do you know when do we plan to officially retire whole GIP which we can WONTFIX this? Thanks!
(In reply to Josh Cheng [:josh] from comment #5) > Hi Geo, Hi John, > Do you know when do we plan to officially retire whole GIP which we can > WONTFIX this? > Thanks! I've got that discussion scheduled with John today already, but I plan on putting a request in to hide Gip across all platforms this week. Once we raise an acceptance run on our side we'll remove it completely. After I talk to John, we may go ahead and WONTFIX this one immediately.
How did that meeting go? Since the last update, we've ignored that a patch made a Gip test permaorange, merged it around to every trunk tree, noticed it was permaorange, backed it out of every trunk tree, and then despite that, as the only person who stars Gip failures I've decided that because of this bug I should just star all of them as "expected fail" without taking the time to look at them. It's already dead, let's kick it into the grave.
Gip will not disappear when it's gone from Treeherder, it just will mean that developers won't pay attention to those results anymore (because they are not on Treeherder). They will still run on Jenkins (Flame device automation) and presumably, we're going to kick off B2G desktop tests on Jenkins too, if Gip is going to be disabled on Treeherder.
And fwiw, I was trying to drive through the fix for bug 1068094, which would hopefully fix lots of these intermittent failures on Gip, but it has been quite difficult communicating that.
Maybe I'm missing something, but it seem to me that if you actually want to keep the suite alive, the path forward is quite clear and reasonably simple: * Disable, disable, disable. I just looked at Gip(a) for the first time since I hid it, and from a brief look it looks like you could have it back visible just by disabling test_a11y_ftu_desktopb2g.py TestFtuAccessibility.test_a11y_ftu and test_a11y_cards_view.py TestCardsViewAccessibility.test_a11y_cards_view. Disable one test in f9, one in f10, one in f11, and maybe two or three in u, and you go from looking like an unowned suite that should be hidden and then shut off to a perfectly healthy suite. Those tests aren't "still doing some good by passing sometimes and thus showing they haven't been completely broken," they are doing immense harm by showing that nobody really cares whether they pass or fail or fail to run at all, and that nobody really cares how much work they cause by constantly failing, and that probably nobody really cares about additional bustage to other tests. * Drop that horrible idea of running up to three times in the case of intermittent failure. If you want your tests to be treated as real tests that should be expected to pass, clearly saying "I don't really expect this stuff to pass more often than once in three times" isn't getting that idea across. The sheriffs *know* that there are intermittent failures which they have never seen, so any time something fails so often that it manages to get up to turning something orange, they aren't going to think bustage, they're going to think intermittent and coincidence.
Well, I don't look at the accessibility tests, I wasn't supposed to look at those. I don't know who is. I don't know who decide to run it three times, I don't think it's necessarily a good idea, either.
WONTFIX in favor of bug 1180903.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.