Closed Bug 1075787 Opened 9 years ago Closed 9 years ago

Gij failures are currently masked and reported as successful

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kgrandon, Assigned: jgriffin)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

Since splitting Gij, it seems that failures can currently be hidden.

I'm seeing runs like this: https://tbpl.mozilla.org/?tree=B2g-Inbound&showall=1&rev=cdc2bda1a15c

Clicking on the 'Gij' result will clearly show test failures, but yet it's reported as green. We need to fix this asap otherwise we may quickly regress a lot of tests. We may consider closing gaia in the meantime.
This problem occurs because mozharness uses the exit code of the process to determine the overall pass/failed status, and apparently Gij always exits with 0.

We can fix this on the mozharness side by making it take the number of passed/failed tests into account.
Assignee: nobody → jgriffin
When this lands, it will cause Gij to go perma-orange, unless the currently failing tests are fixed or hidden by the time this happens.
Thanks Jonathan! I'm currently working my way through the perma-failures. Perma-orange is much more preferable to us than the masked failures. Thanks for knocking this out so quickly.
Comment on attachment 8498407 [details] [diff] [review]
Make Gij take passed/failed test counts into account when setting tbpl status,

Review of attachment 8498407 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozharness/mozilla/testing/gaia_test.py
@@ +251,4 @@
>          """
>          pass
>  
> +    def publish(self, code, passed=None, failed=None):

Does this ever get called without passed/failed? If not, might as well make them required parameters.
Attachment #8498407 - Flags: review?(ahalberstadt) → review+
(In reply to Andrew Halberstadt [:ahal] from comment #5)
> Comment on attachment 8498407 [details] [diff] [review]
> Make Gij take passed/failed test counts into account when setting tbpl
> status,
> 
> Review of attachment 8498407 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mozharness/mozilla/testing/gaia_test.py
> @@ +251,4 @@
> >          """
> >          pass
> >  
> > +    def publish(self, code, passed=None, failed=None):
> 
> Does this ever get called without passed/failed? If not, might as well make
> them required parameters.

It does...i.e., by gaia_unit tests (which don't need it, since they exit with the appropriate exit code).
Comment on attachment 8498407 [details] [diff] [review]
Make Gij take passed/failed test counts into account when setting tbpl status,

https://hg.mozilla.org/build/mozharness/rev/7030a62bf270

I'm going to retrigger on cedar to make sure this doesn't have any unintended side-effects before pushing to production.
Retrigger was good, pushed to production: https://hg.mozilla.org/build/mozharness/rev/244abc6524fa
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
We should also fix the harness returning zero if there were failures; ideally we'll move away from manually checking test run counts in the future. I'll file another bug for this.

Moving to releng, since TBPL only displays the results given to it by buildbot, and as such this was fixed by a mozharness patch (though curious how the changes in bug 1046694 made us regress this behaviour?).
Blocks: 1046694
No longer blocks: 1067892
Component: Tinderboxpushlog → Mozharness
OS: Mac OS X → All
Product: Webtools → Release Engineering
QA Contact: jlund
Hardware: x86 → All
Version: Trunk → unspecified
Depends on: 1076858
(In reply to Ed Morley [:edmorley] from comment #9)
> We should also fix the harness returning zero if there were failures;
> ideally we'll move away from manually checking test run counts in the
> future. I'll file another bug for this.
> 
> Moving to releng, since TBPL only displays the results given to it by
> buildbot, and as such this was fixed by a mozharness patch (though curious
> how the changes in bug 1046694 made us regress this behaviour?).

I've looked at bug 1046694 and don't believe it caused this regression.  More likely, Gij used to return a non-zero exit code on failure, but at some point stopped doing that.
Ah I was basing that from the timing and also the OP blocking bug 1067892 (which was the TBPL part for that bug).
No longer blocks: 1046694
I think that some of the work that went into splitting the Gij test suite surfaced this. I'm not sure exactly which bug surfaced it though. Apologies if I was wrong.
Merged to production, and deployed.
You need to log in before you can comment on or make changes to this bug.