Closed Bug 1521640 Opened 6 years ago Closed 5 years ago

Assertion mismatch did not cause test task to fail (Android, crashtest)

Categories

(Firefox for Android Graveyard :: Testing, defect, P1)

defect

Tracking

(firefox71 fixed)

RESOLVED FIXED
Firefox 71
Tracking Status
firefox71 --- fixed

People

(Reporter: gbrown, Assigned: gbrown)

References

Details

Attachments

(2 files)

I noticed this task was green despite:

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=222783463&repo=autoland&lineNumber=1786

[task 2019-01-19T00:09:03.375Z] 00:09:03 INFO - REFTEST TEST-UNEXPECTED-FAIL | http://10.0.2.2:8854/tests/layout/generic/crashtests/1458028.html | assertion count 9 is more than expected 6 assertions

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=222783463&repo=autoland&lineNumber=1839

[task 2019-01-19T00:09:45.094Z] 00:09:45 INFO - REFTEST TEST-UNEXPECTED-FAIL | http://10.0.2.2:8854/tests/layout/generic/crashtests/1488762-1.html | assertion count 14 is more than expected 9 assertions

...

[task 2019-01-19T00:37:36.603Z] 00:37:36 INFO - REFTEST INFO | Result summary:
[task 2019-01-19T00:37:36.603Z] 00:37:36 INFO - REFTEST INFO | Successful: 364 (0 pass, 364 load only)
[task 2019-01-19T00:37:36.604Z] 00:37:36 INFO - REFTEST INFO | Unexpected: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 failed load, 0 exception)
[task 2019-01-19T00:37:36.604Z] 00:37:36 INFO - REFTEST INFO | Known problems: 2 (0 known fail, 0 known asserts, 0 random, 2 skipped, 0 slow)
[task 2019-01-19T00:37:36.604Z] 00:37:36 INFO - REFTEST SUITE-END | Shutdown

Priority: -- → P3
Assignee: nobody → gbrown

Adding "see also" for bug 1321127 where we had a similar issue in the past.

Also, as discussed in dupe bug 1585652, we seem to have accumulated ~13 crashtests that have more/fewer assertions than expected on Android. As part of fixing this success-vs-failure classification bug, we'll need to adjust those crashtests' asserts() ranges to reflect reality (assuming the ranges are stable), or else they'll become perma-orange.

See Also: → 1321127

https://treeherder.mozilla.org/#/jobs?repo=try&tier=1%2C2%2C3&revision=2e579530868e033931e1e67cdf76054217041eb1

what's different between, say, Linux, and Android?

Linux:

[task 2019-10-02T20:48:09.362Z] 20:48:09     INFO - REFTEST INFO | Result summary:
[task 2019-10-02T20:48:09.362Z] 20:48:09     INFO - REFTEST INFO | Successful: 3821 (3 pass, 3818 load only)
[task 2019-10-02T20:48:09.363Z] 20:48:09     INFO - REFTEST INFO | Unexpected: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 failed load, 0 exception)
[task 2019-10-02T20:48:09.363Z] 20:48:09     INFO - REFTEST INFO | Known problems: 11 (0 known fail, 0 known asserts, 0 random, 11 skipped, 0 slow)
[task 2019-10-02T20:48:09.363Z] 20:48:09     INFO - REFTEST SUITE-END | Shutdown
[task 2019-10-02T20:48:09.363Z] 20:48:09     INFO - REFTEST INFO | Slowest test took 25669ms (file:///builds/worker/workspace/build/tests/reftest/tests/dom/svg/crashtests/1507961-1.html)
[task 2019-10-02T20:48:09.363Z] 20:48:09     INFO - REFTEST INFO | Total canvas count = 2
...
[task 2019-10-02T20:48:13.113Z] 20:48:13     INFO - REFTEST INFO | Process mode: e10s
[task 2019-10-02T20:48:13.113Z] 20:48:13  WARNING - leakcheck | refcount logging is off, so leaks can't be detected!
[task 2019-10-02T20:48:16.164Z] 20:48:16     INFO - Return code: 0
[task 2019-10-02T20:48:16.172Z] 20:48:16    ERROR - Got 1 unexpected statuses
[task 2019-10-02T20:48:16.172Z] 20:48:16     INFO - TinderboxPrint: reftest-crashtest<br/>11473/<em class="testfail">1</em>/38
[task 2019-10-02T20:48:16.173Z] 20:48:16  WARNING - # TBPL WARNING #
[task 2019-10-02T20:48:16.173Z] 20:48:16  WARNING - setting return code to 1
[task 2019-10-02T20:48:16.173Z] 20:48:16  WARNING - The reftest suite: crashtest ran with return status: WARNING
...
[task 2019-10-02T20:48:16.175Z] 20:48:16     INFO - [mozharness: 2019-10-02 20:48:16.174682Z] Finished run-tests step (success)
...
[task 2019-10-02T20:48:16.410Z] 20:48:16  WARNING - returning nonzero exit status 1
...
[task 2019-10-02T20:48:16.596Z] + exit 1
[taskcluster 2019-10-02 20:48:16.925Z] === Task Finished ===
[taskcluster 2019-10-02 20:48:21.527Z] Unsuccessful task run with exit code: 1 completed in 857.366 seconds

Android:

[task 2019-10-02T20:48:40.460Z] 20:48:40     INFO -  REFTEST INFO | Result summary:
[task 2019-10-02T20:48:40.461Z] 20:48:40     INFO -  REFTEST INFO | Successful: 3801 (1 pass, 3800 load only)
[task 2019-10-02T20:48:40.461Z] 20:48:40     INFO -  REFTEST INFO | Unexpected: 0 (0 unexpected fail, 0 unexpected pass, 0 unexpected asserts, 0 failed load, 0 exception)
[task 2019-10-02T20:48:40.461Z] 20:48:40     INFO -  REFTEST INFO | Known problems: 31 (0 known fail, 0 known asserts, 0 random, 31 skipped, 0 slow)
[task 2019-10-02T20:48:40.461Z] 20:48:40     INFO -  REFTEST SUITE-END | Shutdown
[task 2019-10-02T20:48:40.461Z] 20:48:40     INFO -  REFTEST INFO | Slowest test took 13038ms (http://10.0.2.2:8854/tests/gfx/tests/crashtests/306649-1.xml)
[task 2019-10-02T20:48:40.461Z] 20:48:40     INFO -  REFTEST INFO | Total canvas count = 2
...
[task 2019-10-02T20:48:44.121Z] 20:48:44     INFO - Return code: 0
[task 2019-10-02T20:48:44.121Z] 20:48:44     INFO - TinderboxPrint: crashtest<br/>3801/0/31
[task 2019-10-02T20:48:44.121Z] 20:48:44     INFO - ##### crashtest log ends
[task 2019-10-02T20:48:44.121Z] 20:48:44     INFO - # TBPL SUCCESS #
[task 2019-10-02T20:48:44.121Z] 20:48:44     INFO - The crashtest suite: crashtest ran with return status: SUCCESS
...
[task 2019-10-02T20:48:45.697Z] 20:48:45     INFO - [mozharness: 2019-10-02 20:48:45.697405Z] Finished run-tests step (success)
...
[task 2019-10-02T20:48:46.167Z] + exit 0
[taskcluster 2019-10-02 20:48:46.658Z] === Task Finished ===
[taskcluster 2019-10-02 20:48:51.825Z] Successful task run with exit code: 0 completed in 708.264 seconds

Linux:
[task 2019-10-02T20:35:28.526Z] 20:35:28 INFO - Structured output parser in use for reftest.

Missing on Android - looks like it's unstructured.

Priority: P3 → P1

Avoid assertion range mismatch errors in Android crashtests.

See Also: → 1586383

There are multiple issues here. I have a solution for one issue and am making progress on another + have a patch for updating assertion ranges. I hope to have a full solution this week.

Depends on: 1587139

These assertion counts were removed (accidentally?) by bug 1321127, effectively
not tracking assertion count mismatches in the reftest harness and instead
relying on mozharness to fail tasks based on the logged error messages.
Restoring the counts ensures:

  • the reftest summary includes accurate assertion counts like
    REFTEST INFO | Unexpected: 12 (..., 11 unexpected asserts, ...)
    REFTEST INFO | Known problems: 64 (..., 31 known asserts, ...)
  • assertion mismatches cause the harness to exit with an error code so
    that the job fails even if the log parsing is broken (bug 1587139)
    or the tests are being run locally with mach.
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/dc242b12d4e2
Update android crashtest assertion ranges; r=geckoview-reviewers,snorp
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f910bd336d93
Track assertion counts in reftest harness; r=jgraham
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 71
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: