Closed Bug 1267020 Opened 4 years ago Closed 4 years ago

Assertions from Android reftests/crashtests don't appear to be getting reported back to the harness

Categories

(Testing :: Reftest, defect, critical)

Version 3
Unspecified
Android
defect
Not set
critical

Tracking

(firefox48 fixed, firefox49 fixed)

RESOLVED FIXED
mozilla49
Tracking Status
firefox48 --- fixed
firefox49 --- fixed

People

(Reporter: RyanVM, Assigned: gbrown)

References

Details

Attachments

(1 file)

See this log:
http://archive.mozilla.org/pub/mobile/try-builds/ryanvm@gmail.com-0850cd5129678911dd6f238a0ba06923754c6f41/try-android-api-15-debug/try_ubuntu64_vm_armv7_large-debug_test-crashtest-10-bm121-tests1-linux64-build60.txt.gz

The run was marked as failed by buildbot. Looking at the log, the following line appears:
10:19:32  WARNING -  REFTEST INFO | Unexpected: 1 (0 unexpected fail, 0 unexpected pass, 1 unexpected asserts, 0 failed load, 0 exception)

However, no individual tests have a TEST-UNEXPECTED-FAIL for exceeding their assertion allowance.

Looking at the logcat, it turns out that 464407-1.xhtml is our culprit:
http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/1ce7b789f8c4009038dff182ba17d33cc179aec7881d0eded94e56512f455e6f02563aac5bdf3460e08b289d55aaded18738b4f493bd914a65179f5799df63ff

REFTEST TEST-UNEXPECTED-FAIL | http://10.0.2.2:8854/tests/layout/xul/crashtests/464407-1.xhtml | assertion count 1 is more than expected 0 assertions

But the buildbot log doesn't show any issues?
10:04:41     INFO -  REFTEST TEST-START | http://10.0.2.2:8854/tests/layout/xul/crashtests/464407-1.xhtml
10:04:41     INFO -  REFTEST TEST-PASS | http://10.0.2.2:8854/tests/layout/xul/crashtests/464407-1.xhtml | (LOAD ONLY)
10:04:41     INFO -  REFTEST TEST-END | http://10.0.2.2:8854/tests/layout/xul/crashtests/464407-1.xhtml

This seems bad to me :)
When I tried retriggering back on the much more common C3 failure in 336744-1.html, which "we've" been starring as either "infra" or "intermittent" without a bug number for weeks now, I got back to somewhere around  what I think was April 3rd without finding a spot before it started failing like this.
Assignee: nobody → gbrown
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=20a5b1e76922&selectedJob=24087930 is probably the first naturally occurring one in crashtest-3, but that's getting back into the neighborhood of when we increased to 10 chunks of crashtest, so looking before that will require figuring out what chunk it used to be in, and may very well just find that it was changes from the increase in chunks that started it intermittently failing.

Plus this bug is actually about the failure to report unexpected assertions, not that particular one :)
I have verified that this bug is caused by the change landed in bug 1245092.

Prior to bug 1245092, the assertion mismatch - "assertion count ... is more than ..."- was logged via gDumpFn, which was implemented as a dump() + a file write (on android); 1245092 changed this to a simple dump(). I note that the log file is still opened (gLogFile) and hooked into gDumpRawLog and the StructuredLogger instance, so I don't quite understand where this is going wrong.

The same problem appears to affect other dump() calls in reftest.jsm, such as TEST-LOAD messages - http://hg.mozilla.org/mozilla-central/annotate/1c6385ae1fe7/layout/tools/reftest/reftest.jsm#l1368 - which also appear in logcat but are no longer in the test log.
...of course, all of this is going wrong on Android, but not on desktop platforms.
(In reply to Geoff Brown [:gbrown] from comment #6)
> the log file is still opened (gLogFile) and hooked into gDumpRawLog and
> the StructuredLogger instance

So it looks like everything will work on Android if StructuredLogger.log() is called. But I don't see how dump() would call into StructuredLogger.log().

:ahal -- How should this work?
Flags: needinfo?(ahalberstadt)
I noticed that :ahal may be away for a few days, so I'd like to go ahead with this good-enough-for-now fix.

This demonstrates the new logging for an unexpected assertion: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8119f722aefb

I think there's a possibility that I'm missing a more elegant solution, so I'll keep the bug and needinfo open until ahal has a chance to comment.
Attachment #8745822 - Flags: review?(jmaher)
Comment on attachment 8745822 [details] [diff] [review]
ensure assertion count messages are written to log on Android

Review of attachment 8745822 [details] [diff] [review]:
-----------------------------------------------------------------

nice find
Attachment #8745822 - Flags: review?(jmaher) → review+
as a note, ahal is on pto until next month iirc.
See Also: → 1268050
https://hg.mozilla.org/mozilla-central/rev/bc92894e7890
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
Thanks for tackling this Geoff, this is the correct fix. On desktop, stdout is consumed which is how the dump messages are taken into account. On android, the logfile is used instead so they weren't being processed. Sorry for missing this.

The fact that we're using dump and bypassing the structured log protocol for this is a temporary workaround. Bug 1244739 tracks integrating assertions with structured logging properly.
Flags: needinfo?(ahalberstadt)
blech, this never got backported to other branches
See Also: → 1321127
See Also: 1321127
You need to log in before you can comment on or make changes to this bug.