Assertion failure: !assm->error() regression during Mochitest "test_webgl_conformance_test_suite.html"

RESOLVED FIXED

Status

()

Core
JavaScript Engine
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: drs, Unassigned)

Tracking

(Blocks: 1 bug, {assertion, regression})

Trunk
x86_64
Linux
assertion, regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: js-triage-done)

Attachments

(1 attachment)

(Reporter)

Description

6 years ago
Steps to repro:

1) Build from mozilla-central
2) Run the following from obj on Linux (and possibly other platforms, haven't checked):
TEST_PATH=content/canvas/test/webgl make mochitest-plain

I don't think this happens on the TBPL builds, but I've found locally that when I run the WebGL Mochitests, around the 3rd test the entire thing will die with the following console output:

WebGL mochitest: starting page conformance/array-unit-tests.html
++DOMWINDOW == 18 (0x2aeb0383b878) [serial = 18] [outer = 0x2aeb03fdc400]
++DOMWINDOW == 19 (0x2aaaab90bc78) [serial = 19] [outer = 0x2aeb03fdc400]
Assertion failure: !assm->error(), at /home/dsherk/builds/mozilla-central4/js/src/jstracer.cpp:4666
TEST-UNEXPECTED-FAIL | automation.py | Exited with code -6 during test run
INFO | automation.py | Application ran for: 0:00:09.461703
INFO | automation.py | Reading PID log: /tmp/tmpt0yzG5pidlog

== BloatView: ALL (cumulative) LEAK AND BLOAT STATISTICS, default process 32465

     |<----------------Class--------------->|<-----Bytes------>|<----------------Objects---------------->|<--------------References-------------->|
                                              Per-Inst   Leaked    Total      Rem      Mean       StdDev     Total      Rem      Mean       StdDev
   0 TOTAL                                          44        0       10        0 (    1.20 +/-     0.70)       16        0 (    1.56 +/-     0.84)

nsTraceRefcntImpl::DumpStatistics: 2 entries
TEST-PASS | automationutils.processLeakLog() | no leaks detected!

INFO | runtests.py | Running tests: end.
make: *** [mochitest-plain] Error 250
Doug says this test worked for him in a build from ~3 weeks ago, so this seems to have regressed sometime in the last 3 weeks.
Keywords: assertion, regression
Hardware: x86 → x86_64
Summary: Assertion failure: !assm->error() regression on Mochitests → Assertion failure: !assm->error() regression during Mochitest "test_webgl_conformance_test_suite.html"
With some targeted builds, I've narrowed this (slightly) to this 10-day range:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a61a0b7927ed&tochange=f4d78560721a
er 13-day
Narrowed further:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b7d269a291b6&tochange=54f6877c35a7
The first affected revision is: https://hg.mozilla.org/mozilla-central/rev/e7525561c309
e7525561c309	Josh Matthews — Bug 681392 - Remove about: exclusion from SpecialPowers creation. r=ted

I confirmed that a targeted backout of that cset will fix this in an up-to-date build, too.

That's a tweak to specialpowers.js (which is used in mochitests). It just exposed the bug; it's of course not the underlying JS-engine cause of the bug.
This isn't a recent regression in the JS engine, FWIW -- I was able to reproduce it in a build from July, based off of this cset:
   https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=b529ffc1012b
with the specialpowers.js cset from comment 5 applied manually.
Blocks: 514067
Whiteboard: js-triage-needed
This doesn't repro on Windows, and it doesn't repro on Linux in a debugger. The assertion is pretty much a can't-happen (it asserts !assm->error() very soon after guarding on assm->error()). These facts together make me suspect some kind of memory corruption from WebGL. But if not, it's a tracejit bug, and we plan on disabling the tracejit soon. So I'm not sure any action is required here.
Whiteboard: js-triage-needed → js-triage-done
(In reply to David Mandelin from comment #7)
> and it doesn't repro on Linux in a debugger.

Repros just fine for me in a debugger.  I've got it caught in a debugger right now, if you'd like to take a look at all.
Created attachment 566370 [details]
backtrace

Here's the backtrace from GDB, when I hit the assertion-failure/abort.
(In reply to David Mandelin from comment #7)
> But if not, it's a tracejit bug,
> and we plan on disabling the tracejit soon. So I'm not sure any action is
> required here.

(ah, ok.  I'm happy to let you investigate it in a debugger on my machine at some point if you like.  Otherwise, it sounds like it's not a big deal, if tracejit is being disabled soon.)
FWIW, the assertion is:
>    Assembler *assm = traceMonitor->assembler;
>    JS_ASSERT(!assm->error());

When this fails, the value of assm->error() is "nanojit::BranchTooFar"
Thanks to Dan we figured out what this is. It's almost certainly a tracer bug where some paths that exit on assm->error() don't clear the error. So, it doesn't seem to be a WebGL bug. And the assertion is harmless. And it will go away when the tracer is pref'd off by default, which is tonight.
Depends on: 693815

Comment 13

6 years ago
Fwiw: I see this with a fresh pull from mc (includes fix for bug #693815) in an xpcshell-test on 64-bit Linux. The exact output is

Assertion failure: !assm->error(), at /home/bjarne/Work/Mozilla/Trunk/mozilla-central/js/src/jstracer.cpp:4666

The test is part of a large, unfinished benchmark for the http-cache. I can email it directly if someone wants to reproduce...

Comment 14

6 years ago
Fwiw (again), I still see this on a clobber-build on m-c freshly updated today.
Me too (in webgl mochitests, using the command from comment 0). Apparently jstracer isn't disabled after all, even though bug 693815 has been resolved for nearly a week?

Comment 16

6 years ago
My particular issue seems to be fixed now.
makes sense; a more thorough fix for bug 693815 (disabling the tracer) landed on m-c early this morning.

Resolving as FIXED by that bug.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.