Perma Linux SM(tsan) js/src/jit-test/tests/wasm/tables.js | /builds/worker/checkouts/gecko/js/src/jit-test/lib/wasm.js line 12 > WebAssembly.Module:222:1 RuntimeError: indirect call to null (code 3, args " when Gecko 79 merges to Beta on 2020-06-29
Categories
(Core :: JavaScript: WebAssembly, defect, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr68 | --- | unaffected |
firefox-esr78 | --- | unaffected |
firefox77 | --- | unaffected |
firefox78 | --- | unaffected |
firefox79 | + | fixed |
People
(Reporter: aryx, Assigned: rhunt)
References
(Regression)
Details
(Keywords: regression)
central-as-beta simulation: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=41d33c6c9caa97604eb73694179e1ec0d5d426f7&selectedTaskRun=O-MiVozyT_mSxGNPNsC9sQ.0
Log: https://treeherder.mozilla.org/logviewer.html#?job_id=305909020&repo=try
[task 2020-06-11T10:25:12.056Z] TEST-PASS | js/src/jit-test/tests/wasm/streaming.js | Success (code 0, args "--ion-eager --ion-check-range-analysis --ion-extra-checks --no-sse3") [3.1 s]
[task 2020-06-11T10:25:12.066Z] /builds/worker/checkouts/gecko/js/src/jit-test/lib/wasm.js line 12 > WebAssembly.Module:222:1 RuntimeError: indirect call to null
[task 2020-06-11T10:25:12.066Z] Stack:
[task 2020-06-11T10:25:12.066Z] call@/builds/worker/checkouts/gecko/js/src/jit-test/lib/wasm.js line 12 > WebAssembly.Module:wasm-function[7]:0xde
[task 2020-06-11T10:25:12.066Z] @/builds/worker/checkouts/gecko/js/src/jit-test/tests/wasm/tables.js:235:10
[task 2020-06-11T10:25:12.066Z] Exit code: 3
[task 2020-06-11T10:25:12.066Z] FAIL - wasm/tables.js
[task 2020-06-11T10:25:12.066Z] TEST-UNEXPECTED-FAIL | js/src/jit-test/tests/wasm/tables.js | /builds/worker/checkouts/gecko/js/src/jit-test/lib/wasm.js line 12 > WebAssembly.Module:222:1 RuntimeError: indirect call to null (code 3, args "--ion-eager --ion-check-range-analysis --ion-extra-checks --no-sse3") [1.0 s]
[task 2020-06-11T10:25:12.066Z] INFO exit-status : 3
[task 2020-06-11T10:25:12.066Z] INFO timed-out : False
[task 2020-06-11T10:25:12.066Z] INFO stderr 2> /builds/worker/checkouts/gecko/js/src/jit-test/lib/wasm.js line 12 > WebAssembly.Module:222:1 RuntimeError: indirect call to null
[task 2020-06-11T10:25:12.066Z] INFO stderr 2> Stack:
[task 2020-06-11T10:25:12.066Z] INFO stderr 2> call@/builds/worker/checkouts/gecko/js/src/jit-test/lib/wasm.js line 12 > WebAssembly.Module:wasm-function[7]:0xde
[task 2020-06-11T10:25:12.066Z] INFO stderr 2> @/builds/worker/checkouts/gecko/js/src/jit-test/tests/wasm/tables.js:235:10
Reporter | ||
Comment 1•4 years ago
|
||
Lars, can you check the push log what regressed this, please?
Comment 2•4 years ago
|
||
The test case could indicate a new problem with call_indirect, maybe a fallout from ongoing work. But since this is a problem for beta sim, it's more likely a problem with reference types? Ryan, could you take a look?
Updated•4 years ago
|
Assignee | ||
Comment 3•4 years ago
|
||
Okay, I've tried everything I can think of to reproduce this locally and can't seem to be able to.
I'm on a 64bit linux system, compiled on the linked revision from try, using SM standalone, with TSAN enabled, with the args provided. I've also tried larger runs of tests, the default args, non TSAN builds, debug builds, and release builds. I also tried just toggling wasm features on a normal build and wasn't seeing anything, so it's not obviously feature issue.
It's a bit mysterious, but I feel like I have to be missing something obvious. Will continue trying to reproduce this, or maybe pull down the build to see if I can figure out what's going on from it.
Comment 4•4 years ago
|
||
Scanning the pushlog... We could try backing out Dmitry's patch since it futzes with the ABI for indirect calls and is a suspect in this matter: https://hg.mozilla.org/mozilla-central/rev/f1beae5af8565899c72dfccafe9a7eacdb0c708e. (I may not have time to get to that until Monday.)
Assignee | ||
Comment 5•4 years ago
|
||
It doesn't appear to be that patch or the other two recent ABI changes [1] [2]. Continuing on...
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=7bed4337881c89ee07bd545389716b64f2687bfc
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=4a5bbbe29b20bba120db50c4000306644a4eb880
Assignee | ||
Comment 6•4 years ago
|
||
Started a bisection before the weekend, but it appears I messed up the dates a bit.
Bad [1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=14b4ec31bb799556dc8a8cd813090fa097dfe104
Good [2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=89ef1519245ad625bedfed4c0b9344766d812338
Good [3] https://treeherder.mozilla.org/#/jobs?repo=try&revision=eca04ac29c7915d566a9db4222b365e8a11bd774
So it looks like the regressor is between [1] and [2].
Comment 7•4 years ago
|
||
Frequently we have failures on test hardware and not locally because test hardware is pretty feeble esp wrt core counts, as in, we could perhaps try to lower the core count either through the command line switch or by manipulating the core affinity for the test case.
Comment 8•4 years ago
|
||
Ryan, could you clarify what breaks the test? What exactly corresponds to [1] bad, [2] good?
Updated•4 years ago
|
Reporter | ||
Comment 9•4 years ago
|
||
The regressor should be in the push log mentioned in comment 0: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3d1e9c77a42dec977bd7a22e2668af56b2587145&tochange=b2df79a80c0303df9d710800ae37dce56847eef5
First bad (ignore the lines at the top): https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=2148cea246618988ca0e4be17793b2622ce3e26f&selectedTaskRun=SNGLGUXgT--bfMSMJMXb7w.0
Base revision: https://hg.mozilla.org/mozilla-central/rev/b2df79a80c0303df9d710800ae37dce56847eef5
Last good: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception&revision=09ee4be5cc8391808c4f6c132a9733bfb2d9c99f&searchStr=SM%28tsan%29&selectedTaskRun=CHe8JyPGSdKwKslhetHY1g.0
Base revision: https://hg.mozilla.org/mozilla-central/rev/3d1e9c77a42dec977bd7a22e2668af56b2587145
Assignee | ||
Comment 10•4 years ago
|
||
(In reply to Dmitry Bezhetskov from comment #8)
Ryan, could you clarify what breaks the test? What exactly corresponds to [1] bad, [2] good?
I was trying to narrow down the changeset that caused the test failure as I can't reproduce this locally. So [1] was a revision that failed the test and [2] was a revision that passed the test. So the regressor changeset had to be somewhere in-between. However I messed up the commits I chose and there was a 5-day gap meaning that there's not much we can tell from it.
(In reply to Sebastian Hengst [:aryx] (needinfo on intermittent or backout) from comment #9)
The regressor should be in the push log mentioned in comment 0: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=3d1e9c77a42dec977bd7a22e2668af56b2587145&tochange=b2df79a80c0303df9d710800ae37dce56847eef5
First bad (ignore the lines at the top): https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=2148cea246618988ca0e4be17793b2622ce3e26f&selectedTaskRun=SNGLGUXgT--bfMSMJMXb7w.0
Base revision: https://hg.mozilla.org/mozilla-central/rev/b2df79a80c0303df9d710800ae37dce56847eef5Last good: https://treeherder.mozilla.org/#/jobs?repo=try&group_state=expanded&resultStatus=testfailed%2Cbusted%2Cexception&revision=09ee4be5cc8391808c4f6c132a9733bfb2d9c99f&searchStr=SM%28tsan%29&selectedTaskRun=CHe8JyPGSdKwKslhetHY1g.0
Base revision: https://hg.mozilla.org/mozilla-central/rev/3d1e9c77a42dec977bd7a22e2668af56b2587145
Ah, thank you! I should have read closer before setting out to do my own bisection..
Reporter | ||
Comment 11•4 years ago
|
||
Sorry for the confusion, I noticed the revision for the last good central-as-beta sim was wrong. The actual pushlog (after some bisection) is https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=63dc5e9b1b02b0aebd6badfe5eaef7bb9aa8f430&tochange=7f7b983390650cbc7d736e92fd3e1f629a30ac02
df279d4082d84c4205c67a5903f808583c69c098 is already affected.
Updated•4 years ago
|
Reporter | ||
Comment 12•4 years ago
|
||
Bisection shows this started when bug 1643013 landed.
Updated•4 years ago
|
Assignee | ||
Comment 13•4 years ago
|
||
I have no idea how that patch could be causing this, but I also got the same result from my own bisection [1] [2].
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc3c2391f75a33890d989473d92bacb5454d2ccf&selectedTaskRun=dst94XRLSrKE-KZjq63IFg.0
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=ec6fa46c8658d64e6b43ab0bb327119caab2c69d&selectedTaskRun=Mq_9aK-IRqCxy4ZCQdV1TA.0
Comment 14•4 years ago
|
||
That's amazing. Adam and I will look at it tomorrow.
It isn't even the one that added a slot to the global object...
Comment 15•4 years ago
|
||
And the feature that patch works on is preffed off by default.
Reporter | ||
Comment 16•4 years ago
|
||
The issue is gone in beta simulations based on recent central revisions. Is further action needed here?
Assignee | ||
Comment 17•4 years ago
|
||
That's incredibly weird. But if it's not showing up anymore, I don't think we have any hope of solving the issue unless it comes back. So I'd say we should close this.
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Description
•