Closed
Bug 1297901
Opened 8 years ago
Closed 8 years ago
Intermittent js/src/jit-test/tests/wasm/jsapi.js | Timeout (code -9, args "--baseline-eager")
Categories
(Core :: JavaScript Engine, defect, P3)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla53
People
(Reporter: intermittent-bug-filer, Assigned: bbouvier)
References
Details
(Keywords: intermittent-failure)
Attachments
(1 file, 2 obsolete files)
3.25 KB,
patch
|
jonco
:
review+
|
Details | Diff | Splinter Review |
Filed by: philringnalda [at] gmail.com https://treeherder.mozilla.org/logviewer.html#?job_id=34593319&repo=mozilla-inbound https://queue.taskcluster.net/v1/task/FuR7WtzrQ3e4yD64DHuksw/runs/0/artifacts/public%2Flogs%2Flive_backing.log
Comment 1•8 years ago
|
||
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(bbouvier)
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 6•8 years ago
|
||
Found it (thanks to rr!), taking.
Assignee: nobody → bbouvier
Status: NEW → ASSIGNED
Flags: needinfo?(bbouvier)
Assignee | ||
Comment 7•8 years ago
|
||
A first simple fix. Jon says we might even be able to remove the ShellAsyncTasks lock itself, I'm trying this now.
Assignee | ||
Comment 8•8 years ago
|
||
I've tried removing the ShellAsyncTask's mutex; we can't do that easily because AutoLockHelperThreadState is a LockGuard and ExclusiveData<>::Guard is a ExclusiveData::Guard, and there's no obvious way to convert from one to the other. Found the issue by running wasm/jsapi.js in rr with chaos mode; after running under rr a few times, one time a deadlock happened, and I could see the main thread was blocked in DrainJobQueue. Then, repeating the WebAssembly.compile tests from this file (which are the only ones to use drainJobQueue()) from this file made it trigger 100% of the time in rr with chaos mode. Asking review on the initial patch, with an explanation. Considering the following piece of code: while (true) { if (!sc->asyncTasks.lock()->outstanding) break; AutoLockHelperThreadState lock; HelperThreadState().wait(lock, GlobalHelperThreadState::CONSUMER); } `outstanding` is updated by the ShellFinishAsyncTaskCallback, which is called there: http://searchfox.org/mozilla-central/source/js/src/vm/HelperThreads.cpp#1455 So there's a mutex ordering in which we do the following: - main thread: takes asyncTasks lock, looks at outstanding: it's positive. - main thread: unlock asyncTasks (leaving the if) - helper thread: takes asyncTasks lock, updates outstanding to 0 - helper thread: takes GlobalHelperThreadState lock, notifies consumers - main thread: wait on consumer Thus the main thread waits forever, after the notification was sent. Proposed solution is to lock the HelperThreadState *before* the asyncTasks lock: - can't cause a deadlock with the other mutex, since the main thread takes the asyncTasks lock and releases it directly in the `if` condition. - this ensures the helper thread can't notify before the main thread is waiting Under rr chaos mode, this clears the issue right away.
Assignee | ||
Comment 9•8 years ago
|
||
Comment on attachment 8810469 [details] [diff] [review] fix-ordering.patch We've talked about this tonight, but for the record I've explained it precisely in the previous comment.
Attachment #8810469 -
Flags: review?(jcoppeard)
Comment 10•8 years ago
|
||
Nice! I actually tried to run wasm/jsapi.js in a loop a few times to try to reproduce this (to no avail); running in rr chaos mode was a fantastic idea. Thanks so much!
Comment 11•8 years ago
|
||
Comment on attachment 8810469 [details] [diff] [review] fix-ordering.patch Review of attachment 8810469 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. ::: js/src/vm/MutexIDs.h @@ +20,5 @@ > _(RuntimeExclusiveAccess, 200) \ > \ > _(GlobalHelperThreadState, 300) \ > \ > + _(ShellAsyncTasks, 350) \ This works, but can you make this order 500 instead? It only needs to be greater than GlobalHelperThreadState to work, but reducing the number of distinct values makes it clear(er) that there are fewer possible combinations of mutexes that can be held at the same time because mutexes with the same order cannot be held at the same time. I should really have added a comment to this effect.
Attachment #8810469 -
Flags: review?(jcoppeard) → review+
Updated•8 years ago
|
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 13•8 years ago
|
||
Fwiw, this is not pushed yet because a try build shows that an order of 500 can't work: https://treeherder.mozilla.org/#/jobs?repo=try&revision=dd57aed530a9b5308a01d269d648fd086e9957f9 But even an order of 350 doesn't work, as shown by running the wasm/jsapi.js test with JS_GC_ZEAL=14,10.
Assignee | ||
Comment 14•8 years ago
|
||
This works locally and under the initial test case with rr chaos mode. Running a try build just to be sure...
Attachment #8810469 -
Attachment is obsolete: true
Attachment #8812730 -
Flags: review?(jcoppeard)
Assignee | ||
Comment 15•8 years ago
|
||
A better patch, which updates the comment and just Moves the finished queue.
Attachment #8812730 -
Attachment is obsolete: true
Attachment #8812730 -
Flags: review?(jcoppeard)
Attachment #8812747 -
Flags: review?(jcoppeard)
Assignee | ||
Comment 16•8 years ago
|
||
Fwiw, latest try build is all green and happy: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1e440558ee52b00a8244fcfdc46dc055ba336f29
Comment 17•8 years ago
|
||
Comment on attachment 8812747 [details] [diff] [review] mutexes.patch Review of attachment 8812747 [details] [diff] [review]: ----------------------------------------------------------------- Great, thanks for the updates.
Attachment #8812747 -
Flags: review?(jcoppeard) → review+
Comment 18•8 years ago
|
||
Pushed by bbouvier@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/aa05debf2237 Fix ordering of taking mutexes for the shell tasks; r=jonco
Comment 19•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/aa05debf2237
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 20•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-aurora/rev/45a3bdf8faec
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•