Crashing spidermonkey tests do not always fail
Categories
(Core :: JavaScript Engine, defect, P2)
Tracking
()
People
(Reporter: sfink, Unassigned)
References
(Blocks 1 open bug)
Details
This is what bug 1718819 was originally filed for, but it's still very much a problem.
Most commonly seen in green debug SM(cgc) jobs, sometimes a run will be successful (all tests report as passing) but there are minidumps that show assertion failures of various kinds. It's not possible currently to identify the test that produced a minidump, but it appears that this is not a result of an expected-crash test.
Updated•2 years ago
|
Reporter | ||
Comment 1•2 years ago
|
||
I did a try push with a set of patches that displays the command line.
The first example log shows:
Command line:
/builds/worker/workspace/obj-spider/dist/bin/js --dll /builds/worker/fetches/injector/libbreakpadinjector.so -f /builds/worker/checkouts/gecko/js/src/jit-test/lib/prologue.js --ion-eager --more-compartments --ion-offthread-compile=off --selfhosted-xdr-path /tmp/tmpyy8e62v1/shell.xdr --selfhosted-xdr-mode decode -e 'const platform='"'"'linux'"'"'' -e 'const libdir='"'"'/builds/worker/checkouts/gecko/js/src/jit-test/lib/'"'"'' -e 'const scriptdir='"'"'/builds/worker/checkouts/gecko/js/src/jit-test/tests/v8-v5/'"'"'' --module-load-path /builds/worker/checkouts/gecko/js/src/jit-test/modules/ -f /builds/worker/checkouts/gecko/js/src/jit-test/tests/v8-v5/check-splay.js
Crash reason: SIGABRT
Crash address: 0x3e80000105c
Thread 0 (crashed)
0 libpthread.so.0!__pthread_cond_signal [pthread_cond_signal.c : 94 + 0x11]
1 js!mozilla::detail::ConditionVariableImpl::notify_one() [ConditionVariable_posix.cpp:b57e20efe50c238d4439e2a5107844182e1221a3 : 95 + 0x4]
2 js!js::GlobalHelperThreadState::submitTask(js::GCParallelTask*, js::AutoLockHelperThreadState const&)
3 js!js::GCParallelTask::startOrRunIfIdle(js::AutoLockHelperThreadState&) [GCParallelTask.cpp:b57e20efe50c238d4439e2a5107844182e1221a3 : 66 + 0xa]
4 js!js::gc::GCRuntime::endSweepingSweepGroup(JSFreeOp*, js::SliceBudget&) [Sweeping.cpp:b57e20efe50c238d4439e2a5107844182e1221a3 : 1602 + 0x1b]
...
(this is the main thread).
The matching result lines are:
[task 2022-03-06T19:51:20.962Z] TEST-PASS | js/src/jit-test/tests/v8-v5/check-splay.js | Success (code 0, args "--baseline-eager") [21.2 s]
[task 2022-03-06T19:52:25.993Z] TEST-PASS | js/src/jit-test/tests/v8-v5/check-splay.js | Success (code 0, args "") [86.9 s]
[task 2022-03-06T19:53:29.672Z] TEST-PASS | js/src/jit-test/tests/v8-v5/check-splay.js | Success (code -6, args "--ion-eager --ion-offthread-compile=off --more-compartments") [150.1 s]
So it looks like the crashing one is the 3rd, and it has a very suspicious runtime of 150.1s. This looks like a mishandled 150s timeout? It accurately reports an exit code of -6 but considers it to be a TEST-PASS
.
Reporter | ||
Comment 2•2 years ago
|
||
The other three examples in that push are the same test, with about the same duration. The crash stacks vary, in particular the last one dies in js::CurrentThreadCanAccessRuntime(JSRuntime const*) [Runtime.cpp:b57e20efe50c238d4439e2a5107844182e1221a3 : 789 + 0x5]
which has been freaking me out since I took that to mean we're doing an invalid access. But now it looks more like we probably just spend quite a bit of time there, and are randomly timeout-aborted such that we end up dying there fairly often.
This is making me feel better. It's looking like when we timeout, we kill the running test, which generates a minidump. Then for some reason we mark it as a pass instead of a timeout.
Comment 3•2 years ago
|
||
(In reply to Steve Fink [:sfink] [:s:] from comment #2)
Nice, that would explain it!
Description
•