Closed Bug 1340193 Opened 8 years ago Closed 8 years ago

"Assertion failure: CurrentThreadCanAccessRuntime(this) (Should only be mutated by the active thread.)" in JSRuntime::ionLazyLinkList() caused by profiler

Categories

(Core :: Gecko Profiler, defect)

defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1345262
Tracking Status
firefox54 --- affected

People

(Reporter: jseward, Unassigned)

References

(Blocks 1 open bug)

Details

STR: debug build, start profiler. Click on profiler button, then Settings, Threads; and enter "," (a single comma) in the box. Profiler crashes after a minute or so of use. Assertion failure: CurrentThreadCanAccessRuntime(this) (Should only be mutated by the active thread.), at /home/sewardj/MOZ/MC-TALL/js/src/vm/Runtime.cpp:818 Thread 1 "firefox-bin" received signal SIGSEGV, Segmentation fault. #0 0x00007fffe9bfa29a in JSRuntime::ionLazyLinkList() (this=this@entry=0x7fff201278c0) at /home/sewardj/MOZ/MC-TALL/js/src/vm/Runtime.cpp:817 #1 0x00007fffe9ba7d0b in js::CancelOffThreadIonCompile(mozilla::Variant<JSScript*, JSCompartment*, js::ZonesInState, JSRuntime*, js::AllCompilations> const&, bool) (selector=..., discardLazyLinkList=<optimized out>) at /home/sewardj/MOZ/MC-TALL/js/src/vm/HelperThreads.cpp:232 (gdb) up #1 0x00007fffe9ba7d0b in js::CancelOffThreadIonCompile (selector=..., discardLazyLinkList=<optimized out>) at /home/sewardj/MOZ/MC-TALL/js/src/vm/HelperThreads.cpp:232
These STR cause assertion failures elsewhere too. For example: Thread 65 "SamplerThread" received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffeebfff700 (LWP 23924)] 0x00007fffe7989a66 in mozilla::dom::workers::WorkerThread::Dispatch (this=0x4139a40, aRunnable=..., aFlags=<optimized out>) at /home/sewardj/MOZ/MC-TALL/dom/workers/WorkerThread.cpp:235 235 MOZ_ASSERT(cancelable, (gdb) where #0 0x00007fffe7989a66 in mozilla::dom::workers::WorkerThread::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) (this=0x4139a40, aRunnable=..., aFlags=<optimized out>) at /home/sewardj/MOZ/MC-TALL/dom/workers/WorkerThread.cpp:235 #1 0x00007fffe8f5d6d9 in nsIEventTarget::Dispatch(nsIRunnable*, unsigned int) (aFlags=0, aEvent=<optimized out>, this=0x4139a40) at /home/sewardj/MOZ/MC-TALL/ff-Og-linux64-dbg/dist/include/nsIEventTarget.h:37 #2 0x00007fffe8f5d6d9 in ThreadResponsiveness::Update(bool, nsIThread*) (this=this@entry=0x7ffecc1bde80, aIsMainThread=<optimized out>, aThread=0x4139a40) at /home/sewardj/MOZ/MC-TALL/tools/profiler/gecko/ThreadResponsiveness.cpp:105 #3 0x00007fffe8f5d8e9 in ThreadInfo::UpdateThreadResponsiveness() (this=0x7ffecc1bdbe0) at /home/sewardj/MOZ/MC-TALL/tools/profiler/core/ThreadInfo.h:81 #4 0x00007fffe8f5d8e9 in SigprofSender(void*) (aArg=<optimized out>) at /home/sewardj/MOZ/MC-TALL/tools/profiler/core/platform-linux.cc:290 #5 0x00007ffff7bc06ca in start_thread (arg=0x7ffeebfff700) at pthread_create.c:333 #6 0x00007ffff6e4ef7f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:105
Blocks: 1329181
(In reply to Julian Seward [:jseward] from comment #0) > STR: debug build, start profiler. > Click on profiler button, then Settings, Threads; and enter > "," (a single comma) in the box. Profiler crashes after a > minute or so of use. > > Assertion failure: CurrentThreadCanAccessRuntime(this) (Should only be > mutated by the active thread.), at > /home/sewardj/MOZ/MC-TALL/js/src/vm/Runtime.cpp:818 I can reproduce this now. By default the profiler uses the following thread filter list: ["GeckoMain", "Compositor"], i.e. it only profiles two threads. Entering "," into the threads box results in the following thread filter list: ["", ""]. ThreadSelected() succeeds if a filter string is a substring of the thread name, and an empty string is a substring of any string, so ThreadSelected() always succeeds, so we end up profiling every thread. And then the assertion hits. So the profiler doesn't like one of the profiled threads. It must be one of the threads other than "GeckoMain" or "Compositor". And once you get the crash once, you keep getting it. There must be something in the profile that keeps setting the thread filter list to ["", ""]. Julian, does this help? Or am I just explaining things you have already worked out?
Flags: needinfo?(jseward)
(In reply to Nicholas Nethercote [:njn] from comment #2) > And once you get the crash once, you keep getting it. There must be > something in the profile that keeps setting the thread filter list to ["", > ""]. I think this is a separate bug that I've been meaning to file: Once you've profiled a thread, it'll always end up in the profile during the rest of the life of the process, even if it no longer matches the thread filter. Or this is a bug in the addon, but I think I ruled that out last time I looked.
(In reply to Nicholas Nethercote [:njn] from comment #2) > And then the assertion hits. So the profiler doesn't like one of the > profiled threads. It must be one of the threads other than "GeckoMain" or > "Compositor". Probably one of the DOM worker threads that the profiler addon launches for symbolication. It must be a thread that runs JS because we're crashing in JS code.
Here's a longer stack trace than the one in comment 0. > #0 0x00007fffe9d4e93a in JSRuntime::ionLazyLinkList (this=0x7fffc4602000) > at /home/njn/moz/mi4/js/src/vm/Runtime.cpp:839 > #1 0x00007fffe9cccd80 in js::CancelOffThreadIonCompile (selector=..., > discardLazyLinkList=true) > at /home/njn/moz/mi4/js/src/vm/HelperThreads.cpp:234 > #2 0x00007fffe9a0ba3a in js::CancelOffThreadIonCompile ( > runtime=0x7fffc4602000) at ../../../js/src/vm/HelperThreads.h:484 > #3 0x00007fffe9a5a66d in js::ReleaseAllJITCode (fop=0x7fffc587e470) > at /home/njn/moz/mi4/js/src/jsgc.cpp:7053 > #4 0x00007fffe9ca38f6 in js::GeckoProfiler::enable (this=0x7fffc4602230, > enabled=true) at /home/njn/moz/mi4/js/src/vm/GeckoProfiler.cpp:78 > #5 0x00007fffe9ca5e16 in js::EnableContextProfilingStack (cx=0x7fffc4e93000, > enabled=true) at /home/njn/moz/mi4/js/src/vm/GeckoProfiler.cpp:543 > #6 0x00007fffe2f0fe77 in PseudoStack::enableJSSampling (this=0x7fffc5133000) > at ../../dist/include/PseudoStack.h:334 > #7 0x00007fffe8b93398 in profiler_start (aProfileEntries=1000000, > aInterval=1, aFeatures=0x7fffc5ff8540, aFeatureCount=4, > aThreadNameFilters=0x7fffc28f8c80, aFilterCount=2) > at /home/njn/moz/mi4/tools/profiler/core/platform.cpp:2048 Let's make this bug about the CurrentThreadCanAccessRuntime(this) and I'll file a separate bug for the assertion failure in comment 1. bhackett, you've been touching the ionLazyLinkList stuff recently. The profiler is started on the main thread. It enables sampling for a number of different threads, which calls EnableContextProfilingStack(cx) for each of those threads, all from the main thread. I'm guessing that this isn't allowed. This is very similar to bug 1322468 comment 1. Any suggestions how to handle this?
Flags: needinfo?(bhackett1024)
Summary: Threading assertion failure in JSRuntime::ionLazyLinkList() caused by profiler → "Assertion failure: CurrentThreadCanAccessRuntime(this) (Should only be mutated by the active thread.)" in JSRuntime::ionLazyLinkList() caused by profiler
(In reply to Nicholas Nethercote [:njn] from comment #5) > > Let's make this bug about the CurrentThreadCanAccessRuntime(this) and I'll > file a separate bug for the assertion failure in comment 1. I filed bug 1341924 for that assertion failure.
(In reply to Nicholas Nethercote [:njn] from comment #5) > bhackett, you've been touching the ionLazyLinkList stuff recently. The > profiler is started on the main thread. It enables sampling for a number of > different threads, which calls EnableContextProfilingStack(cx) for each of > those threads, all from the main thread. I'm guessing that this isn't > allowed. This is very similar to bug 1322468 comment 1. Any suggestions how > to handle this? Is the main thread of the DOM worker suspended when the main thread is performing this activity? If so then this behavior is OK (not racy), we would just need to make the assertions understand what is happening. I've been meaning to add a mechanism for this sort of behavior; it affects more than just the profiler, as the JS interrupt mechanism also operates on runtimes from other threads on Windows (I had to disable the thread safety checks added by bug 1323066 on Windows as a result). I don't know the non-JS parts of the profiler really at all, though, if you could point me to the code which manages suspension of the other threads that would be great.
(In reply to Brian Hackett (:bhackett) from comment #7) > > Is the main thread of the DOM worker suspended when the main thread is > performing this activity? As far as I know, it's not suspended. Look for the enableJSSampling() call in tools/profiler/core/platform.cpp. It's in profiler_start(), which is called when the profiler is started or restarted, which can happen at arbitrary times.
(In reply to Nicholas Nethercote [:njn] from comment #8) > (In reply to Brian Hackett (:bhackett) from comment #7) > > > > Is the main thread of the DOM worker suspended when the main thread is > > performing this activity? > > As far as I know, it's not suspended. > > Look for the enableJSSampling() call in tools/profiler/core/platform.cpp. > It's in profiler_start(), which is called when the profiler is started or > restarted, which can happen at arbitrary times. Hmm, I don't think there's a way to make this safe without suspending the target thread. ReleaseAllJITCode is on the stack there; this method destroys all jitcode in the runtime, and if the runtime is doing things like executing jitcode while that is going on then bad things will happen. I think the best way to enable profiling from another thread is to set an interrupt on the target thread's runtime. During the interrupt handler the target thread can either enable profiling itself, or pause while another thread does so (though this will need the mechanism I described above to avoid assertions). If the target thread is idling at the top of its event loop or is blocked somewhere then the interrupt handler will not fire until it starts executing JS again, but maybe that's OK in this case.
Flags: needinfo?(bhackett1024)
Bug 1341317 is another manifestation of this same basic problem, but with a new assertion failure that is occurring earlier and more reliably than this bug's assertion failure.
Flags: needinfo?(jseward)
This is a slightly different manifestation of the basic problem covered by bug 1345262.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.