Closed Bug 1710408 Opened 4 years ago Closed 4 years ago

"Assertion failure: !mBufferPositionWhenReceivedJSContext (JSContext should have been cleared before the thread was unregistered), at tools/profiler/core/ProfiledThreadData.h:61" when adding a marker with rr in chaos mode

Categories

(Core :: Gecko Profiler, defect, P3)

defect

Tracking

()

RESOLVED FIXED
94 Branch
Tracking Status
firefox94 --- fixed

People

(Reporter: julienw, Assigned: sfink)

Details

Attachments

(1 file)

I got this assertion error when running rr to debug my failing test:

"Assertion failure: !mBufferPositionWhenReceivedJSContext (JSContext should have been cleared before the thread was unregistered), at tools/profiler/core/ProfiledThreadData.h:61"

Full backtrace:
#01: ProfiledThreadData::NotifyUnregistered(unsigned long) (/home/julien/travail/git/mozilla-central/tools/profiler/core/ProfiledThreadData.h:59)
#02: ActivePS::UnregisterThread(PSAutoLock const&, RegisteredThread*) (/home/julien/travail/git/mozilla-central/tools/profiler/core/platform.cpp:0)
#03: profiler_unregister_thread() (/home/julien/travail/git/mozilla-central/tools/profiler/core/platform.cpp:5322)
#04: nsThread::ThreadFunc(void*) (/home/julien/travail/git/mozilla-central/xpcom/threads/nsThread.cpp:433)
#05: _pt_root (/home/julien/travail/git/mozilla-central/nsprpub/pr/src/pthreads/ptthread.c:204)
#06: ??? (/lib/x86_64-linux-gnu/libpthread.so.0 + 0x7fa3)
#07: clone (/lib/x86_64-linux-gnu/libc.so.6 + 0xf94cf)

I'll share the pernosco link when I'll have it.

Oh, sorry, somehow I missed this bug, and the Pernosco link has now expired. 😢

If you see it again and can capture another rr trace, please NI:me.

Severity: -- → S3
Priority: -- → P3

Gerald, it works now :-)

Pernosco triggers a data generation if you access the link again after the expiration. You accessed it 2h ago, so now it works.

Flags: needinfo?(gsquelart)

I've investigated the issue with the Pernosco recording. Pernosco link with notes

In summary, we have the following sequence:

  1. A DOM Worker calls PROFILER_SET_JS_CONTEXT(cx), it's recorded in the appropriate RegisteredThread, and if the profiler is running, ProfiledThreadData::NotifyReceivedJSContext(), which sets the mBufferPositionWhenReceivedJSContext member variable.
  2. (optional) Profiler is stopped and restarted, during which locked_profiler_start sees the RegisteredThread's JS context, and calls ProfiledThreadData::NotifyReceivedJSContext() again.
  3. The DOM worker calls PROFILER_CLEAR_JS_CONTEXT(), but because the ActivePS::FeatureJS is off, it does NOT call profiledThreadData::NotifyAboutToLoseJSContext(), so mBufferPositionWhenReceivedJSContext stays set.
  4. The DOM worker thread shuts down, calls profiler_unregister_thread(), which calls ProfiledThreadData::NotifyUnregistered(), which checks that mBufferPositionWhenReceivedJSContext has been cleared, but it wasn't -> BOOM.

So there is a discrepancy between PROFILER_CLEAR_JS_CONTEXT() that depends on the FeatureJS, and others that don't.
We'll need to make all these more consistent, but I'm not sure yet which way is better (update mBufferPositionWhenReceivedJSContext always or only when FeatureJS is on.) I need to examine this code, and I'll think about a solution...

Note that this only happens when the profile doesn't profile JS stacks, which should be rare (it's enabled in all default presets), that's probably why we don't notice it that much.
In that recording, I see that the settings were only "threads" and "nostacksampling".

Flags: needinfo?(gsquelart)

Note that this only happens when the profile doesn't profile JS stacks, which should be rare (it's enabled in all default presets), that's probably why we don't notice it that much.
In that recording, I see that the settings were only "threads" and "nostacksampling".

Yes, I was profiling in a test.
It's a good thing we won't see this much in real life then!

I noticed this crash when I try profiling an AWSY test from startup with nostacksampling:

export MOZ_PROFILER_STARTUP=1
export MOZ_PROFILER_SHUTDOWN=awsy-profile.json
export MOZ_PROFILER_STARTUP_FEATURES="nostacksampling"
./mach awsy-test --tp6 --headless --iterations 1 --entities 10

I just hit this too, but I was running pretty much exactly what pbone was (I'm looking at the same issue). I thought it was the result of my changes, since I was adding some sketchy profiler markers, but it looks like probably not.

I'm going to vote for the JSContext registration to be conditional on ActivePS::FeatureJS. Partly because it seems easier to detect if you try to do JS-related stuff when that feature is off, and partly because the other caller of NotifyReceivedJSContext already does this check. (There's yet another caller, but it's conditional on having a stored JSContext which I think can be skipped as well, in which case it will be conditioned the same way albeit indirectly.)

I made the change, and it seems to work. Note that the profiler_set_js_context code has changed quite a bit since the above Pernosco recording.

Assignee: nobody → sphink
Status: NEW → ASSIGNED
Pushed by sfink@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/ee1acae97e9a Do not record JSContext info if the js feature is disabled r=gerald
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 94 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: