Closed Bug 1717386 Opened 2 years ago Closed 2 years ago

ThreadStackHelper::CollectProfilingStackFrame should not allocate memory

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox89 --- unaffected
firefox90 --- wontfix
firefox91 --- fixed

People

(Reporter: mozbugz, Assigned: emilio)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

As seen in bug 1711769 comment 9:

Thread 9 (Thread 0x7f55d51be700 (LWP 178899)):
#0  __lll_lock_wait (futex=futex@entry=0x7f55e7b00020, private=0) at lowlevellock.c:52
#1  0x00007f55e81ed235 in __GI___pthread_mutex_lock (mutex=0x7f55e7b00020) at ../nptl/pthread_mutex_lock.c:135
#2  0x000055d34db48063 in BaseAllocator::malloc(unsigned long) ()
#3  0x000055d34db44cc0 in malloc ()
#4  0x00007f55e0e6910c in nsTSubstring<char>::SetLength(unsigned int) () at /home/jgraham/local/firefox/libxul.so
#5  0x00007f55e23358d1 in mozilla::ThreadStackHelper::CollectProfilingStackFrame(js::ProfilingStackFrame const&) () at /home/jgraham/local/firefox/libxul.so
#6  0x00007f55e2314703 in MergeStacks(unsigned int, bool, RegisteredThread const&, Registers const&, NativeStack const&, ProfilerStackCollector&, JS::ProfilingFrameIterator::Frame*, unsigned int) () at /home/jgraham/local/firefox/libxul.so
#7  0x00007f55dfa03897 in profiler_suspend_and_sample_thread(int, unsigned int, ProfilerStackCollector&, bool) () at /home/jgraham/local/firefox/libxul.so
#8  0x00007f55dfaae3f5 in mozilla::ThreadStackHelper::GetStack(mozilla::HangStack&, nsTSubstring<char>&, bool) () at /home/jgraham/local/firefox/libxul.so
#9  0x00007f55e233473f in mozilla::BackgroundHangManager::MonitorThread(void*) () at /home/jgraham/local/firefox/libxul.so
#10 0x00007f55e79e4acd in _pt_root () at /home/jgraham/local/firefox/libnspr4.so

ThreadStackHelper::CollectProfilingStackFrame is trying to expand a string (I'm not sure which one exactly), and that causes a memory allocation, which is forbidden during profiler_suspend_and_sample_thread, because it freezes a thread, which may leave any held mutex in an unstable state.

NI:Doug, you have worked on this recently, could you please have a look? Happy to help -- in particular, we should probably add ⚠ BIG WARNING COMMENTS ⚠ near profiler_suspend_and_sample_thread and any use of it.

Flags: needinfo?(dothayer)

This is almost surely AssignJSLinearString: https://searchfox.org/mozilla-central/rev/ac36d76c7aea37a18afc9dd094d121f40f7c5441/toolkit/components/backgroundhangmonitor/ThreadStackHelper.cpp#338 from bug 1586921, if the linear string length is bigger than the nsAutoCString capacity.

Which explains why bug 1711769 has only been seen on the parent process.

Flags: needinfo?(florian)
Regressed by: 1586921
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1586921

If we have a function bigger than the inline string capacity then we'd
allocate memory potentially causing deadlocks.

Assignee: nobody → emilio
Status: NEW → ASSIGNED
Flags: needinfo?(florian)
Flags: needinfo?(dothayer)
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/597b394cea27
Don't allocate memory in CollectProfilingStackFrame. r=florian,gerald
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch

I have been testing this with nightly, it's looking good so far.

I'm not 100% certain, but my uptime without crash has been much longer than before the fix.

You need to log in before you can comment on or make changes to this bug.