Closed Bug 1717386 Opened 3 years ago Closed 3 years ago

ThreadStackHelper::CollectProfilingStackFrame should not allocate memory

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox89 --- unaffected
firefox90 --- wontfix
firefox91 --- fixed

People

(Reporter: mozbugz, Assigned: emilio)

References

(Regression)

Details

(Keywords: regression)

Attachments

(1 file)

As seen in bug 1711769 comment 9:

Thread 9 (Thread 0x7f55d51be700 (LWP 178899)):
#0  __lll_lock_wait (futex=futex@entry=0x7f55e7b00020, private=0) at lowlevellock.c:52
#1  0x00007f55e81ed235 in __GI___pthread_mutex_lock (mutex=0x7f55e7b00020) at ../nptl/pthread_mutex_lock.c:135
#2  0x000055d34db48063 in BaseAllocator::malloc(unsigned long) ()
#3  0x000055d34db44cc0 in malloc ()
#4  0x00007f55e0e6910c in nsTSubstring<char>::SetLength(unsigned int) () at /home/jgraham/local/firefox/libxul.so
#5  0x00007f55e23358d1 in mozilla::ThreadStackHelper::CollectProfilingStackFrame(js::ProfilingStackFrame const&) () at /home/jgraham/local/firefox/libxul.so
#6  0x00007f55e2314703 in MergeStacks(unsigned int, bool, RegisteredThread const&, Registers const&, NativeStack const&, ProfilerStackCollector&, JS::ProfilingFrameIterator::Frame*, unsigned int) () at /home/jgraham/local/firefox/libxul.so
#7  0x00007f55dfa03897 in profiler_suspend_and_sample_thread(int, unsigned int, ProfilerStackCollector&, bool) () at /home/jgraham/local/firefox/libxul.so
#8  0x00007f55dfaae3f5 in mozilla::ThreadStackHelper::GetStack(mozilla::HangStack&, nsTSubstring<char>&, bool) () at /home/jgraham/local/firefox/libxul.so
#9  0x00007f55e233473f in mozilla::BackgroundHangManager::MonitorThread(void*) () at /home/jgraham/local/firefox/libxul.so
#10 0x00007f55e79e4acd in _pt_root () at /home/jgraham/local/firefox/libnspr4.so

ThreadStackHelper::CollectProfilingStackFrame is trying to expand a string (I'm not sure which one exactly), and that causes a memory allocation, which is forbidden during profiler_suspend_and_sample_thread, because it freezes a thread, which may leave any held mutex in an unstable state.

NI:Doug, you have worked on this recently, could you please have a look? Happy to help -- in particular, we should probably add ⚠ BIG WARNING COMMENTS ⚠ near profiler_suspend_and_sample_thread and any use of it.

Flags: needinfo?(dothayer)

This is almost surely AssignJSLinearString: https://searchfox.org/mozilla-central/rev/ac36d76c7aea37a18afc9dd094d121f40f7c5441/toolkit/components/backgroundhangmonitor/ThreadStackHelper.cpp#338 from bug 1586921, if the linear string length is bigger than the nsAutoCString capacity.

Which explains why bug 1711769 has only been seen on the parent process.

Flags: needinfo?(florian)
Regressed by: 1586921
Has Regression Range: --- → yes

Set release status flags based on info from the regressing bug 1586921

If we have a function bigger than the inline string capacity then we'd
allocate memory potentially causing deadlocks.

Assignee: nobody → emilio
Status: NEW → ASSIGNED
Flags: needinfo?(florian)
Flags: needinfo?(dothayer)
Pushed by ealvarez@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/597b394cea27
Don't allocate memory in CollectProfilingStackFrame. r=florian,gerald
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch

I have been testing this with nightly, it's looking good so far.

I'm not 100% certain, but my uptime without crash has been much longer than before the fix.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: