Closed Bug 1674299 Opened 4 years ago Closed 4 years ago

100MB of unreported memory in mTasks deque, from nsSegmentedBuffer::FreeOMT

Categories

(Core :: XPCOM, defect)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1675203

People

(Reporter: mstange, Unassigned)

Details

I collected a DMD report on my current regular Nightly build, and it lists 100MB of unreported memory from the mTasks.push(...) call in TaskQueue::DispatchLocked:

Unreported {
  24,776 blocks in heap block record 2 of 14,767
  101,482,496 bytes (101,482,496 requested / 0 slop)
  Individual block sizes: 4,096 x 24,776
  12.58% of the heap (36.38% cumulative)
  19.12% of unreported (55.27% cumulative)
  Allocated at {
    #01: replace_malloc(unsigned long) (/Applications/Firefox Nightly.app/Contents/MacOS/libmozglue.dylib + 0x2ca09)
    #02: moz_xmalloc (/Applications/Firefox Nightly.app/Contents/MacOS/libmozglue.dylib + 0x149f2)
    #03: mozilla::TaskQueue::DispatchLocked(nsCOMPtr<nsIRunnable>&, unsigned int, mozilla::AbstractThread::DispatchReason) (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4f7b35)
    #04: mozilla::TaskQueue::Dispatch(already_AddRefed<nsIRunnable>, unsigned int) (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x50e7e6)
    #05: nsSegmentedBuffer::FreeOMT(void*) (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4d0d24)
    #06: nsPipe::~nsPipe() (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4cd0c3)
    #07: nsPipeInputStream::~nsPipeInputStream() (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4cf720)
    #08: nsPipeInputStream::Release() (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4ce79a)
  }
}

I don't know if that means that there are a lot of pending tasks while the DMD report is collected, or that the queue buffer was never shrunk after having grown once, but in any case, it seems suspicious.

We should add a memory reporter, to get more insight into this. nsDeque is an XPCOM replacement for std::deque that has memory reporting capabilities.

(I confirmed that 0x4f7b35 indeed refers to the push call. Here's the fully-resolved inline callstack at that address:)

% cargo run --release --example addr2line -- -f -i -C -e /Users/mstange/Downloads/XUL.dSYM/Contents/Resources/DWARF/XUL 0x4f7b34
operator new(unsigned long)
/builds/worker/workspace/obj-build/dist/include/mozilla/cxxalloc.h:33
std::__1::__libcpp_allocate(unsigned long, unsigned long)
/builds/worker/fetches/clang/bin/../include/c++/v1/new:253
std::__1::allocator<mozilla::TaskQueue::TaskStruct>::allocate(unsigned long)
/builds/worker/fetches/clang/bin/../include/c++/v1/memory:1789
std::__1::allocator_traits<std::__1::allocator<mozilla::TaskQueue::TaskStruct> >::allocate(std::__1::allocator<mozilla::TaskQueue::TaskStruct>&, unsigned long)
/builds/worker/fetches/clang/bin/../include/c++/v1/memory:1525
std::__1::deque<mozilla::TaskQueue::TaskStruct, std::__1::allocator<mozilla::TaskQueue::TaskStruct> >::__add_back_capacity()
/builds/worker/fetches/clang/bin/../include/c++/v1/deque:2604
std::__1::deque<mozilla::TaskQueue::TaskStruct, std::__1::allocator<mozilla::TaskQueue::TaskStruct> >::push_back(mozilla::TaskQueue::TaskStruct&&)
/builds/worker/fetches/clang/bin/../include/c++/v1/deque:1956
std::__1::queue<mozilla::TaskQueue::TaskStruct, std::__1::deque<mozilla::TaskQueue::TaskStruct, std::__1::allocator<mozilla::TaskQueue::TaskStruct> > >::push(mozilla::TaskQueue::TaskStruct&&)
/builds/worker/fetches/clang/bin/../include/c++/v1/queue:308
mozilla::TaskQueue::DispatchLocked(nsCOMPtr<nsIRunnable>&, unsigned int, mozilla::AbstractThread::DispatchReason)
/builds/worker/checkouts/gecko/xpcom/threads/TaskQueue.cpp:59

I also have 5MB of TaskQueue instances. I don't know what sizeof(TaskQueue) is, but 5MB is probably a lot of instances.

Unreported {
  82 blocks in heap block record 678 of 15,210
  27,552 bytes (26,896 requested / 656 slop)
  Individual block sizes: 336 x 82
  0.00% of the heap (74.25% cumulative)
  0.00% of unreported (97.55% cumulative)
  Allocated at {
    #01: replace_malloc(unsigned long) (/Applications/Firefox Nightly.app/Contents/MacOS/libmozglue.dylib + 0x2ca09)
    #02: moz_xmalloc (/Applications/Firefox Nightly.app/Contents/MacOS/libmozglue.dylib + 0x149f2)
    #03: NS_CreateBackgroundTaskQueue (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x50c878)
    #04: nsPipe::AdvanceReadSegment(nsPipeReadState&, mozilla::ReentrantMonitorAutoEnter const&) (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4cd90c)
    #05: AutoReadSegment::~AutoReadSegment() (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4d46b4)
    #06: nsPipeInputStream::ReadSegments(nsresult (*)(nsIInputStream*, void*, char const*, unsigned int, unsigned int, unsigned int*), void*, unsigned int, unsigned int*) (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x4cefe4)
    #07: mozilla::dom::XMLHttpRequestMainThread::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long long, unsigned int) (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x1072e09)
    #08: nsJARChannel::OnDataAvailable(nsIRequest*, nsIInputStream*, unsigned long long, unsigned int) (/Applications/Firefox Nightly.app/Contents/MacOS/XUL + 0x8b61b3)
  }
}

Both of these come from nsSegmentedBuffer::FreeOMT. I wonder if nsSegmentedBuffer is leaking task queues somehow.

Summary: 100MB of unreported memory in mTasks deque → 100MB of unreported memory in mTasks deque, from nsSegmentedBuffer::FreeOMT

TaskQueue appears to be a std::queue<PendingTask> according to https://searchfox.org/mozilla-central/rev/277ab3925eae21b419167a34624ec0ab518d0c94/ipc/chromium/src/base/message_loop.h#326. This stl collection is known to be quite poorly behaved as noted in bug 1641614, where we ran into large memory overhead in IPC code due to using std::queue.

Due to poor design decisions when defining the stl, the type std::queue isn't allowed to invalidate pointers to members while resizing its backing buffer, so some implementations (such as the one used on macOS) allocate lists of 4kb pages for all queues, including empty ones, so that the individual entries can be placed into these entries and not relocated as the queue is resized etc. For small queues, especiall small queues of small objects, this is often unacceptable overhead.

This type is also poorly behaved as it doesn't report its memory usage to our memory reporters, meaning that we need tools like DMD in order to identify the high memory usage here.

We should change TaskQueue, and all other uses of std::queue in our task scheduling code, with less poorly behaved collections like nsDeque.

ni? :bas

Flags: needinfo?(bas)

Correction on the previous comment: I linked the wrong type for our task queues. The real type being considered here is most likely https://searchfox.org/mozilla-central/rev/277ab3925eae21b419167a34624ec0ab518d0c94/xpcom/threads/TaskQueue.h#51 (/me facepalms)

Fortunately, my comment about not using std::queue still stands as this type also uses one internally: https://searchfox.org/mozilla-central/rev/277ab3925eae21b419167a34624ec0ab518d0c94/xpcom/threads/TaskQueue.h#149, but I'm needinfo-ing the wrong person (sorry :bas!)

Redirecting ni to :kwright

Flags: needinfo?(bas) → needinfo?(kwright)

This bug might be a duplicate of bug 1675203.

We should still add memory reporters for this data structure, even though the regressor was likely backed out.

This bug is indeed a duplicate of bug 1675203 and this behavior was at least somewhat expected when dispatching free calls sequentially. The tasks got backed up more than anticipated given we can dispatch thousands of these in bursts so it was backed out and we're taking a different approach to fixing thread churn. (bug 1670737 and bug 1667581)

Looking at the TaskQueue implementation I'm fairly certain that we are only using std::queue for historical reasons - and it's been unpleasant to debug in the past anyway so I'm all for using nsDeque instead. Save for our TaskQueues the bulk of our threading mechanisms use event queues anyway, so I'm pretty sure we're only going to run into this here.

I filed bug 1678149 so I can change this behavior. Note that this will not fix the underlying issue causing the freeOMT task backup. The background event target (and nsThreadPool in general) doesn't handle massive volumes of short tasks like these very well, hence the issues we keep running into.

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(kwright)
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.