Bug 1644637 Comment 7 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

(In reply to Mike Hommey [:glandium] from comment #6)
> I haven't looked at the patch yet, but see the discussion in the patch attached to bug 1609478.

Thanks! Yes, this is related, inamuch as I'm pretty certain we have an unnecessary contention issue in `jemalloc`, and currently _suspect_ that the deallocation flow is the cause. Here's what I know for sure:

If I measure time immediately before this line and then immediately after this line, I've seen delays as long as 564 ms to do work that essentially boils down to allocating a single 128-byte buffer: https://searchfox.org/mozilla-central/rev/4bb2401ecbfce89af06fb2b4d0ea3557682bd8ff/ipc/chromium/src/chrome/common/ipc_channel_win.cc#409

In a single test run (loading and scrolling a set of 20 pages 5 times each), I measured this allocation taking in excess of 1 ms nearly 100 times (13 of these allocations took in excess of 100 ms to allocate buffers ranging in size from 128 bytes to 4 kB). Changing the underlying buffer to use the system `malloc()` eliminated these delays.

At the same time, I've observed that the IPC thread has been blocked on NtWaitForAlertByThreadId (the underlying Windows construct for mutexes) for windows of time that the main thread was performing a long-running `memset()`.

So while I cannot conclusively prove that the deallocation-associated memory poisoning while the arena lock is held is the cause of some of our measured IPC delays, it's a prime candidate. In any case, this bug is intended to ferret out where in `jemalloc` this resource contention issue arises and propose a fix for it.
(In reply to Mike Hommey [:glandium] from comment #6)
> I haven't looked at the patch yet, but see the discussion in the patch attached to bug 1609478.

Thanks! Yes, this is related, inamuch as I'm pretty certain we have an unnecessary contention issue in `jemalloc`, and currently _suspect_ that the deallocation flow is the cause. Here's what I know for sure:

If I measure time immediately before this line and then immediately after this line, I've seen delays as long as 564 ms to do work that essentially boils down to allocating a single 128-byte buffer: https://searchfox.org/mozilla-central/rev/4bb2401ecbfce89af06fb2b4d0ea3557682bd8ff/ipc/chromium/src/chrome/common/ipc_channel_win.cc#409

In a single test run (loading and scrolling a set of 20 pages 5 times each), I measured this allocation taking in excess of 10 ms nearly 100 times (13 of these allocations took in excess of 100 ms to allocate buffers ranging in size from 128 bytes to 4 kB). Changing the underlying buffer to use the system `malloc()` eliminated these delays.

At the same time, I've observed that the IPC thread has been blocked on NtWaitForAlertByThreadId (the underlying Windows construct for mutexes) for windows of time that the main thread was performing a long-running `memset()`.

So while I cannot conclusively prove that the deallocation-associated memory poisoning while the arena lock is held is the cause of some of our measured IPC delays, it's a prime candidate. In any case, this bug is intended to ferret out where in `jemalloc` this resource contention issue arises and propose a fix for it.

Back to Bug 1644637 Comment 7