Closed
Bug 976363
Opened 11 years ago
Closed 11 years ago
MessageLoop can leak its pending tasks
Categories
(Core :: IPC, defect)
Tracking
()
RESOLVED
FIXED
mozilla30
People
(Reporter: ehsan.akhgari, Assigned: ehsan.akhgari)
References
Details
(Whiteboard: [MemShrink:P2])
Attachments
(1 file, 1 obsolete file)
This was uncovered with my work in bug 935778 where I got some instances of a WeakReference<MessageLoopIdleTask> leak.
So the chromium code that we imported here just throws up its hands and leaks these objects explicitly. I checked the current upstream code and the leaks are still there but the mentions of purify are removed.
I think we need to do better. It could be that deleting these objects can result to double-frees, but we should hopefully be able to find and fix those leaks. Also, note that this currently blocks me landing the checks in bug 935778.
Assignee | ||
Comment 1•11 years ago
|
||
Assignee | ||
Comment 2•11 years ago
|
||
Comment on attachment 8381012 [details] [diff] [review]
Do not leak chromium Task objects when a MessageLoop goes away
(I'm actually interested in both of your reviews here!)
Attachment #8381012 -
Flags: review?(bent.mozilla)
Attachment #8381012 -
Flags: review?(benjamin)
Assignee | ||
Comment 3•11 years ago
|
||
Updated•11 years ago
|
Whiteboard: [MemShrink]
Assignee | ||
Comment 4•11 years ago
|
||
ping? I'd really like to land this sooner than later before people start adding new leaks!
Comment on attachment 8381012 [details] [diff] [review]
Do not leak chromium Task objects when a MessageLoop goes away
Review of attachment 8381012 [details] [diff] [review]:
-----------------------------------------------------------------
This looks ok to me. I wish we had more confidence in this code...
Attachment #8381012 -
Flags: review?(bent.mozilla) → review+
Comment 6•11 years ago
|
||
I'm more than a little concerned about this. In what case do we actually have pending tasks in the MessageLoop destructor, and what kind are they? We take extraordinary pains for that never to happen in the XPCOM version of the event loop and it seems like we ought to be similarly anal about this version.
Updated•11 years ago
|
Flags: needinfo?(ehsan)
Assignee | ||
Comment 7•11 years ago
|
||
Here's a stack to the MessageLoop destructor. The only object that I've found which has this problem so far is MessageLoopIdleTask, but only because it inherits from SupportsWeakPtr, which means that its weak references are now leak-checked. We may run into this problem with other objects that go undetected too.
(lldb) bt
* thread #1: tid = 0x41d014, 0x00000001015d7cd0 XUL`(anonymous namespace)::MessageLoopIdleTask::~MessageLoopIdleTask(this=0x00000001216afb50) + 16 at nsMessageLoop.cpp:36, queue = 'com.apple.main-thread, stop reason = step in
frame #0: 0x00000001015d7cd0 XUL`(anonymous namespace)::MessageLoopIdleTask::~MessageLoopIdleTask(this=0x00000001216afb50) + 16 at nsMessageLoop.cpp:36
frame #1: 0x0000000101bc8e7d XUL`MessageLoop::DeletePendingTasks(this=0x00000001003e7330) + 381 at message_loop.cc:410
frame #2: 0x0000000101bc8ba4 XUL`MessageLoop::~MessageLoop(this=0x00000001003e7330) + 276 at message_loop.cc:168
frame #3: 0x00000001015a72c5 XUL`MessageLoopForUI::~MessageLoopForUI(this=0x00000001003e7330) + 21 at message_loop.h:459
frame #4: 0x00000001015a7275 XUL`MessageLoopForUI::~MessageLoopForUI(this=0x00000001003e7330) + 21 at message_loop.h:459
frame #5: 0x00000001015a7299 XUL`MessageLoopForUI::~MessageLoopForUI(this=0x00000001003e7330) + 25 at message_loop.h:459
frame #6: 0x000000010159db0e XUL`mozilla::ShutdownXPCOM(servMgr=0x0000000000000000) + 2206 at nsXPComInit.cpp:869
frame #7: 0x000000010159d265 XUL`NS_ShutdownXPCOM(servMgr=0x00000001003e74e8) + 21 at nsXPComInit.cpp:645
frame #8: 0x00000001052be0aa XUL`ScopedXPCOMStartup::~ScopedXPCOMStartup(this=0x000000010030e490) + 234 at nsAppRunner.cpp:1189
frame #9: 0x00000001052bdfb5 XUL`ScopedXPCOMStartup::~ScopedXPCOMStartup(this=0x000000010030e490) + 21 at nsAppRunner.cpp:1170
frame #10: 0x00000001052c7a07 XUL`XREMain::XRE_main(this=0x00007fff5fbfeae0, argc=5, argv=0x00007fff5fbff3e0, aAppData=0x00007fff5fbfed78) + 807 at nsAppRunner.cpp:4088
frame #11: 0x00000001052c7e0d XUL`XRE_main(argc=5, argv=0x00007fff5fbff3e0, aAppData=0x00007fff5fbfed78, aFlags=0) + 77 at nsAppRunner.cpp:4273
frame #12: 0x00000001000020d7 firefox`do_main(argc=5, argv=0x00007fff5fbff3e0, xreDirectory=0x0000000100327340) + 1639 at nsBrowserApp.cpp:282
frame #13: 0x0000000100001611 firefox`main(argc=5, argv=0x00007fff5fbff3e0) + 321 at nsBrowserApp.cpp:643
frame #14: 0x0000000100001074 firefox`start + 52
Flags: needinfo?(ehsan)
Assignee | ||
Comment 8•11 years ago
|
||
Note that for reproducing, all one has to do is to run the mochitests in dom/browser-element.
Updated•11 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P2]
Assignee | ||
Comment 9•11 years ago
|
||
Andrew, have you found this through LeakSanitizer? It should happen on all mochitest-2 runs iirc (if those are the ones running dom/browser-element)
Flags: needinfo?(continuation)
Comment 10•11 years ago
|
||
I don't have LSAN working on mochitests yet.
Flags: needinfo?(continuation)
Comment 11•11 years ago
|
||
LSAN works on Mochitests now. The only things I saw containing MessageLoop looked like this, one in M4 and one in M5. Let me know if the allocation stacks wouldn't contain MessageLoop or something.
Indirect leak of 456 byte(s) in 3 object(s) allocated from:
#0 0x446434 in calloc /builds/slave/moz-toolchain/src/llvm/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:90
#1 0x7f047e591224 in event_mm_calloc_ /builds/slave/try-l64-asan-00000000000000000/build/ipc/chromium/src/third_party/libevent/event.c:2663
#2 0x7f047e591224 in event_base_once /builds/slave/try-l64-asan-00000000000000000/build/ipc/chromium/src/third_party/libevent/event.c:1684
#3 0x7f047e5cbde3 in base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) /builds/slave/try-l64-asan-00000000000000000/build/ipc/chromium/src/base/message_pump_libevent.cc:339
#4 0x7f047e6187f3 in RunInternal /builds/slave/try-l64-asan-00000000000000000/build/ipc/chromium/src/base/message_loop.cc:226
#5 0x7f047e6187f3 in RunHandler /builds/slave/try-l64-asan-00000000000000000/build/ipc/chromium/src/base/message_loop.cc:219
#6 0x7f047e6187f3 in MessageLoop::Run() /builds/slave/try-l64-asan-00000000000000000/build/ipc/chromium/src/base/message_loop.cc:193
#7 0x7f047e63fa32 in base::Thread::ThreadMain() /builds/slave/try-l64-asan-00000000000000000/build/ipc/chromium/src/base/thread.cc:162
Assignee | ||
Comment 12•11 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #11)
> LSAN works on Mochitests now. The only things I saw containing MessageLoop
> looked like this, one in M4 and one in M5. Let me know if the allocation
> stacks wouldn't contain MessageLoop or something.
The leaking object is allocated here: <http://mxr.mozilla.org/mozilla-central/source/xpcom/base/nsMessageLoop.cpp#149>, with stacks resembling:
(lldb) bt
* thread #1: tid = 0x819644, 0x00000001015d6c73 XUL`nsMessageLoop::PostIdleTask(this=0x00000001196e52a0, aTask=0x00000001196e52e0, aEnsureRunsAfterMS=2000) + 19 at nsMessageLoop.cpp:148, queue = 'com.apple.main-thread, stop reason = breakpoint 1.1
frame #0: 0x00000001015d6c73 XUL`nsMessageLoop::PostIdleTask(this=0x00000001196e52a0, aTask=0x00000001196e52e0, aEnsureRunsAfterMS=2000) + 19 at nsMessageLoop.cpp:148
frame #1: 0x00000001016a8c69 XUL`NS_InvokeByIndex(that=0x00000001196e52a0, methodIndex=3, paramCount=2, params=0x00007fff5fbf4a58) + 537 at xptcinvoke_x86_64_unix.cpp:162
frame #2: 0x00000001032ba6f4 XUL`CallMethodHelper::Invoke(this=0x00007fff5fbf4a10) + 84 at XPCWrappedNative.cpp:2406
frame #3: 0x00000001032aecd7 XUL`CallMethodHelper::Call(this=0x00007fff5fbf4a10) + 263 at XPCWrappedNative.cpp:1747
frame #4: 0x000000010328e1b2 XUL`XPCWrappedNative::CallMethod(ccx=0x00007fff5fbf4c00, mode=CALL_METHOD) + 226 at XPCWrappedNative.cpp:1714
frame #5: 0x00000001032907ad XUL`XPC_WN_CallMethod(cx=0x00000001105b4170, argc=2, vp=0x0000000110f75170) + 909 at XPCWrappedNativeJSOps.cpp:1285
frame #6: 0x000000010630f005 XUL`js::CallJSNative(cx=0x00000001105b4170, native=0x0000000103290420, args=0x00007fff5fbf52a0)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) + 165 at jscntxtinlines.h:239
frame #7: 0x00000001062cf57c XUL`js::Invoke(cx=0x00000001105b4170, args=CallArgs at 0x00007fff5fbf52a0, construct=NO_CONSTRUCT) + 1164 at Interpreter.cpp:476
frame #8: 0x00000001062c59ff XUL`Interpret(cx=0x00000001105b4170, state=0x00007fff5fbf8120) + 53311 at Interpreter.cpp:2614
frame #9: 0x00000001062b88b9 XUL`js::RunScript(cx=0x00000001105b4170, state=0x00007fff5fbf8120) + 505 at Interpreter.cpp:423
frame #10: 0x00000001062cf6b2 XUL`js::Invoke(cx=0x00000001105b4170, args=CallArgs at 0x00007fff5fbf81d0, construct=NO_CONSTRUCT) + 1474 at Interpreter.cpp:495
frame #11: 0x000000010604dc48 XUL`js_fun_apply(cx=0x00000001105b4170, argc=2, vp=0x0000000110f75098) + 1720 at jsfun.cpp:1021
frame #12: 0x000000010630f005 XUL`js::CallJSNative(cx=0x00000001105b4170, native=0x000000010604d590, args=0x00007fff5fbf8cd0)(JSContext*, unsigned int, JS::Value*), JS::CallArgs const&) + 165 at jscntxtinlines.h:239
frame #13: 0x00000001062cf57c XUL`js::Invoke(cx=0x00000001105b4170, args=CallArgs at 0x00007fff5fbf8cd0, construct=NO_CONSTRUCT) + 1164 at Interpreter.cpp:476
frame #14: 0x00000001062c59ff XUL`Interpret(cx=0x00000001105b4170, state=0x00007fff5fbfbb50) + 53311 at Interpreter.cpp:2614
frame #15: 0x00000001062b88b9 XUL`js::RunScript(cx=0x00000001105b4170, state=0x00007fff5fbfbb50) + 505 at Interpreter.cpp:423
frame #16: 0x00000001062cf6b2 XUL`js::Invoke(cx=0x00000001105b4170, args=CallArgs at 0x00007fff5fbfbc00, construct=NO_CONSTRUCT) + 1474 at Interpreter.cpp:495
frame #17: 0x00000001062cfead XUL`js::Invoke(cx=0x00000001105b4170, thisv=0x00007fff5fbfbe10, fval=0x00007fff5fbfc3b0, argc=1, argv=0x00007fff5fbfc2e0, rval=JS::MutableHandleValue at 0x00007fff5fbfbd80) + 829 at Interpreter.cpp:532
frame #18: 0x000000010602b35f XUL`JS_CallFunctionValue(cx=0x00000001105b4170, obj=JS::HandleObject at 0x00007fff5fbfbe50, fval=JS::HandleValue at 0x00007fff5fbfbe48, args=0x00007fff5fbfc260, rval=JS::MutableHandleValue at 0x00007fff5fbfbe38) + 335 at jsapi.cpp:4935
frame #19: 0x0000000103aa6789 XUL`nsFrameMessageManager::ReceiveMessage(this=0x00000001281def60, aTarget=0x000000011467e6c0, aMessage=0x000000012ae6f738, aIsSync=false, aCloneData=0x00007fff5fbfc9c8, aCpows=0x00007fff5fbfc9a0, aPrincipal=0x0000000000000000, aJSONRetVal=0x0000000000000000) + 7833 at nsFrameMessageManager.cpp:1046
frame #20: 0x0000000103a9b241 XUL`nsAsyncMessageToChild::Run(this=0x000000012ae6f710) + 513 at nsFrameLoader.cpp:2300
frame #21: 0x000000010168f299 XUL`nsThread::ProcessNextEvent(this=0x00000001003289c0, mayWait=false, result=0x00007fff5fbfcbd3) + 1561 at nsThread.cpp:643
frame #22: 0x000000010158fdab XUL`NS_ProcessPendingEvents(thread=0x00000001003289c0, timeout=20) + 171 at nsThreadUtils.cpp:210
frame #23: 0x0000000103148359 XUL`nsBaseAppShell::NativeEventCallback(this=0x00000001105ed480) + 201 at nsBaseAppShell.cpp:98
frame #24: 0x00000001030d3f4c XUL`nsAppShell::ProcessGeckoEvents(aInfo=0x00000001105ed480) + 428 at nsAppShell.mm:388
frame #25: 0x00007fff9324b731 CoreFoundation`__CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
frame #26: 0x00007fff9323cea2 CoreFoundation`__CFRunLoopDoSources0 + 242
frame #27: 0x00007fff9323c62f CoreFoundation`__CFRunLoopRun + 831
frame #28: 0x00007fff9323c0b5 CoreFoundation`CFRunLoopRunSpecific + 309
frame #29: 0x00007fff92ab2a0d HIToolbox`RunCurrentEventLoopInMode + 226
frame #30: 0x00007fff92ab2685 HIToolbox`ReceiveNextEventCommon + 173
frame #31: 0x00007fff92ab25bc HIToolbox`_BlockUntilNextEventMatchingListInModeWithFilter + 65
frame #32: 0x00007fff9441a3de AppKit`_DPSNextEvent + 1434
frame #33: 0x00007fff94419a2b AppKit`-[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] + 122
frame #34: 0x00000001030d2eb7 XUL`-[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:](self=0x0000000100322980, _cmd=0x00007fff94e4d5c3, mask=18446744073709551615, expiration=0x422d63c37f00000d, mode=0x00007fff7d491d00, flag='\x01') + 119 at nsAppShell.mm:165
frame #35: 0x00007fff9440db2c AppKit`-[NSApplication run] + 553
frame #36: 0x00000001030d4a32 XUL`nsAppShell::Run(this=0x00000001105ed480) + 162 at nsAppShell.mm:742
frame #37: 0x0000000104d9591c XUL`nsAppStartup::Run(this=0x0000000110f0c0b0) + 156 at nsAppStartup.cpp:276
frame #38: 0x0000000104c4eb44 XUL`XREMain::XRE_mainRun(this=0x00007fff5fbfeca0) + 6004 at nsAppRunner.cpp:4008
frame #39: 0x0000000104c4f349 XUL`XREMain::XRE_main(this=0x00007fff5fbfeca0, argc=5, argv=0x00007fff5fbff5a0, aAppData=0x00007fff5fbfef38) + 697 at nsAppRunner.cpp:4075
frame #40: 0x0000000104c4f7bd XUL`XRE_main(argc=5, argv=0x00007fff5fbff5a0, aAppData=0x00007fff5fbfef38, aFlags=0) + 77 at nsAppRunner.cpp:4285
frame #41: 0x0000000100002107 firefox`do_main(argc=5, argv=0x00007fff5fbff5a0, xreDirectory=0x0000000100327340) + 1639 at nsBrowserApp.cpp:282
frame #42: 0x0000000100001641 firefox`main(argc=5, argv=0x00007fff5fbff5a0) + 321 at nsBrowserApp.cpp:643
frame #43: 0x00000001000010a4 firefox`start + 52
Comment 13•11 years ago
|
||
Yeah, that doesn't look like the stack I saw in comment 11, which was the only stack I saw that contained the string "MessageLoop"... Oh, but wait, I've seen a bajillion stacks with NS_InvokeByIndex in it, and nothing above it, so it is possible that is at least some of what that is. I really need to try LSAN with non-opt builds.
Comment 14•11 years ago
|
||
Comment on attachment 8381012 [details] [diff] [review]
Do not leak chromium Task objects when a MessageLoop goes away
This code is really the ugliest hybrid of the XPCOM event loop and the chromium event loop I can imagine. The method in question is supposed to run a nsIRunnable when the event loop is idle and it does this by posting a chromium Task using PostIdleTask.
In practice when we shut down, the MessageLoop should be empty because we've already stopped all competing threads and spun the event loop to remove all pending tasks. This however doesn't process "idle" tasks so we're left leaking this one.
So what I'd like to do is this:
In MessageLoop::DeletePendingTasks:
* MOZ_ASSERT if work_queue_ isn't empty (replace the first part of the patch here)
* delete everything in deferred_non_nestable_work_queue_ (keep the second part of the patch here)
Ehsan can you try that and see if it passes try?
Attachment #8381012 -
Flags: review?(benjamin)
Assignee | ||
Comment 15•11 years ago
|
||
Seems to work locally, testing on try now.
Assignee | ||
Updated•11 years ago
|
Attachment #8381012 -
Attachment is obsolete: true
Assignee | ||
Comment 16•11 years ago
|
||
Assignee | ||
Comment 17•11 years ago
|
||
Comment on attachment 8388142 [details] [diff] [review]
Assert that the work queue is empty when a MessageLoop goes away, and delete the tasks in the deferred queue
This seems to work.
Attachment #8388142 -
Flags: review?(benjamin)
Updated•11 years ago
|
Attachment #8388142 -
Flags: review?(benjamin) → review+
Assignee | ||
Comment 18•11 years ago
|
||
Assignee | ||
Comment 19•11 years ago
|
||
(Also removed a now useless comment: https://hg.mozilla.org/integration/mozilla-inbound/rev/f16600ef3313)
Comment 20•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/bf5956ac5ba7
https://hg.mozilla.org/mozilla-central/rev/f16600ef3313
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
You need to log in
before you can comment on or make changes to this bug.
Description
•