Created attachment 721957 [details]
This issue appeared when I was testing play-dune.com, and closed the application after leaving the game running for multiple hours (~12). It appears that the full memory used by the game has been transferred to another process, and that this other process retain this memory and does not do anything with it.
It also appears that every application doing an oom after the fact is adding memory to this process.
My blind guess would be: The crash reporter which is present on the phone is having a oom due to the fact that it holds too much memory, but never show-up on the device and thus never get any chance to discard the total amount of memory accumulated.
This bug also cause what is observed in Bug 848605.
Possible duplicates: Bug 841774, Bug 840670?
-------- DMD report ---------
Unreported: ~101 blocks in stack trace record 1 of 718
~51,920,893 bytes (~51,837,919 requested / ~82,974 slop)
41.10% of the heap (41.10% cumulative); 49.96% of unreported (49.96% cumulative)
replace_malloc /root/B2G/gecko/memory/replace/dmd/DMD.cpp:1226 (0x4001d77a libdmd.so+0x377a)
malloc /root/B2G/gecko/memory/build/replace_malloc.c:152 (0x4016c2fa libmozglue.so+0x42fa)
ppalloc /root/B2G/objdir-gecko/gfx/angle/Tokenizer.cpp:2260 (0x40eb8246 libxul.so+0xd34246)
nsStringBuffer::Alloc(unsigned int) /root/B2G/gecko/xpcom/string/src/nsSubstring.cpp:178 (0x40ce2b80 libxul.so+0xb5eb80)
nsAString_internal::MutatePrep(unsigned int, unsigned short**, unsigned int*) /root/B2G/gecko/xpcom/string/src/nsTSubstring.cpp:131 (0x40ce2cba libxul.so+0xb5ecba)
nsAString_internal::ReplacePrepInternal(unsigned int, unsigned int, unsigned int, unsigned int) /root/B2G/gecko/xpcom/string/src/nsTSubstring.cpp:166 (0x40ce31d6 libxul.so+0xb5f1d6)
nsAString_internal::ReplacePrep(unsigned int, unsigned int, unsigned int) /root/B2G/objdir-gecko/dist/include/nsTSubstring.h:738 (0x40ce3298 libxul.so+0xb5f298)
nsAString_internal::Assign(unsigned short const*, unsigned int, mozilla::fallible_t const&) /root/B2G/gecko/xpcom/string/src/nsTSubstring.cpp:311 (0x40ce36b2 libxul.so+0xb5f6b2)
nsAString_internal::Assign(unsigned short const*, unsigned int) /root/B2G/gecko/xpcom/string/src/nsTSubstring.cpp:290 (0x40ce36c8 libxul.so+0xb5f6c8)
IPC::ParamTraits<nsAString_internal>::Read(IPC::Message const*, void**, nsAString_internal*) /root/B2G/objdir-gecko/dist/include/ipc/IPCMessageUtils.h:305 (0x40b18480 libxul.so+0x994480)
mozilla::dom::PContentParent::OnMessageReceived(IPC::Message const&) /root/B2G/objdir-gecko/ipc/ipdl/PContentParent.cpp:2089 (0x410d7e0e libxul.so+0x9bae0e)
mozilla::ipc::AsyncChannel::OnDispatchMessage(IPC::Message const&) /root/B2G/gecko/ipc/glue/AsyncChannel.cpp:473 (0x40b0ea50 libxul.so+0x98aa50)
mozilla::ipc::RPCChannel::OnMaybeDequeueOne() /root/B2G/gecko/ipc/glue/RPCChannel.cpp:404 (0x40b13ab6 libxul.so+0x98fab6)
RunnableMethod<mozilla::layers::ImageContainerChild, void (mozilla::layers::ImageContainerChild::*)(), Tuple0>::Run() /root/B2G/gecko/ipc/chromium/src/base/task.h:308 (0x40af63b6 libxul.so+0x9723b6)
mozilla::ipc::RPCChannel::DequeueTask::Run() /root/B2G/objdir-gecko/dist/include/mozilla/ipc/RPCChannel.h:448 (0x40b1246c libxul.so+0x98e46c)
MessageLoop::RunTask(Task*) /root/B2G/gecko/ipc/chromium/src/base/message_loop.cc:335 (0x40cf44d0 libxul.so+0xb704d0)
MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask const&) /root/B2G/gecko/ipc/chromium/src/base/message_loop.cc:345 (0x40cf5302 libxul.so+0xb71302)
MessageLoop::DoWork() /root/B2G/gecko/ipc/chromium/src/base/message_loop.cc:442 (0x40cf5ee0 libxul.so+0xb71ee0)
mozilla::ipc::DoWorkRunnable::Run() /root/B2G/gecko/ipc/glue/MessagePump.cpp:43 (0x40b11e2c libxul.so+0x98de2c)
nsThread::ProcessNextEvent(bool, bool*) /root/B2G/gecko/xpcom/threads/nsThread.cpp:627 (0x40cd16fa libxul.so+0xb4d6fa)
NS_ProcessNextEvent_P(nsIThread*, bool) /root/B2G/objdir-gecko/dist/include/nsError.h:179 (0x40cb1d6e libxul.so+0xb2dd6e)
mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) /root/B2G/gecko/ipc/glue/MessagePump.cpp:83 (0x40b11f3c libxul.so+0x98df3c)
MessageLoop::RunInternal() /root/B2G/gecko/ipc/chromium/src/base/message_loop.cc:217 (0x40cf448c libxul.so+0xb7048c)
~AutoRunState /root/B2G/gecko/ipc/chromium/src/base/message_loop.cc:503 (0x40cf4536 libxul.so+0xb70536)
Thanks a lot for this bug report.
To be clear, is this DMD report from the main process, or from the app process?
Also, how did you launch play-dune.com? Do you have some kind of wrapper app?
Looks like this is parent process, receiving 101+ PContent::Msg_ScriptError__ID messages from the child. The string in question is the 'sourceLine' string, I think.
Oh, these console messages just stick around forever.
bent suggested that we
* Add a memory reporter for the console service,
* Restrict the size of the messages it saves, and
* Disable it on B2G if we can.
Hope I'm not forgetting anything.
I would rather not disabling it totally, since this is where the offline cache errors end up, and it's really useful to see them.
(In reply to Fabrice Desré [:fabrice] from comment #5)
> I would rather not disabling it totally, since this is where the offline
> cache errors end up, and it's really useful to see them.
Can we disable the service without breaking logcat? At the very least maybe we could have it purge its ring buffer more frequently (e.g. after notifying all existing observers).
(In reply to Justin Lebar [:jlebar] (PTO 3/11-3/15, then GMT+9 until 3/24) from comment #1)
> To be clear, is this DMD report from the main process, or from the app
I have to processes, it appear as being the main process under about:memory report, but the dmd report is under dmd-warning and not dmd-homescreen, which are only 2 extracted. I can attach the about:memory report if you want, but I didn't found it as informative as the dmd-warning-110.txt.gz which is attached to this bug.
> Also, how did you launch play-dune.com? Do you have some kind of wrapper
I have a version of gecko in which I enabled IonMonkey, because I was checking for IonMonkey leaks. In the browser app, I bookmarked play-dune.com as a homescreen application before starting it.
Created attachment 723049 [details] [diff] [review]
With this patch we still get all the output in logcat, but we should not store any message. I guess we could also not pre-allocate the space for 250 string pointers since we don't use any...
Created attachment 723050 [details] [diff] [review]
Now with less debugging.
* Let's use a pref instead of ifdefs?
* Would it be worthwhile to place a limit on the size of strings we store here (maybe also specified as a pref?) We don't want to store 50mb of strings here on desktop or Fennec either.
Created attachment 723653 [details] [diff] [review]
Now using a preference to toggle the buffering of messages in nsConsoleService.
I did not implement a size limitation on message strings themselves. I don't think we want to truncate messages since they may be useless. We currently have a hard-coded count of 250 messages in the circular buffer, which looks reasonable but I can make it a pref also.
Comment on attachment 723653 [details] [diff] [review]
Review of attachment 723653 [details] [diff] [review]:
r=me with these fixed:
@@ +66,3 @@
> class AddConsoleEnabledPrefWatcher : public nsRunnable
Nit: I guess you should rename this now to something like "AddPrefWatchers" or something.
@@ +70,5 @@
> NS_IMETHOD Run()
> Preferences::AddBoolVarCache(&sLoggingEnabled, "consoleservice.enabled", true);
> + Preferences::AddBoolVarCache(&sLoggingBuffered, "consoleservice.buffered", true);
So, for a little while on the main thread sLoggingBuffered will be true while we wait for the pref service to tell us the value it has when this runnable runs. On B2G the pref service will eventually set it to false. However, if we log a message in the period in between then we will hold it alive forever. That's certainly better than what we have today but I think you could make this even better by calling Reset() here.
Pushing to try with comments addressed:
(In reply to Fabrice Desré [:fabrice] from comment #13)
> +class AddConsolePrefWatchers : public nsRunnable
> + nsConsoleService* mConsole;
You should use |nsRefPtr<nsConsoleService>| there to be safe.
I filed bug 852018 on the remaining issue on desktop.