Limit concurrency of e10s memory reports

RESOLVED FIXED in Firefox 40

Status

()

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: jld, Assigned: jld)

Tracking

unspecified
mozilla40
All
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(e10s-, firefox40 fixed)

Details

(Whiteboard: [MemShrink:P1])

Attachments

(2 attachments)

On B2G devices using zRAM (compression of less-recently-used memory) which are thereby memory-overcommitted, doing a memory report can have significant memory overhead due to uncompressing parts of the process's heap while gathering usage information.  When every process tries to run its memory reporters concurrently, the resulting memory pressure can cause some of them to be killed before they can report.  We should limit memory reporting concurrency to prevent that — on B2G, we probably want to just serialize reporting, at least on smaller devices.

See also bug 1145005 comment #5.

(This bug is just about the core memory reporting, not GC/CC logging.)
Whiteboard: [MemShrink]
Whiteboard: [MemShrink] → [MemShrink:P1]
tracking-e10s: --- → -
Comment on attachment 8599441 [details] [diff] [review]
Patch: Limit concurrency of e10s memory reporting.

Review of attachment 8599441 [details] [diff] [review]:
-----------------------------------------------------------------

r=me. 2 small comments.

::: dom/ipc/ContentParent.cpp
@@ -638,5 @@
>  static const char* sObserverTopics[] = {
>      "xpcom-shutdown",
>      "profile-before-change",
>      NS_IPC_IOSERVICE_SET_OFFLINE_TOPIC,
> -    "child-memory-reporter-request",

min: There's another reference to this string in a comment that should be purged: https://hg.mozilla.org/mozilla-central/annotate/34828fed1639/xpcom/base/nsMemoryReporterManager.cpp#l1510

::: modules/libpref/init/all.js
@@ +4711,5 @@
> +#ifdef MOZ_WIDGET_GONK
> +// Bug 1154053: Serialize B2G memory reports; smaller devices are
> +// usually overcommitted on memory by using zRAM, so memory reporting
> +// causes memory pressure from uncompressing cold heap memory.
> +pref("memory.report_concurrency", 1);

Just curious, did this seem to take significantly longer? If so should we adjust |kTimeoutLengthMS|?
Attachment #8599441 - Flags: review?(erahm) → review+
(In reply to Eric Rahm [:erahm] from comment #2)
> There's another reference to this string in a comment that should be purged:
> https://hg.mozilla.org/mozilla-central/annotate/34828fed1639/xpcom/base/
> nsMemoryReporterManager.cpp#l1510

Nice catch; thanks.
 
> ::: modules/libpref/init/all.js
> @@ +4711,5 @@
> > +#ifdef MOZ_WIDGET_GONK
> > +// Bug 1154053: Serialize B2G memory reports; smaller devices are
> > +// usually overcommitted on memory by using zRAM, so memory reporting
> > +// causes memory pressure from uncompressing cold heap memory.
> > +pref("memory.report_concurrency", 1);
> 
> Just curious, did this seem to take significantly longer? If so should we
> adjust |kTimeoutLengthMS|?

I just tested (made more time-consuming by some unreliability with the Nuwa/prelaunch stuff that made it hard to get the same arrangement of child processes before and after; might be bug 1151672) on my Flame, which has 2 cores, with the full GiB of memory allowed.  There's a slight increase in total elapsed time, but it's <10%.  (I also noticed that, even with 8 children, the parent report takes slightly more than half the total time.)  So we're probably safe, especially given that the devices that would hit the limit first would probably be single-core.
Carrying over r+; thanks for the review.  Rebased and comment edited as noted above.
Attachment #8602490 - Flags: review+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f15f9f1de267 (this is the earlier version, but the only difference is a comment)
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/e05936099653
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Summary: Limit concurrency on e10s memory reports → Limit concurrency of e10s memory reports
I find this just about as unlikely as you will, but, based on https://treeherder.mozilla.org/#/jobs?repo=fx-team&filter-searchStr=a665adcc8cbca924f012ec7bd55dd3f2653b80fe&fromchange=eac6ac60b5e6&tochange=e05936099653 and this being the only thing in that range which wasn't browser/ and thus not built by Android, I backed it out in https://hg.mozilla.org/mozilla-central/rev/356e735fa908

The spew of info about that crash (because it's Android, and nobody actually bothers to file bugs specifically about specific Android failures) goes through bug 1162833, bug 1161150, bug 1159360. At least.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to David Baron [:dbaron] ⏰UTC+2 (busy, returning May 21) from comment #9)
> and bug 1137183 has been backed out.

In case anyone reading this is confused, it looks like that was bug 1137483.
https://hg.mozilla.org/mozilla-central/rev/5df6a8eccc53
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.