Last Comment Bug 828887 - Add some tunable GC parameters to workers
: Add some tunable GC parameters to workers
Status: RESOLVED FIXED
[MemShrink]
:
Product: Core
Classification: Components
Component: DOM: Workers (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla21
Assigned To: Gregor Wagner [:gwagner]
:
Mentors:
Depends on: 841041 841962
Blocks: slim-fast 829482
  Show dependency treegraph
 
Reported: 2013-01-10 02:24 PST by James Lal [:lightsofapollo] (inactive)
Modified: 2013-02-15 16:12 PST (History)
10 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
wontfix
wontfix
+
fixed


Attachments
Partially through a sync operation before a OOM crash in FFOS. (1.32 MB, text/plain)
2013-01-10 02:24 PST, James Lal [:lightsofapollo] (inactive)
no flags Details
patch (14.50 KB, patch)
2013-01-10 06:20 PST, Gregor Wagner [:gwagner]
no flags Details | Diff | Review
better patch (16.94 KB, patch)
2013-01-10 14:01 PST, Gregor Wagner [:gwagner]
bent.mozilla: review+
justin.lebar+bug: approval‑mozilla‑b2g18+
Details | Diff | Review

Description James Lal [:lightsofapollo] (inactive) 2013-01-10 02:24:34 PST
Created attachment 700261 [details]
Partially through a sync operation before a OOM crash in FFOS.

Accounts for some of the memory issues in the gaia calendar app which essentially is constantly streaming data (via xhr + moz-chunked-text) and parsing xml/ical data.

In my case the worker is always busy with some CPU/memory intensive task until we kill it so it is never idle.
Comment 1 Justin Lebar (not reading bugmail) 2013-01-10 02:26:12 PST
Does your worker return to the event loop regularly, or does it get a whole bunch of data at the beginning and then process it all without ever returning to the event loop?

We need to make both cases work, but I specifically want to make sure we fix your case.
Comment 2 Justin Lebar (not reading bugmail) 2013-01-10 02:45:44 PST
Here's the bad part of attachment 700261 [details]:

> 61.04 MB (100.0%) -- explicit
> ├──31.24 MB (51.18%) -- workers/workers(gaiamobile.org)/worker(/caldav_worker.js?time=1357810196506, 43fc4800)
> │  ├──22.57 MB (36.97%) -- gc-heap
> │  │  ├──20.19 MB (33.08%) ── unused-arenas
> │  │  ├───2.00 MB (03.28%) ── unused-chunks

22.5mb of decommitable JS memory.  Ouch.
Comment 3 Gregor Wagner [:gwagner] 2013-01-10 03:05:29 PST
We currently have a 30MB allocation allowance before we start GCs in workers. We have many workers so we should definitely change this.
Comment 4 Lucas Adamski [:ladamski] 2013-01-10 05:07:47 PST
Its hard to determine bb+ with the information in this bug.  Can you help us better understand the user impact of this bug?  Just OOM crashing for Calendar app?  How frequent?
Comment 5 Gregor Wagner [:gwagner] 2013-01-10 05:17:54 PST
(In reply to Lucas Adamski from comment #4)
> Its hard to determine bb+ with the information in this bug.  Can you help us
> better understand the user impact of this bug?  Just OOM crashing for
> Calendar app?  How frequent?

I am still investigating. The problem is that JS workers get the default allocation threshold from desktop firefox. This is optimized for not performing GC during startup and is set to 30MB.

We have many workers on B2G and depending on how the worker is used it might always allocate 30MB. Note that the threshold is per worker and not for all the workers combined.

I tested a patch with James that reduces the GC threshold to 3MB for workers. The worker in question here reduced the footprint from 30MB to around 3-5 MB.

I haven't looked at other workers we are using but if we do some memory intensive computation we are in a much better situation with this patch.
Comment 6 Justin Lebar (not reading bugmail) 2013-01-10 05:19:53 PST
(In reply to Lucas Adamski from comment #4)
> Its hard to determine bb+ with the information in this bug.  Can you help us
> better understand the user impact of this bug?  Just OOM crashing for
> Calendar app?  How frequent?

It's also going to cause excessive memory usage in any app which uses workers to do complicated but finite tasks (e.g. syncing stuff).  So I expect we'd see high memory usage in the mail app as well as a result of this bug.

The user impact of high memory usage is crashes in the worst case, and otherwise reduced multitasking (e.g. your music app in the background crashes when you sync your e-mail).
Comment 7 Lucas Adamski [:ladamski] 2013-01-10 05:30:29 PST
Sounds like a very simple fix tho with a risk for regressions.  We'd definitely take a fix but don't think we'd stop ship of the phone for this.
Comment 8 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-01-10 05:57:36 PST
Do any workers in the chrome process allocate memory dynamically?
Comment 9 Gregor Wagner [:gwagner] 2013-01-10 06:20:45 PST
Created attachment 700376 [details] [diff] [review]
patch
Comment 10 Gregor Wagner [:gwagner] 2013-01-10 14:01:56 PST
Created attachment 700667 [details] [diff] [review]
better patch
Comment 11 Ben Turner (not reading bugmail, use the needinfo flag!) 2013-01-10 14:20:29 PST
Comment on attachment 700667 [details] [diff] [review]
better patch

Review of attachment 700667 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome!

::: dom/workers/RuntimeService.cpp
@@ +1384,2 @@
>  {
> +  BROADCAST_ALL_WORKERS(UpdateJSWorkerMemoryParameter, key, GetDefaultJSWorkerMemoryParameter(key));

Nit: 80 chars wrapping

::: dom/workers/RuntimeService.h
@@ +168,5 @@
> +      case JSGC_ALLOCATION_THRESHOLD:
> +        return sDefaultJSAllocationThreshold;
> +      case JSGC_MAX_BYTES:
> +        return sDefaultJSRuntimeHeapSize;
> +      default:

MOZ_NOT_REACHED again.

::: dom/workers/WorkerPrivate.cpp
@@ +1839,5 @@
>      AssertIsOnMainThread();
>  
>      mJSContextOptions = RuntimeService::GetDefaultJSContextOptions();
> +    mJSRuntimeHeapSize = RuntimeService::GetDefaultJSWorkerMemoryParameter(JSGC_MAX_BYTES);
> +    mJSWorkerAllocationThreshold = RuntimeService::GetDefaultJSWorkerMemoryParameter(JSGC_ALLOCATION_THRESHOLD);

Nit: Wrap this so it fits under 80 chars.

@@ +2218,5 @@
> +    case JSGC_MAX_BYTES:
> +      mJSRuntimeHeapSize = aValue;
> +      break;
> +    default:
> +      break;

I'd just s/break/MOZ_NOT_REACHED/ here.

@@ +2222,5 @@
> +      break;
> +  }
> +
> +  nsRefPtr<UpdateJSWorkerMemoryParameterRunnable> runnable =
> +    new UpdateJSWorkerMemoryParameterRunnable(ParentAsWorkerPrivate(), aKey, aValue);

Nit: 80 chars?
Comment 13 Gregor Wagner [:gwagner] 2013-01-11 00:38:12 PST
Comment on attachment 700667 [details] [diff] [review]
better patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): none
User impact if declined: OOM situations and crashes when using workers (email, calendar, ...)
Testing completed: yes
Risk to taking this patch (and alternatives if risky): 
String or UUID changes made by this patch: none
Comment 14 Justin Lebar (not reading bugmail) 2013-01-11 01:43:36 PST
This patch doesn't fix the bug as described, right?  We do more incremental GCs with this patch (which is good), but this patch doesn't make us do more shrinking GCs, right?

Perhaps we should morph this bug and file a new bug for the symptoms originally described here.
Comment 15 Justin Lebar (not reading bugmail) 2013-01-11 02:03:42 PST
Comment on attachment 700667 [details] [diff] [review]
better patch

Considering how much we use workers, I think the reward here outweighs the risk.  All we're doing here is setting JS runtime params for worker runtimes that we already set for the main JS runtime.

Nobody has told me that my memory-fix approval powers have run out, so a=me, I guess!
Comment 16 Andrea Marchesini (:baku) 2013-01-11 02:43:01 PST
https://hg.mozilla.org/releases/mozilla-b2g18/rev/1fa736d4b087
Comment 17 Ed Morley [:emorley] 2013-01-11 06:29:51 PST
https://hg.mozilla.org/mozilla-central/rev/b5bf34b9a55e
Comment 18 Ed Morley [:emorley] 2013-01-11 06:58:43 PST
https://hg.mozilla.org/mozilla-central/rev/b5bf34b9a55e
Comment 19 Ed Morley [:emorley] 2013-01-11 07:46:57 PST
https://hg.mozilla.org/mozilla-central/rev/b5bf34b9a55e

Note You need to log in before you can comment on or make changes to this bug.