Last Comment Bug 718100 - Web workers should GC more
: Web workers should GC more
Status: VERIFIED FIXED
[MemShrink:P2][qa!]
:
Product: Core
Classification: Components
Component: DOM (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla12
Assigned To: Ben Turner (not reading bugmail, use the needinfo flag!)
:
Mentors:
http://syntensity.com/static/espeak.html
Depends on:
Blocks: 710501
  Show dependency treegraph
 
Reported: 2012-01-13 15:44 PST by Alon Zakai (:azakai)
Modified: 2013-10-10 09:54 PDT (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
affected
affected
verified
verified


Attachments
Patch, v1 (18.07 KB, patch)
2012-01-16 14:41 PST, Ben Turner (not reading bugmail, use the needinfo flag!)
no flags Details | Diff | Review
Patch, v1 (18.07 KB, patch)
2012-01-16 15:01 PST, Ben Turner (not reading bugmail, use the needinfo flag!)
mrbkap: review+
akeybl: approval‑mozilla‑aurora+
Details | Diff | Review

Description Alon Zakai (:azakai) 2012-01-13 15:44:41 PST
STR:

1. Load http://syntensity.com/static/espeak.html
2. Press 'Go!'
3. Hear speech synthesis output
4. Look in about:memory. I see

│  ├──232.70 MB (62.50%) -- workers(syntensity.com)
│  │  └──232.70 MB (62.50%) -- worker(speakWorker.js, 0xb747d52c88536000)
│  │     ├──225.34 MB (60.53%) -- compartment(Web Worker)
│  │     │  ├──192.36 MB (51.67%) -- analysis-temporary

Furthermore, the tooltip on analysis-temporary says it should be cleared away on GC, but none of GC, CC, or Minimize memory usage do anything to it. It is only cleared up when the worker is shut down (when the page is closed).

This doesn't happen if the same code is *not* in a web worker. Without a worker, analysis-temporary remains very very small. Should I move this from JS Engine to the Workers component?
Comment 1 Brian Hackett (:bhackett) 2012-01-13 16:33:20 PST
Bug 706914 should help with reducing the analysis temporary information (that should land early next week, so can check then), but I would guess the about:memory GC buttons are not hooked into the worker runtimes (workers are basically totally isolated from the rest of the JS code in the browser) and that heuristics for triggering GCs in worker runtimes may need improvement.
Comment 2 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-01-13 16:37:09 PST
The Minimize memory usage button generates a memory pressure event, which potentially could trigger some behavior in workers.  I don't know if it actually does though.
Comment 3 Alon Zakai (:azakai) 2012-01-13 16:47:41 PST
bent, do you know what the GC heuristics are for workers? The data in this bug seems to imply we can hold on to lots of memory for too long.
Comment 4 Bill McCloskey (:billm) 2012-01-13 16:50:22 PST
Most of our GCs come from nsJSEnvironment, which I think doesn't do anything for workers. So we only GC them if they hit the allocation triggers. We should have the worker event loop call JS_MaybeGC.
Comment 5 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-01-13 17:50:29 PST
Sorry, I was told that I didn't need to schedule explicit GC these days... Apparently that is not the case! I can fix this for workers.
Comment 6 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2012-01-14 06:44:17 PST
Nick, I think Bill was proposing a more general trigger than memory pressure: Workers should GC when the GC and Minimize Memory usage buttons are pressed, but also see if they should GC during the event loop.
Comment 7 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-01-16 14:41:52 PST
Created attachment 589028 [details] [diff] [review]
Patch, v1

This patch does three things:

1. RuntimeService watches for "child-gc-request" and "memory-pressure" notifications. "child-gc-request" does a normal GC and "memory-pressure" does a shrinking GC on all workers.
2. Workers have a 30 second timer that will trigger a normal GC during continuous running scripts. If the event queue empties then...
3. Workers have a 5 second timer that will trigger a shrinking GC after the worker goes idle. If another event comes in then the timer is canceled and the 30 second timer starts again.

Sound like a decent strategy to hit all the use cases here?
Comment 8 Alon Zakai (:azakai) 2012-01-16 15:01:15 PST
Sounds like you're already doing this, but from a power usage perspective, please make sure the new timers here do not run all the time (even if just once per 30 seconds) if the worker is not active.
Comment 9 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-01-16 15:01:58 PST
Created attachment 589037 [details] [diff] [review]
Patch, v1

Forgot to qref that patch.
Comment 10 Blake Kaplan (:mrbkap) (please use needinfo!) 2012-01-17 11:57:15 PST
Comment on attachment 589037 [details] [diff] [review]
Patch, v1

I asked bent for some more comments over IRC (seeing as how I'm stealing this review from sicking, I thought it was appropriate).

It'd be nice to change BROADCAST_ALL_WORKERS into a template, but my template-fu isn't strong enough to figure out the pointer-to-member-function jazz that'd be required.
Comment 11 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-01-17 19:30:51 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/7080b6f34d32
Comment 13 Alon Zakai (:azakai) 2012-01-19 11:42:41 PST
I verified that this is fixed on today's nightly. Thanks everyone!
Comment 14 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-01-25 02:09:20 PST
Comment on attachment 589037 [details] [diff] [review]
Patch, v1

[Approval Request Comment]
Regression caused by (bug #): bug 649537
User impact if declined: Workers don't GC properly, so memory usage can go way up and never come back down until leaving the page or closing the browser. Addons which make workers that last for the lifetime of the app will not free memory until shutdown.
Testing completed (on m-c, etc.): m-c, alon's emscripten testing
Risk to taking this patch (and alternatives if risky): No regressions on m-c and no bugs filed yet so I think risk is pretty low. If anything crops up we could always back this out later.
Comment 15 Alex Keybl [:akeybl] 2012-01-25 18:13:08 PST
(In reply to ben turner [:bent] from comment #14)
> Comment on attachment 589037 [details] [diff] [review]
> Patch, v1
> 
> [Approval Request Comment]
> Risk to taking this patch (and alternatives if risky): No regressions on m-c
> and no bugs filed yet so I think risk is pretty low. If anything crops up we
> could always back this out later.

Do we know how common this memory leak is in other sites/add-ons that use web workers?

Are web workers used enough in add-ons and the web to warrant uplifting?

I'm hesitant to uplift a memshrink code change like this without significant user impact (going by the # of dupes).
Comment 16 Alon Zakai (:azakai) 2012-01-25 18:27:44 PST
(In reply to Alex Keybl [:akeybl] from comment #15)
> (In reply to ben turner [:bent] from comment #14)
> > Comment on attachment 589037 [details] [diff] [review]
> > Patch, v1
> > 
> > [Approval Request Comment]
> > Risk to taking this patch (and alternatives if risky): No regressions on m-c
> > and no bugs filed yet so I think risk is pretty low. If anything crops up we
> > could always back this out later.
> 
> Do we know how common this memory leak is in other sites/add-ons that use
> web workers?

This memory leak happens in *every* website that uses web workers right now (unless the website has no garbage to clean up in the first place, I guess).

> 
> Are web workers used enough in add-ons and the web to warrant uplifting?
> 

I don't know about addons. But IE9 doesn't support web workers, so they aren't used in any major mainstream website AFAIK. (IE10 however, will support workers.)

I think that fact goes both ways, though: One the one hand, not having many web workers on the web means there is little urgent need for this fix. But on the other hand, since the fix only affects those few websites (since it only modifies web worker code, not anything else in the browser), it is low risk.

Note that while few websites use workers now, workers are very important to the future of the web. We really want people to use them, so that web pages are responsive and fast. And it's a very bad first impression if a developer tries web workers now and sees this leak.
Comment 17 Alex Keybl [:akeybl] 2012-01-27 16:32:54 PST
Comment on attachment 589037 [details] [diff] [review]
Patch, v1

(In reply to Alon Zakai (:azakai) from comment #16)
> This memory leak happens in *every* website that uses web workers right now
> (unless the website has no garbage to clean up in the first place, I guess).

[Triage Comment]
Given this and the low risk of regression given the number of sites using workers, approving for Aurora.
Comment 18 Ben Turner (not reading bugmail, use the needinfo flag!) 2012-01-28 10:36:46 PST
Thanks!

https://hg.mozilla.org/releases/mozilla-aurora/rev/cae62722a5e7
Comment 19 Ioana (away) 2012-02-28 06:02:24 PST
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:11.0) Gecko/20100101 Firefox/11.0
Mozilla/5.0 (X11; Linux i686; rv:11.0) Gecko/20100101 Firefox/11.0
Mozilla/5.0 (Windows NT 6.1; rv:11.0) Gecko/20100101 Firefox/11.0

Verified with the steps from comment 0. analysis-temporary is cleaned up about 5 seconds after the sound stops playing, or when the user clicks the "GC" button.
Comment 20 Ioana (away) 2012-03-16 07:48:04 PDT
Verified as fixed on:
Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20100101 Firefox/12.0
Mozilla/5.0 (X11; Linux i686; rv:12.0) Gecko/20100101 Firefox/12.0
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:12.0) Gecko/20100101 Firefox/12.0
20120314195616
Comment 21 Matthew Raymond 2013-10-10 06:51:57 PDT
There appears to have been a regression for this bug. I'm using Firefox 24, and I've been noticing a similar behavior with a plug-in I maintain that uses speak.js. The web worker will often run out of memory and I have to either restart Firefox or go to about:memory and click "Minimize memory usage" in order to get the plug-in working again. Similarly, I have problems with demo pages using speak.js.

STR:

1. Load http://projects.mattytemple.com/speak-js/
2. Press "Say 'Hello World!'"
3. Hear speech synthesis output.
4. Go to about:memory in another tab and press "Measure". I have the following:

│  ├──133.01 MB (34.97%) -- workers(mattytemple.com)/worker(speakWorker.js, 0x5337400)
│  │  └──94.35 MB (24.81%) -- zone(0xebf3400)
│  │     ├──93.51 MB (24.59%) -- compartment(web-worker)

These numbers appear to stay at these levels indefinitely, but if I press "Minimize memory usage" and then "Measure", I get this:

│  ├──24.68 MB (08.64%) -- workers(mattytemple.com)/worker(speakWorker.js, 0x5337400)
│  │  └──15.30 MB (05.36%) -- zone(0xebf3400)
│  │     ├──15.11 MB (05.29%) -- compartment(web-worker)

Go back to the demo page and repeatedly press the "Say 'Hello World!'" button. You will see the memory used by the web worker continuously rise with each button press and never fall.
Comment 22 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2013-10-10 07:06:40 PDT
Please file a new bug for a new issue.  You can use the "clone this bug" in the lower right corner to get started with filing.  Thanks.
Comment 23 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2013-10-10 09:54:19 PDT
(Matthew Raymond filed bug 925343 for that issue.)

Note You need to log in before you can comment on or make changes to this bug.