Last Comment Bug 730753 - OOM crash with repeatedly calling canvas getContext
: OOM crash with repeatedly calling canvas getContext
Status: RESOLVED FIXED
[MemShrink:P2]
: crash, regression, testcase
Product: Core
Classification: Components
Component: DOM (show other bugs)
: 12 Branch
: x86 Windows 7
: -- critical (vote)
: mozilla13
Assigned To: Olli Pettay [:smaug]
:
Mentors:
Depends on: 730853
Blocks:
  Show dependency treegraph
 
Reported: 2012-02-26 20:56 PST by Alice0775 White
Modified: 2012-04-15 23:30 PDT (History)
10 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
-
affected
-


Attachments
patch (2.89 KB, patch)
2012-02-27 05:49 PST, Olli Pettay [:smaug]
continuation: review+
Details | Diff | Splinter Review
patch (3.37 KB, patch)
2012-02-27 06:48 PST, Olli Pettay [:smaug]
akeybl: approval‑mozilla‑aurora+
Details | Diff | Splinter Review
for aurora (3.37 KB, patch)
2012-02-28 02:42 PST, Olli Pettay [:smaug]
no flags Details | Diff | Splinter Review

Description Alice0775 White 2012-02-26 20:56:02 PST
Build Identifier:
http://hg.mozilla.org/mozilla-central/rev/ce20e9b47e9c
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120225 Firefox/13.0a1 ID:20120225031723

When I test Bug 730698 attachment 600788 [details], browser crashes bp-20846c43-1e86-459b-a638-e25832120226


Reproducible: Always

Steps to Reproduce:
1. Start Firefox with new profile
2. Run firefoxcrashtest.html of attachment 600788 [details]

Actual Results:
  Browser crashes

Expected Results:
  No crash

Regression window(m-c)
Browser does not crash for about 1 min. And I can barely operate it:
http://hg.mozilla.org/mozilla-central/rev/f410bdf30132
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0a1) Gecko/20120130 Firefox/12.0a1 ID:20120130122131
Browser crashes within 30sec:
http://hg.mozilla.org/mozilla-central/rev/89bb79343f73
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:12.0a1) Gecko/20120130 Firefox/12.0a1 ID:20120130125731
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f410bdf30132&tochange=89bb79343f73
Comment 1 Boris Zbarsky [:bz] 2012-02-26 22:02:54 PST
This sounds like a CC regression...
Comment 2 Olli Pettay [:smaug] 2012-02-27 04:57:52 PST
My initial guess is that we end up calling CC fewer times, so we don't release memory
fast enough.
Comment 3 Olli Pettay [:smaug] 2012-02-27 05:05:27 PST
Unfortunately I can't reproduce this on linux.
Comment 4 Olli Pettay [:smaug] 2012-02-27 05:14:45 PST
I'll post a patch to tryserver. Hopefully someone can test it.
Comment 5 Olli Pettay [:smaug] 2012-02-27 05:49:45 PST
Created attachment 600877 [details] [diff] [review]
patch

This brings back similar behavior what we have in <FF12.
Yes, there can be some more CCs, but fortunately they are a lot faster.
http://mxr.mozilla.org/mozilla-beta/source/dom/base/nsJSEnvironment.cpp#3430
http://mxr.mozilla.org/mozilla-beta/source/dom/base/nsJSEnvironment.cpp#3339

https://tbpl.mozilla.org/?tree=Try&rev=26f8245eb5b0

Please test the tryserver build
Look for 26f8245eb5b0 in
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/
Comment 6 Andrew McCreight [:mccr8] 2012-02-27 06:37:35 PST
Comment on attachment 600877 [details] [diff] [review]
patch

Review of attachment 600877 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense.  We really have no idea how much garbage a GC could have released, so we should err on the side of caution.

::: dom/base/nsJSEnvironment.cpp
@@ +180,5 @@
>  static PRUint32 sForgetSkippableBeforeCC = 0;
>  static PRUint32 sPreviousSuspectedCount = 0;
>  
>  static bool sCleanupSinceLastGC = true;
> +static bool sNeedsFullCC = false;

Do you need to set this to false in that init function, too?
Comment 7 Olli Pettay [:smaug] 2012-02-27 06:39:21 PST
(In reply to Andrew McCreight [:mccr8] from comment #6)
> > +static bool sNeedsFullCC = false;
> 
> Do you need to set this to false in that init function, too?
I could do that, just to be consistent.

(Someone should cleanup nsJSEnvironment.cpp)
Comment 8 Olli Pettay [:smaug] 2012-02-27 06:48:20 PST
Created attachment 600897 [details] [diff] [review]
patch
Comment 9 Olli Pettay [:smaug] 2012-02-27 07:01:28 PST
Comment on attachment 600897 [details] [diff] [review]
patch

[Approval Request Comment]
Regression caused by (bug #): bug 721543
User impact if declined: possible OOM in rare cases
Testing completed (on m-c, etc.): just landed to m-c
Risk to taking this patch (and alternatives if risky): bringing back old behavior,
should be low risk.
String changes made by this patch: N/A
Comment 10 Olli Pettay [:smaug] 2012-02-27 07:02:47 PST
https://hg.mozilla.org/mozilla-central/rev/8ea5c983743f
Comment 11 Olli Pettay [:smaug] 2012-02-27 07:31:45 PST
Could someone please test http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/opettay@mozilla.com-26f8245eb5b0/
Comment 12 Andrew McCreight [:mccr8] 2012-02-27 08:14:06 PST
I wonder if this will help with the intermittent hang I was getting in bug 727965.
Comment 13 Alice0775 White 2012-02-27 08:21:56 PST
(In reply to Olli Pettay [:smaug] from comment #11)
> Could someone please test
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/opettay@mozilla.
> com-26f8245eb5b0/

The try build crashes after 15-30sec :(
Comment 14 Olli Pettay [:smaug] 2012-02-27 08:25:38 PST
Boo. Then it is about something else.
We should still use the patch.

Could you check if GC and/or CC runs while running the test.
Set javascript.options.mem.log to true and check error console Messages.
Comment 15 Alice0775 White 2012-02-27 08:28:53 PST
And latest m-c tinderbox-builds also crashes after 15-30sec.
bp-2176ffdf-c6e3-4216-9f32-625232120227

http://hg.mozilla.org/mozilla-central/rev/8ea5c983743f
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120227 Firefox/13.0a1 ID:20120227065949
Comment 16 Alice0775 White 2012-02-27 08:30:43 PST
Sorry, Accidentally removed flags
Comment 17 Alice0775 White 2012-02-27 08:42:22 PST
(In reply to Olli Pettay [:smaug] from comment #14)
> Could you check if GC and/or CC runs while running the test.
> Set javascript.options.mem.log to true and check error console Messages.

It is difficult to capture the log because the browser crashes and error console closes at same time.


GC(T+12.8) TotalTime: 3.4ms, Type: compartment, MMU(20ms): 83%, MMU(50ms): 93%, Reason: TOO_MUCH_MALLOC, +chunks: 0, -chunks: 0
mark: 1.5, mark-roots: 1.1, mark-other: 0.0, sweep: 1.6, sweep-obj: 0.5, sweep-string: 0.0, sweep-script: 0.0, sweep-shape: 0.0, discard-code: 0.0, discard-analysis: 0.1, xpconnect: 0.6, deallocate: 0.0

GC(T+13.6) TotalTime: 2.7ms, Type: compartment, MMU(20ms): 86%, MMU(50ms): 94%, Reason: TOO_MUCH_MALLOC, +chunks: 0, -chunks: 0
mark: 1.2, mark-roots: 0.9, mark-other: 0.0, sweep: 1.3, sweep-obj: 0.4, sweep-string: 0.0, sweep-script: 0.0, sweep-shape: 0.0, discard-code: 0.0, discard-analysis: 0.1, xpconnect: 0.4, deallocate: 0.0

GC(T+14.5) TotalTime: 2.9ms, Type: compartment, MMU(20ms): 85%, MMU(50ms): 94%, Reason: TOO_MUCH_MALLOC, +chunks: 0, -chunks: 0
mark: 1.2, mark-roots: 0.9, mark-other: 0.0, sweep: 1.5, sweep-obj: 0.5, sweep-string: 0.0, sweep-script: 0.0, sweep-shape: 0.0, discard-code: 0.0, discard-analysis: 0.1, xpconnect: 0.5, deallocate: 0.0
Comment 18 Alice0775 White 2012-02-27 08:45:59 PST
Comment #17 is the Nightly nightly build.
http://hg.mozilla.org/mozilla-central/rev/d1b2fd680235
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:13.0) Gecko/20120227 Firefox/13.0a1 ID:20120227031120
Comment 19 Olli Pettay [:smaug] 2012-02-27 08:46:52 PST
Oh, only compartment GC. That doesn't trigger CC.
Comment 20 Olli Pettay [:smaug] 2012-02-27 08:48:43 PST
Have we changed when full GC is triggered?
Comment 21 Alex Keybl [:akeybl] 2012-02-27 16:32:30 PST
Comment on attachment 600897 [details] [diff] [review]
patch

[Triage Comment]
Not clear this is absolutely necessary, but restores old behavior so approving for Aurora 12.
Comment 22 Olli Pettay [:smaug] 2012-02-28 02:42:09 PST
Created attachment 601219 [details] [diff] [review]
for aurora
Comment 23 Scoobidiver (away) 2012-02-28 02:52:28 PST
Is it fixed in the trunk?
Comment 24 Olli Pettay [:smaug] 2012-02-28 02:53:36 PST
This depends on bug 730853. The patch fixes one issue.
Comment 26 Andrew McCreight [:mccr8] 2012-02-28 11:26:49 PST
With this patch (well, and my patch in bug 728460), I'm seeing about 80% of CCs being triggered by this force CC.  It kind of makes the CC hostage to the GC if we want to try to call the CC less.  Kind of a shame, but I think this patch is probably the way to go, because otherwise you'll hit situations like this that are problematic.
Comment 27 Andrew McCreight [:mccr8] 2012-02-28 14:31:18 PST
Olli: would it make sense to trigger a CC every 5 or 10 compartmental GCs?
Comment 28 Andrew McCreight [:mccr8] 2012-02-29 08:57:51 PST
This probably belongs in another bug, but another approach to getting the GC to trigger a CC would be to dump all wrapped natives (etc.) into the purple buffer at the conclusion of a GC.  Then we could filter out everything held by a marked object using the normal methods.  That would probably work for this particular case, where it is allocating a huge number of cross-compartment cycles.
Comment 29 Scoobidiver (away) 2012-03-01 01:46:27 PST
Bug 730853 didn't land in 12.0 so it's not fully fixed.
Comment 30 Alex Keybl [:akeybl] 2012-03-19 16:14:07 PDT
(In reply to Scoobidiver from comment #29)
> Bug 730853 didn't land in 12.0 so it's not fully fixed.

Given the original risk evaluation that this would only occur in rare cases, untracking for 12 since we wouldn't take any more forward change for this.
Comment 31 Nicholas Nethercote [:njn] 2012-04-15 23:30:48 PDT
I think this is finished, correct?

Note You need to log in before you can comment on or make changes to this bug.