If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Crash in OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xrealloc | nsTArray_base<T>::EnsureCapacity<T> | mozilla::dom::CanvasRenderingContext2D::Save

VERIFIED FIXED in Firefox 50

Status

()

Core
Canvas: 2D
--
critical
VERIFIED FIXED
a year ago
11 months ago

People

(Reporter: ashughes, Assigned: eflores)

Tracking

({crash})

41 Branch
mozilla51
x86
Windows
crash
Points:
---
Bug Flags:
qe-verify +

Firefox Tracking Flags

(firefox46 wontfix, firefox47 wontfix, firefox48 wontfix, firefox49 wontfix, firefox-esr45 wontfix, firefox50 verified, firefox51 verified)

Details

(Whiteboard: [gfx-noted], crash signature)

Attachments

(2 attachments)

(Reporter)

Description

a year ago
This bug was filed from the Socorro interface and is 
report bp-45a44ef8-2fea-4a73-83af-cdd8d2160511.
=============================================================
0 	mozglue.dll 	mozalloc_abort(char const* const) 	memory/mozalloc/mozalloc_abort.cpp:33
1 	mozglue.dll 	mozalloc_handle_oom(unsigned int) 	memory/mozalloc/mozalloc_oom.cpp:46
2 	mozglue.dll 	moz_xrealloc 	memory/mozalloc/mozalloc.cpp:107
3 	xul.dll 	nsTArray_base<nsTArrayInfallibleAllocator, nsTArray_CopyWithMemutils>::EnsureCapacity<nsTArrayInfallibleAllocator>(unsigned int, unsigned int) 	xpcom/glue/nsTArray-inl.h:183
4 	xul.dll 	mozilla::dom::CanvasRenderingContext2D::Save() 	dom/canvas/CanvasRenderingContext2D.cpp:1734
5 	xul.dll 	mozilla::dom::CanvasRenderingContext2DBinding::save 	obj-firefox/dom/bindings/CanvasRenderingContext2DBinding.cpp:1960
6 	xul.dll 	mozilla::dom::GenericBindingMethod(JSContext*, unsigned int, JS::Value*) 	dom/bindings/BindingUtils.cpp:2717
7 	xul.dll 	mozilla::SimpleTimer::Cancel() 	dom/media/VideoUtils.cpp:380
8 		@0x2fc6305f
=============================================================
More reports: https://crash-stats.mozilla.com/signature/?product=Firefox&signature=OOM+%7C+large+%7C+mozalloc_abort+%7C+mozalloc_handle_oom+%7C+moz_xrealloc+%7C+nsTArray_base%3CT%3E%3A%3AEnsureCapacity%3CT%3E+%7C+mozilla%3A%3Adom%3A%3ACanvasRenderingContext2D%3A%3ASave 	

These crashes show up going back as far as Firefox 41 so I don't think it's a recent regression but it does seem to be rising by a significant margin starting around March 23rd.
(Reporter)

Comment 1

a year ago
Created attachment 8751459 [details]
Crash Trend as of 2016-05-09
(Reporter)

Comment 2

a year ago
Multiple comments mention WeTransfer.com so that might be a lead. There does not appear to be a strong correlation to a particular system configuration (OS, GPU, Driver, etc).

Comment 3

a year ago
(In reply to Anthony Hughes (:ashughes) [GFX][QA][Mentor] from comment #2)
> Multiple comments mention WeTransfer.com so that might be a lead. There does
> not appear to be a strong correlation to a particular system configuration
> (OS, GPU, Driver, etc).

Do you mean going to WeTransfer.com in browser can reproduce it?
Flags: needinfo?(anthony.s.hughes)
(Reporter)

Comment 4

a year ago
(In reply to Vincent Liu[:vliu] from comment #3)
> Do you mean going to WeTransfer.com in browser can reproduce it?

For me personally, no. All I mean is that multiple users who report this crash have a page on wetransfer.com open when the crash occurs. I cannot say for certain that wetransfer.com is the cause. I only mention it in case people are looking for possible leads to test.
Flags: needinfo?(anthony.s.hughes)
Managed to reproduce runaway memory usage on WeTransfer.com.

One of their animations (progress bar, I think) calls CanvasRenderingContext2D.save() on every frame without a matching restore().

This causes our style state stack at [1] to grow slowly, but without bound.

We should probably bound the size of the stack and make it circular.

[1] https://dxr.mozilla.org/mozilla-central/source/dom/canvas/CanvasRenderingContext2D.cpp#1782
Assignee: nobody → edwin
Created attachment 8765973 [details] [diff] [review]
1272123.patch
Attachment #8765973 - Flags: review?(mstange)
I'm not sure that this is the best course of action. We could also just ignore the save() call.

Does this page modify any canvas state after the save call? If not, we don't have to actually allocate a new array element for the state. We don't have any optimization for that case but we could add one if it helps this page. It's what Chrome seems to do; look for BaseRenderingContext2D::realizeSaves(). (I didn't find any state array length checks in Chrome code.)
(In reply to Markus Stange [:mstange] from comment #7)
> I'm not sure that this is the best course of action. We could also just
> ignore the save() call.

I considered that, but this way it's more likely that the result will be correct.

> Does this page modify any canvas state after the save call? If not, we don't
> have to actually allocate a new array element for the state. We don't have
> any optimization for that case but we could add one if it helps this page.
> It's what Chrome seems to do; look for
> BaseRenderingContext2D::realizeSaves(). (I didn't find any state array
> length checks in Chrome code.)

The style does appear to be modified in between save() calls, but always goes back to the same state just before the call is made. We might be able to get away with comparing the two elements at the top of the stack.

I compared to Chromium, which doesn't exhibit the same runaway memory usage. AFAICT they *should* have the same problem; currently investigating why the don't.
Comment on attachment 8765973 [details] [diff] [review]
1272123.patch

Removing review request until you've found out why Chrome doesn't have this problem.
Attachment #8765973 - Flags: review?(mstange)
Crash volume for signature 'OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xrealloc | nsTArray_base<T>::EnsureCapacity<T> | mozilla::dom::CanvasRenderingContext2D::Save':
 - nightly (version 50): 5 crashes from 2016-06-06.
 - aurora  (version 49): 54 crashes from 2016-06-07.
 - beta    (version 48): 676 crashes from 2016-06-06.
 - release (version 47): 8073 crashes from 2016-05-31.
 - esr     (version 45): 983 crashes from 2016-04-07.

Crash volume on the last weeks:
             Week N-1   Week N-2   Week N-3   Week N-4   Week N-5   Week N-6   Week N-7
 - nightly          0          1          1          0          1          2          0
 - aurora           6          9         10          8          9          8          1
 - beta           106        112        101        115        108         79         19
 - release       1326       1616       1125       1191       1180        885        255
 - esr            137        101        114        125        109        101         69

Affected platforms: Windows, Linux
status-firefox49: --- → affected
status-firefox50: --- → affected
Ack. My bad.

Chrome does share this problem; it just didn't show up in my minimal test case (which didn't dirty the state in between save() calls) because of that optimisation.
Is there any way for a user to work around this problem until we get a good long term fix?
Flags: needinfo?(edwin)
Comment on attachment 8765973 [details] [diff] [review]
1272123.patch

AFAICT it's either this or comparing the whole structure on every possibly-dirty save(). Re-requesting review.
Flags: needinfo?(edwin)
Attachment #8765973 - Flags: review?(mstange)
Comment on attachment 8765973 [details] [diff] [review]
1272123.patch

Review of attachment 8765973 [details] [diff] [review]:
-----------------------------------------------------------------

Ok. Can you add a crashtest?
Attachment #8765973 - Flags: review?(mstange) → review+
> Ok. Can you add a crashtest?

I'm not sure we can crash fast enough. This is a super slow OOM that takes about 30 to 60s to crash on my machine; probably much, much slower on test infra.

Comment 16

a year ago
Pushed by eflores@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ac598dfc00b2
Limit the size of CanvasRenderingContext2D::mStyleStack - r=mstange

Comment 17

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/ac598dfc00b2
Status: NEW → RESOLVED
Last Resolved: a year ago
status-firefox51: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Hi Edwin, could you please nominate for uplift to Aurora50?
Flags: needinfo?(edwin)
We might want that uplifted to beta too as the patch seems safe
status-firefox46: affected → wontfix
status-firefox47: affected → wontfix
status-firefox48: affected → wontfix
status-firefox-esr45: affected → wontfix
Too late for 49, do you want to request uplift to aurora/50?
status-firefox49: affected → wontfix
Flags: needinfo?(edwin)
mstange, eflores, This still could make it into 50.
Flags: needinfo?(mstange)
Flags: needinfo?(edwin)
Comment on attachment 8765973 [details] [diff] [review]
1272123.patch

Approval Request Comment
[Feature/regressing bug #]: Canvas 2D.
[User impact if declined]: Crashes on some tablets when viewing 2D canvas content.
[Describe test coverage new/current, TreeHerder]: Been on m-c for about a month.
[Risks and why]: Very low risk, if any.
[String/UUID change made/needed]: None.
Flags: needinfo?(mstange)
Flags: needinfo?(edwin)
Attachment #8765973 - Flags: approval-mozilla-aurora?
Comment on attachment 8765973 [details] [diff] [review]
1272123.patch

Crash fix, Aurora50+
Attachment #8765973 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Comment 24

a year ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/19ba3c80b6f3
status-firefox50: affected → fixed
This has started spiking recently for users with the ru locale (~42% of crashes with this signature have useragent_locale=ru vs ~9% overall in 48).
Probably because of this website: http://kinogo.club/
Flags: qe-verify+
I couldn't reproduce the crash on Microsoft Surface Pro with Windows 10 64-bit.
Still, there are no new crashes in Socorro after the fix landed so I'm marking this as verified based on that.
Status: RESOLVED → VERIFIED
status-firefox50: fixed → verified
status-firefox51: fixed → verified
You need to log in before you can comment on or make changes to this bug.