Closed Bug 598498 Opened 9 years ago Closed Last year

OOM thewildernessdowntown.com [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ]

Categories

(Core :: General, defect, critical)

x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED WONTFIX
Tracking Status
blocking2.0 --- -

People

(Reporter: cade, Unassigned)

References

()

Details

(Keywords: crash, Whiteboard: [chromeexperiments])

Crash Data

Attachments

(3 files)

User-Agent:       Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20100921 Firefox/4.0b7pre
Build Identifier:  Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0b7pre) Gecko/20100921 Firefox/4.0b7pre

While running the Wilderness Downtown experiment, minefield has been crashing at the same point in the song. I have reproduced it 4 times on my computer.

crash reports:  
http://bit.ly/9uLKNz
http://bit.ly/dBMXwt
http://bit.ly/ci8n77

Reproducible: Always

Steps to Reproduce:
1. navigate to www.thewildernessdowntown.com and enter an address.
2. press play, wait until after the first verse
3. browser should crash after getting very slow and skipping a lot in the playing videos.
Whiteboard: [chromeexperiments]
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crash
Version: unspecified → Trunk
does not crash on: September 15, 2010
       crashes on: September 16, 2010

Change log: http://bit.ly/at87GI


ALSO: I noticed the crash happens when a window that is supposed to display google street view opens.
Hrm.  We had a _lot_ of stuff land there... and those crash reports seem to have no symbols.  Ted, any idea what's up with that?
We do have symbols for that build:
http://symbols.mozilla.org/firefox/firefox-4.0b7pre-WINNT-20100921041551-symbols.txt

I'm not sure why they weren't picked up for these crashes. I'll try updating to that build and test crashing.

The top frame is in _cairo_d2d_create_brush_for_pattern:
FUNC 649afc bd5 c _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface *,_cairo_pattern const *,bool)
Possible dupe of bug 589195 based on that.
I got symbols for my test crash, so it must have been a temporary Socorro issue:
http://crash-stats.mozilla.com/report/index/bp-81fc0fdb-6941-46f6-bc5e-585f62100922
Dan, what do you think of this crash, is this ctypes fallout?
No, that was just ted's test crash, which uses ctypes. The crash in comment 0 is different.
Ah, gotcha.  Thanks.
Chris, are you willing to try crashing on this again in hopes of getting symbols this time?
I can reproduce also on Windows 7 x64. With b7pre/20100922 build, I have the same crash with a different signature : bp-9aff732c-22fc-4a4e-ab02-2d8c82100922

This is not a temporary Socorro issue: see bug 593779 with a crash daily rate of 20-40 crashes/day that lasts for two weeks.
OK, sounds like we seriously need to sort the symbols thing out here.... or catch it in a debugger in a build with symbols.  Someone want to do the latter?
blocking2.0: --- → ?
Component: General → Graphics
QA Contact: general → thebes
Without symbols the signature will always vary between builds, because functions are not at the same offset, FYI.

Thanks for the info, I'll look into the symbols issue.

bz: the signature I gave in comment 3 should be right. You could also download the raw minidump from Socorro and use the symbol server to get a stack.
blocking2.0: ? → ---
Component: Graphics → General
bz: I tried crashing it again, but still did not get any symbols.

here's the crash report:
http://crash-stats.mozilla.com/report/index/bp-0914133f-402f-4f3e-bc6b-d7eb92100922
(In reply to comment #12)
> bz: I tried crashing it again, but still did not get any symbols.

Chris, find me on irc and I'll walk you through building this locally, so you can get a better stack.
Ok, the missing symbols is bug 559661. We've seen it before. For whatever reason this crash consistently hits it. I have a potential solution, I just need to get a few things in place to fix it.

(In reply to comment #0)
> crash reports:  
> http://bit.ly/9uLKNz

Stack:
ChildEBP RetAddr  
004284c8 6a935849 xul!_cairo_d2d_create_brush_for_pattern(struct _cairo_d2d_surface * d2dsurf = 0x0e4a2790, struct _cairo_pattern * pattern = 0x004285c4, bool unique = false)+0x30c [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\gfx\cairo\cairo\src\cairo-d2d-surface.cpp @ 1729]
00428548 6a616de2 xul!_cairo_d2d_fill(void * surface = 0x0e4a2790, _cairo_operator op = CAIRO_OPERATOR_OVER (2), struct _cairo_pattern * source = 0x004285c4, struct _cairo_path_fixed * path = 0x0f703aa4, _cairo_fill_rule fill_rule = CAIRO_FILL_RULE_WINDING (0), double tolerance = 0.10000000000000001, _cairo_antialias antialias = CAIRO_ANTIALIAS_DEFAULT (0), struct _cairo_clip * clip = 0x00000000)+0x196 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\gfx\cairo\cairo\src\cairo-d2d-surface.cpp @ 3233]
00428580 6a639561 xul!_cairo_surface_fill(struct _cairo_surface * surface = <Memory access error>, _cairo_operator op = <Memory access error>, struct _cairo_pattern * source = <Memory access error>, struct _cairo_path_fixed * path = <Memory access error>, _cairo_fill_rule fill_rule = <Memory access error>, double tolerance = <Memory access error>, _cairo_antialias antialias = <Memory access error>, struct _cairo_clip * clip = <Memory access error>)+0x72 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\gfx\cairo\cairo\src\cairo-surface.c @ 2216]
004286a8 6a618b8b xul!_cairo_gstate_fill(struct _cairo_gstate * gstate = <Memory access error>, struct _cairo_path_fixed * path = <Memory access error>)+0xf1 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\gfx\cairo\cairo\src\cairo-gstate.c @ 1184]
004286b8 6a828daf xul!_moz_cairo_fill_preserve(struct _cairo * cr = 0x6abb8241)+0x1b [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\gfx\cairo\cairo\src\cairo.c @ 2338]
004286c0 6aac5d91 xul!gfxContext::Fill(void)+0x8 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\gfx\thebes\gfxcontext.cpp @ 151]
0042882c 6abb8241 xul!nsCanvasRenderingContext2D::DrawImage(class nsIDOMElement * imgElt = 0x09165ef4, float a1 = 650.0098267, float a2 = 265.4313049, float a3 = 1, float a4 = 287.3243408, float a5 = 102, float a6 = 0, float a7 = 1, float a8 = 300, unsigned char optional_argc = 0x06 '')+0x522 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\content\canvas\src\nscanvasrenderingcontext2d.cpp @ 3570]
00428960 6a447f54 xul!nsIDOMCanvasRenderingContext2D_DrawImage(struct JSContext * cx = 0x076d9a80, unsigned int argc = 9, unsigned int64 * vp = 0x004288f8)+0x295 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\obj-firefox\js\src\xpconnect\src\dom_quickstubs.cpp @ 3517]
00428a00 6a4482ec xul!js::ExecuteTree(struct JSContext * cx = 0x02cf7170, struct js::TreeFragment * f = 0x6a447f54, unsigned int * inlineCallCount = 0x14855240, struct js::VMSideExit ** innermostNestedGuardp = 0x00428a58, struct js::VMSideExit ** lrp = 0x0403109c)+0x204 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\js\src\jstracer.cpp @ 6622]
00428a58 6a449c90 xul!js::MonitorTracePoint(struct JSContext * cx = 0x076d9a80, unsigned int * inlineCallCount = 0x00428a7c, bool * blacklist = 0x00428a84)+0x29c [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\js\src\jstracer.cpp @ 16254]
00428a7c 6a441a08 xul!RunTracer(struct js::VMFrame * f = 0x6a49e21d, struct js::mjit::ic::MICInfo * mic = 0x076d9a80)+0x30 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\js\src\methodjit\invokehelpers.cpp @ 820]
00428acc 6a49e21d xul!js::mjit::stubs::InvokeTracer(struct js::VMFrame * f = 0x00000001, unsigned long index = 0x428490)+0x28 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\js\src\methodjit\invokehelpers.cpp @ 923]
00428af0 73048cb3 xul!js::Invoke(struct JSContext * cx = 0x076d9a80, struct js::CallArgs * argsRef = 0x044f0118, unsigned long flags = 0x17a30bb0)+0x2dd [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\js\src\jsinterp.cpp @ 592]
00428b74 6a4c9993 mozcrt19!free(void * ptr = <Memory access error>)+0x13 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\obj-firefox\memory\jemalloc\crtsrc\jemalloc.c @ 6121]
00428b7c 40000000 xul!nsSprocketLayout::PopulateBoxSizes(class nsIFrame * aBox = <Memory access error>, class nsBoxLayoutState * aState = <Memory access error>, class nsBoxSize ** aBoxSizes = <Memory access error>, int * aMinSize = <Memory access error>, int * aMaxSize = <Memory access error>, int * aFlexes = <Memory access error>)+0x773 [e:\builds\moz2_slave\mozilla-central-win32-nightly\build\layout\xul\base\src\nssprocketlayout.cpp @ 924]
Ted, thanks!
blocking2.0: --- → ?
Component: General → Graphics
Depends on: 589195
Summary: Crash @ [@ xul.dll@0x649e08 ] → Wilderness Downtown experiment crash [@ cairo_d2d_create_brush_for_pattern ]
Bas, can you reproduce this?
Summary: Wilderness Downtown experiment crash [@ cairo_d2d_create_brush_for_pattern ] → [D2D] Wilderness Downtown experiment crash [@ cairo_d2d_create_brush_for_pattern ]
I cannot reproduce this, I might add that this is a likely place to crash in an Out of Memory situation since it allocates a relatively large block. I might note the test is quite resource heavy, I wonder if I'm not crashing because I've got more resources.
This is a bit common:1 relevant comment: "Everytime I scroll through google images, it crashes. :("https://crash-stats.mozilla.com/report/list?product=Firefox&build_id=&query_search=signature&query_type=contains&query=cairo_d2d_create_brush_for_pattern&date=09%2F22%2F2010%2009%3A34%3A47&range_value=1&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&signature=_cairo_d2d_create_brush_for_pattern%28_cairo_d2d_surface*%2C%20_cairo_pattern%20const*%2C%20bool%29&missing_sig=&page=1It's consistently a null-deref crash. Assuming we can reproduce it reliably and it's not real OOM, it should block.
Summary: [D2D] Wilderness Downtown experiment crash [@ cairo_d2d_create_brush_for_pattern ] → [D2D] Wilderness Downtown experiment crash [@ cairo_d2d_create_brush_for_pattern ][@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ]
This video basically causes us to happily go on creating D2D surfaces into infinity. I will investigate this. At the moment I have no idea why.
Someone should probably dupe bug 589195 one way or the other.
Assignee: nobody → bas.schouten
Status: NEW → ASSIGNED
bug 589195 reports two problems : memory usage that increase in an exponential way, and the crash. The latter is a dupe, not the first.
> since it allocates a relatively large block

Bas, is it a fixed-size block?  Or is its size under content control?
(In reply to comment #19)
> This video basically causes us to happily go on creating D2D surfaces into
> infinity. I will investigate this. At the moment I have no idea why.

Ignore this comment, it seems the memory reporting with D3D9/D2D interop is flaky. I'm fixing this, but sadly it leaves us with no lead to the real OOM, I'm still not crashing fwiw.
(In reply to comment #22)
> > since it allocates a relatively large block
> 
> Bas, is it a fixed-size block?  Or is its size under content control?

It depends on the size of the gfxImageSurface that it's mirroring.
(In reply to comment #20)
> Someone should probably dupe bug 589195 one way or the other.

Not that in that bug, it was a real OOM, I included in the bug where the memory was being allocated from.
> It depends on the size of the gfxImageSurface that it's mirroring.

OK.  So we're using a fallible allocation for it, right?
(In reply to comment #26)
> > It depends on the size of the gfxImageSurface that it's mirroring.
> 
> OK.  So we're using a fallible allocation for it, right?

Yes, it's internal to D2D, we could in theory throw if that allocation returns an E_OUTOFMEMORY.

Except for a bug in D2D/D3D9 memory reporting, I'm seeing no indication an excessive amount of memory being used(Well, roughly 500MB, but the test uses that no matter what prefs I use). Nor can I reproduce the crash that suggests an OOM.
Assignee: bas.schouten → nobody
Status: ASSIGNED → UNCONFIRMED
Ever confirmed: false
In FF 4.0b6, a lot of OOM crashes have this crashing thread :
0 	mozalloc.dll 	mozalloc_abort 	memory/mozalloc/mozalloc_abort.cpp:77
1 	mozalloc.dll 	mozalloc_handle_oom 	memory/mozalloc/mozalloc_oom.cpp:54
2 	xul.dll 	_cairo_d2d_create_brush_for_pattern 	gfx/cairo/cairo/src/cairo-d2d-surface.cpp:1896
3 	xul.dll 	_cairo_d2d_fill 	gfx/cairo/cairo/src/cairo-d2d-surface.cpp:3226
4 	xul.dll 	_cairo_surface_fill 	gfx/cairo/cairo/src/cairo-surface.c:2216
5 	xul.dll 	_cairo_gstate_fill 	gfx/cairo/cairo/src/cairo-gstate.c:1184
6 	xul.dll 	_moz_cairo_fill_preserve 	gfx/cairo/cairo/src/cairo.c:2338
7 	xul.dll 	gfxWindowsNativeDrawing::PaintToContext 	gfx/thebes/gfxWindowsNativeDrawing.cpp:310
8 	xul.dll 	nsObjectFrame::PaintPlugin 	layout/generic/nsObjectFrame.cpp:1799
9 	xul.dll 	nsDisplayPlugin::Paint 	layout/generic/nsObjectFrame.cpp:1184
10 	xul.dll 	mozilla::FrameLayerBuilder::DrawThebesLayer 	layout/base/FrameLayerBuilder.cpp:1507
....

So it is typical of an OOM crash when D2D is enabled.

More report at :
http://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2010-09-22%2012%3A00%3A00&signature=mozalloc_handle_oom%28%29&version=Firefox%3A4.0b6
Summary: [D2D] Wilderness Downtown experiment crash [@ cairo_d2d_create_brush_for_pattern ][@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] → [D2D] OOM crash [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ]
Removing D2D as I have no indication at the moment that the actual OOM is related to D2D, nor does any of the comments on the bug suggest so. If anybody else reproduces this consistently with D2D, but not without it, please re-add it.
Summary: [D2D] OOM crash [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ] → OOM thewildernessdowntown.com [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ]
> If anybody else reproduces this consistently with D2D, but not without it,
> please re-add it.
I will.
CPU : Pentium T4300, GPU : Intel GMA 4500MHD
* Without D2D (new profile, gfx.direct2d.disabled set to true), it does not consistently crash:
average CPU usage: 60%, RAM usage: 1.7GB
* With D2D (new profile), it crashes consistently :
average CPU usage: 60%, RAM usage: 2GB with a sudden peak to 2.45 GB just before crash happens
I have more information for the crash condition :
It always happens 1 min 30 sec after the beginning of the film, when a Google Street view popup window appears for the first time.
(In reply to comment #30)
> > If anybody else reproduces this consistently with D2D, but not without it,
> > please re-add it.
> I will.
> CPU : Pentium T4300, GPU : Intel GMA 4500MHD
> * Without D2D (new profile, gfx.direct2d.disabled set to true), it does not
> consistently crash:
> average CPU usage: 60%, RAM usage: 1.7GB
> * With D2D (new profile), it crashes consistently :
> average CPU usage: 60%, RAM usage: 2GB with a sudden peak to 2.45 GB just
> before crash happens

Okay, so I think it's still not D2D related. The thing is D2D uses slightly more memory for the internal image cache and some other stuff. 1.7GB is on the edge of the limit of our address space. But what these results looks like is that we're just using way too much memory, and D2D is giving us the 15-20% extra needed to push us over the edge.
As a minor detail I might add memory usage doesn't go over 700 MB, with, or without D2D, it seems.
So you think the fact that is 100% reliably OOMs in D2D code for these users is just a fluke?
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #34)
> So you think the fact that is 100% reliably OOMs in D2D code for these users is
> just a fluke?

I'm pretty sure this stack trace is one of the biggest and most frequent single block allocation we do. If we do get an OOM, this is by far the most likely stack trace.
(In reply to comment #35)
> Created attachment 477904 [details]
> Memory usage with D2D ON during TWDT

Did you happen to use taskbar previews when taking this screenshot? I think this is a bit poluted by the taskbar preview memory usage problem we have with D3D. This could explain the relatively high working set.

In any case neither of these screenshots should be sufficient memory usage to cause an OOM. The significantly higher private byte usage in the D2D case is somewhat surprising, but so is the fact malloc/mapped is higher on one screenshot than the other, since I don't think D2D affects that particular statistic. And that would add to our private bytes.
I compared D2D with D3D9 to GDI + BasicLayers, to ensure HW accel was not a factor at all. I tried to keep everything constant but the memory usage fluctuates quite actively making it a little tricky. The memory usage wasn't too different, although 15-20% higher with D2D I'd say. However the performance with D2D was significantly worse, opening a separate bug on that.
Since the landing of bug 599118 in b7pre/20100925, it does not crash anymore for me.
(In reply to comment #41)
> Since the landing of bug 599118 in b7pre/20100925, it does not crash anymore
> for me.

Note that the -potential- is still there. Bug 599118 reduces the memory usage of our drawing code a bit. The demo presumably still consumed a large amount of memory for the people it was crashing for before.
We don't believe there's anything particularly graphics-specific in this bug.
Component: Graphics → General
QA Contact: thebes → general
Depends on: 601183
No longer depends on: 601183
Duplicate of this bug: 603360
At this point, we're not going to block on the meta-bug. We've found much better data in several other bugs which are blockers (real memory usage, and VM usage due to memory-mapped font files). I believe this problem will mostly resolve itself with those.
blocking2.0: ? → -
Sadly, I can't seem to get any address to load.
(In reply to comment #47)
> Sadly, I can't seem to get any address to load.

Try this

16 Empire Way, Thornlie WA Australia
(In reply to comment #48)
> (In reply to comment #47)
> > Sadly, I can't seem to get any address to load.
> 
> Try this
> 
> 16 Empire Way, Thornlie WA Australia

Stuck at 94% like with any other address :s
(In reply to comment #49)
> Stuck at 94% like with any other address :s

For me with HW acceleration FF stuck in 94 % too, if I reload the tab and inform the same address it goes forward to 97 % . 
And some times (in the process of loading the movie) FF crash.

In Safe Mode the move (for the same address) is 100 % loaded and is showed fine.
Crash Signature: [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ]
Severity: normal → critical
Crash Signature: [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ] → [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ]
is this another example?  
mozalloc_abort(char const* const) | mozalloc_handle_oom() | _cairo_d2d_create_brush_for_pattern
bp-3ca2d8bd-d9b5-4837-a31f-d465c2110903
0	mozalloc.dll	mozalloc_abort	memory/mozalloc/mozalloc_abort.cpp:78
1	mozalloc.dll	mozalloc_handle_oom	memory/mozalloc/mozalloc_oom.cpp:54
2	xul.dll	_cairo_d2d_create_brush_for_pattern	gfx/cairo/cairo/src/cairo-d2d-surface.cpp:1980
3	xul.dll	_cairo_d2d_fill	gfx/cairo/cairo/src/cairo-d2d-surface.cpp:3545
4	xul.dll	_cairo_surface_fill	gfx/cairo/cairo/src/cairo-surface.c:2219
5	xul.dll	_cairo_gstate_fill	gfx/cairo/cairo/src/cairo-gstate.c:1184
Crash Signature: [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ] → [@ cairo_d2d_create_brush_for_pattern ] [@ _cairo_d2d_create_brush_for_pattern(_cairo_d2d_surface*, _cairo_pattern const*, bool) ] [@ mozalloc_handle_oom() ] [@ _cairo_d2d_create_brush_for_pattern ] [@ mozalloc_handle_oom ]
Closing because no crash reported since 12 weeks.
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.