Closed Bug 985122 Opened 10 years ago Closed 10 years ago

Cupcakes vs Veggies crashes on GL_OUT_OF_MEMORY on v1.4

Categories

(Core :: Graphics: CanvasWebGL, defect)

30 Branch
ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 982237
blocking-b2g 1.4+

People

(Reporter: bkelly, Assigned: milan)

References

Details

(Keywords: regression)

While working bug 982237 I noticed that Cupcakes vs. Veggies seems to die on 1.4 in a different way than 1.3.

In my logs I see the following for the game:

I/Gecko   (  742): Attempting load of libEGL.so
I/Adreno200-EGL(  742): <qeglDrvAPI_eglInitialize:295>: EGL 1.4 QUALCOMM build: Nondeterministic AU_msm7627a_B2G/ICS_STRAWBERRY_Merge_release_AU (Merge)
I/Adreno200-EGL(  742): Build Date: 11/24/13 Sun
I/Adreno200-EGL(  742): Local Branch: mybranch1997719
I/Adreno200-EGL(  742): Remote Branch: quic/b2g_ics_1.2
I/Adreno200-EGL(  742): Local Patches: NONE
I/Adreno200-EGL(  742): Reconstruct Branch: NOTHING
E/QCALOG  (  186): [MessageQ] ProcessNewMessage: [XTWiFi-PE] unknown deliver target [OS-Agent]
W/Adreno200-ES20(  742): <qgl2DrvAPI_glFramebufferRenderbuffer:2569>: GL_OUT_OF_MEMORY
W/Adreno200-ES20(  742): <qgl2DrvAPI_glFramebufferRenderbuffer:2569>: GL_OUT_OF_MEMORY
I/Gecko   (  526):
I/Gecko   (  526): ###!!! [Parent][MessageChannel] Error: Channel error: cannot send/recv

Interestingly, I see that the parent process already opened libEGL.so successfully:

I/Gecko   (  526): Attempting load of libEGL.so
I/Adreno200-EGL(  526): <qeglDrvAPI_eglInitialize:295>: EGL 1.4 QUALCOMM build: Nondeterministic AU_msm7627a_B2G/ICS_STRAWBERRY_Merge_release_AU (Merge)
I/Adreno200-EGL(  526): Build Date: 11/24/13 Sun
I/Adreno200-EGL(  526): Local Branch: mybranch1997719
I/Adreno200-EGL(  526): Remote Branch: quic/b2g_ics_1.2
I/Adreno200-EGL(  526): Local Patches: NONE
I/Adreno200-EGL(  526): Reconstruct Branch: NOTHING
E/QCALOG  (  186): [MessageQ] ProcessNewMessage: [XTWiFi-PE] unknown deliver target [OS-Agent]
I/HWComposer(  526): Creating new instance
E/GeckoConsole(  526): OpenGL compositor Initialized Succesfully.
E/GeckoConsole(  526): Version: OpenGL ES 2.0 V@6.0 AU@ (CL@)
E/GeckoConsole(  526): Vendor: Qualcomm
E/GeckoConsole(  526): Renderer: Adreno (TM) 200
E/GeckoConsole(  526): FBO Texture Target: TEXTURE_2D

I don't understand why an app would end up opening libEGL.so.  I thought we tried to keep all GL calls in the compositor thread in the parent.

This is with:

mozilla-central gecko:  803a735d9cf2
gaia master:            6826cb8e2589651fbe3ec15b4fa9cb9a372de4cf
If I understood correctly, Benoit tells me that if WebGL is used then the client process will open libEGL.so itself.
See Also: → 982237
Component: General → Canvas: WebGL
Product: Firefox OS → Core
Version: unspecified → 30 Branch
Keywords: regression
Keywords: perf
Whiteboard: [MemShrink]
Whiteboard: [MemShrink]
Nicholas - This is an OOM, which is a memory-based issue. It's the 1.4 version of bug 982237.
Whiteboard: [MemShrink]
I think we were looking to have someone from #gfx take a look and understand why we are getting errors from libEGL.so before treating this as a general memory issue.  Graphics memory is a bit of different beast.

For example, I did not see any large gralloc allocations occurring with instrumentation enabled, so its not clear that gecko is the source of the issue.  This could be coming from within libEGL.so.
Fair enough, I'll hold off on the flags then.
Keywords: perf
Whiteboard: [MemShrink]
Bas, do you know who might be able to look at this or who I should ask?  I understand Milan is still out.
Flags: needinfo?(bas)
(In reply to Ben Kelly [:bkelly] from comment #5)
> Bas, do you know who might be able to look at this or who I should ask?  I
> understand Milan is still out.

I'll try and figure out later today who can have a look.
Flags: needinfo?(bas)
Ben, the fix you're proposing in bug 982237 doesn't help here?
On gonk, GL_OUT_OF_MEMORY happened in Bug 915001. Bug 915001 Comment 16 might help to understand the problem.
blocking-b2g: 1.4? → 1.4+
Assignee: nobody → milan
(In reply to Milan Sreckovic [:milan] from comment #7)
> Ben, the fix you're proposing in bug 982237 doesn't help here?

No.  The app doesn't even get to the point of trying the 2048x2048 canvas as far as I can tell.  I never see my instrumentation show a gralloc request of that size.  So my proposed fix in bug 982237 won't come into play.

I haven't investigated any further than that here because my main focus was trying to get the fix in for the 1.3 blocker.
All of the action on this issue is in the 1.3 bug 982237.
I still believe this is a different failure than bug 982237.

One possible clue:

James mentioned in bug 982237 comment 73 that we now share a GLContext in CanvasRenderingContext2D::EnsureTarget() instead of creating a new one after bug 939276.
I see it seemingly allocating two 2048x2048 canvases and then a 960x536 one:

I/Gecko   (  714): SNORP: resetting canvas 0x43bd0400
W/Adreno200-ES20(  714): <qgl2DrvAPI_glFramebufferRenderbuffer:2569>: GL_OUT_OF_MEMORY
I/Gecko   (  714): SNORP: created new GL draw target 2048x2048 for canvas 0x43bd0400
I/Gecko   (  714): SNORP: resetting canvas 0x42f3d000
W/Adreno200-ES20(  714): <qgl2DrvAPI_glFramebufferRenderbuffer:2569>: GL_OUT_OF_MEMORY
I/Gecko   (  714): SNORP: created new GL draw target 2048x2048 for canvas 0x42f3d000
I/Gecko   (  638): SNORP: resetting canvas 0x43adf000
I/Gonk    (  139): Setting nice for pid 739 to 18
I/Gonk    (  139): Changed nice for pid 739 from 0 to 18.
I/Gecko   (  714): SNORP: created new GL draw target 960x536 for canvas 0x43155800

the GL_OUT_OF_MEMORY errors would be in response to creating the backing texture in DrawTargetSkia. This app really just needs fixed to not attempt to destroy the phone. I am not sure if we can work around every possible attempt to run the device out of memory.
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #12)
> the GL_OUT_OF_MEMORY errors would be in response to creating the backing
> texture in DrawTargetSkia. This app really just needs fixed to not attempt
> to destroy the phone. I am not sure if we can work around every possible
> attempt to run the device out of memory.

That's a reasonable position in general, but this app used to work in v1.1.  It would seem that our limitations on the app should not become more constrained over time.

Would it be possible to detect "unreasonable" behavior and fallback to a software canvas?
(In reply to Ben Kelly [:bkelly] from comment #13)
> (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #12)
> > the GL_OUT_OF_MEMORY errors would be in response to creating the backing
> > texture in DrawTargetSkia. This app really just needs fixed to not attempt
> > to destroy the phone. I am not sure if we can work around every possible
> > attempt to run the device out of memory.
> 
> That's a reasonable position in general, but this app used to work in v1.1. 
> It would seem that our limitations on the app should not become more
> constrained over time.

It's a trade-off, right? Sometimes in order to do something faster, you need to use more resources.

> 
> Would it be possible to detect "unreasonable" behavior and fallback to a
> software canvas?

I was thinking today that maybe (at least on b2g) we should default to a software canvas and then only promote it to a hardware one later on if we decide that it's worthwhile, based on app workload (lots of drawImage, for instance) and available system resources. In the very short term, maybe it would help to put a maximum size limit for an accelerated canvas. Say...no larger than the screen?
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #14)
> > Would it be possible to detect "unreasonable" behavior and fallback to a
> > software canvas?
> 
> I was thinking today that maybe (at least on b2g) we should default to a
> software canvas and then only promote it to a hardware one later on if we
> decide that it's worthwhile, based on app workload (lots of drawImage, for
> instance) and available system resources. In the very short term, maybe it
> would help to put a maximum size limit for an accelerated canvas. Say...no
> larger than the screen?

I think that makes a lot of sense.  It would probably solve the issues here and in bug 982237.

I can work up a patch and test today if you'd like.  I assume the check would be in EnsureTarget()?  Is that OMT, so can use prefs easily?
Hmm, probably just easiest to add a max value to the conditional in |gfxPlatform::CheckSizeForSkiaGL()|.
Oops.  That should be |CanvasRenderingContext2D::CheckSizeForSkiaGL()|.
While the failures are different, the patch landing in bug 982237 fixes this as well.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.