Closed Bug 905227 Opened 6 years ago Closed 6 years ago

Turn on Skia/GL on B2G for large enough canvases

Categories

(Core :: Canvas: 2D, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla26

People

(Reporter: bjacob, Assigned: bjacob)

References

(Depends on 1 open bug)

Details

Attachments

(5 files, 2 obsolete files)

We need to avoid using Skia/GL, and instead fall back to software Skia, for sufficiently small canvases.

Reasons include: small canvases don't benefit much from hw acceleration anyway, and there can even be hw decceleration; and (that point stands until we get GLContext multiplexing) each canvas currently uses a separate OpenGL context, which is about 1 M of extra memory usage by itself on Adreno drivers at least.

The particular choice of threshold size can be debated; "128x128" has been proposed and sounds sane; we should look into what threshold other browsers (especially Chrome/Android) use and consider using the same; another data point is that the icons on the Everything.me homescreen are made of THREE canvases each, so we definitely want our threshold to be larger than their size so that they are using software skia.
Blocks: 905214
Try push in comment 1 has most reftests failing with one of the two pages rendered blank. One hypothesis is that this is caused by our aggressive demotion of 'old' contexts (reftests use 800x1000 contexts, so they still use skia/GL). Here's a try push with the number of live contexts upped from 2 to 4 contexts:

https://tbpl.mozilla.org/?tree=Try&rev=f6aceabf9055
Now everything is back to failing with blank canvases with random bad pixels at the bottom... exactly like before we had BGRA support in the emulator on the test slaves :-/
Can't debug much locally because of regressions on b2g master, but here is a tryserver push that should log to the dumpfile some relevant info, in particular whether the OpenGL implementation it's running has the BGRA extension...

https://tbpl.mozilla.org/?tree=Try&rev=c0f5bef47355
The log at least confirms that the BGRA extension is deployed on the slave.

But the failures are really strange, and the log seems incomplete: it only has IMAGE 1 but not IMAGE 2 to compare with.

For example:

https://tbpl.mozilla.org/php/getParsedLog.php?id=27347517&tree=Try&full=1#error21
Of course, locally, everything passes... so my ability to test things locally is useless.
This new try push,
https://tbpl.mozilla.org/?tree=Try&rev=2ffdb460b644

has a ton of logging. It should be able to log things directly from inside the skia implementation, while running reftests, on the test slaves.

Also, it has basically MOZ_GL_DEBUG_ABORT_ON_ERRROR hard-enabled, and some logging of texImage2D and readPixels calls in GLContext.h.

Locally everything passes, so it should be interesting to compare logs.
Woohoo! Now here's something concrete. We get a GL_ERROR_INVALID_FRAMEBUFFER_OPERATION, which explains why nothing gets rendered correctly.

https://tbpl.mozilla.org/php/getParsedLog.php?id=27379141&tree=Try&full=1#error3

09:23:01     INFO -  XXX DrawWindow on Canvas2D 0x44c20400
09:23:01     INFO -  XXX ENTER Canvas2D 0x44c20400 EnsureTarget size=800x1000
09:23:01     INFO -  XXX CheckSizeForSkiaGL size=800x1000 minsize=128 result=1
09:23:01     INFO -  EGL Config: 0 [0x0]
09:23:01     INFO -    BUFFER_SIZE: 16 (0x0010)
09:23:01     INFO -    ALPHA_SIZE: 0 (0x0000)
09:23:01     INFO -    BLUE_SIZE: 5 (0x0005)
09:23:01     INFO -    GREEN_SIZE: 6 (0x0006)
09:23:01     INFO -    RED_SIZE: 5 (0x0005)
09:23:01     INFO -    DEPTH_SIZE: 0 (0x0000)
09:23:01     INFO -    STENCIL_SIZE: 0 (0x0000)
09:23:01     INFO -    CONFIG_CAVEAT: 12344 (0x3038)
09:23:01     INFO -    CONFIG_ID: 1 (0x0001)
09:23:01     INFO -    LEVEL: 0 (0x0000)
09:23:01     INFO -    MAX_PBUFFER_HEIGHT: 8192 (0x2000)
09:23:01     INFO -    MAX_PBUFFER_PIXELS: 8192 (0x2000)
09:23:01     INFO -    MAX_PBUFFER_WIDTH: 8192 (0x2000)
09:23:01     INFO -    NATIVE_RENDERABLE: 0 (0x0000)
09:23:01     INFO -    NATIVE_VISUAL_ID: 4 (0x0004)
09:23:01     INFO -    NATIVE_VISUAL_TYPE: 12344 (0x3038)
09:23:01     INFO -    PRESERVED_RESOURCES: ERROR (0x3004)
09:23:01     INFO -    SAMPLES: 0 (0x0000)
09:23:01     INFO -    SAMPLE_BUFFERS: 0 (0x0000)
09:23:01     INFO -    SURFACE_TYPE: 5 (0x0005)
09:23:01     INFO -    TRANSPARENT_TYPE: 12344 (0x3038)
09:23:01     INFO -    TRANSPARENT_RED_VALUE: 0 (0x0000)
09:23:01     INFO -    TRANSPARENT_GREEN_VALUE: 0 (0x0000)
09:23:01     INFO -    TRANSPARENT_BLUE_VALUE: 0 (0x0000)
09:23:01     INFO -    BIND_TO_TEXTURE_RGB: 0 (0x0000)
09:23:01     INFO -    BIND_TO_TEXTURE_RGBA: 0 (0x0000)
09:23:01     INFO -    MIN_SWAP_INTERVAL: 1 (0x0001)
09:23:01     INFO -    MAX_SWAP_INTERVAL: 10 (0x000a)
09:23:01     INFO -    LUMINANCE_SIZE: 0 (0x0000)
09:23:01     INFO -    ALPHA_MASK_SIZE: 0 (0x0000)
09:23:01     INFO -    COLOR_BUFFER_TYPE: 0 (0x0000)
09:23:01     INFO -    RENDERABLE_TYPE: 5 (0x0005)
09:23:01     INFO -    CONFORMANT: 5 (0x0005)
09:23:01     INFO -  OpenGL version detected: 200
09:23:01     INFO -  XXX GLContext 0x44cf3800 InitExtensions GL_OES_EGL_image GL_OES_depth24 GL_OES_depth32 GL_OES_element_index_uint GL_OES_texture_float GL_OES_texture_float_linear GL_OES_compressed_paletted_texture GL_OES_compressed_ETC1_RGB8_texture GL_OES_depth_texture GL_OES_texture_half_float GL_OES_texture_half_float_linear GL_OES_packed_depth_stencil GL_OES_vertex_half_float GL_EXT_texture_format_BGRA8888
09:23:01     INFO -  XXX texImage2D width=800 height=1000 internalformat=0x1908 format=0x1908 type=0x8033
09:23:01     INFO -  XXX glContext = 0x44cf3800
09:23:01     INFO -  XXX CreateGrGLInterfaceFromGLContext with GLContext 0x44cf3800
09:23:01     INFO -  --- GL-Specific ---
09:23:01     INFO -  Stencil Format 0, stencil bits: 08, total bits: 08
09:23:01     INFO -  Stencil Format 1, stencil bits: 08, total bits: 32
09:23:01     INFO -  MSAA Type: None
09:23:01     INFO -  Max FS Uniform Vectors: 16
09:23:01     INFO -  Max Vertex Attributes: 16
09:23:01     INFO -  Support RGBA8 Render Buffer: NO
09:23:01     INFO -  BGRA support: YES
09:23:01     INFO -  BGRA is an internal format: YES
09:23:01     INFO -  Support texture swizzle: NO
09:23:01     INFO -  Unpack Row length support: NO
09:28:32     INFO -  Unpack Flip Y support: NO
09:28:32     INFO -  Pack Row length support: NO
09:28:32     INFO -  Pack Flip Y support: NO
09:28:32     INFO -  Texture Usage support: NO
09:28:32     INFO -  Texture Storage support: NO
09:28:32     INFO -  GL_R support: NO
09:28:32     INFO -  GL_ARB_imaging support: NO
09:28:32     INFO -  Two Format Limit: YES
09:28:32     INFO -  Fragment coord conventions support: NO
09:28:32     INFO -  Vertex array object support: NO
09:28:32     INFO -  Use non-VBO for dynamic data: NO
09:28:32     INFO -  Core Profile: NO
09:28:32     INFO -  XXX LEAVE Canvas2D 0x44c20400 EnsureTarget
09:28:32     INFO -  XXX readPixels GLContext 0x46cb1800 x=0 y=0 width=800 height=1000 format=0x1908 type=0x1401
09:28:32     INFO -  XXX RAW readPixels GLContext 0x46cb1800 x=0 y=0 width=800 height=1000 format=0x1908 type=0x1401
09:28:32     INFO -  XXX texImage2D width=1024 height=1024 internalformat=0x80e1 format=0x80e1 type=0x1401
09:28:32     INFO -  XXX texImage2D width=800 height=1000 internalformat=0x80e1 format=0x80e1 type=0x1401
09:28:32     INFO -  GL ERROR: void mozilla::gl::GLContext::raw_fDrawElements(GLenum, GLsizei, GLenum, const GLvoid*) generated GL error GL_INVALID_FRAMEBUFFER_OPERATION(0x0506)
Looks like a driver bug on the host...

From the same log:

09:28:35     INFO -  09-04 16:21:52.239   658   679 E GeckoConsole: Version: OpenGL ES 2.0 (3.2.0 NVIDIA 190.42)

This is the very, very old NVIDIA driver from 2009 that I thought we weren't using anymore... and it doesn't seem to support what Skia/GL is doing here.

Working on figuring the exact sequence of GL calls leading to this GL error to confirm that it is a driver bug.
Rail, or Joel, do you know if all the B2G emulator test slaves use this NVIDIA 190.42 driver, and if yes, do you know if it would be feasible to update it in a few days (I'm running for a B2G 1.2 deadline end of this week) ?
Flags: needinfo?(rail)
Flags: needinfo?(jmaher)
Flags: needinfo?(bhearsum)
(In reply to Benoit Jacob [:bjacob] from comment #12)
> Rail, or Joel, do you know if all the B2G emulator test slaves use this
> NVIDIA 190.42 driver, and if yes, do you know if it would be feasible to
> update it in a few days (I'm running for a B2G 1.2 deadline end of this
> week) ?

I don't think we'll be doing any driver updates on these slaves (talos-r3-fed). AFAICT, we run these tests on Ubuntu already, maybe we can just shut them off on Fedora? Any idea about that, Joel?
Flags: needinfo?(bhearsum)
my understanding is all of the b2g emulator tests are on ec2 linux slaves, not real physical machine fedora slaves.  As for updating the driver on there, we would need to test this out with all desktop tests and b2g tests before updating the driver.  There is a chance that we could cause desktop tests to fail in the process.

:ahal, you know a lot more about b2g tests, any additional information to add?
Flags: needinfo?(jmaher) → needinfo?(ahalberstadt)
(In reply to Ben Hearsum [:bhearsum] from comment #13)
> (In reply to Benoit Jacob [:bjacob] from comment #12)
> > Rail, or Joel, do you know if all the B2G emulator test slaves use this
> > NVIDIA 190.42 driver, and if yes, do you know if it would be feasible to
> > update it in a few days (I'm running for a B2G 1.2 deadline end of this
> > week) ?
> 
> I don't think we'll be doing any driver updates on these slaves
> (talos-r3-fed). AFAICT, we run these tests on Ubuntu already, maybe we can
> just shut them off on Fedora? Any idea about that, Joel?

In fact, I was just told by Aki that we've already considered that and that it should happen in the next reconfig. That should be happening a little bit later today. That's being tracked in bug 911953.
We still run reftests on the old Fedora slaves for B2G; they need real hardware, we've been told, in order to test the most relevant code paths.
Flags: needinfo?(ahalberstadt)
(In reply to Jonathan Griffin (:jgriffin) from comment #16)
> We still run reftests on the old Fedora slaves for B2G; they need real
> hardware, we've been told, in order to test the most relevant code paths.

But...we have Ubuntu hardware machines too:
[cltbld@talos-linux64-ix-001.test.releng.scl3.mozilla.com ~]$ cat /etc/issue
Ubuntu 12.04 LTS \n \l

Kickstart Date: Mon Jun 17 09:02:00 PDT 2013
Kickstart OS: Ubuntu 12.04
System Installed: Mon Jun 17 09:32:31 PDT 2013
We do, but we never got the emulator tests running correctly on them.  The emulator wouldn't start for some reason and we never were able to figure out why.
(In reply to Jonathan Griffin (:jgriffin) from comment #16)
> We still run reftests on the old Fedora slaves for B2G; they need real
> hardware, we've been told, in order to test the most relevant code paths.

The B2G emulator *is* the "hardware" that B2G reftests are run on; the underlying host doesn't matter as long as it's bug-free. Another way to put it is that real desktop hardware is very different from B2G hardware anyway.
This try push has some logging that will hopefully allow to prove that it's indeed a driver bug (logs GL calls relevant to getting an INVALID_FRAMEBUFFER_OPERATION) and maybe give us some clue into whether we could work around it.

https://tbpl.mozilla.org/?tree=Try&rev=6cf357a14b14
I think jgriffin and ahal provided all needed info. We still have to run b2g emulator tests on the old hardware (mac mini rev 3) and Fedora 12. There is some info in bug 859867.
Flags: needinfo?(rail)
Thanks everyone for the answers.

Back to debugging / proving it's actually a driver bug / trying to find a work-around:

This try push contains more instrumentation than the one in comment 20, hopefully proving it's a driver bug:

https://tbpl.mozilla.org/?tree=Try&rev=b6ba8d62f0cf

This try push is the same, except that it also changes BGRA to RGBA in TexImage2D and TexSubImage2D, in order to test the theory that this might be an driverissue specificly with BGRA:

https://tbpl.mozilla.org/?tree=Try&rev=97aa32297688
These try pushes
 - show that BGRA vs RGBA makes no difference at all;
 - show that the problem is with the primary framebuffer for this off-screen context, not with the other FBO that wants to use BGRA, anyway;
 - suggest that the problem with the primary framebuffer is rather that the color attachment being RGBA4 (as we do on GLES2) displeases the desktop NVIDIA chip. In fact, while RGBA4 is a mandatory format on GLES2, it is not mandatory to support it on desktop GPUs.

So that may not be a driver bug, but rather, a great example of how desktop vs ES (embedded) OpenGL significantly differ, making real desktop OpenGL hardware a poor place to run these tests, re: comment 19.
...that being said, while RGBA4 is not mandatory in desktop OpenGL 2, it is a very common feature, e.g. my desktop OpenGL linux Mesa 8.0 Intel driver supports it, and this NVIDIA document claims that it's always been supported in their drivers:

http://http.download.nvidia.com/developer/OpenGL_Texture_Formats/nv_ogl_texture_formats.pdf

It could be that this particular driver version has a bug that makes it refuse to render into a RGBA4 texture...
This try push hopefully confirms the theory that the failure is caused by the driver refusing to render into a RGBA4 texture used as the primary framebuffer for this GLContext:

https://tbpl.mozilla.org/?tree=Try&rev=f80507115899
And here is a try push that makes that behave as if running on desktop OpenGL, using a 32bpp texture instead of that 16bpp texture... let's see if the NVIDIA card likes that better.

https://tbpl.mozilla.org/?tree=Try&rev=fade753b458a
Summary: Do not use Skia/GL for canvases smaller than a certain size → Turn on Skia/GL on B2G for large enough canvases
For future reference.

This shows that the NVIDIA host doesn't agree to rendering to a FBO backed by a RGBA4 texture (type=0x8033).
Great news: the tryserver run in comment 26, using RGBA8 instead of RGBA4, succeeds. So that's our work-around.

Furthermore, from talking with :jrmuizel and :vladv, it seems that we should never have been using RGBA4 for Canvas 2D anyway. For example, getImageData returning RGBA8 arguably implies that canvas 2D users expect their canvases to be backed by RGBA8.

We also agreed to stop using RGBA4 to back off-screen buffers altogether, even for WebGL, until we can figure a way to let content opt in for it.
So, we are actually not using RGBA4 by default.

We are only using RGBA4 on ES when the OES_rgb8_rgba8 extension is unsupported.

Here is a try push that ignores the extension string and acts as if OES_rgb8_rgba8 were supported:
https://tbpl.mozilla.org/?tree=Try&rev=2b3168df5c20

I wonder though. OES_rgb8_rgba8 indicates whether rgba8 *renderbuffers* are supported. Indeed, we do use a renderbuffer as color attachment. Why is that? Is that for multisampling?
Yup, the emulator's GL implementation's glRenderbufferStorage will just forward RGBA/UNSIGNED to the host GL,

https://github.com/mozilla-b2g/android-sdk/blob/emu-fix/emulator/opengl/host/libs/Translator/GLES_V2/GLESv2Imp.cpp#L1625

and since it only runs on desktop GL anyway, RGBA8 renderbuffers should Just Work i.e. the omission of OES_rgb8_rgba8 in the extensions string seems to really be just that, an omission.

(Also, in the neighbor directory GLES_CM, this extension is exposed in the extension string).
Near-success, just some UNEXPECTED-PASS in WebGL reftests, let's adjust the reftest.list:

https://tbpl.mozilla.org/?tree=Try&rev=876653b7fb47
Comment on attachment 798035 [details] [diff] [review]
Introduce gfx.canvas.min-size-for-skia-gl preference

Review of attachment 798035 [details] [diff] [review]:
-----------------------------------------------------------------

Review time!
Attachment #798035 - Flags: review?(gwright)
Attachment #798036 - Flags: review?(gwright)
That's the trick that did it. See above investigation. The Android Emulator doesn't list it in its extensions string, but it tautologically supports it, because it only runs on desktop OpenGL hosts where this is always available, and looking at its implementation of glRenderbufferStorage (see link above) it just forwards it to the host.
Attachment #800429 - Flags: review?(gwright)
Just removing a fails-if(B2G) that's no longer needed.
Attachment #800431 - Flags: review?(gwright)
Attachment #798035 - Flags: review?(gwright) → review+
Attachment #798036 - Attachment is obsolete: true
Attachment #798036 - Flags: review?(gwright)
Attachment #800740 - Flags: review?(nical.bugzilla)
Attachment #800739 - Flags: review?(nical.bugzilla) → review+
Attachment #800740 - Flags: review?(nical.bugzilla) → review+
Attachment #800429 - Flags: review?(gwright) → review+
Attachment #800431 - Flags: review?(gwright) → review+
Depends on: 956432
Depends on: 957276
Duplicate of this bug: 758845
You need to log in before you can comment on or make changes to this bug.