Last Comment Bug 721489 - Older Adreno 200 drivers intermittently crash when uploading RGB565 textures with glTexImage2D
: Older Adreno 200 drivers intermittently crash when uploading RGB565 textures ...
Status: RESOLVED FIXED
mwc-demo
:
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: unspecified
: ARM Android
: -- critical (vote)
: ---
Assigned To: Benoit Girard (:BenWa)
:
Mentors:
: 623501 724953 731174 (view as bug list)
Depends on: 750590
Blocks: 728249 619615 698673
  Show dependency treegraph
 
Reported: 2012-01-26 12:35 PST by George Wright (:gw280) (:gwright)
Modified: 2012-04-30 19:31 PDT (History)
13 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Part 1: Allocate a PoT Shmem for the Adreno (14.44 KB, patch)
2012-02-09 14:41 PST, Benoit Girard (:BenWa)
no flags Details | Diff | Review
Make gfxSharedImageSurface divisible by 64 on mobile (2.89 KB, patch)
2012-02-10 12:11 PST, Benoit Girard (:BenWa)
cjones.bugs: review-
Details | Diff | Review
Part 1: support stride in gfxSharedImageSurface (5.95 KB, patch)
2012-02-13 12:04 PST, Benoit Girard (:BenWa)
cjones.bugs: review-
Details | Diff | Review
Part 2: Request stride from gfxSharedImageSurface (14.27 KB, patch)
2012-02-13 13:49 PST, Benoit Girard (:BenWa)
no flags Details | Diff | Review
Support stride in gfxSharedImageSurface (6.81 KB, patch)
2012-02-14 08:56 PST, Benoit Girard (:BenWa)
cjones.bugs: review+
Details | Diff | Review
Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues (15.68 KB, patch)
2012-02-15 16:56 PST, George Wright (:gw280) (:gwright)
no flags Details | Diff | Review
Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=? (18.51 KB, patch)
2012-02-17 14:43 PST, George Wright (:gw280) (:gwright)
jmuizelaar: superreview-
Details | Diff | Review
Patch fixed to be off mozilla-central rather than maple (18.61 KB, patch)
2012-02-18 11:41 PST, George Wright (:gw280) (:gwright)
no flags Details | Diff | Review
Update patch yet again to handle canvas layers properly... (22.11 KB, patch)
2012-02-18 16:20 PST, George Wright (:gw280) (:gwright)
jmuizelaar: review+
Details | Diff | Review
Finally final patch. Add support to pref POT textures (23.34 KB, patch)
2012-02-18 17:57 PST, George Wright (:gw280) (:gwright)
jacob.benoit.1: review+
Details | Diff | Review

Description George Wright (:gw280) (:gwright) 2012-01-26 12:35:14 PST
This may be an issue with older drivers and using NPOT textures. Basically it looks like the driver is reading beyond the bounds of the pixel buffer we pass to it via glTexImage2D.

This is not an issue on later Adreno 200 drivers, but whatever is on the Toronto Nexus One is definitely affected.

We need to find an acceptable workaround for this if we want to support GL layers on Adreno 200 devices in the wild, as we have no control or guarantee the drivers are up to date at all.

We have no guarantee that it is to do with NPOT but all signs are pointing to that. Therefore switching to use POT textures may fix it. A quick workaround that is known to stop it from crashing is to allocate a buffer of size 4 bytes * width * height and memcpying the pixel data to the first half of that buffer.
Comment 1 Benoit Girard (:BenWa) 2012-02-09 14:41:17 PST
Created attachment 595882 [details] [diff] [review]
Part 1: Allocate a PoT Shmem for the Adreno

The Adreno driver likes to read outside the bounds of what we give it. By using Power of Two Shmem the Adreno driver wont trigger a segfault when trying to read outside the dimensions it should be reading. The alternative to this Part 1 would be to memcpy from the Shmem to a PoT buffer before the upload which also removes the crash but is more expensive. 

It still causes some visual artifacts when it does decide to read outside the given range to the next PoT. If we like this patch I'll fix up that piece as well in part 2. We can upload always in PoT for these drivers and leave our texture coord untouched thus never showing the extra texture region on screen.

By using smaller tiles on the Adreno such as 256x256 wen can minimize the wasted space due to round up.
Comment 2 Benoit Girard (:BenWa) 2012-02-09 16:51:39 PST
Comment on attachment 595882 [details] [diff] [review]
Part 1: Allocate a PoT Shmem for the Adreno

Will post a patch that uses 128 (or so alignment) that seems like a better fix per IRC convo. Needs a bit more analysis, should be posting it tomorrow.
Comment 3 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-09 17:07:33 PST
Let's try going down to 16 if possible.  That's one NEON register word IIRC.
Comment 4 Benoit Girard (:BenWa) 2012-02-10 12:11:19 PST
Created attachment 596139 [details] [diff] [review]
Make gfxSharedImageSurface divisible by 64 on mobile

32 and lower lead to crashes. I don't see any visual artifacts with this patch applied.
Comment 5 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-10 16:49:43 PST
Comment on attachment 596139 [details] [diff] [review]
Make gfxSharedImageSurface divisible by 64 on mobile

I really don't like sticking this in gfxSharedImageSurface.  It's also
not correct, because we can use GL layers directly on the adreno200,
for chrome code or video.  If we allocate an unfortunately-sized
surface there, we should also crash in theory.

So please put this code in layers.  We have a lot of flexibility in
the buffer sizes we use there.

One more thing: this is actually aligning the stride by 64 *pixels*,
not bytes.  So that means 128 bytes commonly, I guess, but 256 bytes
on 24-bit displays.  The maximum texture we might try to allocate is
in the neighborhood of 2000x2000, and if we hit an unaligned size,
we'll end up wasting up to ~1MB.  That's probably not acceptable.  So
I think we should probably gate on a check like in your "adreno200
quirks" patch.

We'll further want to gate this on the adreno200 version, but we can
do that as a followup.
Comment 6 Benoit Girard (:BenWa) 2012-02-13 12:04:28 PST
Created attachment 596741 [details] [diff] [review]
Part 1: support stride in gfxSharedImageSurface

Starting part 2. I'm going to take the GPU detection piece of my previous patch and use it to calculate a special stride for the adreno.
Comment 7 Benoit Girard (:BenWa) 2012-02-13 13:49:50 PST
Created attachment 596782 [details] [diff] [review]
Part 2: Request stride from gfxSharedImageSurface
Comment 8 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-13 16:06:44 PST
Comment on attachment 596741 [details] [diff] [review]
Part 1: support stride in gfxSharedImageSurface

>diff --git a/gfx/thebes/gfxSharedImageSurface.cpp b/gfx/thebes/gfxSharedImageSurface.cpp

> struct SharedImageInfo {
>     PRInt32 width;
>     PRInt32 height;
>+    long stride;

PRUint32.  This control block is shared across processes with
different architectures, so |long| is a no-no since it's machine
dependent.

>diff --git a/gfx/thebes/gfxSharedImageSurface.h b/gfx/thebes/gfxSharedImageSurface.h

>     template<class ShmemAllocator, bool Unsafe>
>     static already_AddRefed<gfxSharedImageSurface>
>     Create(ShmemAllocator* aAllocator,
>            const gfxIntSize& aSize,
>            gfxImageFormat aFormat,
>+           long aStride,

You need to document whether stride is in pixels or bytes.  I think
it's more natural to specify a stride in bytes, since that's what
hardware cares about.

I think |aStride| is not a very good interface here.  What happens if
a caller passes aSize=<w=1000> but stride=5?  Let's instead make it an
|aStrideAlignBytes| parameter.  For example, aStrideAlignBytes=16
would mean align the stride on a 16-byte boundary.  If
(aStrideAlignBytes % BytesPerPixel(aFormat)) != 0, this function
should fail loudly.

I also don't want to force stride overridding on all consumers of
gfxSharedImageSurface.  Let's make it an optional parameter that
defaults to 0 or something to mean "whatever stride this function
wants to choose".

r- for security bug above.
Comment 9 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-13 16:08:38 PST
Comment on attachment 596782 [details] [diff] [review]
Part 2: Request stride from gfxSharedImageSurface

The quirks stuff in here looks fine but I'd rather review after the changes to the previous patch.
Comment 10 Benoit Girard (:BenWa) 2012-02-13 16:23:28 PST
(In reply to Chris Jones [:cjones] [:warhammer] from comment #8)
> I also don't want to force stride overridding on all consumers of
> gfxSharedImageSurface.  Let's make it an optional parameter that
> defaults to 0 or something to mean "whatever stride this function
> wants to choose".
> 

It's not evident from this patch but the Create that is modified is private and only called by the public Create(Unsafe) which have the same signature. In my second patch I expose a new CreateUnsafe that takes a stride, it should of went into the first patch.
Comment 11 Benoit Girard (:BenWa) 2012-02-14 08:56:20 PST
Created attachment 597043 [details] [diff] [review]
Support stride in gfxSharedImageSurface

I'm no longer convinced that using a specific stride fixes the crash, I think we're just making it less likely to cause a segfault. This is also making us hit a slow path when 'width != stride' and we don't have the unpack extension.

We should land this patch anyways but not use it.
Comment 12 George Wright (:gw280) (:gwright) 2012-02-15 16:56:09 PST
Created attachment 597616 [details] [diff] [review]
Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues
Comment 13 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-15 22:42:31 PST
Comment on attachment 597043 [details] [diff] [review]
Support stride in gfxSharedImageSurface

>diff --git a/gfx/thebes/gfxSharedImageSurface.h b/gfx/thebes/gfxSharedImageSurface.h

>+             if ((aStrideAlignBytes % BytePerPixelFromFormat(aFormat)) != 0) {
>+                 NS_RUNTIMEABORT("Stride is not a multiple of bytes per pixel.");
>+                 return nsnull;
>+             }

Make this MOZ_ASSERT.

>+             if (aStrideAlignBytes < stride) {
>+                 NS_RUNTIMEABORT("Requesting stride smaller than image width.");
>+                 return nsnull;
>+             }
>+             stride = aStrideAlignBytes;

Oh ... my intention here was for aStrideAlignBytes to nudge up the
computed stride as needed to make (computedStride % aStrideAlignBytes
== 0).  So you'd pass |aStrideAlignBytes = 16| to get a stride that
falls on a 16-byte boundary.  To get that, remove the check for
|(aStrideAlignBytes < stride)| and then compute the stride as

  if (stride % aStrideAlignBytes) {
    stride = aStrideAlignBytes * (1 + stride / aStrideAlignBytes);
  }

Does that make sense?

r=me with those changes.
Comment 14 George Wright (:gw280) (:gwright) 2012-02-16 08:27:14 PST
Comment on attachment 597616 [details] [diff] [review]
Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues

Patch by both myself and BenWa
Comment 15 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-02-16 11:51:32 PST
Can we please only do this for older drivers?  A 4x memory overhead in the worst case is going to hurt on these devices, which tend to be lower end to begin with.
Comment 16 George Wright (:gw280) (:gwright) 2012-02-16 11:52:48 PST
There's no way to determine which driver versions are good and which are bad. However, one thing we do know is that the renderer string "Adreno 200" is old and the newer drivers use "Adreno (TM) 200", so this codepath won't get hit on newer drivers.
Comment 17 George Wright (:gw280) (:gwright) 2012-02-16 11:56:24 PST
Comment on attachment 597616 [details] [diff] [review]
Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues

Cancel review as I've since found issues with rendering image layers.
Comment 18 Benoit Girard (:BenWa) 2012-02-16 12:40:04 PST
*** Bug 724953 has been marked as a duplicate of this bug. ***
Comment 19 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-02-16 17:27:14 PST
Moving up to crit as it's a crashing bug
Comment 20 George Wright (:gw280) (:gwright) 2012-02-17 14:43:14 PST
Created attachment 598385 [details] [diff] [review]
Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=?
Comment 21 Mozilla RelEng Bot 2012-02-17 15:03:38 PST
Autoland Patchset:
	Patches: 598385
	Branch: mozilla-central => try
Error applying patch 598385 to mozilla-central.
patching file gfx/gl/GLContext.cpp
Hunk #1 FAILED at 595
1 out of 5 hunks FAILED -- saving rejects to file gfx/gl/GLContext.cpp.rej
patching file gfx/gl/GLContext.h
Hunk #1 FAILED at 708
1 out of 1 hunks FAILED -- saving rejects to file gfx/gl/GLContext.h.rej
abort: patch failed to apply

Could not apply and push patchset:
Comment 22 George Wright (:gw280) (:gwright) 2012-02-18 11:41:22 PST
Created attachment 598557 [details] [diff] [review]
Patch fixed to be off mozilla-central rather than maple

Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r
Comment 23 Mozilla RelEng Bot 2012-02-18 11:45:57 PST
Autoland Patchset:
	Patches: 598557
	Branch: mozilla-central => try
	Destination: http://hg.mozilla.org/try/pushloghtml?changeset=add21f2f75ef
Try run started, revision add21f2f75ef. To cancel or monitor the job, see: https://tbpl.mozilla.org/?tree=Try&rev=add21f2f75ef
Comment 24 Jeff Muizelaar [:jrmuizel] 2012-02-18 11:54:29 PST
Comment on attachment 598385 [details] [diff] [review]
Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=?

Review of attachment 598385 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.cpp
@@ +600,5 @@
>      once = true;
>  #endif
>  }
>  
> +static unsigned char*

What does PackedPixels mean in this context? I think this method could use a better name or description.

@@ +605,5 @@
> +CopyToPackedPixelsBuffer(const GLvoid* rowSource,
> +                         GLsizei srcWidth, GLsizei srcHeight,
> +                         GLsizei dstWidth, GLsizei dstHeight,
> +                         GLsizei stride, GLint pixelsize)
> +{

I wonder if it would be better if the allocation was in the caller instead of the CopyToPackedPixelsBuffer()

@@ +610,5 @@
> +    unsigned char *newPixels = new unsigned char[dstWidth * dstHeight * pixelsize];
> +    unsigned char *rowDest = newPixels;
> +    const unsigned char *source = static_cast<const unsigned char*>(rowSource);
> +
> +    for (int h = 0; h < srcHeight; ++h) {

Perhaps these should be GLsizei instead of int.

@@ +617,5 @@
> +        source += stride;
> +    }
> +
> +    int padHeight = srcHeight;
> +

a comment like:
// pad out an extra row of pixels

@@ +622,5 @@
> +    if (dstHeight > srcHeight) {
> +        memcpy(rowDest, source - stride, srcWidth * pixelsize);
> +        padHeight++;
> +    }
> +

a comment would be good here.

@@ +652,5 @@
> +bool
> +GLContext::CanUploadNonPowerOfTwo()
> +{
> +    // Some GPUs driver crash when uploading non power of two 565 textures.
> +    return Renderer() != RendererAdreno200;

Is this fixed in a newer version of the driver? Can we only use the workaround then?

@@ +2204,5 @@
> +        // as we don't support/want non power of two texture uploads
> +        GLsizei paddedWidth = gfxUtils::NextPowerOfTwo(width);
> +        GLsizei paddedHeight = gfxUtils::NextPowerOfTwo(height);
> +
> +        unsigned char *paddedPixels = CopyToPackedPixelsBuffer(pixels, width, height,

Either here or above there should be a comment describing why we need to pad out.

@@ +2519,5 @@
>      // otherwise.
>      GLfloat xlen = (1.0f - tl[0]) + br[0];
>      GLfloat ylen = (1.0f - tl[1]) + br[1];
>  
> +

extra whitespace

::: gfx/thebes/gfxUtils.cpp
@@ +702,5 @@
> +gfxUtils::IsPowerOfTwo(int aNumber)
> +{
> +    return (aNumber & (aNumber - 1)) == 0;
> +}
> +

I don't see any reason these need to be members of the gfxUtils class. I think gfxUtils stuff might be a hold over from the pre-namespace time.
Comment 25 Mozilla RelEng Bot 2012-02-18 16:16:27 PST
Try run for add21f2f75ef is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=add21f2f75ef
Results (out of 211 total builds):
    success: 175
    warnings: 21
    failure: 15
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/autolanduser@mozilla.com-add21f2f75ef
Comment 26 George Wright (:gw280) (:gwright) 2012-02-18 16:20:41 PST
Created attachment 598595 [details] [diff] [review]
Update patch yet again to handle canvas layers properly...

Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=jrmuizel
Comment 27 Jeff Muizelaar [:jrmuizel] 2012-02-18 16:34:55 PST
Comment on attachment 598595 [details] [diff] [review]
Update patch yet again to handle canvas layers properly...

Review of attachment 598595 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.cpp
@@ +2219,5 @@
> +        GLsizei paddedWidth = NextPowerOfTwo(width);
> +        GLsizei paddedHeight = NextPowerOfTwo(height);
> +
> +        GLvoid* paddedPixels = (GLvoid*)new unsigned char[paddedWidth * paddedHeight * pixelsize];
> +

I don't think you need the cast here.

::: gfx/layers/opengl/CanvasLayerOGL.cpp
@@ +432,5 @@
> +      // We can't use BindAndDrawQuad because that always uploads the whole texture from 0.0f -> 1.0f
> +      // in x and y. We use BindAndDrawQuadWithTextureRect to actually draw a subrect of the texture
> +      // We need to reset the origin to 0,0 from the tile rect because the tile originates at 0,0 in the
> +      // actual texture, even though its origin in the composed (tiled) texture is not 0,0
> +      // FIXME: we need to handle mNeedsYFlip

add a bug number for this.
Comment 28 George Wright (:gw280) (:gwright) 2012-02-18 17:57:10 PST
Created attachment 598604 [details] [diff] [review]
Finally final patch. Add support to pref POT textures

Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=jrmuizel
Comment 29 Benoit Jacob [:bjacob] (mostly away) 2012-02-18 18:08:12 PST
Comment on attachment 598604 [details] [diff] [review]
Finally final patch. Add support to pref POT textures

George said on IRC that r+ could reasonably be carried over from an earlier version of the patch.
Comment 30 Benoit Jacob [:bjacob] (mostly away) 2012-02-18 18:26:05 PST
http://hg.mozilla.org/projects/maple/rev/0a5dbb35bf45
Comment 31 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-02-19 23:50:13 PST
https://crash-stats.mozilla.com/report/index/42a8801e-24d6-4f0f-8713-babb52120220

Can we get this pushed to the m-c please?
Comment 32 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-02-19 23:51:38 PST
Reproduce steps : 
1. go to http://paulrouget.com/mwc-demos/thumbnails/battery.png
2. Take a picture with the webcam API
3. rotate from portrait to landscape
Comment 33 Joe Drew (not getting mail) 2012-02-20 07:54:26 PST
(In reply to Naoki Hirata :nhirata from comment #31)
> https://crash-stats.mozilla.com/report/index/42a8801e-24d6-4f0f-8713-
> babb52120220
> 
> Can we get this pushed to the m-c please?

This won't fix anything on m-c, though it is caused by the same bug in the Adreno drivers.

We only use OpenGL from the Java compositor on m-c. This patch is to Gecko, though. It won't affect anything on the Java side.
Comment 34 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-02-20 09:41:35 PST
I see.  thanks for the response.  What should we do with the Nightly crash then?  Should we reopen the bug 724953?  or wait until maple merges with m-c?
Comment 35 Joe Drew (not getting mail) 2012-02-20 09:52:07 PST
Waiting until Maple merges is probably the best strategy. Spending developer time on mozilla-central at this point is probably counterproductive.
Comment 36 Benoit Girard (:BenWa) 2012-02-21 08:19:29 PST
*** Bug 623501 has been marked as a duplicate of this bug. ***
Comment 37 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-02-21 08:57:47 PST
Sounds good.  Thanks again for the response and clarification.
Comment 38 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-02-28 08:26:52 PST
*** Bug 731174 has been marked as a duplicate of this bug. ***
Comment 39 Cristian Nicolae (:xti) 2012-03-30 07:15:53 PDT
I got this crash on the latest Nightly: https://crash-stats.mozilla.com/report/index/bp-05f4b903-bd63-4e74-97b5-520fe2120330

I went to http://goo.gl/6BBjr and I panned and zoomed the page for several times.
--
Firefox 14.0a1 (2012-03-30)
Device: HTC Desire Z
OS: Android 2.3.3
Comment 40 Martijn Wargers [:mwargers] (not working for Mozilla) 2012-03-30 07:23:51 PDT
Related to bug 694964?

Note You need to log in before you can comment on or make changes to this bug.