Closed Bug 721489 Opened 12 years ago Closed 12 years ago

Older Adreno 200 drivers intermittently crash when uploading RGB565 textures with glTexImage2D

Categories

(Core :: Graphics, defect)

ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: gw280, Assigned: BenWa)

References

(Blocks 1 open bug)

Details

(Whiteboard: mwc-demo)

Attachments

(3 files, 7 obsolete files)

6.81 KB, patch
cjones
: review+
Details | Diff | Splinter Review
15.68 KB, patch
Details | Diff | Splinter Review
23.34 KB, patch
bjacob
: review+
Details | Diff | Splinter Review
This may be an issue with older drivers and using NPOT textures. Basically it looks like the driver is reading beyond the bounds of the pixel buffer we pass to it via glTexImage2D.

This is not an issue on later Adreno 200 drivers, but whatever is on the Toronto Nexus One is definitely affected.

We need to find an acceptable workaround for this if we want to support GL layers on Adreno 200 devices in the wild, as we have no control or guarantee the drivers are up to date at all.

We have no guarantee that it is to do with NPOT but all signs are pointing to that. Therefore switching to use POT textures may fix it. A quick workaround that is known to stop it from crashing is to allocate a buffer of size 4 bytes * width * height and memcpying the pixel data to the first half of that buffer.
Blocks: 619615, 698673
The Adreno driver likes to read outside the bounds of what we give it. By using Power of Two Shmem the Adreno driver wont trigger a segfault when trying to read outside the dimensions it should be reading. The alternative to this Part 1 would be to memcpy from the Shmem to a PoT buffer before the upload which also removes the crash but is more expensive. 

It still causes some visual artifacts when it does decide to read outside the given range to the next PoT. If we like this patch I'll fix up that piece as well in part 2. We can upload always in PoT for these drivers and leave our texture coord untouched thus never showing the extra texture region on screen.

By using smaller tiles on the Adreno such as 256x256 wen can minimize the wasted space due to round up.
Assignee: nobody → bgirard
Status: NEW → ASSIGNED
Attachment #595882 - Flags: review?(jones.chris.g)
Comment on attachment 595882 [details] [diff] [review]
Part 1: Allocate a PoT Shmem for the Adreno

Will post a patch that uses 128 (or so alignment) that seems like a better fix per IRC convo. Needs a bit more analysis, should be posting it tomorrow.
Attachment #595882 - Attachment is patch: false
Attachment #595882 - Attachment is obsolete: true
Attachment #595882 - Attachment is patch: true
Attachment #595882 - Flags: review?(jones.chris.g)
Let's try going down to 16 if possible.  That's one NEON register word IIRC.
32 and lower lead to crashes. I don't see any visual artifacts with this patch applied.
Attachment #596139 - Flags: review?(jones.chris.g)
Comment on attachment 596139 [details] [diff] [review]
Make gfxSharedImageSurface divisible by 64 on mobile

I really don't like sticking this in gfxSharedImageSurface.  It's also
not correct, because we can use GL layers directly on the adreno200,
for chrome code or video.  If we allocate an unfortunately-sized
surface there, we should also crash in theory.

So please put this code in layers.  We have a lot of flexibility in
the buffer sizes we use there.

One more thing: this is actually aligning the stride by 64 *pixels*,
not bytes.  So that means 128 bytes commonly, I guess, but 256 bytes
on 24-bit displays.  The maximum texture we might try to allocate is
in the neighborhood of 2000x2000, and if we hit an unaligned size,
we'll end up wasting up to ~1MB.  That's probably not acceptable.  So
I think we should probably gate on a check like in your "adreno200
quirks" patch.

We'll further want to gate this on the adreno200 version, but we can
do that as a followup.
Attachment #596139 - Flags: review?(jones.chris.g) → review-
Starting part 2. I'm going to take the GPU detection piece of my previous patch and use it to calculate a special stride for the adreno.
Attachment #596139 - Attachment is obsolete: true
Attachment #596741 - Flags: review?
Attachment #596741 - Flags: review? → review?(jones.chris.g)
Attachment #596782 - Flags: review?(jones.chris.g)
Comment on attachment 596741 [details] [diff] [review]
Part 1: support stride in gfxSharedImageSurface

>diff --git a/gfx/thebes/gfxSharedImageSurface.cpp b/gfx/thebes/gfxSharedImageSurface.cpp

> struct SharedImageInfo {
>     PRInt32 width;
>     PRInt32 height;
>+    long stride;

PRUint32.  This control block is shared across processes with
different architectures, so |long| is a no-no since it's machine
dependent.

>diff --git a/gfx/thebes/gfxSharedImageSurface.h b/gfx/thebes/gfxSharedImageSurface.h

>     template<class ShmemAllocator, bool Unsafe>
>     static already_AddRefed<gfxSharedImageSurface>
>     Create(ShmemAllocator* aAllocator,
>            const gfxIntSize& aSize,
>            gfxImageFormat aFormat,
>+           long aStride,

You need to document whether stride is in pixels or bytes.  I think
it's more natural to specify a stride in bytes, since that's what
hardware cares about.

I think |aStride| is not a very good interface here.  What happens if
a caller passes aSize=<w=1000> but stride=5?  Let's instead make it an
|aStrideAlignBytes| parameter.  For example, aStrideAlignBytes=16
would mean align the stride on a 16-byte boundary.  If
(aStrideAlignBytes % BytesPerPixel(aFormat)) != 0, this function
should fail loudly.

I also don't want to force stride overridding on all consumers of
gfxSharedImageSurface.  Let's make it an optional parameter that
defaults to 0 or something to mean "whatever stride this function
wants to choose".

r- for security bug above.
Attachment #596741 - Flags: review?(jones.chris.g) → review-
Comment on attachment 596782 [details] [diff] [review]
Part 2: Request stride from gfxSharedImageSurface

The quirks stuff in here looks fine but I'd rather review after the changes to the previous patch.
Attachment #596782 - Flags: review?(jones.chris.g)
(In reply to Chris Jones [:cjones] [:warhammer] from comment #8)
> I also don't want to force stride overridding on all consumers of
> gfxSharedImageSurface.  Let's make it an optional parameter that
> defaults to 0 or something to mean "whatever stride this function
> wants to choose".
> 

It's not evident from this patch but the Create that is modified is private and only called by the public Create(Unsafe) which have the same signature. In my second patch I expose a new CreateUnsafe that takes a stride, it should of went into the first patch.
I'm no longer convinced that using a specific stride fixes the crash, I think we're just making it less likely to cause a segfault. This is also making us hit a slow path when 'width != stride' and we don't have the unpack extension.

We should land this patch anyways but not use it.
Attachment #596741 - Attachment is obsolete: true
Attachment #596782 - Attachment is obsolete: true
Attachment #597043 - Flags: review?(jones.chris.g)
Comment on attachment 597043 [details] [diff] [review]
Support stride in gfxSharedImageSurface

>diff --git a/gfx/thebes/gfxSharedImageSurface.h b/gfx/thebes/gfxSharedImageSurface.h

>+             if ((aStrideAlignBytes % BytePerPixelFromFormat(aFormat)) != 0) {
>+                 NS_RUNTIMEABORT("Stride is not a multiple of bytes per pixel.");
>+                 return nsnull;
>+             }

Make this MOZ_ASSERT.

>+             if (aStrideAlignBytes < stride) {
>+                 NS_RUNTIMEABORT("Requesting stride smaller than image width.");
>+                 return nsnull;
>+             }
>+             stride = aStrideAlignBytes;

Oh ... my intention here was for aStrideAlignBytes to nudge up the
computed stride as needed to make (computedStride % aStrideAlignBytes
== 0).  So you'd pass |aStrideAlignBytes = 16| to get a stride that
falls on a 16-byte boundary.  To get that, remove the check for
|(aStrideAlignBytes < stride)| and then compute the stride as

  if (stride % aStrideAlignBytes) {
    stride = aStrideAlignBytes * (1 + stride / aStrideAlignBytes);
  }

Does that make sense?

r=me with those changes.
Attachment #597043 - Flags: review?(jones.chris.g) → review+
Comment on attachment 597616 [details] [diff] [review]
Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues

Patch by both myself and BenWa
Attachment #597616 - Flags: review?(joe)
Can we please only do this for older drivers?  A 4x memory overhead in the worst case is going to hurt on these devices, which tend to be lower end to begin with.
There's no way to determine which driver versions are good and which are bad. However, one thing we do know is that the renderer string "Adreno 200" is old and the newer drivers use "Adreno (TM) 200", so this codepath won't get hit on newer drivers.
Comment on attachment 597616 [details] [diff] [review]
Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues

Cancel review as I've since found issues with rendering image layers.
Attachment #597616 - Flags: review?(joe)
Whiteboard: mwc-demo
Moving up to crit as it's a crashing bug
Severity: normal → critical
Blocks: 728249
Attachment #598385 - Flags: superreview?(jmuizelaar)
Attachment #598385 - Flags: review?(joe)
Whiteboard: mwc-demo → [autoland-try:598385] mwc-demo
Whiteboard: [autoland-try:598385] mwc-demo → [autoland-in-queue] mwc-demo
Autoland Patchset:
	Patches: 598385
	Branch: mozilla-central => try
Error applying patch 598385 to mozilla-central.
patching file gfx/gl/GLContext.cpp
Hunk #1 FAILED at 595
1 out of 5 hunks FAILED -- saving rejects to file gfx/gl/GLContext.cpp.rej
patching file gfx/gl/GLContext.h
Hunk #1 FAILED at 708
1 out of 1 hunks FAILED -- saving rejects to file gfx/gl/GLContext.h.rej
abort: patch failed to apply

Could not apply and push patchset:
Whiteboard: [autoland-in-queue] mwc-demo → mwc-demo
Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r
Whiteboard: mwc-demo → [autoland-try:598557] mwc-demo
Whiteboard: [autoland-try:598557] mwc-demo → [autoland-in-queue] mwc-demo
Autoland Patchset:
	Patches: 598557
	Branch: mozilla-central => try
	Destination: http://hg.mozilla.org/try/pushloghtml?changeset=add21f2f75ef
Try run started, revision add21f2f75ef. To cancel or monitor the job, see: https://tbpl.mozilla.org/?tree=Try&rev=add21f2f75ef
Comment on attachment 598385 [details] [diff] [review]
Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=?

Review of attachment 598385 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.cpp
@@ +600,5 @@
>      once = true;
>  #endif
>  }
>  
> +static unsigned char*

What does PackedPixels mean in this context? I think this method could use a better name or description.

@@ +605,5 @@
> +CopyToPackedPixelsBuffer(const GLvoid* rowSource,
> +                         GLsizei srcWidth, GLsizei srcHeight,
> +                         GLsizei dstWidth, GLsizei dstHeight,
> +                         GLsizei stride, GLint pixelsize)
> +{

I wonder if it would be better if the allocation was in the caller instead of the CopyToPackedPixelsBuffer()

@@ +610,5 @@
> +    unsigned char *newPixels = new unsigned char[dstWidth * dstHeight * pixelsize];
> +    unsigned char *rowDest = newPixels;
> +    const unsigned char *source = static_cast<const unsigned char*>(rowSource);
> +
> +    for (int h = 0; h < srcHeight; ++h) {

Perhaps these should be GLsizei instead of int.

@@ +617,5 @@
> +        source += stride;
> +    }
> +
> +    int padHeight = srcHeight;
> +

a comment like:
// pad out an extra row of pixels

@@ +622,5 @@
> +    if (dstHeight > srcHeight) {
> +        memcpy(rowDest, source - stride, srcWidth * pixelsize);
> +        padHeight++;
> +    }
> +

a comment would be good here.

@@ +652,5 @@
> +bool
> +GLContext::CanUploadNonPowerOfTwo()
> +{
> +    // Some GPUs driver crash when uploading non power of two 565 textures.
> +    return Renderer() != RendererAdreno200;

Is this fixed in a newer version of the driver? Can we only use the workaround then?

@@ +2204,5 @@
> +        // as we don't support/want non power of two texture uploads
> +        GLsizei paddedWidth = gfxUtils::NextPowerOfTwo(width);
> +        GLsizei paddedHeight = gfxUtils::NextPowerOfTwo(height);
> +
> +        unsigned char *paddedPixels = CopyToPackedPixelsBuffer(pixels, width, height,

Either here or above there should be a comment describing why we need to pad out.

@@ +2519,5 @@
>      // otherwise.
>      GLfloat xlen = (1.0f - tl[0]) + br[0];
>      GLfloat ylen = (1.0f - tl[1]) + br[1];
>  
> +

extra whitespace

::: gfx/thebes/gfxUtils.cpp
@@ +702,5 @@
> +gfxUtils::IsPowerOfTwo(int aNumber)
> +{
> +    return (aNumber & (aNumber - 1)) == 0;
> +}
> +

I don't see any reason these need to be members of the gfxUtils class. I think gfxUtils stuff might be a hold over from the pre-namespace time.
Attachment #598385 - Flags: superreview?(jmuizelaar) → superreview-
Try run for add21f2f75ef is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=add21f2f75ef
Results (out of 211 total builds):
    success: 175
    warnings: 21
    failure: 15
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/autolanduser@mozilla.com-add21f2f75ef
Whiteboard: [autoland-in-queue] mwc-demo → mwc-demo
Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=jrmuizel
Attachment #598385 - Attachment is obsolete: true
Attachment #598385 - Flags: review?(joe)
Attachment #598557 - Attachment is obsolete: true
Attachment #598595 - Flags: review?(joe)
Attachment #598595 - Flags: review?(jmuizelaar)
Comment on attachment 598595 [details] [diff] [review]
Update patch yet again to handle canvas layers properly...

Review of attachment 598595 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLContext.cpp
@@ +2219,5 @@
> +        GLsizei paddedWidth = NextPowerOfTwo(width);
> +        GLsizei paddedHeight = NextPowerOfTwo(height);
> +
> +        GLvoid* paddedPixels = (GLvoid*)new unsigned char[paddedWidth * paddedHeight * pixelsize];
> +

I don't think you need the cast here.

::: gfx/layers/opengl/CanvasLayerOGL.cpp
@@ +432,5 @@
> +      // We can't use BindAndDrawQuad because that always uploads the whole texture from 0.0f -> 1.0f
> +      // in x and y. We use BindAndDrawQuadWithTextureRect to actually draw a subrect of the texture
> +      // We need to reset the origin to 0,0 from the tile rect because the tile originates at 0,0 in the
> +      // actual texture, even though its origin in the composed (tiled) texture is not 0,0
> +      // FIXME: we need to handle mNeedsYFlip

add a bug number for this.
Attachment #598595 - Flags: review?(jmuizelaar) → review+
Bug 721489 - Add a codepath to only allow for power-of-two textures to be allocated on certain GPUs to fix stability issues. Patch by Benoit Girard and George Wright. r=jrmuizel
Attachment #598595 - Attachment is obsolete: true
Attachment #598595 - Flags: review?(joe)
Attachment #598604 - Flags: review?(jmuizelaar)
Comment on attachment 598604 [details] [diff] [review]
Finally final patch. Add support to pref POT textures

George said on IRC that r+ could reasonably be carried over from an earlier version of the patch.
Attachment #598604 - Flags: review?(jmuizelaar) → review+
http://hg.mozilla.org/projects/maple/rev/0a5dbb35bf45
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Reproduce steps : 
1. go to http://paulrouget.com/mwc-demos/thumbnails/battery.png
2. Take a picture with the webcam API
3. rotate from portrait to landscape
(In reply to Naoki Hirata :nhirata from comment #31)
> https://crash-stats.mozilla.com/report/index/42a8801e-24d6-4f0f-8713-
> babb52120220
> 
> Can we get this pushed to the m-c please?

This won't fix anything on m-c, though it is caused by the same bug in the Adreno drivers.

We only use OpenGL from the Java compositor on m-c. This patch is to Gecko, though. It won't affect anything on the Java side.
I see.  thanks for the response.  What should we do with the Nightly crash then?  Should we reopen the bug 724953?  or wait until maple merges with m-c?
Waiting until Maple merges is probably the best strategy. Spending developer time on mozilla-central at this point is probably counterproductive.
Sounds good.  Thanks again for the response and clarification.
I got this crash on the latest Nightly: https://crash-stats.mozilla.com/report/index/bp-05f4b903-bd63-4e74-97b5-520fe2120330

I went to http://goo.gl/6BBjr and I panned and zoomed the page for several times.
--
Firefox 14.0a1 (2012-03-30)
Device: HTC Desire Z
OS: Android 2.3.3
Related to bug 694964?
Depends on: 750590
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: