Closed Bug 792966 Opened 12 years ago Closed 11 years ago

<video> and <canvas> with width <64 pixels (128 or 256 bytes) are not rendered by GPU

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla21
blocking-b2g -
Tracking Status
firefox19 --- wontfix
firefox20 --- wontfix
firefox21 --- fixed
b2g18 + fixed
b2g18-v1.0.0 --- fixed

People

(Reporter: cjones, Assigned: bjacob)

References

Details

(Whiteboard: [caf:help])

Attachments

(2 files)

See bug 788411.  Not clear if this affects WebGL textures as well.

This bug doesn't affect hwcompositor, only the GPU.  Landing hwcompositor is an acceptable workaround for v1.
I'm not convinced that this is actually a GPU problem.   After backing out bug 788411, I added this cheesy block in gralloc_unlock() right before the cache flush:

        if (hnd->width == 32 && hnd->height == 32) {
          memset((void *) hnd->base, 0xFF, hnd->size);
        }

and the lock screen arrows then turn into white blocks.  GPU is not involved with populating the buffer right?  If this was a GPU issue, I would expect the white blocks to also not appear on the framebuffer after composition.  

Although I don't have a theory for why we didn't see this problem with the hwcompositor yet...
(In reply to Michael Vines [:m1] from comment #1)
> I'm not convinced that this is actually a GPU problem.   After backing out
> bug 788411, I added this cheesy block in gralloc_unlock() right before the
> cache flush:
> 
>         if (hnd->width == 32 && hnd->height == 32) {
>           memset((void *) hnd->base, 0xFF, hnd->size);
>         }
> 
> and the lock screen arrows then turn into white blocks.  GPU is not involved
> with populating the buffer right?  If this was a GPU issue, I would expect
> the white blocks to also not appear on the framebuffer after composition.  
> 

Correct.  That's very interesting.  /me scratches head
I hit this bug running B2G on the Otoro and created a test page on which I can reproduce it easily:

http://people.mozilla.com/~gsvelto/canvas_bug.html

The page has two blue square divs, 32x32 and 16x16 respectively, below them are two canvas elements of the same sizes, painted in red and green. On the right there's an additional tall canvas element (16x160) painted orange.

The problems I see are the following:

- The 16x16 and 16x160 canvas elements are not shown in the Otoro's browser, increasing any dimension from 16 to 17 pixels makes them visible again. So it looks like all elements with either width or height <= 16 pixels is affected.

- When zoomed out the 32x32 canvas appears smaller than the 32x32 div above it, I'll post a snapshot that shows the problem.

- Finally if one pays attention when reloading the page the 32x32 canvas appears slightly larger first (from what I can see, the same size of the div element) and is then immediately repainted smaller. This happens very fast so I'm not sure if I can capture it on video but if you have a device it should be easy to reproduce by reloading the page multiple times while looking at the square.
Sounds similar to bug 791432 (which is a security bug, but is basically about the Adreno skipping drawing on surfaces smaller than a certain size).
blocking-b2g: --- → tef+
Assignee: nobody → bjacob
Benoit, can you figure out if this is also a driver issue?
Flags: needinfo?(bjacob)
Yes, we previously established that this was a driver bug.

I assume this was tef+'d based on comment 5?  If so, this isn't the same kind of bug because *nothing* renders in the buggy case here.  The gralloc buffers are cleared when created so that's not an issue either.

As much as it pains me to leave a bug of this nature in our platform, we just haven't seemed to have seen major issues from this in the field.  (The issue Gabriele hit was with an awkward hack to work around other gecko brokenness ;).)  So I wouldn't hold the release for this as things stand now.

I can't imagine that having very small videos or canvases in the DOM (as opposed to being used for temporary buffers) is a common case in games, but if someone knows how to get some more in-depth QA there that could change my mind.
blocking-b2g: tef+ → -
Flags: needinfo?(bjacob)
OK. I remain concerned about the other bug, bug 791432, which may or may not be the same underlying driver bug, and is a security issue.
I share your concern about bug 791432 and b2g phones will almost certainly be susceptible as well.  However, that's an entirely different beast than the bug here.  Happy to follow up in private if you'd like.
I just verified: bug 791432 doesn't affect Otoro or Unagi devices. So at this point it's not a concern for B2G.
We can work around the driver bug pretty easily, can't we? Have a min texture size, and add clamped pixels to the texture when we upload to it.
We can't easily do exactly that (extend by clamping) due to the way that texture sampling works in GLSL. We'd need to keep track of the original texture size and pass it to any shader sampling it, so that it would multiply every texture coordinate by (original_size / tweaked_size). So we'd have to rewrite all shaders for that.

We can however upsize the textures, which will result in over-sampled rendering into the larger texture, which, when used by the compositor, would then be downscaled. For video, this would just work and make no visible difference. For Canvas 2D we'd have to be a little careful with texture filters to avoid blurring out certain 2d primitives.
(In reply to Benoit Jacob [:bjacob] from comment #12)
> We can however upsize the textures, which will result in over-sampled
> rendering into the larger texture, which, when used by the compositor, would
> then be downscaled. For video, this would just work and make no visible
> difference. For Canvas 2D we'd have to be a little careful with texture
> filters to avoid blurring out certain 2d primitives.

That sounds pretty good actually. I think we could live with a workaround that blurred small canvases. That's a lot better than not appearing at all.
Another problem now that I think about it is all readPixels-like operations exposed by Canvas APIs. Canvas 2D getImageData, WebGL readPixels. We'd have to add a tweak there (downsizing the texture before reading pixels, or downsizing the resulting bitmap).

Also, small canvases actually matter a lot... for conformance tests. At least WebGL conformance tests do use a lot of small canvases; I expect that 2D Canvas tests would do the same. So this bug seems to mean that we can't pass canvas (2d and webgl) conformance tests on these devices? That's actually a big problem... testing at the moment.
(In reply to Michael Vines [:m1] from comment #1)
> Although I don't have a theory for why we didn't see this problem with the
> hwcompositor yet...

As I recall the addendum to this was that we *do* see the same problem with hwc.

One thing I just remembered is that I concluded this was a driver bug after testing on nexus s.  But nexus s has a 24bpp display, so we can take some different paths through cairo.  So I'll retest on otoro with 24bpp forced on.

It strikes me though that the simplest workaround of all is to fall back on shmem+upload (avoid gralloc) for too-small surfaces.
Hm, WebGL canvases seem to be unaffected --- lots of WebGL conformance tests that use small canvases, succeed on my Otoro phone which does reproduce the bug.

So it seems that small OpenGL textures are fine --- so this has indeed nothing to do with bug 791432.

Do I understand correctly that this bug is only about gralloc buffers and no other kind of graphics surface? Sorry I'm a bit slow...
Testing with http://people.mozilla.com/~cjones/small-buffers.html , I don't see the rightmost (<canvas>) buffer on my adreno phone.

If I break on the gralloc buffer we allocate for the canvas in the content process and dump the image bits, they're as expected (blue).  If I break on the buffer in the compositor and dump the bits, they're also as expected (blue).  That doesn't rule out a GPU cache coherency issue, but it means gecko seems to be doing the right thing.

I tried examining the buffer as seen by the GPU using Dr. Vlad's Cure-All LayerScope Tonic Tool (bug 830881), but unfortunately we fail to read it back.  Maybe or maybe not related.
Attached patch HackSplinter Review
Does the trick.  Ain't pretty, like all the code I write these days.

m1, let me know if you guys have any ideas for a real fix here.  Looking down from gecko at the GraphicBuffer.h level API, I don't see anything we're doing wrong.
Attachment #703149 - Flags: feedback?(mvines)
Attachment #703149 - Flags: feedback?(jhicks)
(In reply to Chris Jones [:cjones] [:warhammer] from comment #17)
> Testing with http://people.mozilla.com/~cjones/small-buffers.html , I don't
> see the rightmost (<canvas>) buffer on my adreno phone.

I'm not seeing the bug on my device. I get both blue canvases. Your workaround in comment 18 sounds reasonable to me for now until I can get accomplish some more investigation.
I've tested attachment 703149 [details] [diff] [review] on my Otoro and it fixes the issue for me when looking at http://people.mozilla.com/~gsvelto/canvas_bug.html though when zooming out the canvas elements still have a different size from the divs. That's likely to be a different issue then, but at least the small canvas elements are properly visible now.
Comment on attachment 703149 [details] [diff] [review]
Hack

This hack seems fair, and maybe even a good idea in general.  Like the patch mentions for sizes < 64 we're probably within a single page or two so not like gralloc buys us much beyond the risk of fragmenting that heap.
Attachment #703149 - Flags: feedback?(mvines)
Attachment #703149 - Flags: feedback?(jhicks)
Attachment #703149 - Flags: feedback+
Attachment #703149 - Flags: review?(kchen) → review+
Comment on attachment 703149 [details] [diff] [review]
Hack

I thought the fix for this would be invasive and risky, but it turned out to be quite straightforward and non-risky.

We haven't run into major issues caused by this problem beyond a hack that gsvelto was implementing, but this is a nasty hole in the platform.  Parts of pages will just not work with no indication why.  I think the cost/benefit is about right for an uplift now.
Attachment #703149 - Flags: approval-mozilla-b2g18?
https://hg.mozilla.org/mozilla-central/rev/126481a3ca24
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
(In reply to Chris Jones [:cjones] [:warhammer] from comment #23)
> Comment on attachment 703149 [details] [diff] [review]
> Hack
> 
> I thought the fix for this would be invasive and risky, but it turned out to
> be quite straightforward and non-risky.
> 
> We haven't run into major issues caused by this problem beyond a hack that
> gsvelto was implementing, but this is a nasty hole in the platform.  Parts
> of pages will just not work with no indication why.  I think the
> cost/benefit is about right for an uplift now.

Does something bad happen if the magic 64 in two different files become two different numbers?  Is it worth filing a bug to #define/const/whatever somewhere so that we guarantee that the two places that use it are using the same value?
Nothing particularly bad, but if either accidentally got smaller this bug would come back.  Makes sense to file a followup on giving them a symbolic constant.
Comment on attachment 703149 [details] [diff] [review]
Hack

Approving for 1.0.0 (mozilla-b2g18_v_1_0_0 branch tip) uplift.
Attachment #703149 - Flags: approval-mozilla-b2g18? → approval-mozilla-b2g18+
See Also: → 978176
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: