Closed Bug 670573 Opened 13 years ago Closed 13 years ago

Corrupted (skewed) video when using OpenGL layers; crashes on OS X in libGLImage@0xaddress

Categories

(Core :: Audio/Video, defect)

x86
All
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla8
Tracking Status
firefox6 --- unaffected
firefox7 + fixed

People

(Reporter: Dolske, Assigned: mattwoodrow)

References

Details

(Keywords: regression, Whiteboard: [inbound])

Crash Data

Attachments

(4 files, 2 obsolete files)

I'm in the YouTube HTML5 beta, this is first corrupted video I can remember seeing. It plays fine in Chrome, so I assume it's a Firefox bug.

Video: http://www.youtube.com/watch?v=HtKUZ_SNK60

Oddly, it's apparently a re-upload of another video on YouTube... http://www.youtube.com/watch?v=2ex6dHzcgOE The original is mostly fine, but there's still a slight corruption, as if only part of one of the color channels has the problem.

I'm on the 7-9-2011 OS X Nightly.
Attached image Screenshot
WFM on [Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:8.0a1) Gecko/20110710 Firefox/8.0a1], it works fine with or without hardware acceleration enabled. WFM on the same revision on Win7 as well. 

dolske: can you try toggling HW acceleration, and seeing if that affects things?
Summary: Corrupted video from YouTube → Corrupted (skewed) video from YouTube
I can reproduce this on Linux with OpenGL layers accelerated force-enabled.
OS: Mac OS X → All
Last good revision is 6e26fba24914, first bad revision is 05e0e6423227, which makes this a regression from bug 656185.
Blocks: 656185
Keywords: regression
Summary: Corrupted (skewed) video from YouTube → Corrupted (skewed) video when using OpenGL layers
Assignee: nobody → matt.woodrow+bugzilla
Crash Signature: [@ libGLImage.dylib@0x1cf1 ][@ libGLImage.dylib@0x4caf ]
Summary: Corrupted (skewed) video when using OpenGL layers → Corrupted (skewed) video when using OpenGL layers; crashes on OS X in libGLImage@0xaddress
FWIW, here's the real stack:

#0  0x0000000128599e62 in ?? ()
#1  0x00007fff86f4ccf2 in glgProcessPixelsWithProcessor ()
#2  0x000000011cf15465 in gleTextureImagePut ()
#3  0x000000011cf12c1e in glTexImage2D_Exec ()
#4  0x00007fff86174370 in glTexImage2D ()
#5  0x00000001017f1953 in mozilla::gl::GLContext::fTexImage2D (this=0x107d33a00, target=3553, level=0, internalformat=6408, width=273, height=180, border=0, format=6409, type=5121, pixels=0x15cbb6fc4) at GLContext.h:1733
#6  0x0000000102833a86 in mozilla::gl::GLContext::UploadSurfaceToTexture (this=0x107d33a00, aSurface=0x15db592c0, aDstRegion=@0x7fff5fbf97f0, aTexture=@0x7fff5fbf992c, aOverwrite=true, aSrcPoint=@0x7fff5fbf98f0, aPixelBuffer=false) at /Volumes/SSD/src/mozilla-central/gfx/thebes/GLContext.cpp:1778
#7  0x0000000102868b1b in mozilla::layers::UploadYUVToTexture (gl=0x107d33a00, aData=@0x15e3372e8, aYTexture=0x15e3372b8, aUTexture=0x15e3372c8, aVTexture=0x15e3372d8) at /Volumes/SSD/src/mozilla-central/gfx/layers/opengl/ImageLayerOGL.cpp:729
#8  0x0000000102868f2b in mozilla::layers::PlanarYCbCrImageOGL::UpdateTextures (this=0x15e337280, gl=0x107d33a00) at /Volumes/SSD/src/mozilla-central/gfx/layers/opengl/ImageLayerOGL.cpp:738
Comments in the crash reports with [@ glgProcessPixelsWithProcessor ]indicate this happens when streaming Netflix: https://crash-stats.mozilla.com/report/list?signature=glgProcessPixelsWithProcessor
This comment is very much un-true with WebM videos. I'm not actually sure why this fixes the problem, I can't see anything wrong with adjusting the stride.
Attachment #546010 - Flags: review?(tterribe)
Comment on attachment 546010 [details] [diff] [review]
Don't force the stride

This probably has to do with the GL_UNPACK_ALIGNMENT stuff your original patch removed. As discussed on IRC, I'll let you do a new patch to handle alignment in UploadSurfaceToTexture.
Attachment #546010 - Flags: review?(tterribe) → review-
Attached patch Correctly handle row alignment (obsolete) — Splinter Review
The problem was the GL assumes each pixel row starts 4 byte aligned, this changes the upload code to handle cases where this isn't the case.

Incorrect comment still removed.
Attachment #546010 - Attachment is obsolete: true
Attachment #546037 - Flags: review?(tterribe)
Comment on attachment 546037 [details] [diff] [review]
Correctly handle row alignment

You're only checking to see if the stride is not a multiple of 8, and not also checking the pointer to see if it starts off aligned.
Attachment #546037 - Flags: review?(tterribe) → review-
Good point, my assumption was that the base pointer was always the direct result of an allocation but this breaks down when we allocate all three planes together.

Another issue with this patch is we can upload multiple rectangles within the surface, and we aren't checking alignment of each of these.

Rather than complicating the UploadSurfaceToTexture logic further, I think we should create (relatively) thin wrappers around glTexImage2D/SubImage2D that handle stride/alignment and OpenGL ES specific problems.

This will result in us calling glPixelStorei more often, but we can cache this locally if it ever shows up as a performance issue.

I've got a WIP patch for this, but I'm worried about the risk of creating new regressions if we land it on aurora.

Tim: Can we guarantee that the strides returned from the decoder will always be 4 byte aligned? If so, I'd prefer to take the initial patch on aurora and land the full fix on nightlies.
(In reply to comment #14)
> Tim: Can we guarantee that the strides returned from the decoder will always
> be 4 byte aligned? If so, I'd prefer to take the initial patch on aurora and
> land the full fix on nightlies.

Yes, they'll actually be multiples of at least 16 for Y', and 8 for Cb and Cr. So your plan sounds reasonable to me.
Comment on attachment 546010 [details] [diff] [review]
Don't force the stride

r+ as a reasonably safe work-around for aurora, if not the best overall solution.
Attachment #546010 - Flags: review-
Attachment #546010 - Flags: review+
Attachment #546010 - Flags: approval-mozilla-aurora?
Attachment #546010 - Attachment is obsolete: false
This also fixes both broken testcases, and passes ogg-video reftests.
Attachment #546037 - Attachment is obsolete: true
Attachment #546479 - Flags: review?(tterribe)
Crash Signature: [@ libGLImage.dylib@0x1cf1 ][@ libGLImage.dylib@0x4caf ] → [@ libGLImage.dylib@0x1cf1 ] [@ libGLImage.dylib@0x4caf ]
Comment on attachment 546479 [details] [diff] [review]
Add wrappers to glTex(Sub)Image2D that handle alignment/stride

Review of attachment 546479 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with a couple of suggestions.

::: gfx/thebes/GLContext.cpp
@@ +1738,2 @@
>          } else {
> +            TexImage2D(LOCAL_GL_TEXTURE_2D,

Could I get an NS_ASSERT or an NS_ABORT_IF_FALSE that iterRect->x and iterRect->y are both zero in this case? Checking that the width and height cover the whole texture would be nice, too. joe assures me these things are true because of how the update region is initialized in BasicTextureImage::BeginUpdate(), but that seems somewhat fragile.

@@ +1773,5 @@
> +                      GLint pixelsize, GLint border, GLenum format, 
> +                      GLenum type, const GLvoid *pixels)
> +{
> +    fPixelStorei(LOCAL_GL_UNPACK_ALIGNMENT, 
> +                 std::min(GetAddressAlignment((ptrdiff_t)pixels),

All the other code in gfx/thebes appears to use NS_MIN instead of std::min. Perhaps we should be consistent?

@@ +1837,5 @@
> +                         GLint pixelsize, GLenum format, 
> +                         GLenum type, const GLvoid* pixels)
> +{
> +    fPixelStorei(LOCAL_GL_UNPACK_ALIGNMENT, 
> +                 std::min(GetAddressAlignment((ptrdiff_t)pixels),

See above.
Attachment #546479 - Flags: review?(tterribe) → review+
Attachment #546010 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Fixed review comments, carrying forward r=derf
Attachment #546479 - Attachment is obsolete: true
Attachment #547240 - Flags: review+
Comment on attachment 546010 [details] [diff] [review]
Don't force the stride

Is this a change that we could take for beta as well?
Attachment #546010 - Flags: approval-mozilla-beta?
I thought bug 656185 landed in Fx7, so beta should be unaffected?
And the proper fix on inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/39fef2383893
Whiteboard: [inbound]
And backed out for breaking mobile builds: http://hg.mozilla.org/integration/mozilla-inbound/rev/26d3e82605eb
http://hg.mozilla.org/mozilla-central/rev/26ac09ecc7dd
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla8
Talos is noticing a 1.78% Tp5 regression on Win XP on this changeset, is that expected or possible? If not should probably be backed out. Check m.d.tree-management thread.
Can we get an answer to comment 21? Is this a potential regression when we ship Firefox 6, or is it truly unaffected (as the status flag implies)?
Christian: The other bug only landed in time for fx7, so fx6 will be unaffected.

Marco: This code doesn't run on WinXP, so I can't see how this is possible
Comment on attachment 546010 [details] [diff] [review]
Don't force the stride

Approved to land for 6 beta per todays drivers meeting.
Attachment #546010 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to comment #29)
> Christian: The other bug only landed in time for fx7, so fx6 will be
> unaffected.

Ok, thanks! We are revoking beta (currently Fx6) approval...don't land there as it isn't needed :-D
Attachment #546010 - Flags: approval-mozilla-beta+
Matt, can you please land this on aurora. Time is running short.
Asa, this landed on aurora as http://hg.mozilla.org/releases/mozilla-aurora/rev/dc22634e3b99.

Is there anything else I need to do?
I fiddled the status flag, we're all good...thanks!
Verified as fixed on:
WinXP -> Mozilla/5.0 (Windows NT 5.1; rv:7.0) Gecko/20100101 Firefox/7.0
Win7 -> Mozilla/5.0 (Windows NT 6.1; rv:7.0) Gecko/20100101 Firefox/7.0
Ubuntu 11.04 -> Mozilla/5.0 (X11; Linux x86_64; rv:7.0) Gecko/20100101 Firefox/7.0
MacOS X 10.6 -> Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:7.0) Gecko/20100101 Firefox/7.0

Video: http://www.youtube.com/watch?v=HtKUZ_SNK60 metioned in bug description plays properly, no image corruption observed.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: