Some frames not displayed in short video

VERIFIED FIXED in Firefox 36

Status

()

defect
VERIFIED FIXED
5 years ago
3 years ago

People

(Reporter: snorp, Assigned: snorp)

Tracking

unspecified
Firefox 36
All
Android
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite ?

Firefox Tracking Flags

(firefox34 wontfix, firefox35 affected, firefox36 verified, firefox37 verified, fennec35+)

Details

()

Attachments

(6 attachments, 3 obsolete attachments)

5.70 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
7.82 KB, patch
jgilbert
: review+
Details | Diff | Splinter Review
12.17 KB, patch
jgilbert
: review+
Details | Diff | Splinter Review
13.92 KB, patch
jgilbert
: review+
Details | Diff | Splinter Review
2.25 KB, patch
cpearce
: review+
Details | Diff | Splinter Review
718 bytes, patch
Details | Diff | Splinter Review
If a video has a very short number of frames (say, two) we don't display them all. Sometimes none of them.
The video in the URL has two frames, each 10s long. We don't display any of them.
We treated Flush() and Drain() the same, which is obviously wrong. This makes Drain() behave correctly.

With this patch we only see the red frame (the second one) due to a problem with how the decoder outputs stuff into the Surface.
Assignee: nobody → snorp
Attachment #8520746 - Flags: review?(cpearce)
Attachment #8520746 - Flags: review?(cpearce) → review+
tracking-fennec: ? → 35+
I don't really like this patch, but I don't see another way around it. Also the timing stuff might be a little dumb, but seems to work well enough for me.
Attachment #8522425 - Flags: review?(cpearce)
Duplicate of this bug: 1092288
Comment on attachment 8522425 [details] [diff] [review]
Don't dequeue an output frame until we are ready to display it

Review of attachment 8522425 [details] [diff] [review]:
-----------------------------------------------------------------

This seems like a very bad idea. We have logic inside the code that calls these MediaDataDecoders that manages A/V sync, and it's probably going to end up dropping a lot of frames if we mess with this.

Can you copy the video frame to another surface and create a VideoData on that using VideoData::CreateFromImage instead?
Attachment #8522425 - Flags: review?(cpearce) → review-
(In reply to Chris Pearce (:cpearce) from comment #5)
> Comment on attachment 8522425 [details] [diff] [review]
> Don't dequeue an output frame until we are ready to display it
> 
> Review of attachment 8522425 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This seems like a very bad idea. We have logic inside the code that calls
> these MediaDataDecoders that manages A/V sync, and it's probably going to
> end up dropping a lot of frames if we mess with this.
> 
> Can you copy the video frame to another surface and create a VideoData on
> that using VideoData::CreateFromImage instead?

We can, but it's very slow. It uses glReadPixels(), which in my experience is never faster than 100ms. That's not going to cut it.

You bring up an interesting point wrt dropping frames. With this output method (SurfaceTexture), we would need some additional (horrific) machinery to drop frames. The only way to get SurfaceTexture to let go of a produced frame is to call updateTexImage() on it, which is what we do when we composite it. Dropping the VideoData() is fruitless, because the created layer already has that same SurfaceTexture, and will composite the next frame regardless.

I'm not sure what else we can do here. We could go back to not using SurfaceTexture, but that opens us back up to all the crazy YUV conversion mess. I think we're going to have to live with a less-than-perfect implementation either way.
Now that I think about it some more, it may be possible to do a GPU copy of the SurfaceTexture into a EGLImage. I could see that maybe being good enough. I'll give it a shot.
This does a GPU copy of the SurfaceTexture after output has been put into it.

I like this *much* better than the previous patch. The copy does have some performance impact, however. On a couple devices I tested on here, the copy takes between 10 and 20ms. That's not great, but good enough to play non-60fps videos.
Attachment #8522425 - Attachment is obsolete: true
Attachment #8523075 - Flags: review?(jgilbert)
Attachment #8523075 - Flags: review?(cpearce)
Comment on attachment 8523075 [details] [diff] [review]
Copy the decoded SurfaceTexture into an EGLImage to allow accurate presentation

Review of attachment 8523075 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know any variant of OpenGL, so I'll let jgilbert review this, and say f+ to the principal of doing this copy. Our media stack is designed to require several frames decoded in advance of the playback position, so copying is the best we can do without a lot more work...
Attachment #8523075 - Flags: review?(cpearce) → feedback+
Comment on attachment 8523072 [details] [diff] [review]
Add fencing and better lifetime management for EGLImage Images

Review of attachment 8523072 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/GLImages.cpp
@@ +17,5 @@
>  namespace layers {
>  
>  static nsRefPtr<GLContext> sSnapshotContext;
>  
> +EGLImageImage::~EGLImageImage() {

{ on next line for function definitions in the base scope.
Attachment #8523072 - Flags: review?(jgilbert) → review+
Comment on attachment 8523075 [details] [diff] [review]
Copy the decoded SurfaceTexture into an EGLImage to allow accurate presentation

Review of attachment 8523075 [details] [diff] [review]:
-----------------------------------------------------------------

Memory leak, so another round of review.

::: dom/media/fmp4/android/AndroidDecoderModule.cpp
@@ +8,5 @@
> +#include "GLContext.h"
> +#include "GLBlitHelper.h"
> +#include "GLLibraryEGL.h"
> +#include "GLContextEGL.h"
> +#include "GLContextProvider.h"

Try to keep the includes alphabetical. (Except the parent header file, which is at the top)

@@ +76,5 @@
> +      return nullptr;
> +    }
> +
> +    nsRefPtr<layers::Image> img = mImageContainer->CreateImage(ImageFormat::SURFACE_TEXTURE);
> +    layers::SurfaceTextureImage::Data data = { 0 };

This is really bad. Give ::Data a .Clear, or make it memset on init. Don't clear it like this.

Since you're setting most of its members, just initialize it directly:
data = {mSurfaceTexture.get(), size, true, ...}

@@ +102,5 @@
> +    EGLImage eglImage = sEGLLibrary.fCreateImage(EGL_DISPLAY(), static_cast<GLContextEGL*>(mGLContext.get())->GetEGLContext(),
> +                                                 LOCAL_EGL_GL_TEXTURE_2D_KHR,
> +                                                 (EGLClientBuffer)tex, attribs);
> +    if (!eglImage) {
> +      return nullptr;

You need to delete `tex` here.

@@ +123,5 @@
> +    EGLSync eglSync = nullptr;
> +    if (sEGLLibrary.IsExtensionSupported(GLLibraryEGL::KHR_fence_sync) &&
> +        mGLContext->IsExtensionSupported(GLContext::OES_EGL_sync))
> +    {
> +        eglSync = sEGLLibrary.fCreateSync(EGL_DISPLAY(),

Assert that mGLContext IsCurrent here. It's guaranteed based on the current code, but let's add more safety.

@@ +162,5 @@
> +      return true;
> +    }
> +
> +    SurfaceCaps caps = SurfaceCaps::ForRGBA();
> +    mGLContext = GLContextProvider::CreateOffscreen(gfxIntSize(16, 16), caps);

Use CreateHeadless.
Attachment #8523075 - Flags: review?(jgilbert) → review-
This changes the EGLImageTextureClientOGL to just hold a reference to the Image instead of doing the wonky ownership transfer. I think it's better and means the Image can still be used to do readback (which is coming in a later patch).
Attachment #8523072 - Attachment is obsolete: true
This adds support to GLBlitHelper for EGLImageImage, and also fixes up some of the 'yFlip' logic there.

One concern I have about this is whether or not it's legal to consume the EGLImage from two threads (and thus two separate contexts) at the same time. It seems like that would be bad.
Attachment #8524965 - Flags: review?(jgilbert)
https://hg.mozilla.org/mozilla-central/rev/74b269bec3f2
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 36
Should've set the dontclose flag or something
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #8524962 - Flags: review?(jgilbert)
Comment on attachment 8524958 [details] [diff] [review]
Copy the decoded SurfaceTexture into an EGLImage to allow accurate presentation

Review of attachment 8524958 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/fmp4/android/AndroidDecoderModule.cpp
@@ +99,5 @@
> +      LOCAL_EGL_IMAGE_PRESERVED_KHR, LOCAL_EGL_TRUE,
> +      LOCAL_EGL_NONE, LOCAL_EGL_NONE
> +    };
> +
> +    EGLImage eglImage = sEGLLibrary.fCreateImage(EGL_DISPLAY(), static_cast<GLContextEGL*>(mGLContext.get())->GetEGLContext(),

Pull this long calculation of the EGLContext out into a local var.

@@ +118,5 @@
> +    }
> +
> +    EGLSync eglSync = nullptr;
> +    if (sEGLLibrary.IsExtensionSupported(GLLibraryEGL::KHR_fence_sync) &&
> +        mGLContext->IsExtensionSupported(GLContext::OES_EGL_sync))

Warn if we don't have these? If we don't have these, we could get rendering artifacts.
Attachment #8524958 - Flags: review?(jgilbert) → review+
Attachment #8524962 - Flags: review?(jgilbert) → review+
Comment on attachment 8524965 [details] [diff] [review]
Fix readback for EGLImageImage

Review of attachment 8524965 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/gl/GLBlitHelper.cpp
@@ +727,4 @@
>  {
>      AndroidSurfaceTexture* surfaceTexture = stImage->GetData()->mSurfTex;
> +    if (stImage->GetData()->mInverted) {
> +        yFlip = !yFlip;

You're asking for trouble with this, but I have a patch to fix it.

@@ +774,5 @@
> +        }
> +    }
> +
> +    ScopedBindTextureUnit boundTU(mGL, LOCAL_GL_TEXTURE0);
> +    mGL->fClear(LOCAL_GL_COLOR_BUFFER_BIT);

Why do you clear it? You don't set the clear color, so you're functionally going to get something random. Theoretically, the scissor test or color masks might cause this to do nothing.

It looks like maybe you're trying to get the effect of InvalidateFramebuffer here. If so, please comment saying so! (Also we should have a helper polyfill for InvalidateFramebuffer)

@@ +781,5 @@
> +    mGL->fGetIntegerv(LOCAL_GL_TEXTURE_BINDING_2D, &oldBinding);
> +
> +    BindAndUploadEGLImage(eglImage, LOCAL_GL_TEXTURE_2D);
> +
> +    mGL->fUniform1f(mYFlipLoc, yFlip ? (float)1.0f : (float)0.0f);

`N.Nf` literals are already floats.
Attachment #8524965 - Flags: review?(jgilbert) → review+
(In reply to Jeff Gilbert [:jgilbert] from comment #20)
> Comment on attachment 8524965 [details] [diff] [review]
> Fix readback for EGLImageImage
> 
> Review of attachment 8524965 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/gl/GLBlitHelper.cpp
> @@ +727,4 @@
> >  {
> >      AndroidSurfaceTexture* surfaceTexture = stImage->GetData()->mSurfTex;
> > +    if (stImage->GetData()->mInverted) {
> > +        yFlip = !yFlip;
> 
> You're asking for trouble with this, but I have a patch to fix it.
> 
> @@ +774,5 @@
> > +        }
> > +    }
> > +
> > +    ScopedBindTextureUnit boundTU(mGL, LOCAL_GL_TEXTURE0);
> > +    mGL->fClear(LOCAL_GL_COLOR_BUFFER_BIT);
> 
> Why do you clear it? You don't set the clear color, so you're functionally
> going to get something random. Theoretically, the scissor test or color
> masks might cause this to do nothing.

I think I copy/pasted that from the SurfaceTexture blitting function, which was copy/pasted from the gralloc one. It is probably not necessary for the first two, not sure about gralloc.

> 
> It looks like maybe you're trying to get the effect of InvalidateFramebuffer
> here. If so, please comment saying so! (Also we should have a helper
> polyfill for InvalidateFramebuffer)

Maybe? Do I need that for this?

> 
> @@ +781,5 @@
> > +    mGL->fGetIntegerv(LOCAL_GL_TEXTURE_BINDING_2D, &oldBinding);
> > +
> > +    BindAndUploadEGLImage(eglImage, LOCAL_GL_TEXTURE_2D);
> > +
> > +    mGL->fUniform1f(mYFlipLoc, yFlip ? (float)1.0f : (float)0.0f);
> 
> `N.Nf` literals are already floats.

Yup. Oops.
Backed out for build failures on desktop platforms
I've fixed the build failures, but have now noticed very bad performance on Mediatek devices (which use PowerVR). There is a thread here[0] that talks about the first render to a texture being very slow. That's terrible for us, because this patch only ever renders into a texture once. Ugh.

[0] http://forum.imgtec.com/discussion/3201/first-time-drawing-into-texture-is-slow
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #24)
> I've fixed the build failures, but have now noticed very bad performance on
> Mediatek devices (which use PowerVR). There is a thread here[0] that talks
> about the first render to a texture being very slow. That's terrible for us,
> because this patch only ever renders into a texture once. Ugh.
> 
> [0]
> http://forum.imgtec.com/discussion/3201/first-time-drawing-into-texture-is-
> slow

Ouch. Maybe that's why the decoder goes to such lengths to reuse mis-sized textures. We might have to do something similar. I'll drop this info in the GFX meeting to try to infect others with this knowledge.
Attachment #8528483 - Flags: review?(cpearce) → review+
Can I pick up this bug for QA? If so, can someone update QA test-steps for the same.

Thanks,
Sudhir
(In reply to Sudhir from comment #31)
> Can I pick up this bug for QA? If so, can someone update QA test-steps for
> the same.
> 
> Thanks,
> Sudhir

This will require testing on multiple devices using the test-case URL. I'm already looking at this.
Attempting to play http://people.mozilla.org/~mwargers/tests/videos/lime10s_red10s.mp4 in my Nexus 5 (5.0) does nothing still; Nightly (12/01). The video is unable to be played (starts and stops immediately).

Playing http://people.mozilla.org/~atrain/mobile/tests/wat.mp4 doesn't display the end-frame on first play. On second and further play attempts it does.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I can confirm this looks broken again (still?) on Nightly. It definitely worked locally, so I wonder if something regressed it. (TESTS!)
Welp
Flags: in-testsuite?
Maybe the patch in bug 1106547 will fix it? I'd bet the change to make MP4Reader properly async would have regressed this (bug 1032633).
(In reply to Chris Pearce (:cpearce) from comment #36)
> Maybe the patch in bug 1106547 will fix it? I'd bet the change to make
> MP4Reader properly async would have regressed this (bug 1032633).

That looks very likely. I'm building with the patch now to test.
The patch on 1106547 does fix it.
Depends on: 1032633
See Also: → 1106547
Working again now that bug 1106547 landed.
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Requesting backport of the fixup landed in comment #30 to keep media code in sync for MSE.

Approval Request Comment
[Feature/regressing bug #]: 109711
[User impact if declined]: Harder to backport MSE patches.
[Describe test coverage new/current, TBPL]: Landed on m-c.
[Risks and why]: Minimal. Just removes a debug message.
[String/UUID change made/needed]: none.
Attachment #8540445 - Flags: approval-mozilla-aurora?
Attachment #8540445 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Status: RESOLVED → VERIFIED
The video starts to play, but only the 1st frame is displayed.
Tested on Firefox 36 Beta 1 using Samsung Galaxy S4 (Andorid.4.4.2).
Is this expected considering that on Firefox 35 none of the frames are displayed?
This targeted 36, use http://people.mozilla.org/~atrain/mobile/tests/wat.mp4 where you can tell what the end-frame is.
Verified as fixed in Firefox for Android 36 Beta 2;
Devices: 
Asus Transformer Pad TF300T (Android 4.2.1);
Samsung Galaxy R (Android 2.3.4).
You need to log in before you can comment on or make changes to this bug.