Closed Bug 1060896 Opened 5 years ago Closed 5 years ago

Crash [@ mozilla::layers::ImageContainer::CreateImage(mozilla::ImageFormat) ]

Categories

(Core :: Audio/Video, defect, critical)

x86
All
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla35
Iteration:
34.1
Tracking Status
firefox32 --- unaffected
firefox33 --- unaffected
firefox34 + fixed
firefox35 --- fixed

People

(Reporter: gladjonatan, Assigned: theinternetftw)

References

(Blocks 1 open bug, )

Details

(Keywords: crash, crashreportid)

Crash Data

Attachments

(2 files, 1 obsolete file)

This bug was filed from the Socorro interface and is
00e318d8-d045-41da-816d-1f4402140828
=============================================================
#15 at top crashers.(1)

(1). https://crash-stats.mozilla.com/topcrasher/products/Firefox/versions/34.0a1?days=7
Crash Signature: [@ mozilla::layers::ImageContainer::CreateImage(mozilla::ImageFormat) ]
Iteration: --- → 34.1
Component: General → Untriaged
QA Whiteboard: [bugday-20140901]
Component: Untriaged → Graphics
Keywords: crash, crashreportid
OS: Windows 7 → All
Product: Firefox → Core
Severity: major → critical
This is the top 8th crash signature among sessions attempting D3D11 layers.
Blocks: 1061693
We are crashing here because we are calling CreateImage on a null ImageContainer. I have been reading through a lot of media code looking for why the ImageContainer here is null, but it's really not obvious. I expect a debug build would have crashed in WMFVideoMFTManager's constructor since it's the only place I have seen that checks (asserts) that the ImageContainer is not null. Then I got lost in more code I didn't know, but I stumbled on AbstractMediaDecoder::GetImageContainer that will return null in BufferDecoder's implementation (not sure if this code path may be taken in this particular configuration but I haven't seen anything explicitly proving that it doesn't).

At this point I suggest that someone who knows this code takes a look. Enabling OMTC has a big enough impact in the video pipeline that it most probably has made more frequent whatever scenario where a ImageContainer may be returned null somewhere (which may or not may not be a bug, but we tend to poorly document whwther things are expected to be non-null and forget to handle the null cases until they blow up). So it may be something we didn't have to take into account and have to now.
Component: Graphics → Video/Audio
Flags: needinfo?(cpearce)
Here's some more information on this bug (or at least the manifestation of it that I'm seeing).

I have a testcase here: http://tiftw.com/moz/mp4bug.html

It's a video mp4 in an audio element.  Things work fine when a video tag is used.  I've tried cases with AAC, MP3, and no audio track: the audio side doesn't seem to have anything to do with it.  The problem seems to happen with any mp4.

I bisected the error back to this patch[1] which does a lot of work moving to a new mp4 demuxer.  Note that before the patch for bug 1057879 you have to set media.fragmented-mp4.exposed to true in about:config to trigger the bug.

During the bisection I saw this error output:

I/SampleTable( 7992): There are reordered frames present
E/SampleIterator( 7992): findSampleTime return error
E/MPEG4Extractor( 7992): Unexpected sample table problem

[1] https://hg.mozilla.org/mozilla-central/rev/f75676ac4f7d
This is the #7 topcrash for Firefox 34.0a2 with 88/3985 crashes in the last 7 days. 

Ben, thank you for the test case!
Status: UNCONFIRMED → NEW
Ever confirmed: true
[Tracking Requested - why for this release]:

While https://crash-stats.mozilla.com/report/list?product=Firefox&signature=mozilla%3A%3Alayers%3A%3AImageContainer%3A%3ACreateImage%28mozilla%3A%3AImageFormat%29 says we have one crash with this signature in 33.0b8, that train is practically unaffected, but 34 and 35 are. As it's in the top 10 on Aurora, we might want to track there.
Scratch that bisection.  I looked at it again and discovered that I was unlucky enough have it crash on that page in a similar way but for another reason entirely once I got that far back (bug 1016150).

Here's the mozregression range after another take: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e6614d8d85f9&tochange=7f81be7db528

I'm bisecting now in the background but I've already run into some in that range that are failing to build for me even after clobber, so the above may be as close as I get.
Further bisection failed.  The point at which the crash stopped appearing was point at which another bug (bug 1046549) kept the mp4 from playing in the first place.  If I get a wild hair about this I'll see if I can do a bisect that back-patches that change to let the mp4 play so the bug can trigger. (though of course who knows if this is a regression anyway, I may just end up where the feature was added again)

Want to make sure it's clear that resetting media.fragmented-mp4.exposed to false would fix the problem (or at least, it fixes my test case) for windows users by disabling the new demuxer.

Only fixes it for windows users, though.  I also get crashes on linux.
Sorry for 3 comments in a row, but I think I found the problem:

http://mxr.mozilla.org/mozilla-central/source/content/media/MediaDecoder.cpp#1432

I'm pretty sure this is where the nullptr is coming from.  mVideoFrameContainer is also null because it's retrieved at

http://mxr.mozilla.org/mozilla-central/source/content/media/fmp4/MP4Reader.cpp#402

...by calling the function at...

http://mxr.mozilla.org/mozilla-central/source/content/html/content/src/HTMLMediaElement.cpp#3288

...where we return nullptr if it's not a video element (remember, the testcase is an audio element w/ video mp4).

This all starts rolling at 

http://mxr.mozilla.org/mozilla-central/source/content/media/fmp4/MP4Reader.cpp#402

which creates an h264 decoder because HasVideo() only checks if the mp4 has a video track, not if it's a video element.  However, GetImageContainer() returns null thanks to the above video element check in GetVideoFrameContainer.  This passes the null container down the line...

http://mxr.mozilla.org/mozilla-central/source/content/media/fmp4/wmf/WMFDecoderModule.cpp#67
http://mxr.mozilla.org/mozilla-central/source/content/media/fmp4/wmf/WMFVideoMFTManager.cpp#35
http://mxr.mozilla.org/mozilla-central/source/content/media/fmp4/wmf/WMFVideoMFTManager.cpp#314
http://mxr.mozilla.org/mozilla-central/source/content/media/wmf/DXVA2Manager.cpp#139
http://mxr.mozilla.org/mozilla-central/source/content/media/wmf/DXVA2Manager.cpp#155

...until it crashes.
This sounds like an issue that we had with the older WMFReader (which is what you get if you set the media.fragmented-mp4.exposed pref to false again), and I fixed it by adding checks such as the following during the initialization of the video decoder:

http://dxr.mozilla.org/mozilla-central/source/content/media/wmf/WMFReader.cpp#333
Flags: needinfo?(cpearce)
Here's a fix.

I've only been able to test this on Windows.  I won't be able to test on Linux until I get a chance to spin up a Linux build environment.  It's hard for me to tell if the MP4Reader demuxer is being used there, especially since when I see the crash on my Linux VM it shows a bunch of gstreamer calls. I only see ffmpeg mentioned as a decoder in the MP4Reader code.

In any case, I thought I'd put this patch up so it can be discussed, even if I go back and add an additional one for linux.
Attachment #8501319 - Flags: review?(cpearce)
Comment on attachment 8501319 [details] [diff] [review]
bug1060896_avoidimgcontainernull.diff

Review of attachment 8501319 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the patch!

r=cpearce with the below change.

We don't use MP4Reader on Linux, we're still using the GStreamerReader backend there.

::: content/media/fmp4/MP4Reader.cpp
@@ +309,5 @@
>        MonitorAutoLock mon(mIndexMonitor);
>        mIndexReady = true;
>      }
>  
>      mInfo.mVideo.mHasVideo = mVideo.mActive = mDemuxer->HasValidVideo();

We can make this more compact:

mInfo.mVideo.mHasVideo = mVideo.mActive = mDemuxer->HasValidVideo() &&
                                          mDecoder->GetImageContainer();
Attachment #8501319 - Flags: review?(cpearce) → review+
Cool.  Here's the patch with those changes.

Thanks for the Linux info.  That crash clearly doesn't have anything to do with this then (though it may be due to the same assumptions). I'll look into it once I set up a decent Linux build environment.
Assignee: nobody → bcc
Attachment #8501319 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8501457 - Flags: review+
Keywords: checkin-needed
Hi Ben, do you have a try link (to make sure it builds and everything works) for this change? Thanks!
Keywords: checkin-needed
Oh, so sorry!  I was following the steps at codefirefox.com and completely misunderstood.  It sounded to me like going through try was part of the process that happens after adding the checkin-needed keyword (based on the video there).

I don't have try access, can someone run this through for me?
I landed your fix:
https://hg.mozilla.org/integration/mozilla-inbound/rev/02b1fce67dad

We'll need it on Aurora as well. Marking ni? myself so I don't forget to request...
Flags: needinfo?(cpearce)
https://hg.mozilla.org/mozilla-central/rev/02b1fce67dad
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Comment on attachment 8501457 [details] [diff] [review]
patch2 - Check for ImageContainer null (w/ changes)

Approval Request Comment
[Feature/regressing bug #]: bug 1057879
[User impact if declined]: Opening MP4 in an <audio> tag will crash on Windows and MacOSX. Null pointer deref, so not exploitable.
[Describe test coverage new/current, TBPL]: We have extensive media tests
[Risks and why]: Low
[String/UUID change made/needed]: None
Attachment #8501457 - Flags: approval-mozilla-aurora?
Flags: needinfo?(cpearce)
Comment on attachment 8501457 [details] [diff] [review]
patch2 - Check for ImageContainer null (w/ changes)

>[Describe test coverage new/current, TBPL]: We have extensive media tests

Was this failure caught with the existing test suite? If not, can you add a new test to cover this case?

I'm approving for Aurora but would like a follow-up about the test.
Attachment #8501457 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Flags: needinfo?(cpearce)
Patch: Adds testcase that checks that loading a video inside an <audio> element can play. This test causes a crash in MP4Reader on Windows (and presumably MacOSX) if the earlier patch is not applied.
Attachment #8503805 - Flags: review?(jwwang)
Flags: needinfo?(cpearce)
Comment on attachment 8503805 [details] [diff] [review]
Patch: Add testcase

Review of attachment 8503805 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me.

::: content/media/test/test_video_in_audio_element.html
@@ +15,5 @@
> +   * Test for Bug 1060896; tests that loading a video inside an audio element works.
> +   **/
> +
> +  var manager = new MediaTestManager;
> +   

space

@@ +21,5 @@
> +    var a = event.target;
> +    removeNodeAndSource(a);
> +    manager.finished(a.token);
> +  }
> +   

space

@@ +27,5 @@
> +    var a = document.createElement('audio');
> +    a.token = token;
> +    manager.started(token);
> +    a.autoplay = true;
> +    

space

@@ +34,5 @@
> +    a.src = test.name;
> +  }
> +
> +  var videos = gSmallTests.filter(function(x){return /^video/.test(x.type);});
> +  

space
Attachment #8503805 - Flags: review?(jwwang) → review+
Landed test:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1fa7422822db

Sorry, I missed removing the spaces because the review-granted bugmail no longer includes the review comments so I didn't see them. Very annoying change!
Depends on: 1096750
You need to log in before you can comment on or make changes to this bug.