Closed Bug 1062596 Opened 6 years ago Closed 6 years ago

Enable hardware acceleration in AppleVTDecoder

Categories

(Core :: Audio/Video, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35
Tracking Status
firefox32 --- unaffected
firefox33 --- unaffected
firefox34 --- fixed
firefox35 --- fixed

People

(Reporter: rillian, Assigned: jya)

References

()

Details

Attachments

(3 files, 2 obsolete files)

In Bug 1055694 we disabled hardware accelerated video decoding, or at least stopped asking for it explicitly, because of crashes inside VTDecompressionSessionCreate() called from AppleVTDecoder::InitializeSession() when multiple decoders were initialized in quick succession.

The problem goes away if we don't add kVTVideoDecoderSpecification_EnableHardwareAcceleratedVideoDecoder to the decoder selection dictionary, and also if we disable jemalloc. The problem only happens on MacOS X 10.9.4 and later.

This bug is about figuring out the issue so we can enable it again.
See bug 1055694 for stack traces. Visiting https://www.jottacloud.com/ reliably triggers the bug. This page loads 19 <video> elements.

This is not high priority. Hardware acceleration isn't crucial on desktop, and the work-around is fine. It would be nice to understand the jemalloc interaction, at least.
FWIW, I've had the crash occurring with just one video, and during the initialisation of the first video decoder.

So I now don't believe that it's related to how many decoders have been initialised, just that having many multiply the chances for the crash to occur ; but not a cause per say
See Also: → 1062654
Just discovered something interesting about the crashes at bug 1055694, at least on OS X 10.9.4.

There all the crashes happen in AppleVA::VP31IsAvailable(), or in bzero() called from it.  I looked at VP31IsAvailable() in a disassembler (Hopper Disassembler, http://www.hopperapp.com/) and discovered that this method sets aside a *huge* amount of space on the stack for local variables:

_VP3IsAvailable:
0000000000041b28 55                              push       rbp
0000000000041b29 4889E5                          mov        rbp, rsp
0000000000041b2c 4156                            push       r14
0000000000041b2e 53                              push       rbx
0000000000041b2f 4881EC802B0300                  sub        rsp, 0x32b80
...

And that seems to be the problem.  The crashes happen either here, where the code places a stack check value at the "end" of the stack to make sure the method doesn't overflow it:

0000000000041b36 4C8B35CBC40000                  mov        r14, qword [ds:imp___got____stack_chk_guard]
0000000000041b3d 498B06                          mov        rax, qword [ds:r14]
0000000000041b40 488945E8                        mov        qword [ss:rbp-0x32b90+var_207736], rax

Or here, in the call to bzero():

0000000000041b70 488D9D78D4FCFF                  lea        rbx, qword [ss:rbp-0x32b90+var_8]
0000000000041b77 4889DF                          mov        rdi, rbx
0000000000041b7a BE702B0300                      mov        esi, 0x32b70
0000000000041b7f E8224D0000                      call       imp___stubs____bzero

In other words, the bug is that the VP31IsAvailable() is called with too little space on the stack, when we're using jemalloc.
> AppleVA::VP31IsAvailable()

Oops, this is really AppleVA::VP3IsAvailable().
*Very* strangely, AppleVA::VP3IsAvailable() isn't called when you don't use jemalloc!  I currently have no idea why.
(Following up comment #5)

Actually, it seems that bug 1055964's crashes are no (suddenly) longer reproducible with the jottacloud site.
(In reply to Steven Michaud from comment #6)
> (Following up comment #5)
> 
> Actually, it seems that bug 1055964's crashes are no (suddenly) longer
> reproducible with the jottacloud site.

Are you sure you mean 1055964 ?
> Are you sure you mean 1055964 ?

No, I meant bug 1055694.  Sorry.
(In reply to Steven Michaud from comment #8)
> > Are you sure you mean 1055964 ?
> 
> No, I meant bug 1055694.  Sorry.

Have you re-enabled the HW acceleration though?
And now I see the crashes again, testing with the 2014-09-03 m-c nightly (previously I was testing with the 2014-08-29 m-c nightly, which crashed previously).

I won't comment again until I've figured this out :-(
You may want to try with the patches from bug 1059066. Re-enable hardware acceleration in VTDecompressionSessionCreate, and see how you go.

For me the issue occurs much more regularly with those (though I've also changed machine in the past few days, which could also explain things)
I'm giving up for today, I'm much too tired after all that weirdness.

What I say at comment #3 still stands, though.
If it's a stack size issue.
Try bumping the value defined:
http://mxr.mozilla.org/mozilla-central/source/content/media/VideoUtils.h#173

to 256kiB
When I see the crashes there are *lots* of (about 10) calls to VP3IsAvailable() pending at the same time (on different threads), each consuming 0x32b80 (== 207744) bytes of stack for local variables.

And for the time being at least, I'm seeing these crashes (with the 2014-09-03 m-c nightly) with and without jemalloc.
(In reply to comment #13)

I'll try that tomorrow.
Well, had about 30 go.. Can't get a crash after bumping the stack size from 128kB to 256kB
Bump media thread stack size on OSX
Attachment #8484707 - Flags: review?(kinetik)
Assignee: nobody → jyavenard
Status: NEW → ASSIGNED
Re-enable VT hardware acceleration
Attachment #8484708 - Flags: review?(giles)
Attachment #8484707 - Flags: review?(kinetik) → review+
Blocks: 1059066
Attachment #8484708 - Flags: review?(giles) → review+
I want to change the way we set the flags. I don't think using a string is the proper way to do it, and instead should use the official CFStringRef entry.

In the next patch
Attached patch Use official 10.9 constant (obsolete) — Splinter Review
Use 10.9 proper constant
Attachment #8484813 - Flags: review?(giles)
Attached patch Use official 10.9 constant (obsolete) — Splinter Review
Had stray changes to VideoUtils.h
Attachment #8484816 - Flags: review?(giles)
Attachment #8484813 - Attachment is obsolete: true
Attachment #8484813 - Flags: review?(giles)
Comment on attachment 8484816 [details] [diff] [review]
Use official 10.9 constant

Review of attachment 8484816 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with comments addressed.

Just to be clear, this shouldn't be necessary, right? If the dictionary key is accessed/compared by value hash, rather than address, it shouldn't matter whether we use our own CFStringRef (like gstreamer) or the Framework constant. I still like this better in that it's less error prone. I was paranoid I'd make a silent typo of the string value every time I rewrote the code. We can still do that with the constant name, but it's easier to cut/paste from the Header.

Please include a brief summary of _why_ the change is being made in the body of your commit message.

::: content/media/fmp4/apple/AppleUtils.h
@@ +68,3 @@
>  private:
> +  // Copy operator isn't supported and is not implemented.
> +  AutoCFRelease<T>& operator=(const AutoCFRelease<T>&);

Do you need this? The ctor should handle initializtion at allocation time, and I don't see you assigning to a reference anywhere.

::: content/media/fmp4/apple/AppleVTDecoder.h
@@ +55,5 @@
>    nsresult WaitForAsynchronousFrames();
>    void DrainReorderedFrames();
>    void ClearReorderedFrames();
> +
> +  CFDictionaryRef GetDecoderSpecification();

This should be CreateDecoderSpecification() to match CF memory ownership naming rules.

::: content/media/fmp4/apple/AppleVTLinker.cpp
@@ +61,5 @@
>  
> +  // Will only resolve in 10.9 and later.
> +  skPropHWAccel =
> +    GetIOConst("kVTVideoDecoderSpecification_EnableHardwareAcceleratedVideoDecoder");
> +

Please null skPropHWAccel again in Unlink() since the value will no longer be valid after that.
Attachment #8484816 - Flags: review?(giles) → review+
(In reply to Ralph Giles (:rillian) from comment #22)
> Comment on attachment 8484816 [details] [diff] [review]
> Use official 10.9 constant
> 
> Review of attachment 8484816 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me with comments addressed.
> 
> Just to be clear, this shouldn't be necessary, right? If the dictionary key
> is accessed/compared by value hash, rather than address, it shouldn't matter
> whether we use our own CFStringRef (like gstreamer) or the Framework
> constant. I still like this better in that it's less error prone. I was
> paranoid I'd make a silent typo of the string value every time I rewrote the
> code. We can still do that with the constant name, but it's easier to
> cut/paste from the Header.

That's right, it's not absolutely necessary. But I, like you was paranoid about using the proper string. Also, I had different crashes occurring when I was using the proper symbol (turned out to be for something else, see below)

If you have a better idea on how we could keep using the official constant name, without having to rely on a getter.

> 
> Please include a brief summary of _why_ the change is being made in the body
> of your commit message.
> 
> ::: content/media/fmp4/apple/AppleUtils.h
> @@ +68,3 @@
> >  private:
> > +  // Copy operator isn't supported and is not implemented.
> > +  AutoCFRelease<T>& operator=(const AutoCFRelease<T>&);
> 
> Do you need this? The ctor should handle initializtion at allocation time,
> and I don't see you assigning to a reference anywhere.

Well, this was actually the reason for the crash I had. I hadn't looked on how AutoCFRelease was implemented and my first code was something like:
AutoCFRelease<T> spec = null
if (GetHWConst()) {
  spec = CFDictionaryCreat
}

this actually made CFRelease to be called twice on spec due to how the compiler used the default constructor:
spec = AutoCFRelease<T>(CFDictionaryCreate(...))
e.g.
AutoCFRelease<T> spec(AutoCFRelease<T>(CFDictionaryCreate(...)))

Then I also read this :)
https://developer.mozilla.org/en-US/docs/Mozilla/Cpp_portability_guide#Always_declare_a_copy_constructor_and_assignment_operator


> This should be CreateDecoderSpecification() to match CF memory ownership
> naming rules.
will do.

> 
> ::: content/media/fmp4/apple/AppleVTLinker.cpp
> @@ +61,5 @@
> >  
> > +  // Will only resolve in 10.9 and later.
> > +  skPropHWAccel =
> > +    GetIOConst("kVTVideoDecoderSpecification_EnableHardwareAcceleratedVideoDecoder");
> > +
> 
> Please null skPropHWAccel again in Unlink() since the value will no longer
> be valid after that.

ok
Carrying r+
Attachment #8484816 - Attachment is obsolete: true
Blocks: 1062654
See Also: 1062654
Blocks: 1070703
Comment on attachment 8484707 [details] [diff] [review]
Bump thread stack size on OS X to avoid crashes

Approval Request Comment
[Feature/regressing bug #]: 941296
[User impact if declined]: MP4/H264 decoding can't be enabled in 10.6 and 10.7 (bug 1070703)
[Describe test coverage new/current, TBPL]: This is a backport of a change already in m-c. Passes all mochitests
[Risks and why]: We believe risk is low. We are actually considering bumping the stack size on all platforms and not just mac. It may causes a slight increase in memory usage, though this hasn't been confirmed.
[String/UUID change made/needed]: none
Attachment #8484707 - Flags: approval-mozilla-aurora?
Comment on attachment 8484707 [details] [diff] [review]
Bump thread stack size on OS X to avoid crashes

Aurora+
Attachment #8484707 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.