Closed Bug 1353313 Opened 7 years ago Closed 7 years ago

Intermittent PROCESS-CRASH | dom/media/test/test_streams_element_capture_createObjectURL.html | application crashed [@ mozilla::RemoteDataDecoder::Error]

Categories

(Core :: Audio/Video: Playback, defect, P1)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr45 --- unaffected
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 + fixed
firefox55 + fixed

People

(Reporter: cbook, Assigned: mchiang)

References

()

Details

(4 keywords)

Attachments

(2 files, 6 obsolete files)

https://treeherder.mozilla.org/logviewer.html#?job_id=88432119&repo=mozilla-central

filing as sec-bug just in case for the Crash address: 0xe5e5e65d adress 

PROCESS-CRASH | dom/media/test/test_streams_element_capture_createObjectURL.html | application crashed [@ mozilla::RemoteDataDecoder::Error]
Crash dump filename: /tmp/tmpHR_n_R/15e693df-9e1c-42af-04f1-b363e8aa2529.dmp
Operating system: Android
                  0.0.0 Linux 3.10.40-gc16a3c6 #1 SMP PREEMPT Tue Jul 28 17:58:25 UTC 2015 armv7l
CPU: arm
     ARMv1 Qualcomm Krait features: swp,half,thumb,fastmult,vfpv2,edsp,neon,vfpv3,tls,vfpv4,idiva,idivt
     4 CPUs

GPU: UNKNOWN

Crash reason:  SIGSEGV
Crash address: 0xe5e5e65d
Process uptime: not available

Thread 21 (crashed)
 0  libxul.so!mozilla::RemoteDataDecoder::Error [RefPtr.h:b5d8b27a7537 : 284 + 0x0]
     r0 = 0xe5e5e5e5    r1 = 0x949df868    r2 = 0x00000009    r3 = 0x9e3c5ea8
     r4 = 0x83cfcb10    r5 = 0xe5e5e5e5    r6 = 0x9cdc34ad    r7 = 0x00000001
     r8 = 0x75c99628    r9 = 0xb4a20000   r10 = 0x00000000   r12 = 0x80000000
     fp = 0x00000001    sp = 0x949df840    lr = 0x9cdc189d    pc = 0x9cdc32a6
    Found by: given as instruction pointer in context
 1  libxul.so!mozilla::jni::NativeStub<mozilla::java::CodecProxy::NativeCallbacks::OnError_t, mozilla::JavaCallbacksSupport, mozilla::jni::Args<bool> >::Wrap<&mozilla::JavaCallbacksSupport::OnError> [JavaCallbacksSupport.h:b5d8b27a7537 : 61 + 0x5]
     r4 = 0x83cfcb10    r5 = 0x949df868    r6 = 0x9cdc34ad    r7 = 0x00000001
     r8 = 0x75c99628    r9 = 0xb4a20000   r10 = 0x00000000    fp = 0x00000001
     sp = 0x949df868    pc = 0x9cdc189d
    Found by: call frame info
 2  data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xd78063
     r4 = 0x00000055    r5 = 0x13139720    r6 = 0x130602e0    r7 = 0x00000001
     r8 = 0x75c99628    r9 = 0xb4a20000   r10 = 0x00000000    fp = 0x00000001
     sp = 0x949df890    pc = 0xa2bc7065
    Found by: call frame info
 3  dalvik-non moving space (deleted) + 0x154e
     sp = 0x949df894    pc = 0x75bf4550
    Found by: stack scanning
 4  dalvik-main space (deleted) + 0x53971e
     sp = 0x949df8a0    pc = 0x13139720
    Found by: stack scanning
 5  dalvik-main space (deleted) + 0x53971e
     sp = 0x949df8a8    pc = 0x13139720
    Found by: stack scanning
 6  dalvik-main space (deleted) + 0x4602de
     sp = 0x949df8ac    pc = 0x130602e0
    Found by: stack scanning
 7  dalvik-non moving space (deleted) + 0xa6626
     sp = 0x949df8b4    pc = 0x75c99628
    Found by: stack scanning
 8  data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xee6fc7
     sp = 0x949df8c0    pc = 0xa2d35fc9
    Found by: stack scanning
 9  dalvik-non moving space (deleted) + 0xa6626
     sp = 0x949df8c4    pc = 0x75c99628
    Found by: stack scanning
10  dalvik-main space (deleted) + 0x53971e
     sp = 0x949df8c8    pc = 0x13139720
    Found by: stack scanning
11  dalvik-main space (deleted) + 0x5d3fe
     sp = 0x949df8d0    pc = 0x12c5d400
    Found by: stack scanning
12  system@framework@boot.art + 0x1e63c6
     sp = 0x949df8d4    pc = 0x70d013c8
    Found by: stack scanning
13  dalvik-main space (deleted) + 0x4602de
     sp = 0x949df8e0    pc = 0x130602e0
    Found by: stack scanning
14  dalvik-non moving space (deleted) + 0xa6696
     sp = 0x949df8e8    pc = 0x75c99698
    Found by: stack scanning
15  dalvik-main space (deleted) + 0x4602de
     sp = 0x949df8ec    pc = 0x130602e0
    Found by: stack scanning
16  data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xee702b
     sp = 0x949df8f0    pc = 0xa2d3602d
    Found by: stack scanning
17  dalvik-non moving space (deleted) + 0xa6696
     sp = 0x949df8f4    pc = 0x75c99698
    Found by: stack scanning
18  dalvik-zygote space (deleted) + 0x160f3ce
     sp = 0x949df8f8    pc = 0x75a883d0
    Found by: stack scanning
19  dalvik-non moving space (deleted) + 0xa681e
     sp = 0x949df8fc    pc = 0x75c99820
    Found by: stack scanning
20  dalvik-non moving space (deleted) + 0xa681e
     sp = 0x949df900    pc = 0x75c99820
    Found by: stack scanning
21  dalvik-non moving space (deleted) + 0xa681e
     sp = 0x949df904    pc = 0x75c99820
    Found by: stack scanning
22  dalvik-main space (deleted) + 0x5d3fe
     sp = 0x949df90c    pc = 0x12c5d400
    Found by: stack scanning
23  data@app@org.mozilla.fennec-1@base.apk@classes.dex + 0xee6d89
     sp = 0x949df910    pc = 0xa2d35d8b
Attached file more complete stack
dom/media/test/ is playback, not webrtc
gcp: any ideas what's going on here?  there's very little usable stack to work off of
Component: WebRTC: Audio/Video → Audio/Video: Playback
Flags: needinfo?(gpascutto)
OS: Unspecified → Android
John, this is your area of expertise
Flags: needinfo?(jolin)
Look for an owner for:

mozilla::java::CodecProxy
mozilla::RemoteDataDecoder

because that's all we have to go on.
Flags: needinfo?(gpascutto)
Munro?  You've been touching RemoteDataDecoder.cpp and CodecProxy
Flags: needinfo?(mchiang)
Group: core-security → media-core-security
Assignee: nobody → mchiang
Flags: needinfo?(mchiang)
Priority: -- → P1
Assignee: mchiang → nobody
Assignee: nobody → mchiang
See Also: → 1354435
Comment on attachment 8856790 [details] [diff] [review]
Bug1353313-Use-RefPtr-to-hold-RemoteVideoDecoder.patch

Review of attachment 8856790 [details] [diff] [review]:
-----------------------------------------------------------------

Using RefPtr here doesn't seem helpful to me. This extra ref count will be decreased when |mJavaCallbacks| goes away in |ProcessShutdown()| and won't extend the life of |RemoteDataDecoder|.

It looks like the issue is a race condition like the following:
- OnError() checks |mCancel|
- ProcessShutdown() preempts and resolves promise
- RemoteDataDecoder is destructed
- OnError() resumes
- UAF!
Attachment #8856790 - Flags: review?(jolin) → review-
Comment on attachment 8856791 [details] [diff] [review]
Bug1353313-Add-synchronization-mechanism-to-JavaCallbacksSupport.patch

Review of attachment 8856791 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/media/platforms/android/JavaCallbacksSupport.h
@@ +19,5 @@
>    using Base::AttachNative;
>    using Base::DisposeNative;
>  
> +  JavaCallbacksSupport() :
> +    mCanceled(false),

Coding style: use ':' and ',' to start a new line for initialization list.

@@ +77,5 @@
>      mCanceled = true;
>    }
>  
>  private:
>    Atomic<bool> mCanceled;

With |mMutex|, |Atomic<>| is no longer needed.
Attachment #8856791 - Flags: review?(jolin) → review+
Flags: needinfo?(jolin)
Comment on attachment 8856791 [details] [diff] [review]
Bug1353313-Add-synchronization-mechanism-to-JavaCallbacksSupport.patch

Review of attachment 8856791 [details] [diff] [review]:
-----------------------------------------------------------------

There's a call to "Cancel" to ensure that the JavaCallBackSupport object is no longer called by the JNI.

Adding a mutex to ensure that the OnBlah function may help as it adds some delay, but it doesn't fundamentally resolve the problem that the JNI may call the callback object once it's been destructed.

There needs to be another mechanism to ensure that the callback object isn't used *at all*. Not that the functions of the callback do nothing

::: dom/media/platforms/android/JavaCallbacksSupport.h
@@ +25,4 @@
>  
> +  virtual ~JavaCallbacksSupport() {
> +    MutexAutoLock lock(mMutex);
> +    mCanceled = true;

this is either unnecessary or very wrong. After the destructor has been called, this is freed. and access to those members will lead to a UAF.

If you rely on the destructor holding on a mutex to make sure the remaining of the code access this safely that can't be, and this is an undefined behaviour.
Attachment #8856791 - Flags: review+
(In reply to Jean-Yves Avenard [:jya] from comment #12)
> Comment on attachment 8856791 [details] [diff] [review]
> Bug1353313-Add-synchronization-mechanism-to-JavaCallbacksSupport.patch
> 
> Review of attachment 8856791 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> There's a call to "Cancel" to ensure that the JavaCallBackSupport object is
> no longer called by the JNI.
> 
> Adding a mutex to ensure that the OnBlah function may help as it adds some
> delay, but it doesn't fundamentally resolve the problem that the JNI may
> call the callback object once it's been destructed.
> 
> There needs to be another mechanism to ensure that the callback object isn't
> used *at all*. Not that the functions of the callback do nothing
> 
> ::: dom/media/platforms/android/JavaCallbacksSupport.h
> @@ +25,4 @@
> >  
> > +  virtual ~JavaCallbacksSupport() {
> > +    MutexAutoLock lock(mMutex);
> > +    mCanceled = true;
> 
> this is either unnecessary or very wrong. After the destructor has been
> called, this is freed. and access to those members will lead to a UAF.
> 
> If you rely on the destructor holding on a mutex to make sure the remaining
> of the code access this safely that can't be, and this is an undefined
> behaviour.

For the onBlah function call after CallBackSupport dtor, this would cause a UAF indeed. I will find out another way to ensure the callback object isn't used at all.

However, for the onBlah function call before dtor, I think this mutex is still needed to hold the CallBackSupport object before the onBlah function finishes.

Do you agree?
There are two issues to solve:

1- Ensuring that the callback object isn't used after it's been destructed
2- Ensuring that all functions currently in use (in another thread) in the callback object have completed.

The mutex change allows for 2. And calling Cancel() is sufficient.

But 1 needs to be resolved too
Comment on attachment 8858185 [details] [diff] [review]
Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch

Review of attachment 8858185 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/CodecProxy.java
@@ +101,5 @@
>              reportError(fatal);
>          }
>  
> +        private synchronized void reportError(boolean fatal) {
> +            if (!mCodecProxyReleased) {

shouldn't the same test be done for all callbacks?
(In reply to Jean-Yves Avenard [:jya] from comment #16)
> Comment on attachment 8858185 [details] [diff] [review]
> Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch
> 
> Review of attachment 8858185 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> :::
> mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/CodecProxy.
> java
> @@ +101,5 @@
> > reportError(fatal);
> > }
> > 
> > + private synchronized void reportError(boolean fatal) {
> > + if (!mCodecProxyReleased) {
> 
> shouldn't the same test be done for all callbacks?

There are quite a few regression bugs if we add the test to all callbacks:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a43dd2aafcbc66380f8ba8bbe93e5bfca9ac8c88
https://treeherder.mozilla.org/#/jobs?repo=try&revision=55cb1e7df136fc0ad91b2beeec74eb332fb9e2a3

I will address these in a following bug.
Attachment #8858252 - Flags: review?(jolin) → review+
Comment on attachment 8859416 [details] [diff] [review]
Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch

Review of attachment 8859416 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/media/CodecProxy.java
@@ +85,4 @@
>          }
>  
>          @Override
> +        public synchronized void onOutput(Sample sample) throws RemoteException {

Please dispose sample if proxy is dead.
Attachment #8859416 - Flags: review?(jolin) → review+
Comment on attachment 8859470 [details] [diff] [review]
Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Pretty hard

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No, just as the change we made, we made some function synchronized.

Which older supported branches are affected by this flaw?
back to 54

If not all supported branches, which bug introduced the flaw?
bug 1257777

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Will backport to beta

How likely is this patch to cause regressions; how much testing does it need?
Low risk to cause regression. We just solve a UAF bug.
Attachment #8859470 - Flags: sec-approval?
sec-approval+. Please request approval on patches for backporting as well.
Attachment #8859470 - Flags: sec-approval? → sec-approval+
https://hg.mozilla.org/mozilla-central/rev/ab2aad34a967
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment on attachment 8859470 [details] [diff] [review]
Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch

Approval Request Comment
[Feature/Bug causing the regression]: bug 1257777
[User impact if declined]: intermittent crash
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: no
[Why is the change risky/not risky?]: we just solved an UAF bug.
[String changes made/needed]: none
Attachment #8859470 - Flags: approval-mozilla-beta?
Comment on attachment 8859470 [details] [diff] [review]
Bug1353313-Add-synchronization-mechanism-to-CodecProxy.patch

Fix a sec-high. Beta54+. Should be in 54 beta 2.
Attachment #8859470 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Group: media-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: