Intermittent GECKO(1834) | SUMMARY: AddressSanitizer: heap-use-after-free dom/media/gmp/ChromiumCDMParent.cpp:464:36 in mozilla::gmp::ChromiumCDMParent::RecvOnSessionKeysChange(nsCString const&, nsTArray<mozilla::gmp::CDMKeyInformation>&&)

RESOLVED FIXED

Status

()

defect
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: intermittent-bug-filer, Assigned: gerald)

Tracking

({csectype-uaf, intermittent-failure, sec-high})

unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr45 unaffected, firefox-esr52 unaffected, firefox53 unaffected, firefox54 unaffected, firefox55 fixed)

Details

Attachments

(1 attachment)

Group: core-security → media-core-security
Can you look at this, Chris? Thanks.
Flags: needinfo?(cpearce)
Maybe you could look at this Gerald while Chris is away? Thanks.
Flags: needinfo?(cpearce) → needinfo?(gsquelart)
Looks like a message is received by ChromiumCDMParent after the document was closed, which destroyed MediaKeys, which had the last reference to that ChromiumCDMParent.

I'll see if I can safely fix it shortly -- Otherwise Chris will be back next week.

Also, I think bug 1357072 is related or the same. (I only saw its original summary in an email, the crash happens at the same line.)
Andrew, could you please give me access to it?
Flags: needinfo?(continuation)
Thank you Kartikaya for marking bug 1357072 as duplicate of this one.
Flags: needinfo?(continuation)
Assignee: nobody → gsquelart
Flags: needinfo?(gsquelart)
Posted patch 1355506.patchSplinter Review
"Bug 1355506 - Ensure the ChromiumCDMProxy stays alive until the parent has completed its Shutdown() - r?cpearce"

I believe the bad scenario was:
Main thread:
1 - MediaKeys::Shutdown()
2 . - ChromiumCDMProxy->Shutdown()
3 . . - Dispatch ChromiumCDMParent->Shutdown()
. . . . while keeping a RefPtr to ChromiumCDMParent only.
4 . - RefPtr<ChromiumCDMProxy> = nullptr
. . . Last RefPtr -> the proxy is destroyed now!
GMP thread:
5 - ChromiumCDMParent::RecvOnSessionKeysChange()
. . Just uses the stored ChromiumCDMProxy* **BOOM**

With this patch, it becomes:
1 - MediaKeys::Shutdown()
2 . - ChromiumCDMProxy->Shutdown()
3 . . - Dispatch ChromiumCDMParent->Shutdown()
. . . . while keeping RefPtrs to ChromiumCDMParent and ChromiumCDMProxy.
4 . - RefPtr<ChromiumCDMProxy> = nullptr
. . . Not the last RefPtr, so it stays alive.
GMP thread:
5 - ChromiumCDMParent::RecvOnSessionKeysChange()
. . Just uses the stored ChromiumCDMProxy*, and it's fine now.
6 - ChromiumCDMParent::Shutdown()
. . Resets the weak pointer to ChromiumCDMProxy, so we won't use it anymore.
. . After the task has run, the last RefPtr to ChromiumCDMProxy is destroyed.
7 - ChromiumCDMParent::ActorDestroy()
. . Weak pointer to ChromiumCDMProxy is already null, nothing bad happens


I'd really like to get cpearce to review this, as he's most familiar with all things CDM.
Please let me know if we cannot wait until 26 April.


As for testing: I've never been able to run ASAN locally, and I don't think I should run tests publicly on Try.
I've run some local tests with debugging printfs, and it seems to do the right things.

So Kartikaya, would you be able to apply this patch, and test it with ASAN?
Flags: needinfo?(bugmail)
Attachment #8859874 - Flags: review?(cpearce)
I think it's fine to do try pushes with the patch. Maybe strip the commit message or comments so it's not super obvious that's it's a sec bug. There's no way other than try pushes to makes sure this passes tests and doesn't break stuff. Alternatively, you could get a loaner from releng and test it there. You can probably even use the one-click loaner feature on an existing ASAN job which would make things easier.

FWIW I've also had mixed results with local ASAN builds so I'm not sure if any results I get locally would be representative of anything.
Flags: needinfo?(bugmail)
Yeah, just don't include the bug number or any comments and it shouldn't be too bad. If you want to be more paranoid, you could wait a week or two so we're farther out from release. This isn't going to be able to land for a while anyways.
Alright, I've just started a Try with a different bug number, a vague commit description, and no comments in the source:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4c89606e0dc8f36eaf8116b4749bacfd91d17cac

Choosing the bug number lead me to find a few related intermittent bugs, which I believe this fix here would solve:
Bug 1354313, bug 1354425, bug 1354864, bug 1356026.
It looks like the same stack traces, but not running in ASAN, so it's less obvious that there is a UAF.

So we could easily attach the fix to one of these public bugs, review&land it quickly there (and get rid of annoying intermittents anyway), and this sec bug here will automagically get fixed in the process; we can mark it solved later, after the official release.
Is that acceptable?
Comment on attachment 8859874 [details] [diff] [review]
1355506.patch

Review of attachment 8859874 [details] [diff] [review]:
-----------------------------------------------------------------

That looks reasonable. Thanks for taking care of this Gerald!
Attachment #8859874 - Flags: review?(cpearce) → review+
Comment on attachment 8859874 [details] [diff] [review]
1355506.patch

Al, assuming sec-approval+, what do you think about my suggestion in comment 9, to sneak this fix into a more innocent-looking (but related) public bug?
It would help with removing at least 4 intermittent bugs quicker.


[Security approval request comment]
How easily could an exploit be constructed based on the patch?
I would hope "not easily", as we're calling a method on a freed object.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
Code comments explain the UAF scenario, but I would think a competent hacker would guess by looking at the changes.
I could remove these comments if preferred.

Which older supported branches are affected by this flaw?
None

If not all supported branches, which bug introduced the flaw?
N/A

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
N/A

How likely is this patch to cause regressions; how much testing does it need?
Very unlikely, as all it does is extend the lifetime of an object.
Attachment #8859874 - Flags: sec-approval?
And I'm told that since it's only affecting Nightly, we may not need to enforce the whole security process? Al?
(In reply to Gerald Squelart [:gerald] from comment #11)
> Comment on attachment 8859874 [details] [diff] [review]
> 1355506.patch
> 
> Al, assuming sec-approval+, what do you think about my suggestion in comment
> 9, to sneak this fix into a more innocent-looking (but related) public bug?
> It would help with removing at least 4 intermittent bugs quicker.

I think it is a great idea. If you can do this under a cover bug, I'm happy to give sec-approval+ right now.
(In reply to Gerald Squelart [:gerald] from comment #12)
> And I'm told that since it's only affecting Nightly, we may not need to
> enforce the whole security process? Al?

Ah, just saw this. That's correct. If it is nightly only, just check in a fix. I'll clear the flag.
Attachment #8859874 - Flags: sec-approval?
Keywords: checkin-needed
I believe this will also fix intermittent bugs that crash in RecvOnSessionKeysChange: Bug 1354313, bug 1354425, bug 1354864.
I'll close them after this one is resolved&fixed.
https://hg.mozilla.org/mozilla-central/rev/10b258d99791
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Group: media-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.