macOS Crash in [@ safeShift10BitBy6] with netflix.com
Categories
(Core :: Audio/Video: GMP, defect)
Tracking
()
People
(Reporter: aryx, Assigned: aosmond)
References
Details
(4 keywords, Whiteboard: [adv-main131+r][adv-esr128.3+r][adv-esr115.16+r])
Crash Data
Attachments
(2 files)
|
48 bytes,
text/x-phabricator-request
|
diannaS
:
approval-mozilla-beta+
diannaS
:
approval-mozilla-esr128+
tjr
:
sec-approval+
|
Details | Review |
|
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-esr115+
|
Details | Review |
[Tracking Requested - why for this release]:
First crash report from 2024-07-19, ~25 crashes/day on macOS over the last week. If an url is shared, it's a https://www.netflix.com page.
Crash report: https://crash-stats.mozilla.org/report/index/132d64f1-5c54-47dd-b623-153790240802
Reason: EXC_BAD_ACCESS / KERN_PROTECTION_FAILURE
Top 10 frames:
0 XUL safeShift10BitBy6(unsigned short const&) gfx/layers/MacIOSurfaceImage.cpp:58
0 XUL mozilla::layers::MacIOSurfaceImage::SetData(mozilla::layers::ImageContainer*,... gfx/layers/MacIOSurfaceImage.cpp:200
1 XUL mozilla::VideoData::CreateAndCopyData(mozilla::VideoInfo const&, mozilla::lay... dom/media/MediaData.cpp:360
2 XUL mozilla::gmp::ChromiumCDMParent::CreateVideoFrame(mozilla::gmp::CDMVideoFrame... dom/media/gmp/ChromiumCDMParent.cpp:1020
3 XUL mozilla::gmp::ChromiumCDMParent::RecvDecodedData(mozilla::gmp::CDMVideoFrame ... dom/media/gmp/ChromiumCDMParent.cpp:841
4 XUL mozilla::gmp::PChromiumCDMParent::OnMessageReceived(IPC::Message const&) ipc/ipdl/PChromiumCDMParent.cpp:1492
5 XUL mozilla::gmp::PGMPContentParent::OnMessageReceived(IPC::Message const&) ipc/ipdl/PGMPContentParent.cpp:408
6 XUL mozilla::ipc::MessageChannel::DispatchAsyncMessage(mozilla::ipc::ActorLifecyc... ipc/glue/MessageChannel.cpp:1820
6 XUL mozilla::ipc::MessageChannel::DispatchMessage(mozilla::ipc::ActorLifecyclePro... ipc/glue/MessageChannel.cpp:1739
6 XUL mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::ActorLifecycleProxy*, ... ipc/glue/MessageChannel.cpp:1530
Updated•1 year ago
|
Comment 1•1 year ago
|
||
:jimm could this be triaged and prioritized for investigation?
Comment 2•1 year ago
|
||
The bug is marked as tracked for firefox129 (release), tracked for firefox130 (beta) and tracked for firefox131 (nightly). However, the bug still isn't assigned.
:jimm, could you please find an assignee for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Comment 3•1 year ago
•
|
||
Looks like something they may have shipped and then possibly backed out? Andrew any thoughts?
Strange that this only impacted 128 and 129b4 but not 129.0 or 129.0.1.
Updated•1 year ago
|
Comment 4•1 year ago
|
||
This may be works for me. The crash hasn't happened in the last 10 days, regardless of release. It stated July 19th, and went away Aug 7th. From the crash it appears format related.
Comment 5•1 year ago
|
||
This is a reminder regarding comment #2!
The bug is marked as tracked for firefox129 (release), tracked for firefox130 (beta) and tracked for firefox131 (nightly). We have limited time to fix this, the soft freeze is in 14 days. However, the bug still isn't assigned.
Comment 6•1 year ago
|
||
This is a reminder regarding comment #2!
The bug is marked as tracked for firefox129 (release), tracked for firefox130 (beta) and tracked for firefox131 (nightly). We have limited time to fix this, the soft freeze is in 8 days. However, the bug still isn't assigned.
Comment 7•1 year ago
|
||
Crashes are still happening AFAICT, but the volume is pretty low.
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Comment 8•1 year ago
|
||
I think this is actually a security bug. We verify the bounds of any buffers we receive from the GMP process if it is a GMP plugin (i.e. OpenH264) before passing it into the GMPVideoDecoder implementation:
https://searchfox.org/mozilla-central/rev/e968519d806b140c402c3b3932cd5f6cd7cc42ac/dom/media/gmp/GMPVideoDecoderParent.cpp#301
However we do not appear to have any similar checks for the GMP process if it is a Chromium-style plugin (i.e. Widevine):
https://searchfox.org/mozilla-central/rev/e968519d806b140c402c3b3932cd5f6cd7cc42ac/dom/media/gmp/ChromiumCDMParent.cpp#991
This is called directly by ChromiumCDMParent::RecvDecodedData and ChromiumCDMParent::RecvDecodedShmem. Given this is where the process boundary is, we need to do similar bounds checks to ensure that plane offset + plane stride * plane width does not exceed the bounds of the provided data buffer. My best guess is that we are actually getting a buffer that violates that, hence the BAD_ACCESS crash on the pointer.
I suspect this is because of a bug in Widevine that Netflix is sometimes hitting. We should just drop the buffer in this case.
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Comment 9•1 year ago
|
||
Possibly related is bug 1906527 which substantially alters the OSX pathway and there do not appear any crashes on nightly with this signature.
| Assignee | ||
Comment 10•1 year ago
|
||
There do not appear to be any signature morphs for nightly, but the volume is probably too low to be certain that bug 1906527 fixed/changed things.
| Assignee | ||
Comment 11•1 year ago
|
||
Comment 12•1 year ago
|
||
The bug is linked to a topcrash signature, which matches the following criteria:
- Top 20 desktop browser crashes on release
- Top 10 content process crashes on release
- Top 5 desktop browser crashes on Mac on release
For more information, please visit BugBot documentation.
| Assignee | ||
Comment 13•1 year ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D219919
Updated•1 year ago
|
| Assignee | ||
Comment 14•1 year ago
|
||
Comment on attachment 9420419 [details]
Bug 1911317.
Security Approval Request
- How easily could an exploit be constructed based on the patch?: I'm not sure. They need to get the GMP process / Widevine library to provide a badly structured buffer, but the Widevine library is a blackbox. It is immediately obvious from the patch we are worried about a buffer overflow. Removing the comments/log entries won't make a difference.
- Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?: Yes
- Which branches (beta, release, and/or ESR) are affected by this flaw, and do the release status flags reflect this affected/unaffected state correctly?: All
- If not all supported branches, which bug introduced the flaw?: None
- Do you have backports for the affected branches?: Yes
- If not, how different, hard to create, and risky will they be?: Only esr115 required an updated patch.
- How likely is this patch to cause regressions; how much testing does it need?: Unlikely, it just adds bounds checking for something that should never happen.
- Is the patch ready to land after security approval is given?: Yes
- Is Android affected?: No
| Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
| Assignee | ||
Comment 15•1 year ago
|
||
We are still getting crash reports for users with the updated Widevine plugin, so it does not appear that will fix this issue.
| Assignee | ||
Comment 16•1 year ago
|
||
e.g. https://crash-stats.mozilla.org/report/index/dbbc703b-9193-4829-8a22-011550240903 has gmp-widevinecdm":{"version":"4.10.2830.0" in the crash report telemetry.
Comment 17•1 year ago
|
||
Comment on attachment 9420419 [details]
Bug 1911317.
Approved to land and uplift
Comment 18•1 year ago
|
||
| Reporter | ||
Comment 19•1 year ago
|
||
Updated•1 year ago
|
Comment 20•1 year ago
|
||
The patch landed in nightly and beta is affected.
:aosmond, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox131towontfix.
For more information, please visit BugBot documentation.
| Assignee | ||
Comment 21•1 year ago
|
||
Comment on attachment 9420419 [details]
Bug 1911317.
Beta/Release Uplift Approval Request
- User impact if declined: Sec issue, related crashes in the wild
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Just adds additional bounds checking.
- String changes made/needed:
- Is Android affected?: No
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Updated•1 year ago
|
Comment 22•1 year ago
|
||
Comment on attachment 9420419 [details]
Bug 1911317.
Approved for 131.0b3
Comment 23•1 year ago
|
||
| uplift | ||
Updated•1 year ago
|
Updated•1 year ago
|
Comment 24•1 year ago
|
||
Comment on attachment 9420419 [details]
Bug 1911317.
Approved for 128.3esr
Comment 25•1 year ago
|
||
| uplift | ||
Updated•1 year ago
|
Comment 26•1 year ago
|
||
:aosmond can you take a look at the crashes in nightly/beta that occurred post change?
Updated•1 year ago
|
Comment 27•1 year ago
|
||
Reopening since the crashes are still happening.
| Assignee | ||
Comment 28•1 year ago
|
||
I am hoping bug 1918778 will resolve it, should have the fix in hand EOD/this weekend, as there are several problems with how we deal with the surface recycling that can explain the crash.
Comment 29•1 year ago
|
||
:aosmond in will both patches (this and bug 1918778) need to be uplifted to esr115?
I have requested nomination to esr128 on the other patch just to be proactive, but I will wait until b8 has a little more crash-free traffic
| Assignee | ||
Comment 30•1 year ago
|
||
(In reply to Dianna Smith [:diannaS] from comment #29)
:aosmond in will both patches (this and bug 1918778) need to be uplifted to esr115?
I have requested nomination to esr128 on the other patch just to be proactive, but I will wait until b8 has a little more crash-free traffic
This bug did not resolve the topcrash, but we should still uplift it for the security implications, since we did not verify the sanity of the output from the plugin in the content process.
As for bug 1918778, in order to uplift, we will need to uplift bug 1906527 as well, or I will need to see if I can write a minimalist version of the patch suitable for uplift. Let me see what I can do.
Comment 31•1 year ago
|
||
Let's re-close this bug then for the sake of tracking. Bug 1918778 is looking good for having cleaned up the remaining crashes under this signature.
Updated•1 year ago
|
Comment 32•1 year ago
|
||
esr115 Uplift Approval Request
- User impact if declined: Sec issue, related crashes in the wild
- Code covered by automated testing: yes
- Fix verified in Nightly: yes
- Needs manual QE test: no
- Steps to reproduce for manual QE testing: No
- Risk associated with taking this patch: Low
- Explanation of risk level: Just adds additional bounds checking.
- String changes made/needed: None
- Is Android affected?: no
Comment 33•1 year ago
|
||
| uplift | ||
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Description
•