Don't recursively crash due to "exception-context annotations" in GMP processes

RESOLVED FIXED in Firefox 49

Status

()

Core
Security: Process Sandboxing
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: jld, Assigned: jld)

Tracking

(Blocks: 1 bug)

46 Branch
mozilla51
Unspecified
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox49 fixed, firefox50 fixed, firefox51 fixed)

Details

Attachments

(1 attachment)

(Assignee)

Description

2 years ago
Bug 1236108 introduced a feature where child processes that are crashing can add annotations at that time (instead of needing to send them over IPDL when in a non-crashing state, I guess?).  Unfortunately, this is accomplished by creating and writing to a temporary file from the child process.

GeckoMediaPlugin processes on Linux have no access to the filesystem, because they don't need it — there's one open() of the plugin module itself, from libc's dlopen() implementation, and that's faked with seccomp-bpf to use a previously opened file descriptor instead.  Correspondingly, not only does the seccomp-bpf policy prevent filesystem access, but the process also runs chroot()ed into a deleted directory if kernel features permit.

So that crash-time annotation feature isn't going to work for GMP on Linux without significant changes.

But a bigger problem is that the offending open() call is treated like any other unexpected syscall and… invokes the crash reporter.  Which does the same open(), and raises SIGSYS, and this causes unbounded recursion of crashing.  Eventually the process runs out of stack and dies without having actually requested a crash dump.

That part at least is fixable: just return an error (perhaps EPERM) from the open and the PlatformWriter will quietly discard all writes.  The extra metadata won't be attached, but it's not clear that that's as valuable for GMP processes as it is for content, and the rest of the crash report will still work.
Just making sure I understand this, when you say we don't get the extra metadata, do you mean the stuff in the "metadata" tab of the crash report? We can survive without that (though it would be nice). We do however need everything in the "detais" tab of the crash report. Is that going to remain?
Blocks: 1265235
(Assignee)

Comment 2

2 years ago
It looks like this feature is used for OOMAllocationSize (if OOM) and MozCrashReason (if an intentional crash/assertion): http://searchfox.org/mozilla-central/rev/480c0269eedd567906bfa098f18f51a9e7fa2e38/toolkit/crashreporter/nsExceptionHandler.cpp#1286
(Assignee)

Comment 3

2 years ago
Created attachment 8776734 [details] [diff] [review]
Patch: soft-fail unexpected open() in GMP

Not sure if this is the best way to handle this, but it will at least make crash reporting work again.
Attachment #8776734 - Flags: review?(gpascutto)
Comment on attachment 8776734 [details] [diff] [review]
Patch: soft-fail unexpected open() in GMP

Review of attachment 8776734 [details] [diff] [review]:
-----------------------------------------------------------------

I'm mentally filing this in the "it would be nice to get non-fatal reports of sandbox exceptions" because the risk here is if that more of those spurious opens appear, we won't know about them.

Alternatively, directory-based file brokering should allow removing this workaround (by whitelisting the relevant tmpdir)? If so can we hang the bugs off of each other?
Attachment #8776734 - Flags: review?(gpascutto) → review+
(In reply to Jed Davis [:jld] [⏰PDT; UTC-7] from comment #2)
> It looks like this feature is used for OOMAllocationSize (if OOM) and
> MozCrashReason (if an intentional crash/assertion):
> http://searchfox.org/mozilla-central/rev/
> 480c0269eedd567906bfa098f18f51a9e7fa2e38/toolkit/crashreporter/
> nsExceptionHandler.cpp#1286

Warning: Bug 1278717 disposes of PCrashReporter completely and in favor of this mechanism. You might want to coordinate with billm.
(Assignee)

Comment 6

2 years ago
(In reply to Aaron Klotz [:aklotz] from comment #5)
> Warning: Bug 1278717 disposes of PCrashReporter completely and in favor of
> this mechanism. You might want to coordinate with billm.

Sigh.  Thanks for the warning (and I tend to agree with not using IPDL IPC for crash metadata).  But: without this patch you get no crash report at all; with this patch you get a minidump but no extras.

(In reply to Gian-Carlo Pascutto [:gcp] from comment #4)
> I'm mentally filing this in the "it would be nice to get non-fatal reports
> of sandbox exceptions" because the risk here is if that more of those
> spurious opens appear, we won't know about them.

I left in the stderr logging, at least, but of course that doesn't help problems that show up in non-(developer-or-automation) usage.

> Alternatively, directory-based file brokering should allow removing this
> workaround (by whitelisting the relevant tmpdir)? If so can we hang the bugs
> off of each other?

I was trying to avoid this, because there's no other reason why the broker's attack surface should be exposed to these processes.  But it might be the easiest way to get some kind of workaround shipped.

Also, it seems kind of… counterintuitive that we're writing the metadata out-of-band to the filesystem when we already have a connection directly to the parent process that we're already using to send it a block of data about the crash.  From bug 1278717 comment #3 it looks like I'm not the only one interested in something like that.
(Assignee)

Comment 7

2 years ago
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c40dd581e79

I'll file a followup for improving on this.
Keywords: checkin-needed
(Assignee)

Updated

2 years ago
Blocks: 1291502

Comment 8

2 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa84c3fbfbad
Soft-fail unexpected open() in GMP processes to avoid recursive crash. r=gcp
Keywords: checkin-needed

Comment 9

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/fa84c3fbfbad
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox51: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Comment on attachment 8776734 [details] [diff] [review]
Patch: soft-fail unexpected open() in GMP

Approval Request Comment
[Feature/regressing bug #]: Widevine EME on Linux
[User impact if declined]: We won't get crash reports for GMPs on Linux at all. This includes both crash reports for the Widevine CDM and OpenH264.
[Describe test coverage new/current, TreeHerder]: We have plenty of EME mochitests, which run on Linux.
[Risks and why]: Low; this is tweaking sandbox rules
[String/UUID change made/needed]: None
Attachment #8776734 - Flags: approval-mozilla-beta?
Attachment #8776734 - Flags: approval-mozilla-aurora?

Updated

2 years ago
status-firefox50: --- → affected

Updated

2 years ago
status-firefox49: --- → affected
Comment on attachment 8776734 [details] [diff] [review]
Patch: soft-fail unexpected open() in GMP

Widevine EME on Linux, we need to get usable crash reports, Aurora50+
Attachment #8776734 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8776734 [details] [diff] [review]
Patch: soft-fail unexpected open() in GMP

Should land for beta 3 later this week.
Attachment #8776734 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 13

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/74d54db8df58
status-firefox49: affected → fixed
You need to log in before you can comment on or make changes to this bug.