Closed Bug 1146873 Opened 5 years ago Closed 4 years ago

Windows process sandbox should not start process if critical policy set-up fails.

Categories

(Core :: Security: Process Sandboxing, defect)

All
Windows 7
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox50 --- fixed

People

(Reporter: bobowen, Assigned: gkrizsanits)

References

Details

(Whiteboard: sbwc1)

Attachments

(1 file, 2 obsolete files)

Most of the Chromium sandbox policy set-up functions are simple and unlikely to fail, but as they do report a status, we should decide if the policy setting is critical and not start the process if it fails.
Flags: needinfo?(bobowen.code)
Whiteboard: sbwc1
Flags: needinfo?(bobowen.code)
Duplicate of this bug: 1011658
We'll want to get SetAlternateDesktop working soon, but even that shouldn't really fail at this point.

I was going to see whether Chromium treats any of them as non-fatal, at least as far as starting the child without them is concerned.
If we decide they are all fatal we could just return when they fail, instead of what it does now.

The question of what we do when we decide not to start the child is probably the important bit.
We wouldn't want to release assert for GMP and NPAPI.
For content, it seems unlikely that retrying is going to be worthwhile, as nothing will have changed, if crashing is the accepted way of handling these things then I guess that is what we do.
I suppose trying to do a more graceful shutdown, would mean we wouldn't get the information we needed to diagnose any issues.
Flags: needinfo?(bobowen.code)
Assignee: nobody → gkrizsanits
So the only meaningful failures I see from these calls under SandboxBroker::SetSecurityLevelForContentProcess are from mPolicy->SetAlternateDesktop(true);

According to my understanding that is something we do not ship yet for content processes.

For the rest of the functions all we need is an assert as they can only return a bunch of bad params (and we use static params). I'm tempted to just do an assert / release assert at https://hg.mozilla.org/mozilla-central/annotate/46fe2115d46a5bb40523b8466341d8f9a26e1bdf/ipc/glue/GeckoChildProcessHost.cpp#l991

Bob, did you have anything specific in mind here you wanted to handle differently?
Flags: needinfo?(bobowen.code)
Attached patch asserting sandboxsetup failures (obsolete) — Splinter Review
Sorry for the lag, but I really had to take my time to set up my win dev env finally... I think for functions that with these static arguments cannot fail, an assert is the right thing to do. SetAlternateDesktop call is pretty much the only call that actually does some system calls that might fail theoretically. So for GMP I don't do release assert because of it. SetJobLevel could also fail (see related assert message) if memory limit is set, but we don't use memory limit so I just use assert there too. And even if we will, I think only thing to do will be to do a smarter assert.
Attachment #8757844 - Flags: feedback?(bobowen.code)
Comment on attachment 8757844 [details] [diff] [review]
asserting sandboxsetup failures

Review of attachment 8757844 [details] [diff] [review]:
-----------------------------------------------------------------

It was a national holiday here in the UK yesterday, although I did think about this while I was pulling weeds out of our garden. :-)

I think we need to ignore our prior knowledge about whether these functions can fail and treat them as if they can and make sure we behave as we would want.
I also think that behaviour should be different for the content process to the others.

If we're going to fail to start the content process, we're basically sunk.
To my mind in an ideal world we would fail gracefully (perhaps after some retries) with some sort of friendly (in the circumstances) error report to the user and if allowed a report to us.
Given that we don't quite have that, crashing via MOZ_RELEASE_ASSERT or MOZ_CRASH seems like the only sensible option.
(In fact it should probably be up to the content process code whether it crashes or not, but given the current state of this code, having the SandboxBroker do it is fine I think. Although this is really useful for things I need to think about when I come to writing up my plans for refactoring this policy setting/process launch code.)

For other processes I don't think we should crash the main process (in opt builds), we should just fail to start the child and it should be up to the code that was trying to start that process to handle the failure gracefully.

Sorry, I'd not thought about this more clearly before.
So, in light of that some comments ...

::: ipc/glue/GeckoChildProcessHost.cpp
@@ +996,5 @@
>  #if defined(MOZ_CONTENT_SANDBOX)
>        if (mSandboxLevel > 0 &&
>            !PR_GetEnv("MOZ_DISABLE_CONTENT_SANDBOX")) {
> +        bool ok = mSandboxBroker.SetSecurityLevelForContentProcess(mSandboxLevel);
> +        MOZ_RELEASE_ASSERT(ok, "Setting security level for content process failed.");

For the moment I think we could make this void and release assert in SetSecurityLevelForContentProcess.
That way if it could happen in the future we would pinpoint the problem.
We should probably have a comment here explaining that.

@@ +1007,5 @@
>      case GeckoProcessType_Plugin:
>        if (mSandboxLevel > 0 &&
>            !PR_GetEnv("MOZ_DISABLE_NPAPI_SANDBOX")) {
> +        bool ok = mSandboxBroker.SetSecurityLevelForPluginProcess(mSandboxLevel);
> +        MOZ_RELEASE_ASSERT(ok, "SetSecurityLevelForGMPlugin has failed.");

I think this should not crash and return false the same as for GMP.

@@ +1015,5 @@
>        break;
>      case GeckoProcessType_IPDLUnitTest:
>        // XXX: We don't sandbox this process type yet
>        // mSandboxBroker.SetSecurityLevelForIPDLUnitTestProcess();
>        // cmdLine.AppendLooseValue(UTF8ToWide("-sandbox"));

Let's leave the comment, but remove these commented out lines and SetSecurityLevelForIPDLUnitTestProcess.

@@ +1029,5 @@
>            [](const std::string arg) { return arg.find("gmp-widevinecdm") != std::string::npos; });
>          auto level = isWidevine ? SandboxBroker::Restricted : SandboxBroker::LockDown;
> +        bool ok = mSandboxBroker.SetSecurityLevelForGMPlugin(level);
> +        if (!ok) {
> +            // Seems like SetAlternateDesktop has failed, let's not start

I don't think we should be specific about what might have failed just that we failed to set the policy.

::: security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp
@@ +133,4 @@
>  
>    result = mPolicy->SetTokenLevel(sandbox::USER_RESTRICTED_SAME_ACCESS,
>                                    accessTokenLevel);
> +  MOZ_ASSERT(sandbox::SBOX_ALL_OK == result,

I think this and the others should be release asserts.

@@ +145,4 @@
>  
>    if (aSandboxLevel > 2) {
>      result = mPolicy->SetAlternateDesktop(true);
> +    if (sandbox::SBOX_ALL_OK != result) {

Release assert like the others.

@@ +235,5 @@
>    }
>  
>    sandbox::ResultCode result = mPolicy->SetJobLevel(jobLevel,
>                                                      0 /* ui_exceptions */);
> +  MOZ_RELEASE_ASSERT(sandbox::SBOX_ALL_OK == result,

Just a normal assert and we should return false as well, otherwise if because of some weird future change this only failed in opt builds we would ignore it.

@@ +241,4 @@
>  
>    result = mPolicy->SetTokenLevel(sandbox::USER_RESTRICTED_SAME_ACCESS,
>                                    accessTokenLevel);
> +  MOZ_ASSERT(sandbox::SBOX_ALL_OK == result,

For this and the others here and for GMP we need to return false as well.
Perhaps we need a local macro to remove the duplication.
Attachment #8757844 - Flags: feedback?(bobowen.code)
(In reply to Bob Owen (:bobowen) from comment #5)
> I think we need to ignore our prior knowledge about whether these functions
> can fail and treat them as if they can and make sure we behave as we would
> want.

Alright, let's do the future proof version.

> Sorry, I'd not thought about this more clearly before.
> So, in light of that some comments ...

Your previous comment was pretty clear that this is the approach you prefer to be honest. Since I already had the patch by then mostly and I thought the asserting is the way to go, I just wanted you to have a look at it and see if that changes your mind. Anyway, it can't cause any harm to be extra careful and guard against future changes so I made all the requested changes.
Attachment #8757844 - Attachment is obsolete: true
Attachment #8758694 - Flags: review?(bobowen.code)
Comment on attachment 8758694 [details] [diff] [review]
handling sandbox policy setup failures. v2

Review of attachment 8758694 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for doing this and cleaning things up.

Out of interest, do you know what it looks like if we do return false at these points for NPAPI and GMP?
Even though, I think we're all agreed it shouldn't happen with the existing code.

::: ipc/glue/GeckoChildProcessHost.cpp
@@ +995,5 @@
>      case GeckoProcessType_Content:
>  #if defined(MOZ_CONTENT_SANDBOX)
>        if (mSandboxLevel > 0 &&
>            !PR_GetEnv("MOZ_DISABLE_CONTENT_SANDBOX")) {
> +        // For now we threat every failure fatal in SetSecurityLevelForContentProcess

nit: "threat every failure fatal" should be "treat every failure as fatal"

::: security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp
@@ +305,4 @@
>  }
>  
>  bool
>  SandboxBroker::SetSecurityLevelForIPDLUnitTestProcess()

Can you remove this as well, please.
Attachment #8758694 - Flags: review?(bobowen.code) → review+
I'm setting the review flag again as we talked per IRC yesterday where should I do the error state setting and the lock.Notify(); bits in case of error. This seems like the cleanest approach.
Attachment #8758694 - Attachment is obsolete: true
Attachment #8759605 - Flags: review?(bobowen.code)
Comment on attachment 8759605 [details] [diff] [review]
handling sandbox policy setup failures. v3

Review of attachment 8759605 [details] [diff] [review]:
-----------------------------------------------------------------

::: ipc/glue/GeckoChildProcessHost.cpp
@@ +572,5 @@
> +  if (!ok) {
> +    // WaitUntilConnected might be waiting for us to signal.
> +    // If something failed let's set the error state and notify.
> +    MonitorAutoLock lock(mMonitor);
> +    mProcessState = PROCESS_ERROR;

I was thinking that we could even do the PROCESS_CREATED (currently at the end of PerformAsyncLaunchInternal and in GeckoExistingProcessHost::PerformAsyncLaunch) signal here.
So we would always signal, with the state based on ok.

But maybe that should be in a follow-up, with an IPC peer reviewing not me.

::: security/sandbox/win/src/sandboxbroker/sandboxBroker.h
@@ +36,3 @@
>  #endif
>    bool SetSecurityLevelForPluginProcess(int32_t aSandboxLevel);
>    bool SetSecurityLevelForIPDLUnitTestProcess();

nit: this can go as well.
Attachment #8759605 - Flags: review?(bobowen.code) → review+
https://hg.mozilla.org/mozilla-central/rev/15d68ab35c70
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in before you can comment on or make changes to this bug.