Closed Bug 1392570 Opened 2 years ago Closed 2 years ago

Firefox fails to launch on Windows 7 when already running in a job.

Categories

(Core :: Security: Process Sandboxing, defect, P1)

All
Windows
defect

Tracking

()

VERIFIED FIXED
mozilla57
Tracking Status
firefox56 --- verified
firefox57 --- verified

People

(Reporter: bobowen, Assigned: bobowen)

References

(Blocks 1 open bug)

Details

(Whiteboard: sb+)

Attachments

(3 files)

Chromium has a command line flag to allow no job to be used in the sandbox policy.
This was originally supposed to only be used for remote sessions on Win7 or before.

My suspicion is that this is contributing to the failed launches (SANDBOX_FAILED_LAUNCH) with error code 18.
(Note: if this is the case on release, these would be for GMP and NPAPI 64-bit processes only as we are not using a job for content until Fx56.)

If you use runas to start Firefox then this also uses a separate job and we would fail to start on Win7.
An unknown number of people use (or used to use) runas with Firefox for some security benefits (see bug 1228880).
This is now a counterproductive technique, but I'm going to allow both the remote and local case for now and add telemetry to see if we can restrict this to just remote.
This patch also adds telemetry for when this occurs, breaking it down for local and remote sessions.
Attachment #8900380 - Flags: review?(jmathies)
Comment on attachment 8900380 [details] [diff] [review]
On Windows 7 don't attempt to use a job object for the sandbox when it will fail

Review of attachment 8900380 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp
@@ +274,5 @@
> +    return true;
> +  }
> +
> +  JOBOBJECT_EXTENDED_LIMIT_INFORMATION job_info = {};
> +  // If we are can't get the flags then again assume we can use a job.

nit - "If we are in a job but can't get.."

@@ +297,5 @@
> +                           ? u"remote" : u"local");
> +  Telemetry::ScalarSet(Telemetry::ScalarID::SANDBOX_NO_JOB, localRemote, true);
> +
> +  // Allow running without the sandbox in this case. This slightly reduces the
> +  // ability of the sandbox to protect its children from spawning new processes

nit - "running without the sandbox" "slightly reduces the ability of the sandbox"..

did you mean job object here?

@@ +299,5 @@
> +
> +  // Allow running without the sandbox in this case. This slightly reduces the
> +  // ability of the sandbox to protect its children from spawning new processes
> +  // or preventing them from shutting down Windows or accessing the clipboard.
> +  return false;

Do we need this info in about:support for debugging?

::: toolkit/components/telemetry/Scalars.yaml
@@ +387,5 @@
> +    kind: boolean
> +    keyed: true
> +    notification_emails:
> +      - bowen@mozilla.com
> +    release_channel_collection: opt-out

You'll need a data review here. I believe Rebecca Weiss is handling these now.
Attachment #8900380 - Flags: review?(jmathies) → review+
(In reply to Jim Mathies [:jimm] from comment #3)
> Comment on attachment 8900380 [details] [diff] [review]
...
> > +  // If we are can't get the flags then again assume we can use a job.
> 
> nit - "If we are in a job but can't get.."

Fixed locally thanks.
 
> > +  // Allow running without the sandbox in this case. This slightly reduces the
> > +  // ability of the sandbox to protect its children from spawning new processes
> 
> nit - "running without the sandbox" "slightly reduces the ability of the
> sandbox"..
> 
> did you mean job object here?

Hmm, yeah I think this came over from the chromium code, but you're right it doesn't make sense.
Fixed locally, thanks.

> @@ +299,5 @@
> > +
> > +  // Allow running without the sandbox in this case. This slightly reduces the
> > +  // ability of the sandbox to protect its children from spawning new processes
> > +  // or preventing them from shutting down Windows or accessing the clipboard.
> > +  return false;
> 
> Do we need this info in about:support for debugging?

I don't think so as not running using the job is not likely to cause regressions, so it probably wouldn't be that useful.

However, if we decide we want to not allow this for local sessions then if the telemetry indicates a number of people do this, we'll probably want to warn first or at least fail in a way that makes it obvious to the user what is happening.
Comment on attachment 8900380 [details] [diff] [review]
On Windows 7 don't attempt to use a job object for the sandbox when it will fail

Hi Rebecca,

Would you be able to do the approval for the collection of this new technical information, please?
Attachment #8900380 - Flags: feedback?(rweiss)
:bobowen, can you clone this form (https://docs.google.com/document/d/1SSn5w8DfCSkHWJS8DNTd7ya82diWRxaDUFM5aL4UDDo/edit) and respond to the questions accordingly?  I suspect you'll be able to copy/paste a lot from this thread there.  I'll perform the review given the responses to those questions.
Flags: needinfo?(bobowencode)
(In reply to Rebecca Weiss from comment #6)
> :bobowen, can you clone this form
> (https://docs.google.com/document/d/
> 1SSn5w8DfCSkHWJS8DNTd7ya82diWRxaDUFM5aL4UDDo/edit) and respond to the
> questions accordingly?  I suspect you'll be able to copy/paste a lot from
> this thread there.  I'll perform the review given the responses to those
> questions.

Form completed and sent thanks.
Flags: needinfo?(bobowencode)
Comment on attachment 8900380 [details] [diff] [review]
On Windows 7 don't attempt to use a job object for the sandbox when it will fail

Review of attachment 8900380 [details] [diff] [review]:
-----------------------------------------------------------------

Can you amend this bug with your form as a text attachment?

1) Is there documentation that describes the schema for the ultimate data set available publicly, complete and accurate?
Yes, following standard documentation for Telemetry probes and under bug #1392570

2) Is there a control mechanism that allows the user to turn the data collection on and off? (Note, for data collection not needed for security purposes, Mozilla provides such a control mechanism)  
Yes, standard Telemetry opt out mechanism.

3) If the request is for permanent data collection, is there someone who will monitor the data over time?
Time scoped to Firefox 62.

4) Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?  
These are type 1 collections.

5) Is the data collection default-on or default-off? 
Default on, but scoped to Win7 users who meet a specific condition.

6) Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc.  See the appendix for more details)? 
No.

7) Is the data collection covered by the existing Firefox privacy notice? If unsure: escalate to legal.
Yes

8) Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)
No
Attachment #8900380 - Flags: feedback?(rweiss) → feedback+
(In reply to Rebecca Weiss from comment #8)
> Comment on attachment 8900380 [details] [diff] [review]
> On Windows 7 don't attempt to use a job object for the sandbox when it will
> fail
> 
> Review of attachment 8900380 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Can you amend this bug with your form as a text attachment?

Thanks Rebecca, I've done it as a PDF, because as plain text it was difficult to interpret some of the answers.
Pushed by bobowencode@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d45344aac145
On Windows 7 don't attempt to use a job object for the sandbox when it will fail. r=jimm, data-r=rweiss
https://hg.mozilla.org/mozilla-central/rev/d45344aac145
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Comment on attachment 8900380 [details] [diff] [review]
On Windows 7 don't attempt to use a job object for the sandbox when it will fail

Approval Request Comment
[Feature/Bug causing the regression]:
Using a Job object as part of the content process sandbox policy on Windows, which landed in Fx56.

[User impact if declined]:
Users running on a remote desktop session or using runas to run Firefox as a different user, will not be able to launch content processes and therefore not have a functioning browser.

[Is this code covered by automated tests?]:
We don't have any tests that run Firefox this way and they would be quite complicated to set up I think.
As for regressions, all e10s tests will test the normal code path.

[Has the fix been verified in Nightly?]:
Yes, by me.

[Needs manual test from QE? If yes, steps to reproduce]:
Yes should be easy to test.
STR: you can run Firefox using runas and your current user ID, which still runs it in a separate job which causes the issue, with something like the following from a command prompt:

runas /user:<user_id> "c:\Program Files (x86)\Nightly\firefox.exe -no-remote -P <profile_name>"

It will probably then prompt for your password and give you a browser with a tab that can't load any web content.

[List of other uplifts needed for the feature/fix]:
None.

[Is the change risky?]:
No.

[Why is the change risky/not risky?]:
Simple change that only reduces the sandbox policy strength in this particular case, so shouldn't cause any regressions in itself. Chrome uses a very similar piece of code to detect running in a job.

[String changes made/needed]:
None.
Attachment #8900380 - Flags: approval-mozilla-beta?
Hi Florin, could you help find someone to verify if this issue was fixed as expected on the latest Nightly build? Thanks!
Flags: qe-verify+
Flags: needinfo?(florin.mezei)
Cosmin, please verify this on the latest Nightly build.
Flags: needinfo?(florin.mezei)
QA Contact: cosmin.muntean
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:57.0) Gecko/20100101 Firefox/57.
Version: 57.0a1, Build ID: 20170904100131

I have tested this issue on Windows 7 x64 with latest Nightly 57.0a1 (Build ID: 20170904100131) and is no longer reproducible. Tested with Nightly build on x32 and x64 versions.

I can confirm that the issue was reproducible on older Nightly builds (eg: Nightly 57.0a1 from 2017-08-22, Build ID: 20170822142709)
Status: RESOLVED → VERIFIED
Comment on attachment 8900380 [details] [diff] [review]
On Windows 7 don't attempt to use a job object for the sandbox when it will fail

Fix an issue that users can't start firefox on a remote desktop session or use runas as a different user. Beta56+.
Attachment #8900380 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
This patch also adds telemetry for when this occurs, breaking it down for local and remote sessions.

This patch needed a trivial rebase for Beta for the toolkit/components/telemetry/Scalars.yaml file.
Having trouble building Beta for other reasons at the moment to test the compliation.
I have managed to reproduce this bug using the STR from comment 12, on an affected Nightly build (2017-08-22).

Verified fixed on 56 Beta 12 (20170914024831) under Windows 7 x64/x86.
Flags: qe-verify+
Blocks: 1400826
You need to log in before you can comment on or make changes to this bug.