Open Bug 1476101 Opened 7 years ago Updated 3 years ago

Optimize Windows sandbox policy construction

Tracking

()

Status:

NEW

Project Flags:

Performance Impact

medium

People

(Reporter: jld, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: perf:responsiveness, perf:startup)

Attachments

(2 files, 1 obsolete file)

[WIP] Bug 1476101 - Optimize Windows sandbox policy construction 6 years ago hafsa 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1476101, allow SandboxBroker to hold a cached process available for reuse by GeckoChildProcessHost, which will improve performance 6 years ago Emma Malysz 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1476101, cache the setup done by sandboxBroker for reuse by GeckoChildProcessHost. 6 years ago Emma Malysz 47 bytes, text/x-phabricator-request		Details \| Review

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Reporter

Description

•

7 years ago

I profiled content process launch on Windows, and found that we spend about half the total ContentParent::LaunchSubprocess time constructing the sandbox policy: https://perfht.ml/2zM88PS Specifically, we spend a lot of time adding the filesystem rules, and more specifically in sandbox::IsReparsePoint and sandbox::ConvertToLongPath. But the policy is mostly (or entirely?) the same for every process, so ideally we could just create the common policy once and copy it per-process like we did on B2G, or maybe cache that work some other way. Data-driven warning: We don't have telemetry data on what portion of launch time this is for the median user. The CONTENT_PROCESS_LAUNCH_TIME_MS metric has some bugs (it's collecting times from two different points in the process), but I'm seeing a median around 8ms on the system where I profiled that vs. 34ms on Beta, so that profile isn't exactly representative. Bug 1474991 would help with this.

Jim Mathies [:jimm]

Updated

•

7 years ago

Priority: -- → P2

Whiteboard: [qf]

Jean Gong :jgong

Updated

•

7 years ago

Whiteboard: [qf] → [qf:p1:f67]

Mike Conley (:mconley) (:⚙️)

Comment 1

•

6 years ago

Hey bobowen, I hear you work on the Windows sandbox! :) What are the chances this can be prioritized sometime this quarter?

Flags: needinfo?(bobowencode)

Whiteboard: [qf:p1:f67] → [qf:p2:responsiveness][fxperf]

Bob Owen (:bobowen)

Comment 2

•

6 years ago

After reviewing the sandbox policy code, I don't see an obvious reason why we can't share them (when the policy settings are the same of course). In fact the policy code itself appears to have been created with that in mind.
Some of the other chromium sandbox code that uses it pretty much assumes a single process per policy, but it doesn't look like it will require big changes.

I'll try and take a stab at a patch soon.
Leaving the NI to nag me.

Mike Conley (:mconley) (:⚙️)

Comment 3

•

6 years ago

(In reply to Bob Owen (:bobowen) from comment #2)

I'll try and take a stab at a patch soon.
Leaving the NI to nag me.

Any progress to report here, bobowen? Can we assign this one to you?

Whiteboard: [qf:p2:responsiveness][fxperf] → [qf:p2:responsiveness][fxperf:p2]

Bob Owen (:bobowen)

Comment 4

•

6 years ago

I'm not actively working on this, but I will try to when I get a chance.

Things aren't quite so simple as I first thought in comment 2, as there is some information in the policy that changes per process, but it doesn't look too bad.

hafsa

Comment 5

•

6 years ago

Hello, I'm an Outreachy applicant and I will start working on this bug.

hafsa

Comment 6

•

6 years ago

Attached file [WIP] Bug 1476101 - Optimize Windows sandbox policy construction — Details

Jed Davis [:jld] ⟨⏰|UTC-7⟩ ⟦he/him⟧

Reporter

Updated

•

6 years ago

Updated

•

6 years ago

Assignee: nobody → emalysz

Emma Malysz

Comment 7

•

6 years ago

Proposed plan:

Change policy by splitting out dynamic parts and passing those by another means to SpawnTarget
Check use of policy object and avoid any sharing problems.
Let SandboxBroker hold 1 common process

Emma Malysz

Comment 8

•

6 years ago

Attached file Bug 1476101, allow SandboxBroker to hold a cached process available for reuse by GeckoChildProcessHost, which will improve performance (obsolete) — Details

Alex Thayer [:alexical] (she/her)

Comment 9

•

6 years ago

To be clear, this ought to have no impact on startup time, yes? As we'll still have to do the work for the first content process, we'll just be avoiding it for subsequent launches? I imagine there's no way we have in mind for reducing this time for the first content process launch of a session, correct?

Do we know how often UI changes are blocked on content process creation, vs when we are able to complete process creation in the background?

Emma Malysz

Comment 10

•

6 years ago

Attached file Bug 1476101, cache the setup done by sandboxBroker for reuse by GeckoChildProcessHost. — Details

Phabricator Automation

Updated

•

6 years ago

Attachment #9091594 - Attachment is obsolete: true

:Gijs (he/him)

Comment 11

•

5 years ago

(In reply to Doug Thayer [:dthayer] from comment #9)

To be clear, this ought to have no impact on startup time, yes? As we'll still have to do the work for the first content process, we'll just be avoiding it for subsequent launches? I imagine there's no way we have in mind for reducing this time for the first content process launch of a session, correct?

Hm, but if we can share some of the policy between normal content processes, the privileged content process, the gpu process and the networking process, it might benefit startup...

Bob Owen (:bobowen)

Comment 12

•

5 years ago

(In reply to :Gijs (he/him) from comment #11)

(In reply to Doug Thayer [:dthayer] from comment #9)

To be clear, this ought to have no impact on startup time, yes? As we'll still have to do the work for the first content process, we'll just be avoiding it for subsequent launches? I imagine there's no way we have in mind for reducing this time for the first content process launch of a session, correct?

Hm, but if we can share some of the policy between normal content processes, the privileged content process, the gpu process and the networking process, it might benefit startup...

The only sharing that can be done here (at least with the planned changes) is between the normal content processes and the privileged content process (even this might cause problems, in the future).
Given that some of these are changes to the chromium sandbox code, I've asked if we can quantify that benefit, once we have a working prototype.
(Ideally we would get changes upstreamed, but that goes for quite a lot of the changes we already carry. As always it's a matter of priorities.)

The GPU process isn't sandbox yet, but hopefully will be soon, but the policy will be different.
The networking process is not sandboxed yet, but again the policy will be different.

Flags: needinfo?(bobowencode)

Bob Owen (:bobowen)

Comment 13

•

5 years ago

emalysz - before we go through another round of reviews, I'm assuming you have a working version now. Does this seem to be actually giving us the performance benefits we were hoping for?

Flags: needinfo?(emalysz)

Emma Malysz

Comment 14

•

5 years ago

We're currently unable to get a symbolicated profile due to Bug 1594097, so we cannot demonstrate performance benefits (if any) until that is resolved

Flags: needinfo?(emalysz)

Emma Malysz

Comment 15

•

5 years ago

•

Edited

bobowen- I'll comment profiles once I figure out how to track the sandbox code. But for right now, I added markers within ContentParent.cpp LaunchSubprocessInternal() and saw that the second and third time the marker is hit it is reduced from ~14 ms to ~7ms.
Without patch: https://perfht.ml/33QQtRM
With patch: https://perfht.ml/2QjCaRF

I'm having trouble profiling the "IPCLaunchThread" (something about naming), but here are preliminary results.

Flags: needinfo?(bobowencode)

Bob Owen (:bobowen)

Comment 16

•

5 years ago

(In reply to Emma Malysz from comment #15)

bobowen- I'll comment profiles once I figure out how to track the sandbox code. But for right now, I added markers within ContentParent.cpp LaunchSubprocessInternal() and saw that the second and third time the marker is hit it is reduced from ~14 ms to ~7ms.

Great work, I'm on PTO for some of Monday, so I'll try to get to the patch on Tuesday.
As has been mentioned this might not improve initial start-up, because that might rely more on the first process start.
It may well help with fission though, particularly when restoring multiple tabs.

However, I've been thinking about initial start-up and as the IPC Launch is single threaded on Windows, we could possibly post a task to that thread after the GPU and socket threads have started, to create the policy ready for the first content process launch.
Something for a follow-up though.

Without patch: https://perfht.ml/33QQtRM
With patch: https://perfht.ml/2QjCaRF

I'm having trouble profiling the "IPCLaunchThread" (something about naming), but here are preliminary results.

Looks like the name of that thread is "IPC Launch" [1].
I think it is the name that the profiler needs in the settings.

[1] https://searchfox.org/mozilla-central/rev/cac340f10816707e91c46db6d285f80259420f07/ipc/glue/GeckoChildProcessHost.cpp#856

Flags: needinfo?(bobowencode)

Emma Malysz

Comment 17

•

5 years ago

Here are two more profiles, this time gathered with the IPC Launch Thread:
with patch: http://bit.ly/37laVvZ
without patch: http://bit.ly/2r2zrkY

In both cases, I loaded two tabs. With the patch, it has reduced the time spent from 17 ms to 4 ms in SetSecurityLevelForContentProcess.
The marker set in this method showed a small decrease (7.2 to 6.9) on the first instance, but the subsequent look ups were significantly reduced: 7.5-10ms to 0.

Emma Malysz

Comment 18

•

5 years ago

The sandbox team has a few concerns with the current implementation and wants to investigate policy interactions more to see if this is still a viable option. :bobowen also suggested that we could cache the reparse point lookup, as this is what causes the performance issue. This eliminates the risk of the policy and use of it not being thread-safe.

There also can be performance wins when the policy is first constructed if file paths are similar in the rules.

Emma Malysz

Updated

•

4 years ago

Assignee: emalysz → nobody

Bob Owen (:bobowen)

Comment 19

•

4 years ago

Just to bring things up to speed in this bug.
After discussions with the chromium sandbox team, it looks like we can probably remove the checks that cause a fair amount of the I/O during sandbox setup. (See https://phabricator.services.mozilla.com/D112285#3844519)
There are some complications to getting that working for symlinks (bug 1695556), although for some of the possible performance benefits, we might just be able to remove the check.

However after very quick testing of this, the performance benefits of this alone are possibly somewhat masked by other mitigations we have in place, like the pre-allocated process. It looks like it might be a bit more noticeable for fission.

I hope to be looking at getting these changes made this half.

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Performance Impact: --- → P2

Keywords: perf:responsiveness

Whiteboard: [qf:p2:responsiveness][fxperf:p2] → [fxperf:p2]

Mike Conley (:mconley) (:⚙️)

Updated

•

3 years ago

Keywords: perf:startup

Whiteboard: [fxperf:p2]

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.