Closed Bug 1966716 Opened 1 year ago Closed 9 months ago

Use USER_LIMITED for the Windows GPU process sandbox on Nightly

Categories

(Core :: Security: Process Sandboxing, enhancement, P1)

All
Windows
enhancement

Tracking

()

RESOLVED FIXED
144 Branch
Tracking Status
firefox144 --- fixed

People

(Reporter: bobowen, Assigned: bobowen)

References

Details

(Keywords: perf-alert)

Attachments

(1 file)

Landing USER_LIMITED for the Windows GPU process sandbox on Nightly only, to see if we still have the user reported issues we had previously.

Pushed by bobowencode@gmail.com: https://hg.mozilla.org/integration/autoland/rev/3d504e5a2f9a Use USER_LIMITED for the Windows GPU process sandbox on Nightly. r=handyman
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 140 Branch

Lead to huge regressions on many Talos startup-y tests only on Windows hw-wr. Sw-wr is unaffected. So the sandbox is delaying/hampering stuff at startup.

122% on ts_paint

90% on startup_about_home_paint

104% on sessionrestore hw-wr but sw-wr is unaffected

(In reply to Mayank Bansal from comment #4)
Mayank: I really appreciate that you're helping us track performance issues. However, please stop CC'ing me on everything you find; I'm not on the performance team and I don't know about everything in Firefox (case in point: I know approximately nothing about our graphics code and sandboxing, i.e. this bug). Usually if you're asking questions or think action needs to be taken, pinging the assignee of the bug or triage owner (if there isn't an assignee) would be more appropriate.

Lead to huge regressions on many Talos startup-y tests only on Windows hw-wr. Sw-wr is unaffected. So the sandbox is delaying/hampering stuff at startup.

Is there a separate perf alert bug on file? That is the typical way to deal with this, right? If not, let's get one on file and needinfo the assignee of the regressing bug (Bob) there. Even better if you have profiles of a before/after build to show what is causing the slowdown.

Flags: needinfo?(mayankleoboy1)
Pushed by smolnar@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f3a91d82cca8 Revert "Bug 1966716: Use USER_LIMITED for the Windows GPU process sandbox on Nightly. r=handyman" for causing sandboxing crashes related to font loading (bug 1967071)

Backed out for causing sandboxing crashes related to font loading

Backout link: https://hg.mozilla.org/integration/autoland/rev/f3a91d82cca84a81609f5c5e9bf827198b8c9ebc

Status: RESOLVED → REOPENED
Flags: needinfo?(bobowencode)
Resolution: FIXED → ---
Target Milestone: 140 Branch → ---
Pushed by archaeopteryx@coole-files.de: https://hg.mozilla.org/mozilla-central/rev/f927eb67efc3 Revert "Bug 1966716: Use USER_LIMITED for the Windows GPU process sandbox on Nightly. r=handyman" for causing sandboxing crashes related to font loading (bug 1967071)

Gijs: I had cc'd you based on the test owner mentioned here for startup_about_home_paint

Flags: needinfo?(mayankleoboy1)
Regressions: 1967485
Depends on: 1967485
Flags: needinfo?(bobowencode)
Depends on: 1967046

(In reply to Sandor Molnar[:smolnar] from comment #7)

Backed out for causing sandboxing crashes related to font loading

Backout link: https://hg.mozilla.org/integration/autoland/rev/f3a91d82cca84a81609f5c5e9bf827198b8c9ebc

Perfherder has detected a talos performance change from push f3a91d82cca84a81609f5c5e9bf827198b8c9ebc.

If you have any questions, please reach out to a performance sheriff. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
56% ts_paint windows11-64-24h2-shippable e10s fission stylo webrender 725.50 -> 320.92
53% sessionrestore_no_auto_restore windows11-64-24h2-shippable e10s fission stylo webrender 762.50 -> 358.08
52% sessionrestore windows11-64-24h2-shippable e10s fission stylo webrender 740.33 -> 354.67
47% startup_about_home_paint_cached windows11-64-24h2-shippable e10s fission stylo webrender 869.25 -> 458.83
47% startup_about_home_paint windows11-64-24h2-shippable e10s fission stylo webrender 870.67 -> 463.17
46% startup_about_home_paint_realworld_webextensions windows11-64-24h2-shippable e10s fission stylo webrender 891.00 -> 483.67

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a performance sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 45210

The following documentation link provides more information about this command.

Keywords: perf-alert
Status: REOPENED → RESOLVED
Closed: 1 year ago1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 141 Branch
No longer depends on: 1967046
Depends on: 1967046
Pushed by pstanciu@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/0739bb7ed078 https://hg.mozilla.org/integration/autoland/rev/584aadc43139 Revert "Bug 1966716: Use USER_LIMITED for the Windows GPU process sandbox on Nightly. r=handyman" as requested
Status: RESOLVED → REOPENED
Flags: needinfo?(bobowencode)
Resolution: FIXED → ---
Target Milestone: 141 Branch → ---

It looks like this is down to people having user only fonts in places other than the standard directory.
This could be manually done or more likely to be third party font managers.
This would not normally cause issues as the content process can't access these either (issues other than not being able to use those fonts).

However, dropdown/combo boxes still have their layout done in the parent, so it can access these fonts.
So, if they are used the GPU process can't find them causing either garbled fonts or crashes.
Even if it is possible moving the layout back to the content process, doesn't seem to be trivial.

The simplest solution is to add policy rules to allow read access to either the fonts or their dirs.
While not ideal, only the user, SYSTEM and Administrators have write access to the registry key, so to add malicious entry an attacker would already have user or higher access to the system.

Flags: needinfo?(bobowencode)
Depends on: 1977201
No longer regressions: 1967046
No longer regressions: 1967485
Pushed by csabou@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/8f85432344cf https://hg.mozilla.org/integration/autoland/rev/97ed289b2fc7 Revert "Bug 1966716: Use USER_LIMITED for the Windows GPU process sandbox on Nightly. r=handyman" for causing marionetter failures on test_refresh_firefox.py

Backed out for causing marionette failures on test_refresh_firefox.py

Push with failures

Failure log

Backout link

Flags: needinfo?(bobowencode)
Depends on: 1986646
Flags: needinfo?(bobowencode)
Status: REOPENED → RESOLVED
Closed: 1 year ago9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 144 Branch

:bobowen, anything you want to mention here in a release note? (Process info)
We could include it in the nightly only release notes if you wanted some extra visibility.

Flags: needinfo?(bobowencode)

(In reply to Donal Meehan [:dmeehan] from comment #25)

:bobowen, anything you want to mention here in a release note? (Process info)
We could include it in the nightly only release notes if you wanted some extra visibility.

Possibly, but I'll wait a couple of days to make sure it sticks and the existing issues on previous attempts seem to be fixed.
There isn't anything specific people can do testing-wise, apart from look out for GPU process crashes and possibly rendering quirks, so we'll have to explain that, but I guess people using Nightly are more used to looking out for such things.

Flags: needinfo?(bobowencode)
Flags: needinfo?(bobowencode)
Regressions: 1988709
QA Whiteboard: [qa-triage-done-c145/b144]
No longer regressions: 1988709

I'm going to wait until I'm back to think about this again.

Flags: needinfo?(bobowencode)
Blocks: 1998122
Regressions: 2001403
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: