Closed Bug 1426100 Opened 3 years ago Closed 2 years ago
Stop processing native events in the content process on Mac
46 bytes, text/x-phabricator-request
|Details | Review|
I'm still trying to track down the remaining two regressions for bug 1423628. However those regressions are Windows only, so we should be fine turning off native event loop processing on Mac and Linux.
It looks like there is a fairly big performance hit on Linux for quite a few tests . So, I think we need to at least try and work out what's going on before we turn it off there. Mac looks much better though.  https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=a4f7888d7b39&newProject=try&newRevision=f64c8da4f5e21b99ed99663d93b2c4fd07bfeabb&framework=1&showOnlyConfident=1
Summary: Stop processing native events in the content process on Mac and Linux → Stop processing native events in the content process on Mac
Looks like I spoke too soon over Mac, I was slightly concerned about the content-process-startup results, so I did some more retriggers and it definitely seems to have regressed quite significantly.
Assignee: bobowencode → nobody
Status: ASSIGNED → NEW
Priority: P1 → P2
I sat down to do some content startup profiling, while kicking off a fresh talos run. Drumroll please... https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=c25d3fc57897c77d25d9a2541ea40c97ab55d0d8&framework=1&showOnlyComparable=1&selectedTimeRange=172800 Something apparently changed in the last few months, because there's no significant regressions. Most of the metrics are very low confidence. Should I be concerned that all of these are low confidence, or is that just what happens when there's no significance to a change? Since we have a long time until the next central->beta merge, I'm going to put a patch to just change the pref, and not make it conditional on NIGHTLY_OR_EARLY_BETA; we'll have plenty of time to flip it back if we need to back it out.
One thing of note, this does not appear to close _all_ WindowServer connections; so while I'd like to get this landed, I also want to expend a bit more energy understanding what's going on. For comparison: Firefox Nightly: ~/p/mozilla-central ❯❯❯ sudo lsmp -p 63721 | grep -i -E "(windowserver|ipc-)" name ipc-object rights flags boost reqs recv send sonce oref qlimit msgcount context identifier type 0x00003603 0x621fb72d send -------- --- 1 -> 5 0 0x0000000000000000 0x00001803 (197) WindowServer + send -------- D-- 1 <- 0x00039463 (197) WindowServer 0x00003903 0x81fc2255 send -------- --- 2 -> 1024 0 0x0000000000000000 0x0006a203 (197) WindowServer 0x00006d07 0x621fb145 send -------- D-- 1 -> 5 0 0x0000000000000000 0x00002403 (197) WindowServer 0x00007507 0x7c044795 send -------- --- 1 -> 5 0 0x0000000000000000 0x00056fab (197) WindowServer + send -------- D-- 1 <- 0x0004c31f (197) WindowServer Local build with native event processing disabled: ~/p/mozilla-central ❯❯❯ sudo lsmp -p 79653 | grep -i -E "(windowserver|ipc-)" name ipc-object rights flags boost reqs recv send sonce oref qlimit msgcount context identifier type 0x00003507 0x621fb72d send -------- --- 1 -> 5 0 0x0000000000000000 0x00001803 (197) WindowServer 0x00006403 0x8126da35 send -------- --- 2 -> 1024 0 0x0000000000000000 0x0005ea97 (197) WindowServer + send -------- D-- 1 <- 0x00047bfb (197) WindowServer Chrome Canary: ~/p/mozilla-central-perf ❯❯❯ sudo lsmp -p 76050 | grep -i -E "(windowserver|ipc-)" name ipc-object rights flags boost reqs recv send sonce oref qlimit msgcount context identifier type 0x00006d9f 0x76147ee5 send -------- D-- 1 -> 5 0 0x0000000000000000 0x00053237 (197) WindowServer 0x00009a0f 0x5d27383d send -------- --- 1 -> 6 0 0x0000000000000000 0x00014103 (197) WindowServer
This enables us to remove access to the windowserver mach service from content processes. The windowserver is considered a significant risk for sandbox escapes because it has a large attack surface (>600 IPC methods) and runs as uid=root.
Comment on attachment 8994285 [details] Bug 1426100 - disable native event process in content process on macOS; r?haik Haik Aftandilian [:haik] has approved the revision. https://phabricator.services.mozilla.com/D2302
Attachment #8994285 - Flags: review+
We originally thought that this would enable us to disconnect from the windowserver local service (which is a significant sandbox escape risk), however investigations revealed that that requires changes to WebGL and thus will be handled separately. This also corrects an incorrect usage of the (undocumented) APIs for closing windowserver connections. If CGSSetDenyWindowServerConnections is called while there are open connections it is a no-op, so it must be called after disconnecting any open connections.
Lost track of the try result, but tests are green and perfherder shows no regressions.
Comment on attachment 8995630 [details] Bug 1426100 - disable native event processing in content processes on macOS; r?haik Haik Aftandilian [:haik] has approved the revision. https://phabricator.services.mozilla.com/D2478
Attachment #8995630 - Flags: review+
Pushed by email@example.com: https://hg.mozilla.org/integration/autoland/rev/edfca01b39ab disable native event processing in content processes on macOS; r=haik
Noticeable perf wins thanks to this bug: == Change summary for alert #14762 (as of Fri, 03 Aug 2018 12:27:24 GMT) == Improvements: 5% tp5o_webext osx-10-10 opt e10s stylo 384.06 -> 363.76 5% tscrollx osx-10-10 opt e10s stylo 2.90 -> 2.76 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=14762
You need to log in before you can comment on or make changes to this bug.