Closed Bug 1426100 Opened 3 years ago Closed 2 years ago

Stop processing native events in the content process on Mac

Categories

(Core :: DOM: Content Processes, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: bobowen, Assigned: Alex_Gaynor)

References

(Blocks 2 open bugs)

Details

Attachments

(1 file, 1 obsolete file)

I'm still trying to track down the remaining two regressions for bug 1423628.
However those regressions are Windows only, so we should be fine turning off native event loop processing on Mac and Linux.
It looks like there is a fairly big performance hit on Linux for quite a few tests [1].
So, I think we need to at least try and work out what's going on before we turn it off there.
Mac looks much better though.

[1] https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=a4f7888d7b39&newProject=try&newRevision=f64c8da4f5e21b99ed99663d93b2c4fd07bfeabb&framework=1&showOnlyConfident=1
Summary: Stop processing native events in the content process on Mac and Linux → Stop processing native events in the content process on Mac
Looks like I spoke too soon over Mac, I was slightly concerned about the content-process-startup results, so I did some more retriggers and it definitely seems to have regressed quite significantly.
Assignee: bobowencode → nobody
Status: ASSIGNED → NEW
Priority: P1 → P2
Blocks: 1381022, 1350432
No longer blocks: 1423628
Blocks: 1467758
Assignee: nobody → agaynor
I sat down to do some content startup profiling, while kicking off a fresh talos run. Drumroll please...

https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=c25d3fc57897c77d25d9a2541ea40c97ab55d0d8&framework=1&showOnlyComparable=1&selectedTimeRange=172800

Something apparently changed in the last few months, because there's no significant regressions. Most of the metrics are very low confidence. Should I be concerned that all of these are low confidence, or is that just what happens when there's no significance to a change?

Since we have a long time until the next central->beta merge, I'm going to put a patch to just change the pref, and not make it conditional on NIGHTLY_OR_EARLY_BETA; we'll have plenty of time to flip it back if we need to back it out.
One thing of note, this does not appear to close _all_ WindowServer connections; so while I'd like to get this landed, I also want to expend a bit more energy understanding what's going on.

For comparison:


Firefox Nightly:
~/p/mozilla-central ❯❯❯ sudo lsmp -p 63721 | grep -i -E "(windowserver|ipc-)"
  name      ipc-object    rights     flags   boost  reqs  recv  send sonce oref  qlimit  msgcount  context            identifier  type
0x00003603  0x621fb72d  send        --------        ---            1        ->        5         0  0x0000000000000000 0x00001803  (197) WindowServer
                  +     send        --------        D--            1         <-                                       0x00039463  (197) WindowServer
0x00003903  0x81fc2255  send        --------        ---            2        ->     1024         0  0x0000000000000000 0x0006a203  (197) WindowServer
0x00006d07  0x621fb145  send        --------        D--            1        ->        5         0  0x0000000000000000 0x00002403  (197) WindowServer
0x00007507  0x7c044795  send        --------        ---            1        ->        5         0  0x0000000000000000 0x00056fab  (197) WindowServer
                  +     send        --------        D--            1         <-                                       0x0004c31f  (197) WindowServer


Local build with native event processing disabled:
~/p/mozilla-central ❯❯❯ sudo lsmp -p 79653 | grep -i -E "(windowserver|ipc-)"
  name      ipc-object    rights     flags   boost  reqs  recv  send sonce oref  qlimit  msgcount  context            identifier  type
0x00003507  0x621fb72d  send        --------        ---            1        ->        5         0  0x0000000000000000 0x00001803  (197) WindowServer
0x00006403  0x8126da35  send        --------        ---            2        ->     1024         0  0x0000000000000000 0x0005ea97  (197) WindowServer
                  +     send        --------        D--            1         <-                                       0x00047bfb  (197) WindowServer


Chrome Canary:
~/p/mozilla-central-perf ❯❯❯ sudo lsmp -p 76050 | grep -i -E "(windowserver|ipc-)"
  name      ipc-object    rights     flags   boost  reqs  recv  send sonce oref  qlimit  msgcount  context            identifier  type
0x00006d9f  0x76147ee5  send        --------        D--            1        ->        5         0  0x0000000000000000 0x00053237  (197) WindowServer
0x00009a0f  0x5d27383d  send        --------        ---            1        ->        6         0  0x0000000000000000 0x00014103  (197) WindowServer
This enables us to remove access to the windowserver mach service from content
processes. The windowserver is considered a significant risk for sandbox
escapes because it has a large attack surface (>600 IPC methods) and runs as
uid=root.
Comment on attachment 8994285 [details]
Bug 1426100 - disable native event process in content process on macOS; r?haik

Haik Aftandilian [:haik] has approved the revision.

https://phabricator.services.mozilla.com/D2302
Attachment #8994285 - Flags: review+
Priority: P2 → P1
Attachment #8994285 - Attachment is obsolete: true
We originally thought that this would enable us to disconnect from the
windowserver local service (which is a significant sandbox escape risk),
however investigations revealed that that requires changes to WebGL and thus
will be handled separately.

This also corrects an incorrect usage of the (undocumented) APIs for closing
windowserver connections. If CGSSetDenyWindowServerConnections is called while
there are open connections it is a no-op, so it must be called after
disconnecting any open connections.
Lost track of the try result, but tests are green and perfherder shows no regressions.
Comment on attachment 8995630 [details]
Bug 1426100 - disable native event processing in content processes on macOS; r?haik

Haik Aftandilian [:haik] has approved the revision.

https://phabricator.services.mozilla.com/D2478
Attachment #8995630 - Flags: review+
Keywords: checkin-needed
Pushed by archaeopteryx@coole-files.de:
https://hg.mozilla.org/integration/autoland/rev/edfca01b39ab
disable native event processing in content processes on macOS; r=haik
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/edfca01b39ab
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Depends on: 1480992
Noticeable perf wins thanks to this bug:

== Change summary for alert #14762 (as of Fri, 03 Aug 2018 12:27:24 GMT) ==

Improvements:

  5%  tp5o_webext osx-10-10 opt e10s stylo     384.06 -> 363.76
  5%  tscrollx osx-10-10 opt e10s stylo        2.90 -> 2.76

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=14762
Depends on: 1481304
You need to log in before you can comment on or make changes to this bug.