Closed Bug 1416359 Opened 2 years ago Closed 2 years ago

0.87 - 3.76% sessionrestore_many_windows / sessionrestore_no_auto_restore (linux64, windows7-32) regression on push 75e7f32c336501a698e618667ab180abc9ff6e84 (Thu Nov 9 2017)

Categories

(Core :: Security: Process Sandboxing, defect)

53 Branch
defect
Not set

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=75e7f32c336501a698e618667ab180abc9ff6e84

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  4%  sessionrestore_no_auto_restore linux64 pgo e10s     479.67 -> 497.69
  2%  sessionrestore_many_windows linux64 pgo e10s        1,533.96 -> 1,565.58


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=10460

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
:jfkthame, I see you authored the patches in bug 1412090, they appear to have caused a small slowdown in some of our sessionrestore perf tests, specifically on linux.

could you take a look at this and determine if there is a fix?
Component: Untriaged → Security: Process Sandboxing
Flags: needinfo?(jfkthame)
Product: Firefox → Core
Bug 1412090 affects content-process startup, in that instead of just asking the fontconfig library for the set of available font patterns (which fontconfig most likely has in cache, and thus can return pretty quickly), we get the list via IPC from the chrome process. We have to do this because sandboxing means that we can't trust fontconfig's answer, we need the chrome process to filter the list in view of sandbox restrictions so that we don't end up with inaccessible fonts in the list that content thinks it should be using.

It makes sense that building the font list this way in content is a bit slower, because we're pushing each of the (typically several hundred) available font patterns, each of which is an object with a couple dozen properties -- mostly integers, bools, and strings, though some like charset have substantial internal structure of their own -- through a serialization and deserialization process in order to end up with the FcPatterns we need in the content process. That's bound to be slower than just asking fontconfig for a copy of what it already has in cache.

An alternative implementation would be to test the accessibility of the font files directly within the content process, but this is likely to be worse, because that file access would itself be expensive (bug 1412090 comment 17).

Bottom line is that I think this is a price we have to pay for the content-process sandboxing, which means we can't just let the content process directly rely on a service (fontconfig) that doesn't know about the sandbox restrictions that are in play.

So IMO this is WONTFIX, but let's see if :jimm or others have additional thoughts...
Flags: needinfo?(jfkthame) → needinfo?(jmathies)
2 notes:

1) We discussed extending the sandbox to whitelist the fonts on enumeration, but this doesn't work because fontconfig in content picks up fonts dynamically, after we have launched and can no longer modify the sandbox - and remember the original bug was people downloading fonts, so very much that use case.

2) If we're getting rid of all IO in content then obviously the fonts themselves will have to go over IPC at some point in the future.
We're willing to accept this as a acceptable regression related to font loading on linux and security sandboxing.
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(jmathies)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.