Open Bug 1796059 Opened 2 years ago Updated 1 year ago

4.75 - 3.63% Heap Unclassified / Heap Unclassified (Windows) regression on Fri October 14 2022

Categories

(WebExtensions :: General, defect, P3)

defect

Tracking

(firefox-esr102 unaffected, firefox106 unaffected, firefox107 wontfix, firefox108 wontfix)

REOPENED
Tracking Status
firefox-esr102 --- unaffected
firefox106 --- unaffected
firefox107 --- wontfix
firefox108 --- wontfix

People

(Reporter: bacasandrei, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Perfherder has detected a awsy performance regression from push 00ee36d3c47b8bfdc555af9954df12935d792f8a. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
5% Heap Unclassified windows10-64-2004-shippable-qr fission tp6 90,938,739.42 -> 95,255,419.15
4% Heap Unclassified windows10-64-2004-shippable-qr fission tp6 91,325,173.74 -> 94,640,847.06

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(nika)

Set release status flags based on info from the regressing bug 1793995

:mixedpuppy could this be triaged for impact on 107?

Flags: needinfo?(mixedpuppy)

I chatted with :kmag a bit about this, and he seems to be of the opinion that we should accept the extra potential memory usage overhead of rust regexes. Redirecting the ni? to him for input and/or next steps here.

Flags: needinfo?(nika) → needinfo?(kmaglione+bmo)

Yeah, sorry, I forgot to comment last week.

We looked into this, and determined that the overhead is only in the main process, not per-process, so it's not as big of an issue as it could be. I think, given the performance improvement this should give us, along with the added flexibility and performance improvements of being able to access principals off-main-thread, it's worth the extra overhead.

That said, we can probably improve the situation somewhat. Most of the patterns we're compiling when we run in automation are for request matching for the webcompat add-on. There are a lot of patterns that we can probably optimize there.

The rust regex engine supports regex sets, which we could fairly easily use for MatchGlobSet, which would hopefully reduce memory usage somewhat. We could also optimize MatchPatternSet for the cases where we have multiple patterns with the same path glob but different hostnames, or the same hostname but different path globs, to coalesce things where possible. In the former case, we'd share the same compiled glob across different patterns or coalesce the host matchers into a single MatchPattern. In the latter, we'd compile the multiple glob patterns into a single MatchGlobSet in a single MatchPattern. We would probably improve both memory and performance at the same time with either of those changes.

Long term, I'd also ideally like to move static content script matching to the parent side of DocumentChannel (with the exception of scripts that run on about:blank and probably data: URIs), so we don't need to send the patterns to content processes at all. It would also let us predict and pre-compile scripts earlier in the request cycle (which we did more effectively pre-Fission), and send the compiled stencil data from the parent process to the content process. That would also ideally allow us to store that stencil data in shared memory, so it could be shared across all content processes which used a script (which, again, we handled better prior to Fission).

Shorter term, we could probably save a bit on the content script matching front by only compiling patterns which could match the origin of a given origin isolated content process.

That said, none of that would help the regression we're seeing in automation, only regressions we'd see in the real world.

Flags: needinfo?(kmaglione+bmo)

Sounds like this is a wontfix based on the most recent comments. Feel free to reopen if we're intending to investigate further still, though.

Status: NEW → RESOLVED
Closed: 1 year ago
Flags: needinfo?(mixedpuppy)
Resolution: --- → WONTFIX

I'd still like to investigate the improvements suggested in comment 4. I'd be willing to accept the regression for the sake of the wins it gives us if it can't be fixed, but I think there are probably several ways to significantly improve the situation without too much effort

Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Severity: -- → N/A
Type: defect → task
Priority: -- → P3

This bug has the keyword regression, so its type should be defect.

Type: task → defect
Severity: N/A → S3
You need to log in before you can comment on or make changes to this bug.