Open Bug 1410716 Opened 2 years ago Updated 1 year ago

Inefficient handing of node with 1,000s of applied rules

Categories

(DevTools :: Inspector: Rules, defect, P2, major)

58 Branch
x86_64
All
defect

Tracking

(firefox-esr52 affected)

Tracking Status
firefox-esr52 --- affected

People

(Reporter: gwarser, Unassigned)

References

(Blocks 2 open bugs, )

Details

(Keywords: memory-footprint)

Attachments

(3 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:58.0) Gecko/20100101 Firefox/58.0
Build ID: 20171021100029

Steps to reproduce:

- go to https://browserleaks.com/css
- right click on table cell with your desktop horizontal or vertical size to the right of "min-width" or "min-height"
- select "Inspect Element"



Actual results:

My cpu goes up to 100% and memory usage starts rising 10MB/s. This goes on for about 3 minutes, then for about 2m memory stays leveled, and after that memory and cpu goes back to normal levels. 
See attached profiling and memory reports.


Expected results:

I expect low cpu and memory usage.
This was reproduced on clear ESR and Nightly. It is reproducible on stable and beta.
Reproduced with FF57b10 on Win10 x64 1709 Fall Creator's Update.

Captured the issue as it was happening too. https://streamable.com/8ubwc
Component: Untriaged → Developer Tools: Inspector
Severity: normal → major
Status: UNCONFIRMED → NEW
Has Regression Range: --- → yes
Has STR: --- → yes
Ever confirmed: true
Keywords: footprint, mlk, topmlk
OS: Unspecified → All
Hardware: Unspecified → x86_64
Summary: Inefficient resources usage when inspecting table cell on browserleaks.com/css → Massive memory leak when inspecting table cell on browserleaks.com/css
This specific page uses an unusual CSS file:

https://browserleaks.com/css/media-queries.css

which includes 100s (maybe 1000s) of media queries, sometimes separate queries at 1px increments, to detect various features, including min-width.
I tried to profile this, but for some reason it seems these steps also affect the profiler's ability to gather data, as it never displayed anything (it did show a timeout message).

With DevTools logging enabled, I can see that inspecting the min-width node causes ~2,400 domstylerule actors to be created and serialized over to client in a very giant message from the pagestyle actor, most likely in response to `getApplied` for this node.

The doesn't seem like a memory _leak_.  The original reporter states that memory levels return to normal after some time.

We just appear to process _lots_ of data and create _lots_ of objects.  So, ideally we'd like to reduce the amount of data involved and / or process it more quickly.
Keywords: mlk, topmlk
Summary: Massive memory leak when inspecting table cell on browserleaks.com/css → Inefficient handing of node with 1,000s of applied rules
This may also be related to bug 1244661 and bug 1412319, which are 2 other bugs about the rule-view being slow when a lot of data is being retrieved and displayed.
Component: Developer Tools: Inspector → Developer Tools: CSS Rules Inspector
Priority: -- → P2
I have no idea if that's the only source of issues (probably not), but here's one thing I discovered with slightly reduced media-queries.css (reduced enough to not kill the system, while still producing observable slowdowns):

One of the observable effects is a sudden RAM usage spike that drops fairly quickly.
It seems to originate here:
https://searchfox.org/mozilla-central/source/devtools/server/actors/styles.js#1584

When I return just before creating the CSSLexer:

> return {offset: 1234, text: 'rule'};
> let lexer = InspectorUtils.getCSSLexer(text);

The RAM spike disappears.

`text` isn't a big string (up to the CSS file size - in the case of media-queries.css, up to 1MB), but the function is called for every single rule - so I'm guessing it's caused by lexers not being freed fast enough?
Product: Firefox → DevTools
After getCSSLexer was converted to JS in bug 1418874, the performance appears to have improved a lot - I don't see any RAM usage spikes, and the lag doesn't exceed a couple seconds.

It would probably be good to monitor behavior on this page, to make sure it won't accidentally regress when CSS lexer is changed again in bug 1410184.
(In reply to Adrian Wielgosik [:adrian17] from comment #10)
> After getCSSLexer was converted to JS in bug 1418874

Are you sure it is because of this patch?

https://browserleaks.com/css was already quite usable on yesterday's nightly.
Also bug 1418874 regress our performance test. And I just looked at them and they are stressing this "getRuleText" method from styles.js.
So I'm confused here. Is browserleaks.com still your main test case? Are you sure it wasn't fixed before?
It's not impossible that moving to JS version of getCSSLexer fixes the memory issue, because when we invoke the CSS Lexer in C++, and the source is stored as single-byte string in JS engine, the binding code would need to copy the string into a two-byte (UTF-16) string for C++ code to consume, which means duplicate a 1MB string into a 2MB string.

Further more, if the lexer is not automatically released immediately after getRuleText returns (maybe due to GC not being aware of the large memory, since it's managed by C++ not JS engine), you can quickly pile up the 2MB strings for unreleased lexers, until the lexers eventually get released by the GC.

However, when the lexer is implemented in JS, there is no need for the string conversion anymore, and thus no extra allocation. The lexer can enjoy all the string optimizations JS engine deploys.

Having said that, it is indeed possible that converting to WASM regresses it again, but maybe in rarer cases, because Rust processes UTF-8 string. If the input string is fully ASCII (like most of the cases a CSS text should be), there is no conversion needed (although a check may be necessary).
Further thought on the WASM one, it may regress in terms of performance, because we may need to duplicate string in some cases, but it's probably fine on memory if JS engine is aware of memory allocated for WASM (ideally).

The real fix for this problem is probably on the devtools side that it should probably avoid repeatedly creating CSS lexer if it's crossing language boundary.
(In reply to Alexandre Poirot [:ochameau] from comment #11)
> (In reply to Adrian Wielgosik [:adrian17] from comment #10)
> > After getCSSLexer was converted to JS in bug 1418874
> 
> Are you sure it is because of this patch?

Admittedly, I do now know whether past patches improved this case already - that may be the case.
The runtime improvement I observed yesterday also isn't big (judging by eye, 10%).

However, I have verified that commit e10c1e086d80 from bug 1418874 removed a big RAM usage spike I observed before. So that may have improved this case a lot for machines with limited RAM.
> I do now know whether
"I do not know".

Also, I profiled this case with newer FF build: https://perf-html.io/public/ae49a211a5e043d3cf74eb9bdd92482cf45a46a7/calltree/?hiddenThreads=2-3&implementation=js&thread=4&threadOrder=0-2-3-4-1&v=3

It appears that there are still slow spots on the content process side, like `_computeRuleIndex`.
You need to log in before you can comment on or make changes to this bug.