Open Bug 1927919 Opened 4 days ago Updated 3 days ago

Firefox spends more CPU time resolving style during speedometer 3 TodoMVC-*-Complex-DOM "prepare" steps

Tracking

()

Status:

NEW

People

(Reporter: mstange, Unassigned, NeedInfo)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Markus Stange [:mstange]

Reporter

Description

•

4 days ago

•

Edited

Bug 1926423 improved this a lot but there is still a very large difference in CPU time spent on resolving style on these tests.

During the "prepare" step of some of the TodoMVC Complex-DOM tests on sp3, we spend much more time doing Style things than Chrome.

This is still the biggest contributor to using more CPU time overall (bug 1925359).

You can run the Svelte test here: https://www.browserbench.org/Speedometer3.0/?suite=TodoMVC-Svelte-Complex-DOM&iterationCount=5

Profile of style::parallel::style_trees across all of sp3: https://share.firefox.dev/3C1dPdf

This profile shows around 60% in various sources of overhead:

self.style_source.clone() during to_applicable_declaration_block
Allocating boxed rule nodes in ensure_child
Running the Arc<T> drop implementation in ensure_child, which calls is_static and does the refcount decrement
StyleSource::read() (maybe in is_read_only_lock() but unclear) during update_for_node and insert_ordered_rules_with_important`
and maybe 30% in what looks like useful work

Jira Integration Bot

Updated

•

4 days ago

See Also: → https://mozilla-hub.atlassian.net/browse/SP3-837

Emilio Cobos Álvarez (:emilio)

Comment 1

•

4 days ago

Hmm, I'm a bit confused about the profiler, but:

self.style_source.clone() during to_applicable_declaration_block

This is not particularly avoidable, but also, how can this be? This code doesn't seem particularly crazy, and there's no way that's more expensive than the actual work we need to do for styling (if we hit that code-path, we're already doing a full selector-match...).

Is there any chance the profiler is somehow coalescing calls to Arc::clone or something?

Allocating boxed rule nodes in ensure_child

Hmm, there's no easy way those can be unboxed afaict, we can try to recycle them somehow I guess (not trivial)...

StyleSource::read() (maybe in is_read_only_lock() but unclear) during update_for_node and insert_ordered_rules_with_important`

So, this one is trivial to test out, by commenting out or making this release assert a debug_assert!:

https://searchfox.org/mozilla-central/rev/1235dca15fb62be4357e9e6d78a0d8751ea173b6/servo/components/style/shared_lock.rs#220

This codepath is indeed very hot, but I'm a bit skeptic about this being indeed the culprit, in the sense that the profiles seem to hint at places where we're memory bound (the first time that we touch the relevant ApplicableDeclarationBlock / DeclarationBlock).

I suspect that removing that memory access would just move the CPU time elsewhere? Markus, do you know how is this exactly getting measured?

Flags: needinfo?(mstange.moz)

Timothy Nikkel (:tnikkel)

Comment 2

•

3 days ago

Bug 1925335 is another where we see refcount related things taking what seems like an unusually high amount of time on this same machine. Added to see also.

Bugzilla

Firefox spends more CPU time resolving style during speedometer 3 TodoMVC-*-Complex-DOM "prepare" steps

Categories

(Core :: CSS Parsing and Computation, defect)

Tracking

()

People

(Reporter: mstange, Unassigned, NeedInfo)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2