mozilla::dom::LargestContentfulPaint::MaybeProcessImageForElementTiming shows up in SP3 profiles
Categories
(Core :: DOM: Performance APIs, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox130 | --- | fixed |
People
(Reporter: jrmuizel, Assigned: sefeng)
References
(Blocks 1 open bug)
Details
(Keywords: perf-alert, Whiteboard: [sp3])
Attachments
(1 file)
We spend about 0.13% of SP3 running MaybeProcessImageForElementTiming.
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Reporter | ||
Comment 1•1 year ago
|
||
Sean, can you give an overview of what this function is doing? Is it expected that this would be showing up during the Speedometer3 measured time?
Assignee | ||
Comment 2•1 year ago
|
||
Basically whenever an image is completely loaded
, we call this function for the image to do a record. And these records are used to figure out which images have been loaded and the load time of these images.
I knew they'd be called often, so it's not surprising for me to see them in the profile. However I'd love to avoid having them showing in the measurement. I need to think and play the code a bit to see if we can avoid them. I keep my NI for this.
Comment 3•1 year ago
|
||
LCPImageEntryKey calls do_GetWeakReference
https://searchfox.org/mozilla-central/rev/47b65cbe249613b9af936cd4660789bb642a8e30/dom/performance/LargestContentfulPaint.h#28
on construction. Since we construct these items as temporaries to do look up and inserts this is one big contributor in the profile above.
I wrote a patch to make these raw ptrs that we never deference, this means that we need to add the weak ptrs to ImagePendingRendering (because we do need to dereference them), slightly increasing memory use. That patch makes us go from ~120 samples in stacks matching Largest*, to ~75 samples on ~95000 total samples (10 runs of sp3). Which is approximately a 0.05% win. I can post the patch for review if we think this is a good trade off.
Comment 4•1 year ago
|
||
Hmm, actually that idea would cause collisions if an object was destroyed and then another object was created at the same address. So I guess that design wouldn't work. There must be a good pattern to avoid do_GetWeakReference for every hashtable lookup somehow.
Updated•1 year ago
|
Assignee | ||
Comment 5•1 year ago
|
||
Two changes are introduced:
-
LCP has a hashtable called
ContentIdentifiersForLCP
to store
(element, image) pairs to avoid processing the same pair multiple
times. Instead of using weak pointers for elements, this patch changes
it to use raw pointers for better performance. Also, makes the
hashtable to use element alone as the key, so that the entry
can be quickly removed when nsINode::LastRelease is called. -
Another change to make imgRequestProxy stores the
timestamps for LCP, so we can create LCP entries when the size
is available, instead of creating one temporarily and updating it
later. This allows us to eliminate a hashtable called
mImageLCPEntryMap
for better performance.
Updated•1 year ago
|
Comment 7•11 months ago
|
||
bugherder |
Assignee | ||
Comment 8•11 months ago
|
||
The landed patch probably won't trigger any perf wins alert, however it should make these https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=8e504a82842ce4b879db246a7844df8bea8ffa16&originalSignature=5131097&newSignature=5131097&framework=13&application=firefox&originalRevision=0bfdbd96a77d94ec50d2c407703b23d1b895a6c8&page=1 subtests improvement.
Comment 9•11 months ago
|
||
(In reply to Pulsebot from comment #6)
Pushed by sefeng@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/942169a00ae4
Simplify and reduce the number of hashtable lookups for LCP implementation
r=emilio
Perfherder has detected a browsertime performance change from push 942169a00ae469a2ffe466c8ef6fb92663b138dc.
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
---|---|---|---|---|---|
64% | nytimes largestContentfulPaint | linux1804-64-shippable-qr | cold fission webrender | 1,814.38 -> 647.81 | Before/After |
64% | nytimes largestContentfulPaint | macosx1015-64-shippable-qr | cold fission webrender | 1,225.16 -> 446.47 | Before/After |
53% | nytimes largestContentfulPaint | windows11-64-shippable-qr | cold fission webrender | 941.39 -> 447.00 | Before/After |
18% | speedometer3 TodoMVC-WebComponents/DeletingAllItems/Async | linux1804-64-shippable-qr | fission webrender | 0.94 -> 0.78 | Before/After |
15% | speedometer3 TodoMVC-WebComponents/DeletingAllItems/Async | linux1804-64-nightlyasrelease-qr | fission webrender | 0.95 -> 0.81 | Before/After |
... | ... | ... | ... | ... | ... |
2% | speedometer3 TodoMVC-WebComponents/CompletingAllItems/Async | windows11-64-shippable-qr | fission webrender | 2.63 -> 2.57 | Before/After |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 1509
For more information on performance sheriffing please see our FAQ.
Description
•