Closed Bug 1874756 Opened 1 year ago Closed 11 months ago

mozilla::dom::LargestContentfulPaint::MaybeProcessImageForElementTiming shows up in SP3 profiles

Categories

(Core :: DOM: Performance APIs, defect)

defect

Tracking

()

RESOLVED FIXED
130 Branch
Tracking Status
firefox130 --- fixed

People

(Reporter: jrmuizel, Assigned: sefeng)

References

(Blocks 1 open bug)

Details

(Keywords: perf-alert, Whiteboard: [sp3])

Attachments

(1 file)

We spend about 0.13% of SP3 running MaybeProcessImageForElementTiming.

https://share.firefox.dev/41XDYCu

Blocks: speedometer3
Whiteboard: [sp3]
Component: Performance → DOM: Performance
See Also: → 1874765

Sean, can you give an overview of what this function is doing? Is it expected that this would be showing up during the Speedometer3 measured time?

Flags: needinfo?(sefeng)

Basically whenever an image is completely loaded, we call this function for the image to do a record. And these records are used to figure out which images have been loaded and the load time of these images.

I knew they'd be called often, so it's not surprising for me to see them in the profile. However I'd love to avoid having them showing in the measurement. I need to think and play the code a bit to see if we can avoid them. I keep my NI for this.

Depends on: 1875427

LCPImageEntryKey calls do_GetWeakReference
https://searchfox.org/mozilla-central/rev/47b65cbe249613b9af936cd4660789bb642a8e30/dom/performance/LargestContentfulPaint.h#28
on construction. Since we construct these items as temporaries to do look up and inserts this is one big contributor in the profile above.

I wrote a patch to make these raw ptrs that we never deference, this means that we need to add the weak ptrs to ImagePendingRendering (because we do need to dereference them), slightly increasing memory use. That patch makes us go from ~120 samples in stacks matching Largest*, to ~75 samples on ~95000 total samples (10 runs of sp3). Which is approximately a 0.05% win. I can post the patch for review if we think this is a good trade off.

Flags: needinfo?(jmuizelaar)

Hmm, actually that idea would cause collisions if an object was destroyed and then another object was created at the same address. So I guess that design wouldn't work. There must be a good pattern to avoid do_GetWeakReference for every hashtable lookup somehow.

Flags: needinfo?(jmuizelaar)
Depends on: 1877787

Two changes are introduced:

  • LCP has a hashtable called ContentIdentifiersForLCP to store
    (element, image) pairs to avoid processing the same pair multiple
    times. Instead of using weak pointers for elements, this patch changes
    it to use raw pointers for better performance. Also, makes the
    hashtable to use element alone as the key, so that the entry
    can be quickly removed when nsINode::LastRelease is called.

  • Another change to make imgRequestProxy stores the
    timestamps for LCP, so we can create LCP entries when the size
    is available, instead of creating one temporarily and updating it
    later. This allows us to eliminate a hashtable called
    mImageLCPEntryMap for better performance.

Assignee: nobody → sefeng
Status: NEW → ASSIGNED
Pushed by sefeng@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/942169a00ae4 Simplify and reduce the number of hashtable lookups for LCP implementation r=emilio
Status: ASSIGNED → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → 130 Branch

(In reply to Pulsebot from comment #6)

Pushed by sefeng@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/942169a00ae4
Simplify and reduce the number of hashtable lookups for LCP implementation
r=emilio

Perfherder has detected a browsertime performance change from push 942169a00ae469a2ffe466c8ef6fb92663b138dc.

Improvements:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
64% nytimes largestContentfulPaint linux1804-64-shippable-qr cold fission webrender 1,814.38 -> 647.81 Before/After
64% nytimes largestContentfulPaint macosx1015-64-shippable-qr cold fission webrender 1,225.16 -> 446.47 Before/After
53% nytimes largestContentfulPaint windows11-64-shippable-qr cold fission webrender 941.39 -> 447.00 Before/After
18% speedometer3 TodoMVC-WebComponents/DeletingAllItems/Async linux1804-64-shippable-qr fission webrender 0.94 -> 0.78 Before/After
15% speedometer3 TodoMVC-WebComponents/DeletingAllItems/Async linux1804-64-nightlyasrelease-qr fission webrender 0.95 -> 0.81 Before/After
... ... ... ... ... ...
2% speedometer3 TodoMVC-WebComponents/CompletingAllItems/Async windows11-64-shippable-qr fission webrender 2.63 -> 2.57 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 1509

For more information on performance sheriffing please see our FAQ.

Keywords: perf-alert
Blocks: 1914992
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: