Open Bug 1925335 Opened 16 days ago Updated 10 hours ago

Some sp3 tests (especially NewsSite) spend an appreciable amount of time in RasterImage::NotifyDecodeComplete ("notify images")

Categories

(Core :: Graphics: ImageLib, defect)

defect

Tracking

()

People

(Reporter: mstange, Unassigned, NeedInfo)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [sp3])

NewsSite-Next spends 4% of its time in RasterImage::NotifyDecodeComplete, NewsSite-Nuxt spends 1.1% in it.
I can't find the equivalent work in the Chrome profiles anywhere.

NewsSite-Next: https://share.firefox.dev/3UcDiqv (4753 samples)
NewsSite-Nuxt: https://share.firefox.dev/3Ytgw06 (1051 samples)

Furthermore, some time is spent in AsyncNotifyRunnable::Run.

NewsSite-Next: https://share.firefox.dev/40b4hGy (345 samples)
NewsSite-Nuxt: https://share.firefox.dev/40azYju (309 samples)

If, somehow, we were able to eliminate all of this time completely, we would improve the overall sp3 score by 0.5%.

Nice find.

Wow, basically 3 seconds in addref is most of the first profile. Maybe our observers are more well behaved and we can avoid addref'ing now?

Depends on: 1925423

These profiles have some surprising results. Filtering stacks on "addref" gives us about 17% of all samples (seems reasonable). Of these samples 87% are AddRef'ing when sending image notifications. Seems too high for image notifications to be responsible for 87% of all addref time.

I tried to replicate this myself. I used the built in profiler on a Windows and a mac machine. I did 100 iterations of NewsSite-Next and NewsSite-Nuxt. In both cases addref'ing for image notifications take much much less of the time.

https://share.firefox.dev/4f8tgP4

https://share.firefox.dev/4eSmTQc

So something is different about the profiles that I'm doing and the ones posted in the first comment here. Maybe it's a difference of the hardware being used? Perhaps its a difference between the built in profiler and whatever profiler was used for the profiles in comment 0?

Markus, do you have any thoughts about that difference?

Flags: needinfo?(mstange.moz)

In fact, just addref'ing to send image notification is fully 15% (72k of 483k samples) of the entire profile from the first profile in comment 0 (after expanding to show all samples). If that figure is accurate then is there something that makes these specific virtual addref calls more expensive then other virtual addref calls? Or is that the price we pay for all virtual addref calls?

That is very curious and I don't have an answer at the moment. I'll try to see if it's a machine issue by comparing a samply profile with a Gecko profiler profile.

What is the hardware that that profile was done on?

I will also try using samply to see if I get different results.

I tried on a very old under powered machine (with the theory that slower memory access, smaller cpu caches, and less intelligent pre-fetching might cause slower virtual function calls) and as a percent of total time the image notifications work is actually less there.

I tried samply on macos, it's results are in line with what I see from the built in profiler.

Severity: -- → S2
Depends on: 1863046
See Also: → 1927919

I tried the built in profiler on what should be an identical machine to what was used for the profile in comment 0

https://share.firefox.dev/3AkUB1R

The sum of time spent in stacks that include IProgressObserver or imgINotificationObserver (which double counts a lot) is about 3%. This is inline with what I've been seeing locally.

You need to log in before you can comment on or make changes to this bug.