Some sp3 tests (especially NewsSite) spend an appreciable amount of time in RasterImage::NotifyDecodeComplete ("notify images")
Categories
(Core :: Graphics: ImageLib, defect)
Tracking
()
People
(Reporter: mstange, Unassigned, NeedInfo)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(Whiteboard: [sp3])
NewsSite-Next spends 4% of its time in RasterImage::NotifyDecodeComplete
, NewsSite-Nuxt spends 1.1% in it.
I can't find the equivalent work in the Chrome profiles anywhere.
NewsSite-Next: https://share.firefox.dev/3UcDiqv (4753 samples)
NewsSite-Nuxt: https://share.firefox.dev/3Ytgw06 (1051 samples)
Furthermore, some time is spent in AsyncNotifyRunnable::Run
.
NewsSite-Next: https://share.firefox.dev/40b4hGy (345 samples)
NewsSite-Nuxt: https://share.firefox.dev/40azYju (309 samples)
If, somehow, we were able to eliminate all of this time completely, we would improve the overall sp3 score by 0.5%.
Updated•16 days ago
|
Comment 1•16 days ago
|
||
Nice find.
Wow, basically 3 seconds in addref is most of the first profile. Maybe our observers are more well behaved and we can avoid addref'ing now?
Comment 2•13 days ago
|
||
These profiles have some surprising results. Filtering stacks on "addref" gives us about 17% of all samples (seems reasonable). Of these samples 87% are AddRef'ing when sending image notifications. Seems too high for image notifications to be responsible for 87% of all addref time.
I tried to replicate this myself. I used the built in profiler on a Windows and a mac machine. I did 100 iterations of NewsSite-Next and NewsSite-Nuxt. In both cases addref'ing for image notifications take much much less of the time.
https://share.firefox.dev/4f8tgP4
https://share.firefox.dev/4eSmTQc
So something is different about the profiles that I'm doing and the ones posted in the first comment here. Maybe it's a difference of the hardware being used? Perhaps its a difference between the built in profiler and whatever profiler was used for the profiles in comment 0?
Markus, do you have any thoughts about that difference?
Comment 3•12 days ago
|
||
In fact, just addref'ing to send image notification is fully 15% (72k of 483k samples) of the entire profile from the first profile in comment 0 (after expanding to show all samples). If that figure is accurate then is there something that makes these specific virtual addref calls more expensive then other virtual addref calls? Or is that the price we pay for all virtual addref calls?
Reporter | ||
Comment 4•11 days ago
|
||
That is very curious and I don't have an answer at the moment. I'll try to see if it's a machine issue by comparing a samply profile with a Gecko profiler profile.
Comment 5•11 days ago
|
||
What is the hardware that that profile was done on?
I will also try using samply to see if I get different results.
Comment 6•10 days ago
|
||
I tried on a very old under powered machine (with the theory that slower memory access, smaller cpu caches, and less intelligent pre-fetching might cause slower virtual function calls) and as a percent of total time the image notifications work is actually less there.
Comment 7•10 days ago
|
||
I tried samply on macos, it's results are in line with what I see from the built in profiler.
Updated•8 days ago
|
Comment 8•10 hours ago
|
||
I tried the built in profiler on what should be an identical machine to what was used for the profile in comment 0
https://share.firefox.dev/3AkUB1R
The sum of time spent in stacks that include IProgressObserver or imgINotificationObserver (which double counts a lot) is about 3%. This is inline with what I've been seeing locally.
Description
•