Open Bug 1654087 Opened 1 year ago Updated 2 months ago

reddit.com: slow page performance after scrolling through a lot of pages

Categories

(Core :: General, defect, P3)

defect

Tracking

()

Future

People

(Reporter: miketaylr, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [qf:p1:responsiveness])

I can reproduce this after scrolling about 30 pages. Afterwards, each page scrolled in firefox takes forever to load.

A couple profiles:
https://share.firefox.dev/3fi5rVE
https://share.firefox.dev/3hLShSk

A lot of time spent in js::AtomizeString but main problem may be spending a lot of time loading many mp4's.
It's not super smooth in Chrome, but the delays are much much shorter around a few seconds. In Firefox, the delay can be up to 10s+ for the next page to show up.

Lots of timeupdate events. Are we possibly dispatching them more often than Chrome?

Flags: needinfo?(drno)
Whiteboard: [qf] → [qf:p1:responsiveness]

Unfortunately I cannot attach the chrome profile because it is 55MB, but in Chrome many of the functions like experimentEligibilitySelector are only about 20ms, while they are 200-300ms in Firefox. This seems to be mostly because the self time for experimentEligibilitySelector is spent in js::AtomizeString.

Ted, do you know if it's normal to see so many atomize calls and them taking this long to execute?

Flags: needinfo?(tcampbell)
Component: Untriaged → JavaScript Engine
Product: Firefox → Core
Target Milestone: --- → Future

(In reply to Denis Palmeiro [:denispal] from comment #3)

Ted, do you know if it's normal to see so many atomize calls and them taking this long to execute?

I investigated this a bit. I'm assuming I was looking at the same thing...

I saw a lot of strings with length 1897 being atomized there (some kind of IDs), but often the same JSString*. Atomizing that many characters is pretty slow and I think the JS code here has an array of them that it's filter()ing, so the problem gets worse the more posts it's loading.

It's probably worth adding a JSString* => JSAtom* cache, at least for long strings, to make atomizing the same string repeatedly an O(1) operation instead of O(n). I want to do some more measurements tomorrow to see how often a cache like that would hit on various workloads.

Flags: needinfo?(tcampbell) → needinfo?(jdemooij)
Depends on: 1657559

Fixing the AtomizeString issue in bug 1657559.

Flags: needinfo?(jdemooij)

Denis, would be great if you could confirm bug 1657559 fixes the AtomizeString issue on reddit.

Flags: needinfo?(dpalmeiro)

Thanks Jan. This certainly feels better now, I am no longer seeing long 10s+ jank. That being said, it still feels quite a bit slower compared to Chrome. A new profile can be found here: https://share.firefox.dev/31YMmDn. It seems to be possibly just a long GC that is causing the delays, which is understandable considering the amount of garbage scrolling must generate.

Flags: needinfo?(dpalmeiro)

I don't see any long GCs in that profile. If I filter all markers to just the GCSlice markers, the worst is only 52ms. That one had a budget of 52ms, which presumably is because the browser believed itself to be idle. The budgets may not be right (in fact, I believe in some cases they aren't). But almost nothing goes over budget.

I only see 2 major GCs in that profile, both a bit over 2 seconds long. But that's just the time elapsed between the start and end; the browser is still working for almost all of that time. (The max pauses are 36ms and 52ms.)

The most significant GC-related thing is that there are lots of minor GCs. They're 3-4ms each, which is a fair amount of time. At first glance at the memory graph, it looks like it may be a result of lots of steady allocation, but I don't know for sure that there isn't something fixable there.

Severity: -- → S3
See Also: → 1557537
Flags: needinfo?(drno)

(In reply to Olli Pettay [:smaug] from comment #2)

Lots of timeupdate events. Are we possibly dispatching them more often than Chrome?

Nils, you might have missed the needinfo. Mind still checking that?

Flags: needinfo?(drno)
See Also: → 1685848
See Also: 1685848

(In reply to Olli Pettay [:smaug] from comment #2)

Lots of timeupdate events. Are we possibly dispatching them more often than Chrome?

I don't know how often Chrome dispatches them. Firefox appears to dispatch them every 250ms, which is the minimum as per spec if I'm not mistaken: https://searchfox.org/mozilla-central/source/dom/html/HTMLMediaElement.cpp#6783
We don't appear to take current load into consideration as the spec suggest. But since we do the minimum I don't see how we could reduce the load here without breaking the spec.

Flags: needinfo?(drno)

Looks like timeupdate may get dispatched more often than every 250ms. The task to queue them happens at most every 250ms, but if main thread is busy, there are possibly multiple queued nsAsyncEventRunner objects in the main thread.
So if main thread is busy because of other things, there may be burst of timeupdate events every now an then, and that causes the main thread to be yet more busy - unless I'm missing something.

The spec has text like "if the user agent has not fired a timeupdate event at the element in the past 15 to 250ms and is not still running event handlers for such an event, then the user agent must queue a media element task given the media element to fire an event named timeupdate at the element."
The issue we have is that the time check happens when we queue async event, not when the event actually fires.

Flags: needinfo?(drno)
Depends on: 1686696

Filed bug 1686696 for the timeupdate issue.

Flags: needinfo?(drno)

There's not anything left here that JS folks are looking at, so redirecting this to AV.

Component: JavaScript Engine → Audio/Video
Priority: -- → P3

We landed some improvement in bug1686696, which is about reducing the frequency of dispatching certain events. Would you mind to help me check if the situation gets improved while browsing Reddit?
Thank you.

Flags: needinfo?(dpalmeiro)

I would say it certainly has improved, especially since I first reproduced this 7 months ago. However, we seem to still be quite janky when comparing to chrome while infinitely scrolling on reddit.

Here is a new profile: https://share.firefox.dev/3ucChQA

Flags: needinfo?(dpalmeiro)

Thank you. From the profiled result in comment16, the amount of timeupdate events was indeed significantly less than before, and I noticed that Reddits spent 300~400ms in its event handler, which were mostly called from redditstatic.com. Those operations were done by themselves, which are not related with media, so I'm not sure what we (media team) could do on the improvement.

Maybe we need to move this bug to another component to investigate those non-media-related JS calls to see if we can speed up them or not? Smaug, do you have any ideas?

Thank you.

Flags: needinfo?(bugs)

Yeah, I think this bug is rather generic. This does block bug 1689392
We should keep investigating this and others Reddit perf issues.

Component: Audio/Video → General
Flags: needinfo?(bugs)

Do you still see this being worse than in Chrome. (Which isn't much, since reddit is pretty bad there too. Checkerboarding all the time a lot. On Linux and Windows.)

Flags: needinfo?(dpalmeiro)
You need to log in before you can comment on or make changes to this bug.