Closed Bug 1696171 Opened 5 years ago Closed 4 months ago

Navigation on YouTube and YouTube Music gets CPU to 100%

Categories

(Core :: Performance: Responsiveness, defect, P3)

Firefox 86
defect

Tracking

()

RESOLVED INCOMPLETE
Performance Impact low

People

(Reporter: testfirefox101, Unassigned)

References

Details

(Keywords: perf:responsiveness)

Attachments

(3 files, 1 obsolete file)

Attached image CPU usage in Task Manager (obsolete) —

User Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:86.0) Gecko/20100101 Firefox/86.0

Steps to reproduce:

Firefox Profiler recording for Case 1: https://share.firefox.dev/3rhLzc4

Reporting 3 cases together because they all started manifesting at the same time about a week ago, and are related to YouTube products. In as much as I've tested, this CPU usage bug has only triggered on my main PC, only in Firefox, and only on YouTube and YouTube Music. I have tested the below cases with both a new Firefox profile and a fresh Firefox 86 install on the computer that triggers the bugs. The cases below are the msot stripped down ways of triggering the bug consistently 100% of the time.

Case 1:

  1. Open Firefox
  2. Access YouTube at https://www.youtube.com/
  3. Click on any video to access it
  4. While the video is running, click on the channel page name below the video (if the video is paused, the bug doesn't trigger)

Case 2:

  1. Open Firefox
  2. Access YouTube at https://www.youtube.com/
  3. Click on any playlist to start playing the first video in the playlist
  4. While the video is playing, click on the "Next" button to the right of the "Play" button (if the video is paused, the bug doesn't trigger)

Case 3:

  1. Open Firefox
  2. Access YouTube Music at https://music.youtube.com/
  3. Click on any playlist
  4. While the song is playing, click on the "Next" button to the right of the "Play" button (if the song is paused, the bug doesn't trigger)

Actual results:

  1. One Firefox process in the Task Manager climbs to over 90% CPU usage, causing the Task Manager to go from a regular 10% CPU usage to 99%-100% usage.
  2. Firefox freezes and most PC activities stall for up to 15 seconds, after which Firefox shows a yellow top bar claiming "A web page is slowing down your browser".
  3. Pressing "Stop It" will, after a few seconds, either:
  • redirect the browser to the intended YouTube page, but only partially loaded (for YouTube Music nothing loads, the page remains unresponsive, but the browser is no longer frozen)
  • make the bar disappear then reappear after a few seconds, and pressing "Stop It" a second time then leads to the webpage behaviour above

Expected results:

Load the correct YouTube page or YouTube Music song.

Attachment #9206635 - Attachment is obsolete: true

Hey testfirefox101,
I tried reproducing this issue on the latest versions of Firefox Nightly 88.0a1 (2021-03-03), beta 87.0b5 and release 86.0 and although some minor spikes are encountered (making the cpu go from 10% to 30-40%) it goes back down immediately and I don't get any notifications either.

Can you test the issue while in Safe Mode? You can find helpful info here : https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode .
Also a fresh new profile could help. You can find more about creating a new profile here : https://support.mozilla.org/en-US/kb/troubleshoot-and-diagnose-firefox-problems#w_6-create-a-new-firefox-profile .
If possible, you can test this issue on the nightly build as well. Download the build from : https://www.mozilla.org/en-US/firefox/nightly/all/ .

Flags: needinfo?(testfirefox101)

Hello Andrei, thanks for looking into this. I've just tried all three of your suggestions, all three triggered the same CPU spike to 100%.

Could this be about an incompatibility between a recent Firefox update and a system component? Just to mention it, the system's hardware is from 2015 when I built it, there have been no software changes for the past year excluding PC games, and no driver changes for at least 2 years.

Flags: needinfo?(testfirefox101)

Setting a component for this issue in order to get the dev team involved.
If you feel it's an incorrect one please feel free to change it to a more appropriate one.

Andrew do you think you can help us out a bit here?

Component: Untriaged → Performance
Flags: needinfo?(continuation)
Product: Firefox → Core

There is actually a DOM Worker thread that is running on full capacity on the parent process, and causes 5s long event processing delays. There is some script the worker runs based on a setTimeout handler, which consumes all the time. But also the YouTube WebContent process shows a lot of Javascript activity.

Here the selection from the original profile: https://share.firefox.dev/3rIdOAO

Olli, might the fix from bug 1684139 also help here? I cannot see in which intervals the timeout handler actually gets fired.

Flags: needinfo?(continuation) → needinfo?(bugs)

Reporter could you maybe check with a recent Firefox Nightly build if the problem is also existent there? Thanks.

Flags: needinfo?(testfirefox101)

Not enough data to say anything about that.

Child process uses oddly lots of time in executing that JS. Is this perhaps a low end machine?
But it is the parent process worker which is the most worrisome part here.
Can anyone reproduce this on Nightly and share also the URLs?

I've tested again with Nightly to record a profile for it, but there seems to be an odd behaviour.

  • Nightly by itself still triggers the CPU spike, causing a 10 second hang
  • if I start recording a profile first and then try to trigger the bug, the bug either doesn't happen or there is only a very small spike in CPU
  • if I first trigger the bug and then start recording during the CPU spike, the CPU spike immediately stops when recoring starts

This behaviour is not 100% consistent, and in case it's useful I've managed to capture part of a CPU spike by starting the Nightly recording after I trigger the bug: https://share.firefox.dev/2PMhLq3

The systems runs on an i5-6400 @2.70, 8GB RAM, GTX 960 8GB, H170 mb, with a 64bit Windows 8.1 Pro.

Flags: needinfo?(testfirefox101)

I have a similar (or same?) issue. Navigation triggers a serious CPU spike and it stays that way. Windows 10 19043.899 here with core i5/intel graphics 5500

https://share.firefox.dev/3eSDFCr

It might be the same. So given by that last profile there is also a DOMWorker that spends nearly 100% of the time in poll(), which gets called from loop():

https://searchfox.org/mozilla-central/rev/f07a609a76136ef779c65185165ff5ac513cc172/toolkit/modules/subprocess/subprocess_worker_win.js#746-778

I wonder if the usage of setTimeout with 0ms could cause this problem. Why do we have to poll that often for the subprocess status? I wonder which kind of subprocess we actually polling here.

It would be great to have markers somewhere when we start a subprocess, so that we might get some more information. Florian, what do you think?

Ilgaz, can you please also try again with a Nightly build? And maybe start the recording with the profiler before you load Youtube Music. It might give us some more information when the DOM Worker actually starts to handle that subprocess. Thanks!

Flags: needinfo?(ilgaz)
Flags: needinfo?(florian)

Hello. This is the profile of nightly, started recording and went directly to music.youtube.com
https://share.firefox.dev/3qXBrUX

Flags: needinfo?(ilgaz)

This profile doesn't have the DOMWorker threads included. Maybe you have to run the profiler with custom settings. Therefore click the little dropdown next to the icon, select custom, and click Edit Settings. Is the checkbox for DOM Worker selected?

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #12)

This profile doesn't have the DOMWorker threads included. Maybe you have to run the profiler with custom settings. Therefore click the little dropdown next to the icon, select custom, and click Edit Settings. Is the checkbox for DOM Worker selected?

Apologies, here is the one with all selected. https://share.firefox.dev/3eQPsRP

Interesting is that no DOM Worker shows a high load anymore. Do you see that the CPU load dropped on your side? Feel free to also install my PerfChaser web extension to directly observe the CPU load within the sidebar. Keep an eye on the main process and the DOMWorker related threads in the lower Threads panel.

As mentioned earlier I wonder if the patch on bug 1684139 helped here. Would you mind creating two new profiles for the following two Nightly builds?

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #10)

It might be the same. So given by that last profile there is also a DOMWorker that spends nearly 100% of the time in poll(), which gets called from loop()

That DOM worker is blocked on poll, it doesn't spend CPU time there. That subprocess_worker also confuses about:performance for the same reason.

In most of the profiles from this bug that I opened, the busy content process has very few profiler samples, but the parent process gets sampled regularly. This seems to me like a process priority issue.

Flags: needinfo?(florian)

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #10)

It would be great to have markers somewhere when we start a subprocess, so that we might get some more information. Florian, what do you think?

The ChromeUtils.addProfilerMarker API is available in chrome workers, so feel free to add markers that you think would be informative.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #14)

Interesting is that no DOM Worker shows a high load anymore. Do you see that the CPU load dropped on your side? Feel free to also install my PerfChaser web extension to directly observe the CPU load within the sidebar. Keep an eye on the main process and the DOMWorker related threads in the lower Threads panel.

As mentioned earlier I wonder if the patch on bug 1684139 helped here. Would you mind creating two new profiles for the following two Nightly builds?

There is a major improvement in current nightly. While it doesn't go down to 3-4% CPU like chromium (I tested MS Edge), it currently uses 14-16 % CPU compared to 40-50% of current stable.

Here are the profiler results. I tried to be consistent browsing.

I also have a commercial AV here etc so I will test on Linux (openSUSE TW) for more "pure" results.

https://share.firefox.dev/30X2PYK --> without patch
https://share.firefox.dev/3lrd6FX --> with patch
https://share.firefox.dev/3rZmdA7 --> current (just updated) firefox nightly

Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE

Thanks for the profiles. Strangely the one for without the patch does only contain 3 DOMWorker threads instead of 4. So it's hard to say right now if it is really fixed by just bug 1684139. As such I would reopen this bug for now, and allowing to investigate the profiles further.

It looks like that we spend still a fair amount of time for rendering, and the web content process about 20% in nsRefreshDriver::Tick.

ilgaz, mind trying to create another profile from Youtube Music where no video is played but just a static image is shown?

Status: RESOLVED → REOPENED
Depends on: 1684139
Ever confirmed: true
Flags: needinfo?(ilgaz)
Resolution: DUPLICATE → ---

Just to confirm, is my instance of the bug a lost cause?

Just to be thorough, I tried the patched Nightly build linked by Henry, and I have the same behaviour I mentioned in my last post: CPU goes from 15% to 100% and stalls (for around 20-30 seconds this time), but with the profile recorder running it only spikes for a few moments.

In case the data can be used, I've recorded several navigation steps that trigger the spike on YouTube Music, with songs without video (excluding ads): https://share.firefox.dev/3tBRaL3

My apoligies Henrik for misreading your name.

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #19)

ilgaz, mind trying to create another profile from Youtube Music where no video is played but just a static image is shown?

Hello,

I have subscription so when I sign in no ads etc. shown. These are the profiles which nightly plays songs (no videos)

https://share.firefox.dev/3tBWqhP --> without patch
https://share.firefox.dev/3cLAUQZ --> with patch

Flags: needinfo?(ilgaz)

(In reply to testfirefox101 from comment #20)

Just to confirm, is my instance of the bug a lost cause?

No, it's not. Sorry I didn't specifically reply to your comment. But it looked like it is harder to reproduce and a bit inconsistent.

Just to be thorough, I tried the patched Nightly build linked by Henry, and I have the same behaviour I mentioned in my last post: CPU goes from 15% to 100% and stalls (for around 20-30 seconds this time), but with the profile recorder running it only spikes for a few moments.

And that makes it hard for us to check what's wrong. Would you maybe mind to install my PerfChaser extension (https://github.com/whimboo/perfchaser/) and checking which processes and specifically threads are consuming most of the CPU while the CPU is at 100%? This extension has way lesser impact than the profiler so that you should still see the spike for a longer time.

In case the data can be used, I've recorded several navigation steps that trigger the spike on YouTube Music, with songs without video (excluding ads): https://share.firefox.dev/3tBRaL3

As it looks like during the navigations there is a 60% of the time spent in handing click events that actually cause a layout flush. I'm not sure if that is expected. Emilio, could you maybe have a look?

Flags: needinfo?(emilio)

4 second styling times look terrible. Is there any chance you could re-take the profile with either the "sequential styling" checkbox enabled, or the "StyleThread" checkbox enabled, so that we can see what is going on in those threads in your machine? Thanks so much.

Flags: needinfo?(emilio) → needinfo?(testfirefox101)

@Henrik, PerfChaser shows the "web content (shared)" process spiking to 392% CPU. The threads have no names, only IDs, and during a 30 second spike they are generally below 1%, and none go above 3.9% CPU. At any one point there are between 6 or >15 threads above 0.0%, and they largely have one of 2 states

  • in one state, half of the threads are at 0.2% and the other half at 0.5 or 0.6%
  • in the other state, there are a few seemingly random threads that will take up more than 1%, with the only pattern being thread 5940 which comes up most of the times and will almost always be at the top, with between 1.1% - 3.9% CPU

@Emilio
https://share.firefox.dev/3faZfTa - StyleThread enabled
https://share.firefox.dev/31f4wkk - Sequential Styling enabled

Flags: needinfo?(testfirefox101)

(In reply to testfirefox101 from comment #25)

@Henrik, PerfChaser shows the "web content (shared)" process spiking to 392% CPU.

Interesting. Would be good to know which threads these are. Also it looks like you are running Nightly with Fission disabled. I would kinda be interested how the page behaves with Fission (site-isolation) actually be enabled. To enable it open the preferences and search for Fission.

The threads have no names, only IDs

That should not happen. Do all threads have no names or just those that cause a higher CPU load? Do you see the same with Fission enabled?

In current nightly, just displaying static Music page while playing (bring tab to front) causes 50-60% CPU usage. Here is the current sample: https://share.firefox.dev/3rzobX2

Attached image PerfChaser.PNG

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #26)

Interesting. Would be good to know which threads these are. Also it looks like you are running Nightly with Fission disabled. I would kinda be interested how the page behaves with Fission (site-isolation) actually be enabled. To enable it open the preferences and search for Fission.

It's the exact same bug bevahiour with Fission enabled.

That should not happen. Do all threads have no names or just those that cause a higher CPU load? Do you see the same with Fission enabled?

I've attached a screenshot with Fission disabled. It's the same with Fission enabled, but instead of "web content (shared)" processes there are "web content" processes.

And a new recording with Fission enabled: https://share.firefox.dev/3u453Sm

(In reply to testfirefox101 from comment #30)

And a new recording with Fission enabled: https://share.firefox.dev/3u453Sm

There is lots of JS activity going on. Olli, could you help here? Looks like some long running tasks are run within the registered click event handlers.

(In reply to testfirefox101 from comment #0)

  1. Firefox freezes and most PC activities stall for up to 15 seconds, after which Firefox shows a yellow top bar claiming "A web page is slowing down your browser".

Is the long hang still happening? In the last profile I see various freezes for up to 3.6 seconds, but that's very different from a never-ending hang.

Flags: needinfo?(testfirefox101)

(In reply to Markus Stange [:mstange] from comment #32)

Is the long hang still happening? In the last profile I see various freezes for up to 3.6 seconds, but that's very different from a never-ending hang.

Yes, it's still happening regularly. I've mentioned before that for some reason, when the profile recorder is running the hangs are comparatively short. Which is bad luck for me, since I have no other way to record the actual hangs I'm experiencing.

Flags: needinfo?(testfirefox101)

Ah I see, that is indeed bad luck. I'm not sure what could be causing that.

Maybe the Windows Performance Recorder might work here? Maybe worth a try.

(In reply to Markus Stange [:mstange] from comment #34)

Ah I see, that is indeed bad luck. I'm not sure what could be causing that.

The explanations in bug 1703410 may also explain this: "on Windows, the timer resolution is adjusted from its system default (~16ms) down to the requested interval (if less than 10ms). This is done so that the sampler loop has a chance to actually reach the expected rate.
But an important side effect is that it affects all timers in the application, which can sometimes cause strange things".

If that's the reason, going to about:profiling and configuring the sampling interval to 20ms may help capture a profile of the bug.

(In reply to Florian Quèze [:florian] from comment #36)

If that's the reason, going to about:profiling and configuring the sampling interval to 20ms may help capture a profile of the bug.

Increasing it to 20 ms doesn't make a noticeable difference, https://share.firefox.dev/3tmLSn3
Intreasing it all the way to 100 ms does make the bug's hang last up to 5 s, https://share.firefox.dev/2QsTYf6

(In reply to Henrik Skupin (:whimboo) [⌚️UTC+1] from comment #35)

Maybe the Windows Performance Recorder might work here? Maybe worth a try.

I'll look into this.

testfirefox101, could you perhaps upload the information from about:support here?

Flags: needinfo?(bugs) → needinfo?(testfirefox101)
Whiteboard: [qf:p1:responsiveness]

If this can be repeated on Linux, we could take a perf profile:
perf record -g -F 999 ./firefox
<quickly reproduce the issue>
<quit or kill firefox>
perf script -F +pid,-cpu >/tmp/problem.perf

Then load problem.perf into profiler.firefox.com

The 100ms profile shows most of the time spent in MutationCallbacks. The other profiles also all show that the janks (which are shorter) are always MutationCallbacks, plus an event that likely triggers it (normally a click event created from a script). The marker chart shows simultaneous tap (shorter) and click (longer) events. See https://share.firefox.dev/2QLR4SY and https://share.firefox.dev/3enTnDN

Attached file about:support
Flags: needinfo?(testfirefox101)

I tried to reproduce this again (on Windows), no luck.

You seems to have Avast Antivirus installed. Could you try disabling that and see if the issue still happens?

Flags: needinfo?(testfirefox101)
Whiteboard: [qf:p1:responsiveness] → [qf:p3:responsiveness]

(In reply to Olli Pettay [:smaug] (away May 1 - 8) from comment #43)

I tried to reproduce this again (on Windows), no luck.

You seems to have Avast Antivirus installed. Could you try disabling that and see if the issue still happens?

Just tried it, Firefox actually stalled for two minutes this time, a definite record.

Could the bug have to do with something in the OS? In case current Firefox versions are less optimized for Win 8.1.

Flags: needinfo?(testfirefox101)

It shouldn't be about Win8.1. Most of the users sure are on Win10 and Win7, but still.
But it does sounds like something on your system (which is why I suggested trying to disable Avast, which is known to cause performance issues.)
Do you have up-to-date graphics drivers?

Based on the profiles when the jank happens, nothing really runs. Even profiler doesn't seem to be able to get samples.

I have just manually updated my graphics drivers just to be sure, unfortunately no change in bug behaviour.

Performance Impact: --- → P3
Whiteboard: [qf:p3:responsiveness]
Priority: -- → P3
Component: Performance: General → Performance: Responsiveness

The severity field is not set for this bug.
:mstange, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(mstange.moz)

Is this still happening? We now have better tools to diagnose these issues, specifically the samply profiler now has Windows support.

Severity: -- → S3
Flags: needinfo?(mstange.moz) → needinfo?(testfirefox101)

(In reply to Markus Stange [:mstange] from comment #48)

Is this still happening? We now have better tools to diagnose these issues, specifically the samply profiler now has Windows support.

I am not the one you asked, but I constantly have performance problems with YouTube and YT Music too. Here is the sample of YouTube showing live content along with chat window open: https://share.firefox.dev/3GqL37Z . This one is from openSUSE tw on Intel i5 7th gen. I can also provide a Windows 11 one. https://share.firefox.dev/3GqL37Z

Unfortunately https://share.firefox.dev/3GqL37Z doesn't have symbols so it isn't too useful performance profile.
(WebExtensions process is doing some unusual CPU heavy stuff)

Redirect a needinfo that is pending on an inactive user to the triage owner.
:mstange, since the bug has recent activity, could you please find another way to get the information or close the bug as INCOMPLETE if it is not actionable?

For more information, please visit BugBot documentation.

Flags: needinfo?(testfirefox101) → needinfo?(mstange.moz)

(In reply to Ilgaz Öcal from comment #49)

I am not the one you asked, but I constantly have performance problems with YouTube and YT Music too. Here is the sample of YouTube showing live content along with chat window open: https://share.firefox.dev/3GqL37Z . This one is from openSUSE tw on Intel i5 7th gen. I can also provide a Windows 11 one. https://share.firefox.dev/3GqL37Z

Given the high activity in the WebExtensions process, it's likely that some add-on is involved in causing extra CPU load. Could you file a new bug about what you're seeing? It seems like a different issue from what testfirefox101 was seeing on Windows.

Status: REOPENED → RESOLVED
Closed: 5 years ago4 months ago
Resolution: --- → INCOMPLETE

testfirefox101, I'm closing this bug for now - please feel free to reopen if this is still happening for you.

These days we actually have a better tool to investigate these types of problems on Windows: You can install samply and then record a system-wide profile with samply record -a.

Flags: needinfo?(mstange.moz)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: