Closed Bug 1661311 Opened 4 years ago Closed 4 years ago

30 second browser hang on Google Docs

Categories

(Core :: Graphics, defect)

Unspecified
Windows
defect

Tracking

()

RESOLVED FIXED
84 Branch
Performance Impact medium
Tracking Status
firefox82 --- wontfix
firefox83 --- wontfix
firefox84 --- fixed

People

(Reporter: cpeterson, Assigned: emilio)

References

Details

(Keywords: perf:responsiveness)

Nightly, especially on Google Docs, has been lagging or hanging for me a lot more than usual over the past couple days. I just captured a profile of a ~30 second hang of my whole browser, including chrome painting in all my windows. I have Fission enabled, if that matters.

Here is my profile:

https://share.firefox.dev/3huXC0U

Looks like a lot of time is spent in JS and the compositor.

There's certainly a fair bit of JS, but gfxUtils::DrawPixelSnapped takes 11s to execute. Will try with graphics first.

Component: Performance → Graphics
Whiteboard: [qf:p2:responsiveness]

(In reply to Denis Palmeiro [:denispal] from comment #1)

There's certainly a fair bit of JS, but gfxUtils::DrawPixelSnapped takes 11s to execute. Will try with graphics first.

I reported another Google Docs hang four months ago (bug 1635884). Some profile investigation pointed to graphics or layout problems in that bug, too:

It looks like it's hanging while trying to draw SVGs

almost all of the time is spent in skia. It's really odd, it looks like it's redrawing this svg over and over again.

there seem to be an absolute ton of MozScrolledAreaChanged events being fired.

my woefully uninformed theory is that you're probably on a hidpi laptop, and zoom is trying to record in lowdpi, so we're constantly changing dpi or something like that.

However, that Google Docs hang bug was closed as a duplicate of an HTTP/3 bug 1636479.

See Also: → 1635884

There's a bunch of weird stuff in this profile.

On the parent process main thread 1.3 s in accessibility things. Do you expect accessibility to be enabled?
Another 1 s on the parent process main thread in drag and drop code. Were you dragging and dropping during the profile?

On the gpu process compositor most of the time is under CompositorScreenshotGrabber. The profile doesn't seem to have screenshots (which is what we need for profiling graphics as it introduces overhead), so I wonder why we are doing this.

The gfxUtils::DrawPixelSnapped call in the content process is re-rasterizing a large svg sprite sheet. It's coming from a ThemeChanged notification. Would you expect that to be happening during your profile? This can come from pref changes and theme changes and things from the OS like that.

(In reply to Timothy Nikkel (:tnikkel) from comment #3)

On the parent process main thread 1.3 s in accessibility things. Do you expect accessibility to be enabled?

No. Perhaps Firefox is detecting Windows' Emoji input method (Windows+. keyboard shortcut) as an accessibility input device? I see the "Show a touch keyboard when necessary" option is enabled in about:preferences#general. The ui.osk.enabled pref is true ("OSK" is "On Screen Keyboard") and ui.osk.debug.keyboardDisplayReason pref value is IKPOS: Keyboard presence confirmed.

Another 1 s on the parent process main thread in drag and drop code. Were you dragging and dropping during the profile?

I don't think so. My whole browser, content and chrome, was hanging. I was probably trying to click focus to different tabs to see if any of them responding.

On the gpu process compositor most of the time is under CompositorScreenshotGrabber. The profile doesn't seem to have screenshots (which is what we need for profiling graphics as it introduces overhead), so I wonder why we are doing this.

I omitted screenshots from the profile because the Google Doc I was editing contained some private information. I can try to reproduce the problem on a non-private doc and include screenshots in the profile, if that will help?

The gfxUtils::DrawPixelSnapped call in the content process is re-rasterizing a large svg sprite sheet. It's coming from a ThemeChanged notification. Would you expect that to be happening during your profile? This can come from pref changes and theme changes and things from the OS like that.

Does ThemeChanged notification indicate a Windows OS theme change or a Firefox theme change? I was not changing my Windows OS theme, but I have a dynamic Firefox theme installed (Gradientus extension that changes the window color depending on the time of day. The extension schedules a timer event for every five minutes to check the time of day, but only changes the theme color four times per day (dawn, morning, afternoon, evening).

(In reply to Chris Peterson [:cpeterson] from comment #4)

(In reply to Timothy Nikkel (:tnikkel) from comment #3)

On the parent process main thread 1.3 s in accessibility things. Do you expect accessibility to be enabled?

No. Perhaps Firefox is detecting Windows' Emoji input method (Windows+. keyboard shortcut) as an accessibility input device? I see the "Show a touch keyboard when necessary" option is enabled in about:preferences#general. The ui.osk.enabled pref is true ("OSK" is "On Screen Keyboard") and ui.osk.debug.keyboardDisplayReason pref value is IKPOS: Keyboard presence confirmed.

about:support will tell you if accessibility is enabled or not. I'm not sure but I don't think the osk should enable accessbility.

Another 1 s on the parent process main thread in drag and drop code. Were you dragging and dropping during the profile?

I don't think so. My whole browser, content and chrome, was hanging. I was probably trying to click focus to different tabs to see if any of them responding.

Possible that a click could be interpreted as a drag under load.

On the gpu process compositor most of the time is under CompositorScreenshotGrabber. The profile doesn't seem to have screenshots (which is what we need for profiling graphics as it introduces overhead), so I wonder why we are doing this.

I omitted screenshots from the profile because the Google Doc I was editing contained some private information. I can try to reproduce the problem on a non-private doc and include screenshots in the profile, if that will help?

Okay, that explains why we are spending time in CompositorScreenshotGrabber. Recording screenshots introduces a lot of overhead in the graphics code that is otherwise not there, so not grabbing screenshots at all is the preferred way to take a profile if gfx is involved.

The gfxUtils::DrawPixelSnapped call in the content process is re-rasterizing a large svg sprite sheet. It's coming from a ThemeChanged notification. Would you expect that to be happening during your profile? This can come from pref changes and theme changes and things from the OS like that.

Does ThemeChanged notification indicate a Windows OS theme change or a Firefox theme change? I was not changing my Windows OS theme, but I have a dynamic Firefox theme installed (Gradientus extension that changes the window color depending on the time of day. The extension schedules a timer event for every five minutes to check the time of day, but only changes the theme color four times per day (dawn, morning, afternoon, evening).

Hmm, the extension could be involved. If the problem is easy enough to observe then the easiest way to check would be to disable it and try to reproduce again.

Definitely restart and test with the extension still on.
Then if it's still acting slow, try disabling the extension.

Looking at the extension, I can see a couple places where things might go off the rails if anything goes wrong. Does fission run the extension's background script in all processes? Because that would do it!

Flags: needinfo?(cpeterson)

We should consider throttling ThemeChanged to something coarser than ASAP. :)

Severity: -- → S3

about:support will tell you if accessibility is enabled or not. I'm not sure but I don't think the osk should enable accessbility.

My about:support says:

Accessibility
Activated: true
Prevent Accessibility: 0
Accessible Handler Used: true
Accessibility Instantiator: UNKNOWN|

Looking at the extension, I can see a couple places where things might go off the rails if anything goes wrong. Does fission run the extension's background script in all processes? Because that would do it!

I asked a Fission engineer. Extension content scripts run in each content process, but extension background scripts run in a dedicated extension process, so IIUC the Gradientus alarms should not fire in every content process.

I'll disable the extension for a few days to see if that makes a difference.

Flags: needinfo?(cpeterson)
See Also: → 1668875

Is this still happening, now that bug 1668875 is fixed?

Flags: needinfo?(cpeterson)

(In reply to Markus Stange [:mstange] from comment #9)

Is this still happening, now that bug 1668875 is fixed?

I bet this is fixed. I think the fix for bug 1668875 fixed Google Sheets hang bug 1646222, which is presumably related to this Google Docs hang. I think we can close this bug as fixed (by bug 1668875).

Depends on: 1668875
Flags: needinfo?(cpeterson)
See Also: 1668875

Great.

Assignee: nobody → emilio
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 84 Branch
Performance Impact: --- → P2
Whiteboard: [qf:p2:responsiveness]
You need to log in before you can comment on or make changes to this bug.