Closed Bug 1648433 Opened 4 years ago Closed 2 years ago

Ghost windows on Reddit

Categories

(Core :: DOM: Core & HTML, defect)

77 Branch
defect

Tracking

()

RESOLVED INCOMPLETE
Performance Impact none

People

(Reporter: pedro, Unassigned)

References

(Blocks 2 open bugs)

Details

(Keywords: perf:resource-use)

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0

Steps to reproduce:

Run the browser for a day or two with some pinned tabs that do things in the background. I've reduced it to GMail and Whatsapp Web to try and see if any of the other things I usually run could be the problem.

Actual results:

After a while the browser has clear latency spikes in interaction. It's most easily noticeable when typing in Whatsapp Web but also happens in other interactions. It's easy to end up writing for two or three seconds without response and then see everything show up at once.

I tried capturing a profile while that was happening:

https://profiler.firefox.com/public/041d2898388724f85d0cbcd010e5408aaf3d06e1/calltree/?globalTrackOrder=0-1-2-3-4-5-6-7-8-9-10-11&hiddenGlobalTracks=1-2-3-4-5-6-7-8-9&localTrackOrderByPid=10035-1-0~21182-0~10125-0~10140-0~&thread=11&v=4

I've reproduced this with Firefox running under XWayland in sway instead of my normal setup of having MOZ_ENABLE_WAYLAND=1 as I was told that that configuration is not supported with Firefox master and to not file bug reports with that setup.

Expected results:

Normal latencies irrespective of how long the browser has been running.

I've noticed this for a few versions of Firefox now but don't know exactly since when. It's also possible something changed in the Ubuntu kernel scheduler settings in the meantime.

Component: Untriaged → Performance
Product: Firefox → Core

Pedro, do you think you could use about:memory to collect a verbose memory report and share it?

Flags: needinfo?(pedro)

about:memory has quite a few options, what specific report would you like? And do you need it to be captured while the problem is happening or do you just want to know what kind of memory is being used in this workload?

I've noticed that the Outlook web versions of mail and calendar running in the background make it particularly easy to create these latency spikes. What memory traces should I gather?

Flags: needinfo?(pedro)

Trying it back to back it's either much easier to trigger with MOZ_ENABLE_WAYLAND=1 or that's the actual cause of the issue and the trace I recorded under XWayland was a more normal latency spike (e.g., high CPU usage at the same from something else).

Yeah, the profile shows some pretty long hangs while we build the CC graph. Something - either us, or one of the loaded sites - is creating a lot of objects that aren't getting destroyed.

Hey Pedro,

In about:memory, can you please click "Measure and Save", and attach the saved report to this bug?

Flags: needinfo?(pedro)

Here's a memory report I just got. I started with a freshly rebooted system under no memory pressure and then opened my usual workload over time. At the start, with all the tabs already open Firefox was working fine. Now of a total of 20GB the system still has 3GB of free memory and 8.5GB of available memory (free plus buffers the kernel can drop) and yet Firefox has now started to show quite a bit of latency. Typing is still keeping up but things like Ctrl-L to go to the address bar or following a link with middle-click have become quite slow. Some tabs also now display a loading graphic when switching to them which I think means they've been somehow evicted from RAM.

Flags: needinfo?(pedro)

The memory usage isn't very high, but you do have some ghost windows (and some detached windows, which might be on their way to being ghost windows) which can cause the CC to take a long time.

Here's another log. Keyboard input is very visibly lagging in case that helps diagnose stuff. I only have the single Firefox window open, but if "ghost" windows are the problem it may very well be that MOZ_ENABLE_WAYLAND=1 is part of the reason this is happening.

This is about the "windows" that are used to represent web pages, not browser UI windows, so Wayland shouldn't be relevant.

What pages are showing up as ghost windows in your memory report? The usual next step here is to have somebody try to reproduce the issue with invasive logging tools to try to figure out why the windows are leaking.

What pages are showing up as ghost windows in your memory report?

How would I determine that?

(In reply to Pedro Côrte-Real from comment #10)

What pages are showing up as ghost windows in your memory report?

How would I determine that?

If you do "measure" in about:memory, there will be some entries like this:

2 (100.0%) -- ghost-windows
├──1 (50.00%) ── <anonymized-2147483820>
└──1 (50.00%) ── <anonymized-2147483962>

Except instead of anonymized, they should have URLs. There may be ghost-window entries from multiple processes.

reddit.com and worten.pt are the only two websites that come up with that pattern. Here are all the cases I found:

2 (100.0%) -- ghost-windows
├──1 (50.00%) ── https://www.reddit.com/
└──1 (50.00%) ── https://www.reddit.com/r/electricvehicles/comments/hndxtr/daimler_chairman_ola_k%C3%A4llenius_on_the_upcoming/

4 (100.0%) -- ghost-windows
├──1 (25.00%) ── about:blank
├──1 (25.00%) ── https://www.reddit.com/
├──1 (25.00%) ── https://www.reddit.com/r/woodworking/comments/hn4rv4/i_scroll_sawed_wonder_woman/
└──1 (25.00%) ── https://www.worten.pt/checkout/success

2 (100.0%) -- ghost-windows
├──1 (50.00%) ── https://www.worten.pt/informatica-e-acessorios/acessorios-pc/teclados-para-pc/teclado-hp-classic-usb-idioma-portugues-teclado-numerico-5153517
└──1 (50.00%) ── https://www.worten.pt/search?query=displayport%20mini&sortBy=priceAsc&hitsPerPage=24&page=1&facetFilters=facets.m_vendido-por%3AWorten&latestFacet=facets.m_vend

Seems like reddit causes this semi-reliably. After restarting the browser all ghosts were gone even after reopening all tabs. Once I browse into reddit a few times I get this:

2 (100.0%) -- ghost-windows
└──2 (100.0%) ── https://www.reddit.com/ [2]

So I found a pattern that may reproduce this in reddit.com. When I open the page and simply scroll, there are no ghosts generated, even after the page loads more content on scroll. But if I press one of the articles on the front page it will open that article but in a new reddit feature where the URL changes and the article content loads but the front-page is still behind it and clicking on it will send you back to it. If I close the tab at that point and check about:memory I now have a ghost window for the frontpage:

1 (100.0%) -- ghost-windows
└──1 (100.0%) ── https://www.reddit.com/

So maybe something in the way that background frontpage behind articles is implemented causes this?

Strange. We've had a few reports of similar issues with Reddit (bug 1626132, bug 1628664), but I haven't been able to reproduce the leaks yet.

Summary: Huge latency spikes after browser has been running for a while → Ghost windows on Reddit

Hit a ghost in youtube as well:

1 (100.0%) -- ghost-windows
└──1 (100.0%) ── https://www.youtube.com/

Latency is now very poor on this running session.

Changing reddit to the old interface seems to at least reduce the problem very significantly. After a bit of use it hasn't happened with reddit again. I did see it with another new site:

1 (100.0%) -- ghost-windows
└──1 (100.0%) ── https://www.motorsport.com/f1/news/gloves-off-mercedes-red-bull/4826986/?ic_source=home-page-widget&ic_medium=widget&ic_campaign=widget-1

It seems like something that happens in a lot of sites and the new reddit interface for some reason triggers it very often.

Weirdly the first two ghost windows, the two motorsport.tv links, are actually links I've never visited. I don't remember those pages at all and Firefox shows the links with the not visited color.

Reddit still has no ghost windows using the old interface. That has made the browser much more usable.

Can you reproduce this in safe mode? Some issue with an addon seems like the most likely cause.

I've reproduced it in safe mode:

1 (100.0%) -- ghost-windows
└──1 (100.0%) ── https://www.reddit.com/r/motogp/comments/ho5jqw/ducati_right_now/

It also reminded me how poor the performance of sites is without uBlock Origin...

Hi Andrew,

The reporter has reproduced this in safe mode.

Do you have an idea about how we should move this bug forward?

Flags: needinfo?(continuation)
Whiteboard: [qf-]

Well, I or somebody else needs to also reproduce it, and then look at cycle collector logs to see what is holding the window alive.

Flags: needinfo?(continuation)

Have you still seen ghost windows? I've tried to reproduce, no luck.

Flags: needinfo?(pedro)

Switching to the old reddit design mostly fixed the issue for me but I just checked in my current session and I still see this:

1 (100.0%) -- ghost-windows
└──1 (100.0%) ── https://www.reddit.com/

This is on 88.0 running with MOZ_ENABLE_WAYLAND=1 as I usually do.

Flags: needinfo?(pedro)
Performance Impact: --- → -
Whiteboard: [qf-]
Component: Performance → DOM: Core & HTML

:hsinyi, could you take a look at this to see if this belongs here or in another component?

Flags: needinfo?(htsai)

DOM: Core is a reasonable place for a window leak like this. The investigation got stuck before because we couldn't reproduce the issue.

Flags: needinfo?(htsai)
Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: