Closed Bug 1662297 Opened 4 years ago Closed 3 years ago

Zillow using massive amounts of memory

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox83 --- fixed

People

(Reporter: mconca, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(4 files)

The latest version of Nightly is using a lot of memory on zillow.com at certain zoom levels. Seems fine until you zoom in one click past where the satellite view kicks in. Link below starts at that zoom level.

STR:

  1. Create new profile in Nightly
  2. Go to this URL
    https://www.zillow.com/homes/for_sale/Castle-Rock,-CO_rb/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22usersSearchTerm%22%3A%22Castle%20Rock%2C%20CO%22%2C%22mapBounds%22%3A%7B%22west%22%3A-104.89113437569046%2C%22east%22%3A-104.88703327810668%2C%22south%22%3A39.396847440697016%2C%22north%22%3A39.39931394177977%7D%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A23984%2C%22regionType%22%3A6%7D%5D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22sort%22%3A%7B%22value%22%3A%22globalrelevanceex%22%7D%2C%22ah%22%3A%7B%22value%22%3Atrue%7D%7D%2C%22isListVisible%22%3Atrue%2C%22mapZoom%22%3A19%7D

On Windows 10, Task Manager shows that Firefox slowly increments the amount of memory being used to somewhere between 8-10 GB. The page and browser almost become unusable.

Latest Nightly (82.0a1 2020-08-31) on Windows 10. New profile.

Hi,

Am I not able to replicate the issue on my end. I'm using windows 10 pro, firefox nightly 82.0a1 (2020-09-04) (64-Bit), beta 81.0b4 and release 80.0.

Please test if the issue occurs to you in safe mode (add-ons disabled). Here is a link that can help you do that:
https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode
Also please attach your about:support information.

I will move this over to a component so developers can take a look over it. If is not the correct component please feel free to change it to an appropriate one.

Thanks for the report.

Best regards, Clara.

Component: Untriaged → Performance
Product: Firefox → Core

Looks like others may be hitting this on a different site.

https://bugzilla.mozilla.org/show_bug.cgi?id=1663073

See Also: → 1663073
Attached file memory report.txt —

I was able to take a memory profile (attached).

Attached file about:support —

Here is a copy of about:support. Note that I have WebRender enabled and am running Fission. I'll try disabling each of those to see if there is a difference.

Attached file memory report no-wr.txt —

This is definitely connected to WebRender. When I disable WR, it uses 1/10th the memory and is much, much faster to load. Memory report with WR disabled attached.

Component: Performance → Graphics: WebRender

I can reproduce this on OS X 10.15.6 on a 16" MacBook Pro (AMD Radeon Pro 5300M 4 GB). Firefox went to ~17 GB memory usage within seconds and became unusable.

Unfortunately the memory report isn't very useful, as the high memory usage is unreported: 8,161.31 MB (98.85%) ── heap-unclassified

Bug 1625590 is a previous report we've had with high memory usage with WebRender. That turned out to be font related, and it was a very slow accumulation, so it might not be related.

Marking this as S1 because clicking the link in description made Firefox use 35 GB memory and made my Mac completely unusable.

Severity: -- → S1

This seems like an older bug, I was able to reproduce this with FF70.

Dzmitry, would you mind taking a look?

Flags: needinfo?(dmalyshau)
Severity: S1 → S2
Priority: -- → P2

Didn't realize it was S1. Going to look at it ASAP.

Assignee: nobody → dmalyshau
Status: NEW → ASSIGNED
Flags: needinfo?(dmalyshau)

It looks like huge textures are getting rasterized with Skia (on CPU side) there. That's one source of slowdowns. However, once a texture is decoded, it gets uploaded to WebRender, and I'm seeing 5500 RGBA8 tiles in the cache (which corresponds to 5.5Gigs of VRAM). Uploading this amount of textures takes a lot of time, then it may get freed if the texture cache sees that these are not used (e.g. because we picture cached everything). Finally, as a nail to the coffin, when zooming to the level of the roads, we get insane amount of draw calls (1000 - 1000000, many zeroes). Could be a result of us having so much texture data, in which case each consecutive draw call ends up using a different texture cache slice...

Previously, we discussed some ideas on making sure we never hit a situation with this many draw calls. For example, we could copy all the texture data in a single texture used for the frame, and keep it hot (i.e. only uploading/updating parts that are needed), at the cost of increased VRAM usage and the cost of copy operations. It's not clear how much extra VRAM that path would take. It's not an easy fix, and we need to know more about the problem before diving into it.

Perhaps, we can get someone closer to the Skia and WR blobs to see what's going on with this side of the problem? I'm currently trying to take a WR capture without blowing my HDD space, will probably follow-up with more info here.

Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(lsalzman)

I finally managed to get a capture of the workload there (feel free to ping me if you want that 11GB bomb). I see 8K draw calls, caused by the fact we have 5K of textures. These textures mostly correspond to the blob images of the map. It looks like different layers being rasterized, so each of the layer is mostly empty. The idea that comes to mind is that instead of having the 5K layers mixed by WR renderer, we could mix them all together on CPU side before passing to WR. That would be done on the Gecko side, if I understand correctly.

Assignee: dmalyshau → nobody
Status: ASSIGNED → NEW
See Also: → 1650436

We need to implement some heuristics in the blob recording to avoid layerization going nuts like this. We should figure this out as soon as jrmuizel is back.

Blocks: wr-blob-perf

As far as I can tell there's no SVG splitting happening here. There's a large number of layers because there's a large number of SVGs.

Adding svg { background: rgba(255, 0, 0, 0.05) } shows the extent of the sadness.

So as far as I can see there's no easy solutions here. The basic problem is that we have lots of SVG elements and many of them are overlapping. We end up using memory that's ~SUM(BOUNDS(items)) of each item where as a traditional approach ~BOUNDS(UNION(items)). Therefore to solve this we need to share the memory some how.

Possible ideas:

  • group the blobs on the gecko side
  • expose the path items to WebRender so that they can be rasterized directly into the destination
  • group the blobs on the WebRender side
  • coarsely rasterize the blobs to tiles so that we don't need to store the fully transparent ones.

I've looked at the page more closely and my guess is that this is what is happening:

  • The Zillow map has 256x256 tiles that it presumably loads and unloads
  • Each tile contains an <svg> with a single <path> of the whole path for every path that intersects its bounding box.
  • That means that in this scene every tile contains an <svg> with a <path> for the entire diagonal road
  • The <svg>s all have overflow:visible which means that we end up with a layer the size of the map for every tile that contains the diagonal road.
  • This what shows up as the huge amount of overdraw when add path { fill: rgba(0, 255, 0, 0.05); }

Switching all the svgs to oveflow:hidden fixes this problem and drastically improves performance.

I suggest we reach out to Zillow and see if we can get them to make that fix.

ni? myself for outreach and building a site patch!

Flags: needinfo?(dschubert)
See Also: → 1666771

I have reached out to them. Because this is easy to workaround without having them fix their site, we will ship a temporary intervention in bug 1666771.

I'll leave this bug in the Graphics component for now as I don't know if y'all want to further investigate potential rendering improvements, but this turns out to be a WONTFIX from your point of view, feel free to move this to Web Compatibility::Desktop.

Flags: needinfo?(dschubert)
Whiteboard: [sitewait]
Depends on: 1666771
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(lsalzman)
Blocks: gfx-83
No longer blocks: gfx-82
Severity: S2 → S4
Priority: P2 → P3
No longer blocks: gfx-83

I'm not able to reproduce the issue on my side regardless if the Intervention is enabled or not.

Tested with:
Browser / Version: Firefox Nightly 89.0a1 (2021-03-31)
Operating System: Windows 10 Pro

Mike Conca can you still reproduce the issue on your side with zillow.com intervention disabled from about:compat

Flags: needinfo?(mconca)

(In reply to Oana Arbuzov [:oanaarbuzov] from comment #23)

Mike Conca can you still reproduce the issue on your side with zillow.com intervention disabled from about:compat

Yes, I can still easily reproduce this on my Windows 10 machine (Nightly 89.0a1 (2021-04-01) (64-bit)). Without the intervention, memory grows rapidly and consumes 10+ GB within a few seconds. This does not happen with the intervention enabled.

Flags: needinfo?(mconca)

Testing this issue, on my side I do get an increase in memory usage when accessing the page with Interventions disabled, but not as much as previously reported. The difference between INTERVENTIONS enabled and INTERVENTIONS disabled is about 1-1,5 GB of memory usage.

Tested with
Browser / Version: Firefox Nightly 91.0a1 (2021-06-23) (64-bit)
Operating System: Windows 10 Pro

Whiteboard: [sitewait] → [webcompat:sitepatch-applied]

Using the latest build of Firefox Nightly, when accessing: https://www.zillow.com/castle-rock-co/?searchQueryState=%7B%22pagination%22%3A%7B%7D%2C%22usersSearchTerm%22%3A%22Castle%20Rock%2C%20CO%22%2C%22mapBounds%22%3A%7B%22west%22%3A-104.89747914111518%2C%22east%22%3A-104.88054903781318%2C%22south%22%3A39.392561325065884%2C%22north%22%3A39.4011473867797%7D%2C%22mapZoom%22%3A16%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A23984%2C%22regionType%22%3A6%7D%5D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22sort%22%3A%7B%22value%22%3A%22globalrelevanceex%22%7D%7D%2C%22isListVisible%22%3Atrue%7D and navigating on the map using the mouse (scrolling with the mouse wheel, clicking on items displayed, etc) there are no notable differences between Firefox with Interventions enabled or disabled. An increase in CPU usage has been observed using other browsers as well when doing the said action on the page.

Mike, is the originally reported issue still reproducible on your side, or do you have the same results as I did?

Tested with:

Browser / Version: Firefox Nightly 99.0a1 (2022-02-16) (64-bit)
Operating System: Windows 10 Pro x64

Flags: needinfo?(mconca)

I can no longer reproduce this. BUT... since I originally reported this issue, I have upgraded my laptop and now run Windows 11, so I cannot recreate the exact original conditions.

Flags: needinfo?(mconca)

Thank you for the reply. Since the issue is not reproducible on our side using Windows 10, and not on your side as well using windows 11, we will be closing this as FIXED. Please feel free to comment here if the issue re-appears, or file a new report if there are any new findings regarding this issue.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
See Also: → 1760043
Whiteboard: [webcompat:sitepatch-applied]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: