Closed Bug 1642072 Opened 4 years ago Closed 4 years ago

Memory usage continues to increase, reaching 100% of physical memory

Categories

(Core :: Graphics: WebRender, defect)

78 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla79
Tracking Status
firefox-esr68 --- unaffected
firefox76 --- unaffected
firefox77 --- unaffected
firefox78 --- verified
firefox79 --- verified

People

(Reporter: alice0775, Assigned: cbrewster)

References

(Blocks 1 open bug, Regression, )

Details

(5 keywords)

Attachments

(4 files)

Memory usage continues to increase, reaching 100% of physical memory.
Disabled WebRender will fix the issue.

Steps to Reproduce:

  1. Open https://keithclark.co.uk/labs/css-fps/nojs/
  2. Playback for about 20 sec
  3. Observe Windows Task Manager

Actual results:
Memory usage continues to increase, reaching 100% of physical memory.
Switch other tab, then memory usage will go down.

Expected Results:
Memory usage goes back and forth between 500MB and 2GB

Regression window:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=dd35edffc6dffb7ae6a4050b955c6e7855cdffcd&tochange=2383139c85c0a60504a871530d099f96eff9e59d

Attached file memory-report.json.gz

KDE, X11, Debian Testing, Macbook Pro
Nightly crashed without any report on this page. I did not see an increase in memory usage.

Flags: needinfo?(connorbrewster)

Only my main profile crashes. Launching Nightly with mozregression causes my desktop to lock up when memory usage is about 4 of 16 GB. With "last good", this page just stutters as usual, but is otherwise fine.

Blocks: wr-stability
Keywords: crash, hang
OS: Windows 10 → All
Hardware: Desktop → All

This test case doesn't work well with our current render target pool approach. We end up creating a very large number of render targets because the requested render target size changes frequently. With my patch, these render targets end up being even larger due to the additional scaling. Most of the scale factors are between 1 and ~1.7. This scaling factor likely causes many of the render targets in the pool to be larger than before and we start hitting some issues with memory usage. Additionally, we don't evict render targets from the pool unless they have been unused for 60 frames. For me the page becomes extremely slow and eventually locks up before we even hit 60 frames.

I think we should look at ways to improve the render target pool allocation strategy for cases like this. Right now a texture can only be reused from the pool if it matches the exact size and format that is requested. In this test case, the requested size is frequently changing due to the perspective transforms and animation.
One possible solution would be to reuse smaller regions of larger targets, this is mentioned in a comment and we would need to make sure overly large targets don't live indefinitely.
Additionally, we may want to adjust the render target GC strategy to better handle cases like this. I would be interested to find out where the 60 frame threshold came from and if there are any major negatives to reducing that to a smaller number.

I did some experimentation with relaxing the render target reuse check to allow the reuse of render targets that are larger than the requested size.
Unfortunately, this optimization only helps when requested render target sizes are successively getting smaller from frame to frame. This particular test case has sections during its animation where requested render target sizes continually increase, which means this optimization is not helpful here.

See Also: → 1642549

I was able to fix this by adding a memory "red line" to the render target GC logic. The current GC strategy removes render targets until the pool is under a specified memory threshold; however, any render targets that have been used within the past 60 frames are kept in the pool. The red line provides another memory threshold, but it enforces that the render target pool size is smaller than the red line after the GC runs regardless of whether or not a target has been recently used. Its not 100% clear what the red line value should be at, I set it at 10x the normal memory threshold and that fixed the memory issue here.

test patch: https://hg.mozilla.org/try/rev/f0fd5cdc19afbf5c09adb7516d11315a872e040e

Glenn, it looks like you worked on the render target GC most recently, does this make sense?

Flags: needinfo?(connorbrewster) → needinfo?(gwatson)

For reference, here is what the test case looks like before the regression with render target debug enabled. Usually only 4-6 of the render targets are used per frame for this test case, so there is no reason for the pool to be this large.

We can probably do something a bit more advanced in future based on your suggestions above, but your test patch sounds like a good change for now.

It ensures we don't end up in memory exhaustion situations which is a better place to be, for sure!

Flags: needinfo?(gwatson)
Severity: -- → S2
Flags: needinfo?(jbonisteel)
Assignee: nobody → connorbrewster
Status: NEW → ASSIGNED
Flags: needinfo?(jbonisteel)
Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6fbc33061f0d
Add a red line threshold to the render target GC to prevent memory exhaustion r=gw
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79

Verified fixed on Linux and Win10. Thanks!

Blocks: wr-79
No longer blocks: gfx-triage
Status: RESOLVED → VERIFIED

Can this be uplifted to Beta?

Flags: needinfo?(connorbrewster)

Comment on attachment 9153851 [details]
Bug 1642072: Add a red line threshold to the render target GC to prevent memory exhaustion r=gw

Beta/Release Uplift Approval Request

  • User impact if declined: Memory exhaustion on https://keithclark.co.uk/labs/css-fps/nojs/ which can cause Firefox to crash or cause the computer to lock up.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: Go to https://keithclark.co.uk/labs/css-fps/nojs/, let the animation run for 20-30 seconds and ensure that Firefox does not have excessive memory usage (user reported 100% of physical memory used).
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This is a small patch which adds an excessive memory check to ensure we don't hit memory exhaustion by holding on to too many render targets. Easy to back out if needed.
  • String changes made/needed: none
Flags: needinfo?(connorbrewster)
Attachment #9153851 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Comment on attachment 9153851 [details]
Bug 1642072: Add a red line threshold to the render target GC to prevent memory exhaustion r=gw

approved for 78.0b5

Attachment #9153851 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-triaged]

This issue is verified as fixed in our latest Beta Build 78.0b5 on windows 10.

QA Whiteboard: [qa-triaged]
Flags: qe-verify+
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: