Closed Bug 1683323 Opened 5 years ago Closed 5 years ago

2.2 - 9.55% Explicit Memory / Heap Unclassified / Resident Memory (linux1804-64-shippable-qr) regression on push 5589923a52c80ed09c8ce383afb0385cabc0ec18 (Tue December 15 2020)

Tracking

()

Status:

RESOLVED WONTFIX

Tracking Flags:

Tracking

Status

firefox-esr78

---

unaffected

firefox83

---

unaffected

firefox84

---

unaffected

firefox85

---

unaffected

firefox86

---

affected

People

(Reporter: aesanu, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Andra Esanu (needinfo me)

Reporter

Description

•

5 years ago

Perfherder has detected a awsy performance regression from push 5589923a52c80ed09c8ce383afb0385cabc0ec18. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio	Suite	Platform	Absolute values (old vs new)
10%	Heap Unclassified	linux1804-64-shippable-qr	257,021,375.60 -> 281,573,267.14
5%	Explicit Memory	linux1804-64-shippable-qr	517,304,866.81 -> 543,739,422.35
3%	Resident Memory	linux1804-64-shippable-qr	975,430,047.02 -> 1,004,053,983.23
2%	Explicit Memory	linux1804-64-shippable-qr	522,660,870.67 -> 534,168,924.52

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(gwatson)

Andra Esanu (needinfo me)

Reporter

Comment 1

•

5 years ago

This has been backed out in https://treeherder.mozilla.org/perfherder/alerts?id=28138

Glenn Watson [:gw]|PTO until Sept 15

Comment 2

•

5 years ago

If I understand correctly, the patch that caused this regression (pt 2 in that series) was landed, then backed out, and has re-landed on m-c now. So is there another perf alert that shows the heap usage regressing again? (asking because pt 3 also landed yesterday but was backed out, and I want to make sure I've investigating the right part of the patch).

Flags: needinfo?(gwatson) → needinfo?(aesanu)

Glenn Watson [:gw]|PTO until Sept 15

Comment 3

•

5 years ago

OK, I was able to repro locally and can see what is going on now.

Previously, what we try and do is to allocate the smallest possible set of render targets (in terms of size) that would allow webrender to draw the render task graph. Then, we'd try to keep these in a memory pool until a threshold is exceeded and garbage collect them.

We try to reuse render targets from the pool next frame, because GPU driver allocations of render targets can be expensive and cause frame spikes. The problem with this scheme is that the nature of web content means that the required sizes of render targets often change from frame to frame, and so we spend a lot of time freeing old render targets, and allocating new targets from the GPU driver. So, even though we're using a render target memory pool, because the sizes often don't match, we still end up seeing a lot of frame spikes due to driver memory allocations.

Now, whenever we create a shared render target, we create it at a fixed size (2048 x 2048), and that is used for allocating render tasks. Once that 2k target is fully allocated, we then allocate another 2k x 2k target, and so on.

This has a couple of effects:

In some cases, especially on simple pages, the allocated memory can be higher as we've allocated targets (size 2k) larger than strictly needed.

We get much better reuse of render targets, so we reduce frame performance spikes because we don't need to alloc/free targets from the GPU driver.
Although the reported memory use in some cases is higher, from some testing it seems to be much more stable and as good or better on complex pages (since we don't have targets of unusual sizes retained in the memory pool until they are garbage collected).

Given those findings, I think it's fine to close this as a WONTFIX. Does that sound reasonable to you?

Ryan VanderMeulen [:RyanVM]

Comment 4

•

5 years ago

10% jumps in heap-unclassified always make me nervous that there's something we should be measuring that we currently aren't. Is there a memory probe we need to add here?

status-firefox83: --- → unaffected

status-firefox84: --- → unaffected

status-firefox85: --- → unaffected

status-firefox86: --- → affected

status-firefox-esr78: --- → unaffected

Flags: needinfo?(gwatson)

Glenn Watson [:gw]|PTO until Sept 15

Comment 5

•

5 years ago

Yes - I'm not sure exactly what's involved in doing that in WR, but it does sound like a good idea to ensure those allocations are included.

We don't have access to the allocation or pointer itself (they are allocated by the GPU driver and we have an opaque handle), which might complicate things?

We can calculate a rough estimate of what the GPU driver is going to allocate, maybe that's enough?

Flags: needinfo?(gwatson) → needinfo?(ryanvm)

Ryan VanderMeulen [:RyanVM]

Comment 6

•

5 years ago

I don't think I'm in a position to weigh in on that, sorry. I'll defer to the judgement of the gfx team :)

Flags: needinfo?(ryanvm)

Andra Esanu (needinfo me)

Reporter

Comment 7

•

5 years ago

Removing needinfo since the patch that needs investigation was detected

Flags: needinfo?(aesanu)

Glenn Watson [:gw]|PTO until Sept 15

Comment 8

•

5 years ago

jrmuizel, given the details in [1] and the memory improvements in other areas from this patch series [2], are you OK with closing this as WONTFIX?

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1683323#c3,
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1682365#c29

Flags: needinfo?(jmuizelaar)

Jeff Muizelaar [:jrmuizel]

Comment 9

•

5 years ago

Yep.

Status: NEW → RESOLVED

Closed: 5 years ago

Flags: needinfo?(jmuizelaar)

Resolution: --- → WONTFIX

BMO Automation

Updated

•

5 years ago

Has Regression Range: --- → yes

You need to log in before you can comment on or make changes to this bug.

Bugzilla

2.2 - 9.55% Explicit Memory / Heap Unclassified / Resident Memory (linux1804-64-shippable-qr) regression on push 5589923a52c80ed09c8ce383afb0385cabc0ec18 (Tue December 15 2020)

Categories

(Core :: Graphics: WebRender, defect)

Tracking

()

People

(Reporter: aesanu, Unassigned)

References

(Regression)

Details

(Keywords: perf, perf-alert, regression)

Crash Data

Security

(public)

User Story

Description

Regressions:

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated