2.2 - 9.55% Explicit Memory / Heap Unclassified / Resident Memory (linux1804-64-shippable-qr) regression on push 5589923a52c80ed09c8ce383afb0385cabc0ec18 (Tue December 15 2020)
Categories
(Core :: Graphics: WebRender, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr78 | --- | unaffected |
firefox83 | --- | unaffected |
firefox84 | --- | unaffected |
firefox85 | --- | unaffected |
firefox86 | --- | affected |
People
(Reporter: aesanu, Unassigned)
References
(Regression)
Details
(Keywords: perf, perf-alert, regression)
Perfherder has detected a awsy performance regression from push 5589923a52c80ed09c8ce383afb0385cabc0ec18. As author of one of the patches included in that push, we need your help to address this regression.
Regressions:
Ratio | Suite | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|---|
10% | Heap Unclassified | linux1804-64-shippable-qr | 257,021,375.60 -> 281,573,267.14 | ||
5% | Explicit Memory | linux1804-64-shippable-qr | 517,304,866.81 -> 543,739,422.35 | ||
3% | Resident Memory | linux1804-64-shippable-qr | 975,430,047.02 -> 1,004,053,983.23 | ||
2% | Explicit Memory | linux1804-64-shippable-qr | 522,660,870.67 -> 534,168,924.52 |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) will be backed out in accordance with our regression policy.
For more information on performance sheriffing please see our FAQ.
Reporter | ||
Comment 1•5 years ago
|
||
This has been backed out in https://treeherder.mozilla.org/perfherder/alerts?id=28138
Comment 2•5 years ago
|
||
If I understand correctly, the patch that caused this regression (pt 2 in that series) was landed, then backed out, and has re-landed on m-c now. So is there another perf alert that shows the heap usage regressing again? (asking because pt 3 also landed yesterday but was backed out, and I want to make sure I've investigating the right part of the patch).
Comment 3•5 years ago
|
||
OK, I was able to repro locally and can see what is going on now.
Previously, what we try and do is to allocate the smallest possible set of render targets (in terms of size) that would allow webrender to draw the render task graph. Then, we'd try to keep these in a memory pool until a threshold is exceeded and garbage collect them.
We try to reuse render targets from the pool next frame, because GPU driver allocations of render targets can be expensive and cause frame spikes. The problem with this scheme is that the nature of web content means that the required sizes of render targets often change from frame to frame, and so we spend a lot of time freeing old render targets, and allocating new targets from the GPU driver. So, even though we're using a render target memory pool, because the sizes often don't match, we still end up seeing a lot of frame spikes due to driver memory allocations.
Now, whenever we create a shared render target, we create it at a fixed size (2048 x 2048), and that is used for allocating render tasks. Once that 2k target is fully allocated, we then allocate another 2k x 2k target, and so on.
This has a couple of effects:
- In some cases, especially on simple pages, the allocated memory can be higher as we've allocated targets (size 2k) larger than strictly needed.
- We get much better reuse of render targets, so we reduce frame performance spikes because we don't need to alloc/free targets from the GPU driver.
- Although the reported memory use in some cases is higher, from some testing it seems to be much more stable and as good or better on complex pages (since we don't have targets of unusual sizes retained in the memory pool until they are garbage collected).
Given those findings, I think it's fine to close this as a WONTFIX. Does that sound reasonable to you?
Comment 4•5 years ago
|
||
10% jumps in heap-unclassified always make me nervous that there's something we should be measuring that we currently aren't. Is there a memory probe we need to add here?
Comment 5•5 years ago
|
||
Yes - I'm not sure exactly what's involved in doing that in WR, but it does sound like a good idea to ensure those allocations are included.
We don't have access to the allocation or pointer itself (they are allocated by the GPU driver and we have an opaque handle), which might complicate things?
We can calculate a rough estimate of what the GPU driver is going to allocate, maybe that's enough?
Comment 6•5 years ago
|
||
I don't think I'm in a position to weigh in on that, sorry. I'll defer to the judgement of the gfx team :)
Reporter | ||
Comment 7•5 years ago
|
||
Removing needinfo since the patch that needs investigation was detected
Comment 8•5 years ago
|
||
jrmuizel, given the details in [1] and the memory improvements in other areas from this patch series [2], are you OK with closing this as WONTFIX?
[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1683323#c3,
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1682365#c29
Comment 9•5 years ago
|
||
Yep.
Updated•5 years ago
|
Description
•