Closed Bug 1431813 Opened 2 years ago Closed 2 years ago

Crash in OOM | large | mozalloc_abort | mozalloc_handle_oom | moz_xmalloc | std::_Allocate | std::vector<T>::_Resize<T>

Categories

(Core :: Graphics: Layers, defect, P3, critical)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- wontfix
firefox59 --- fixed
firefox60 --- fixed

People

(Reporter: marcia, Assigned: rhunt)

References

Details

(Keywords: crash, regression, Whiteboard: [gfx-noted])

Crash Data

Attachments

(3 files, 2 obsolete files)

This bug was filed from the Socorro interface and is
report bp-bc4351d7-ee9e-4bfc-a143-3b0b10180118.
=============================================================

Seen while looking at nightly crash stats, and also present in 58 up to the RC: http://bit.ly/2DOeOeB. The crash reason for all is EXCEPTION_BREAKPOINT.

Bug 1395478 touched code in this area. ni on :dvander for any ideas.

Top 10 frames of crashing thread:

0 mozglue.dll mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:33
1 mozglue.dll mozalloc_handle_oom memory/mozalloc/mozalloc_oom.cpp:54
2 mozglue.dll moz_xmalloc memory/mozalloc/mozalloc.cpp:72
3 xul.dll std::_Allocate vs2017_15.4.2/VC/include/xmemory0:78
4 xul.dll std::vector<unsigned char, std::allocator<unsigned char> >::_Resize<<lambda_2b51424039c320f102fd798e073c89b2> > vs2017_15.4.2/VC/include/vector:1491
5 xul.dll mozilla::gfx::CaptureCommandList::Append<mozilla::gfx::PushClipCommand> gfx/2d/CaptureCommandList.h:36
6 xul.dll mozilla::gfx::DrawTargetCaptureImpl::SetPermitSubpixelAA gfx/2d/DrawTargetCapture.cpp:122
7 xul.dll gfxFont::Draw gfx/thebes/gfxFont.cpp:2269
8 xul.dll gfxTextRun::DrawGlyphs gfx/thebes/gfxTextRun.cpp:438
9 xul.dll gfxTextRun::Draw gfx/thebes/gfxTextRun.cpp:702

=============================================================
Flags: needinfo?(dvander)
Flags: needinfo?(dvander) → needinfo?(rhunt)
Priority: -- → P3
Whiteboard: [gfx-noted]
Interesting, I'm seeing some quite high requests for memory from draw target captures (287.57 MiB [1], 21.77 MiB, 358 KiB). This seems unexpected to me as each DrawCommand should be small and we shouldn't be queuing thousands of them.

Bug 1433527 got me thinking that we might have some painting code (like CSS gradients) that uses SetTransform poorly. As a first approach at trying to reduce the memory used here I wrote a patch that eliminates redundant SetTransform and SetPermitSubpixelAA calls and logged CaptureList capacities at destruction to compare.

As a test I opened arstechnica.com and theverge.com and scrolled down and up on both of them slowly.

The amount of redundant SetTransform calls over the run was 500130, and redundant SetPermitSubpixelAA calls were about half that. Average capture size dropped from 10940.51 bytes to 6699.89 bytes. Glancing at histograms I see a drop in some of the outliers as well.

There might be more work to do with this, but this seems like a good start.

[1] https://crash-stats.mozilla.com/report/index/64b10cf3-203a-4f2f-b2c0-5134a0180126
[2] https://crash-stats.mozilla.com/report/index/110fb6c0-f01c-42cc-9b1c-e3a1b0180126
[3] https://crash-stats.mozilla.com/report/index/177824e9-2f5f-4ee3-8d4a-f08cc0180125
Assignee: nobody → rhunt
Flags: needinfo?(rhunt)
Attached patch capture-memory.patch (obsolete) — Splinter Review
Attachment #8945890 - Flags: review?(bas)
Attachment #8945890 - Flags: review?(bas) → review+
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bdd8a8fc3912
Try and reduce the size of draw target captures. (bug 1431813, r=bas)
The problem is that we can't remove SetTransform calls when the DrawTarget has had a PushLayer, PopLayer, PushClip, PopClip since the last SetTransform as they (on Skia) do a mCanvas->Save() and mCanvas->Restore(). 

I believe D2D is the same way, and tried to hit all the places we need to block in a new patch.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=5d0a5b4b5bc277f1cd5381f070c5fba2cc6aec26
Flags: needinfo?(rhunt)
Attached patch capture-memory.patch (obsolete) — Splinter Review
Attachment #8945890 - Attachment is obsolete: true
Attachment #8947174 - Flags: review?(bas)
Comment on attachment 8947174 [details] [diff] [review]
capture-memory.patch

Review of attachment 8947174 [details] [diff] [review]:
-----------------------------------------------------------------

Ewl.
Attachment #8947174 - Flags: review?(bas) → review+
SkCanvas::restore will remove any changes to it's matrix since the previous save, but we don't want that to happen when we PopClip or PopLayer.
Attachment #8947886 - Flags: review?(lsalzman)
Attachment #8947174 - Attachment is obsolete: true
Attachment #8947887 - Flags: review?(bas)
From running `./mach awsy-test --entities=10 --iterations=1 --per-tab-pause=1 --settle-wait-time=1` once:

Average CaptureCommandList vector capacity at destruction:
Before: 14084.1487
After: 6171.91133
%Change: 56.17% smaller
Attachment #8947886 - Flags: review?(lsalzman) → review+
Attachment #8947887 - Flags: review?(bas) → review+
Comment on attachment 8947889 [details] [diff] [review]
Reuse existing DrawCommand when pushing sequential SetTransform and SetPermitSubpixelAA

Review of attachment 8947889 [details] [diff] [review]:
-----------------------------------------------------------------

Wow, that copy made for a weird looking diff. I think I understand it now though, looks good.
Attachment #8947889 - Flags: review?(bas) → review+
Pushed by rhunt@eqrion.net:
https://hg.mozilla.org/integration/mozilla-inbound/rev/4ef7429cf9fd
Be sure to preserve the current transform when restoring an SkCanvas (bug 1431813, r=lsalzman)
https://hg.mozilla.org/integration/mozilla-inbound/rev/0003691542e9
Deduplicate unneeded SetTransform and SetPermitSubpixelAA calls from DrawTargetCapture (bug 1431813, r=bas)
https://hg.mozilla.org/integration/mozilla-inbound/rev/1856dfdba622
Reuse existing DrawingCommand for SetTransform and SetPermitSubpixelAA (bug 1431813, r=bas)
Is this something we should consider backporting to Beta for 59 or can it ride the 60 train?
Comment on attachment 8947886 [details] [diff] [review]
Be sure to preserve the current transform when restoring an SkCanvas

Yes we think this is worth backporting to beta.

Approval Request Comment
[Feature/Bug causing the regression]: OMTP capture lists can become quite large and cause OOM's
[User impact if declined]: OOM's could be more likely on certain paint heavy pages
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Kind of. There have been no crashes with this signature since it landed in nightly, but these patches don't completely eliminate the possibility of an OOM, they just significantly reduce memory needed for OMTP capture lists reducing the likelyhood of an OOM.
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: The other patches in this bug
[Is the change risky?]: Not really
[Why is the change risky/not risky?]: The patches just deduplicate some commands that are large and commonly used
[String changes made/needed]: None
Flags: needinfo?(rhunt)
Attachment #8947886 - Flags: approval-mozilla-beta?
Comment on attachment 8947886 [details] [diff] [review]
Be sure to preserve the current transform when restoring an SkCanvas

Low-risk fixes to avoid OOM situations when OMTP is enabled. Taking for 59b8.
Attachment #8947886 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8947887 [details] [diff] [review]
Deduplicate identical state changes in DrawTargetCapture

Low-risk fixes to avoid OOM situations when OMTP is enabled. Taking for 59b8.
Attachment #8947887 - Flags: approval-mozilla-beta+
Attachment #8947889 - Flags: approval-mozilla-beta+
See Also: → 1455930
You need to log in before you can comment on or make changes to this bug.