Closed Bug 1695016 Opened 3 years ago Closed 3 years ago

Crash with a testcase. Memory and CPU bloat with very simiar testcase involving 2^16 divs

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox88 --- wontfix
firefox89 --- wontfix
firefox90 --- wontfix
firefox91 --- fixed

People

(Reporter: mayankleoboy1, Assigned: nical)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: regression)

Crash Data

Attachments

(4 files)

STR1 (Crash)

  1. Use hw-wr
  2. Open the crash.html
  3. Crash

STR2 (Bloat)

  1. Use hw-wr
  2. Open the bloat.html
  3. memory bloat, continuous CPU use

Difference between the 2 cases is this:
var divCount = 65536; // causes crash
var divCount = 65534; // causes bloat

Attached file crash.html
Attached file bloat.html
Attached file about:support

Bug 1692250 - Begin refactoring the how pictures refer to their textured content. r=gw

This patch only erases some of the differences between how pictures and other primitves resolve their render tasks. There is a lot more to do there but I quite haven't figured out the incremental next step towards decoupling the picture primitive its content. After this patch we may be close to a good place to start extracting composite modes out into their own primitives.

Differential Revision: https://phabricator.services.mozilla.com/D106142

2021-02-25T23:45:04.194000: DEBUG : Did not find a branch, checking all integration branches
2021-02-25T23:45:04.194000: INFO : The bisection is done.
2021-02-25T23:45:04.194000: INFO : Stopped

Crash Signature: [@ webrender::frame_builder::FrameBuilder::build ]
Regressed by: 1692250
Has Regression Range: --- → yes
Summary: Crash with a testcase → Crash with a testcase. Memory and CPU bloat with very simiar testcase involving 2^16 divs

Idea of this testcase came from the testcase at : https://bug1480964.bmoattachments.org/attachment.cgi?id=9017292
Feel free to INVALID

Keywords: regression
Flags: needinfo?(gwatson)

Nical, looks like this might be a regression from your recent patches?

Flags: needinfo?(gwatson) → needinfo?(nical.bugzilla)

The crash can be avoided by moving back to u32 for for render task ids, but this type of page is kind of malicious so I would rather set some limits so that we don't crash but don't attempt to correctly render 2^16 visible primitives either.

Crash Signature: [@ webrender::frame_builder::FrameBuilder::build ] → [@ webrender::frame_builder::FrameBuilder::build ] [@ webrender::render_task_graph::RenderTaskGraphBuilder::end_frame ]
Severity: -- → S3

I tried adding a concept of dummy render task which is returned when task allocation "fails" (in this case when we detect that we are requesting too many render tasks.), that task points to the first pixel of the dummy texture. In principle it is a nice solution because it transparently returns something that doesn't allocate any GPU memory and should render without crashing, although it won't render properly. However I run into a few issues:

  • The render task graph building code relies on all render tasks being part of the graph. This is a problem if one successfully creates render task A, but the creation of render task B that depends on A fails. In this case A becomes disconnected from the graph and this trips a number of assertions in the code. That's not too bad because It wouldn't be hard for the render task graph to properly cull out disconnected nodes, and I think it should.
  • Because this depends on the primitive traveral order, content requesting too many render tasks can cause render task allocation failures to happen in the chrome. I don't think that this is an acceptable outcome. Unfortunately we can't differentiate during frame building what is web content from what is part of the ui.
See Also: → 1700248
See Also: → 1695814

This is to work around fuzzers repeatedly finding test cases with 2^16 clip masks or other type of render tasks. Not something we would see in real pages nor something that we would handle well at all even with 32 bits ids but at least it won't get in the way of fuzzing.

Assignee: nobody → nical.bugzilla
Status: NEW → ASSIGNED
Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/254bc19a58d7
Go back to 32 bits rendder task ids. r=gw
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
Flags: needinfo?(nical.bugzilla)

The patch landed in nightly and beta is affected.
:nical, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(nical.bugzilla)
You need to log in before you can comment on or make changes to this bug.