Closed Bug 1691475 Opened 3 years ago Closed 3 years ago

Shared surfaces created in the compositor process can be freed too early

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

RESOLVED FIXED
87 Branch
Tracking Status
firefox-esr78 --- wontfix
firefox85 --- wontfix
firefox86 --- fixed
firefox87 --- fixed

People

(Reporter: aosmond, Assigned: aosmond)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

When a shared surface is created and lives in the compositor process, we actually free it differently from shared surfaces created in the content process:

https://searchfox.org/mozilla-central/rev/7067bbd8194f4346ec59d77c33cd88f06763e090/gfx/layers/ipc/SharedSurfacesChild.cpp#426

Since we don't post the release to the compositor thread, we don't have any ordering guarantees with the resource updates that we get when we use IPDL to send the message.

We add a shared surface on the main thread to the shared surfaces table
when in the compositor process because it uses a different wrapper
object if the shared surface is in a different process. This structure
was mirrored for the removal of a shared surface created in the
compositor process, however that created a race between the main thread
freeing the surface before the display list creation bound an image key
to the shared surface's external image ID. As such, this patch makes
removal always post to the compositor thread, whether in the compositor
process or not.

This variant of the crash can only happen if you don't have a GPU process.

Comment on attachment 9201832 [details]
Bug 1691475 - Remove shared surfaces on the compositor thread.

Beta/Release Uplift Approval Request

  • User impact if declined: May crash if using WebRender
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Verifying in nightly is difficult since the crash rate is too low. However we see this crash as an intermittent in CI. I confirmed via logs this was the problem, and cannot reproduce the intermittent with the fix. The fix is pretty small and clearly solves solves a data race.
  • String changes made/needed:
Attachment #9201832 - Flags: approval-mozilla-beta?
Pushed by aosmond@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/64a26b4a51ae
Remove shared surfaces on the compositor thread. r=jrmuizel
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87 Branch

Comment on attachment 9201832 [details]
Bug 1691475 - Remove shared surfaces on the compositor thread.

We don't have a lot of crashes on desktop but we have more crash volume for this signature on Fenix. Approved for our next 86 beta, thanks.

Attachment #9201832 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.