Closed Bug 1263200 Opened 8 years ago Closed 8 years ago

crash in mozilla::layers::CompositorBridgeParent::RootLayerTreeId

Categories

(Core :: Panning and Zooming, defect)

48 Branch
Unspecified
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox47 --- unaffected
firefox48 --- fixed

People

(Reporter: kats, Assigned: kats)

Details

(Keywords: crash, regression, Whiteboard: [gfx-noted])

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is 
report bp-ee65187d-687c-4afc-97d1-f2fa42160401.
=============================================================

Saw this while looking around in the topcrash list in 48 crash-stats. jwatt was seeing this crash locally yesterday as well, so it seems like a pre-existing bug that we should fix. Either it's a race condition during startup or (as jwatt was seeing) there was a compositor startup failure and it resulted in this crash.
The earliest one appears to be from 20160324030447, and all the crashes are on OS X. Interestingly I don't see any matching crashes using the CompositorParent signature, only CompositorBridgeParent, and that rename happened very recently. So either this regressed with that change, or shortly after.
This is the regression range for the march 24 nightly:

https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=efe7d026ac641759838dd3897c37892e37e5b244&tochange=6202ade0e6d688ffb67932398e56cfc6fa04ceb3

I see bug 1068674 in that range which might be responsible for compositor initialization failures, Markus, do you think that's a possible cause? For reference the crash that jwatt was seeing was that the compositor pointer in the widget was null, and so when an input event came in and went through the ProcessUntransformedAPZEvent function it crashed with a null deref when trying to call RootLayerTreeId().
Attached file jwatt's pastebin
This is the pastebin jwatt sent me yesterday when he was running into (includes some local logging of his, but there's some "Failed to initialise Compositor" stuff in there).
Keywords: regression
Whiteboard: [gfx-noted]
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #2)
> I see bug 1068674 in that range which might be responsible for compositor
> initialization failures, Markus, do you think that's a possible cause?

Hmm, it might be, though I don't see how. But nothing else in that range looks like a likely candidate.

Jonathan, can you step through GLContextProviderCGL::CreateForWindow and see where it fails?
Flags: needinfo?(jwatt)
I can't currently. I spent a chunk of time this morning trying to bisect between the last aurora branch point and tip. Every changeset I built worked, including, finally, tip. After that I've not been able to reproduce at all. :/

I also uninstalled/disabled various add-ons, but I was originally seeing this crash while running:

  ./mach mochitest dom/smil/test/test_smilCSSFromTo.xhtml

which runs with a clean profile, so I don't see how that would relate.
Flags: needinfo?(jwatt)
:(

Maybe it's related to what GPU is in use? Or maybe some OS-internal state that affects what kinds of GL contexts it can create for us?
Does it happen when canvas acceleration is off?
There have been 22 occurrences from 9 installations in the past 7 days on Nightly 48.
I think the crash itself can be fixed pretty easily, because we're leaving a dangling mAPZC pointer in the basewidget when the compositor creation fails and the compositor is destroyed. As to why the compositor creation fails I don't know.
Assignee: nobody → bugmail.mozilla
I'm going to assume it's a regression from bug 1068674 since that's the most likely explanation.
Blocks: 1068674
Oh actually it looks like bug 1068674 was uplifted to aurora a few days ago, but there's still no crashes like this on aurora. So maybe it's not caused by that.
No longer blocks: 1068674
Comment on attachment 8741770 [details]
MozReview Request: Bug 1263200 - Reset the APZ pointer in the base widget to null if the compositor creation fails. r?mstange

https://reviewboard.mozilla.org/r/46741/#review43701
Attachment #8741770 - Flags: review?(mstange) → review+
https://hg.mozilla.org/mozilla-central/rev/4a6678c781c9
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
I don't see any more of these in crash-stats after this patch landed.
... all of which seem to be coming from AllocPWebRenderBridgeParent. i.e. the new manifestation of this crash is webrender-specific. We can track it separately, probably in one of the other bugs you filed (I haven't looked at them all yet).
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #18)
> ... all of which seem to be coming from AllocPWebRenderBridgeParent. i.e.
> the new manifestation of this crash is webrender-specific. We can track it
> separately, probably in one of the other bugs you filed (I haven't looked at
> them all yet).

I got this crash signature in the context of bug 1365009 comment 0
For those following along at home, bug 1365009 comment 7 has my explanation of why this crash signature shows up with webrender sometimes.
You need to log in before you can comment on or make changes to this bug.