Crash in mozilla::layers::ClientLayerManager::DidComposite
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox63 | --- | wontfix |
firefox64 | --- | fixed |
firefox65 | --- | fixed |
People
(Reporter: marcia, Assigned: mstange)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-release+
|
Details | Review |
This bug was filed from the Socorro interface and is report bp-7b1f595b-406e-4433-874a-16c9c0181126. ============================================================= Mac crash at the top of Mac specific 63.0.3 crashes: https://bit.ly/2zqbupe. This crash seems to have started in 63, but wasn't seen in the 63 betas. Comments mention: *Crashed in the middle of banking app Peoples.com *Was attempting to purchase something on Ancestry web site and clicked button for PayPal to checkout. The link opened a page which instantly disappeared as a pop-up presented in the foreground. The field of the pop-up was blank and the indicator suggested that it was loading. Suddenly, the browser crashed. I need to make that purchase before the deadline. Some of the other URLs seem to be people trying to pay bills. Top 10 frames of crashing thread: 0 XUL mozilla::layers::ClientLayerManager::DidComposite gfx/layers/client/ClientLayerManager.cpp:507 1 XUL mozilla::layers::CompositorBridgeChild::RecvDidComposite gfx/layers/ipc/CompositorBridgeChild.cpp:539 2 XUL NS_IsMainThread xpcom/threads/nsThreadManager.cpp:47 3 @0x11eddf447 4 XUL mozilla::layers::PCompositorBridgeChild::OnMessageReceived ipc/ipdl/PCompositorBridgeChild.cpp:1436 5 XUL mozilla::ipc::MessageChannel::DispatchAsyncMessage ipc/glue/MessageChannel.cpp:2248 6 XUL mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2175 7 XUL mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:2045 8 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1161 9 libsystem_pthread.dylib libsystem_pthread.dylib@0x1c6f =============================================================
Updated•6 years ago
|
Comment 1•6 years ago
|
||
Jesse can you help find someone to investigate this crash? It seems to be particularly bothersome to Mac users.
Comment 2•6 years ago
|
||
Since this seems to have been introduced in 63.0.3 was bug 1503424 seems like the most likely cause.
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Comment 3•6 years ago
|
||
I can't make heads or tails of this. Bug 1503424 added a null check for something that we thought was only null on macOS 10.10 prerelease versions. So if there had been any regressions from it, I would have expected those regressions to only happen on machines with those 10.10 prereleases, and only for users that would otherwise have crashed on startup. But that's clearly not the case - these new crashes happen on seemingly all versions of macOS. Here's the version breakdown: OS X 10.9 93 OS X 10.10 176 OS X 10.11 67 OS X 10.12 137 OS X 10.13 71 OS X 10.14 69 There's a disproportionately large number of crashes on 10.10. So there clearly is *some* correlation with 10.10 but I don't understand what it could be. As for the actual crash in this bug, it's here: > if (aTransactionId.IsValid()) { > nsIWidgetListener *listener = mWidget->GetWidgetListener(); > if (listener) { > listener->DidCompositeWindow(aTransactionId, aCompositeStart, aCompositeEnd); > } > listener = mWidget->GetAttachedWidgetListener(); mWidget seems to be null. But it seems to not have been null during the call to mWidget->GetWidgetListener() a few lines further up, otherwise we should have crashed there. So maybe the call to listener->DidCompositeWindow is causing mWidget to be nulled out? But if that's the case, how is that related to the null check in the initialization method of ChildView that bug 1503424 added?
Assignee | ||
Comment 4•6 years ago
|
||
The code quoted above got compiled to the following (I'm showing Hopper's decompiled code for it):
> if (r14 != 0x0) {
> rdi = *(r13 + 0xb8);
> rax = *rdi;
> rax = (*(rax + 0x58))(rdi);
> if (rax != 0x0) {
> (*(*rax + 0x90))(rax, r14, r12, r15);
> }
> rdi = *(r13 + 0xb8);
> rax = *rdi; // <-- crash happens here
> ...
r15 is the transaction ID, r13 is the this pointer to the ClientLayerManager instance, rdi is the pointer to the widget, rax is the pointer to the widget's vtable (and briefly the pointer to the widget listener). So this confirms the suspicion that the call to listener->DidCompositeWindow nulls out mWidget; the second time we try to get the widget's vtable, we dereference a null widget pointer.
Assignee | ||
Comment 5•6 years ago
|
||
listener->DidCompositeWindow calls nsView::DidCompositeWindow, which puts a script blocker on the stack and then calls rootContext->NotifyDidPaintForSubtree, which puts DelayedFireDOMPaintEvents into the script blocker. So then those can execute JS when the script blocker is popped off the stack at the end of nsView::DidCompositeWindow. And if the JS handlers close windows, I assume this would destroy the widget and null it out on the ClientLayerManager. I still can't see a relation to 1503424 though, and I don't see anything in the 63.0.1 -> 63.0.3 diff that indicates that it might change when windows are closed or how MozAfterPaint events are processed.
Reporter | ||
Comment 6•6 years ago
|
||
During triage we discussed the fact that there were a set of 52 crashes in 62.0.3: https://bit.ly/2E4gUdr. However, those crashes don't seem to be the same kind of URLs that are in the ones in 63, and they are not skewed to 10.10.
Assignee | ||
Comment 7•6 years ago
|
||
Pushed by mstange@themasta.com: https://hg.mozilla.org/integration/autoland/rev/daf1ffb72989 Work around mWidget becoming null during DidComposite, and add a diagnostic assert that will hopefully root out the culprit. r=jrmuizel
Comment 9•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/daf1ffb72989
Comment 10•5 years ago
|
||
Crash volume for 63.0.3 looks high enough that we may want to keep this on the radar for possible Fx64 dot release ride-along status.
Assignee | ||
Comment 11•5 years ago
|
||
Comment on attachment 9028453 [details] Bug 1510058 - Work around mWidget becoming null during DidComposite, and add a diagnostic assert that will hopefully root out the culprit. r?jrmuizel [Beta/Release Uplift Approval Request] Feature/Bug causing the regression: None User impact if declined: Crashes on macOS, possibly when popup windows close. No steps to reproduce unfortunately, and no clearly identified regressor. Is this code covered by automated tests?: No Has the fix been verified in Nightly?: No Needs manual test from QE?: No If yes, steps to reproduce: List of other uplifts needed: None Risk to taking this patch: Low Why is the change risky/not risky? (and alternatives if risky): The patch just adds an additional null check. String changes made/needed:
Comment 12•5 years ago
|
||
Comment on attachment 9028453 [details] Bug 1510058 - Work around mWidget becoming null during DidComposite, and add a diagnostic assert that will hopefully root out the culprit. r?jrmuizel This is already on Beta (65).
Comment 13•5 years ago
|
||
Comment on attachment 9028453 [details]
Bug 1510058 - Work around mWidget becoming null during DidComposite, and add a diagnostic assert that will hopefully root out the culprit. r?jrmuizel
avoiding a null ptr crash on macos; approved for 64.0.2
Comment 14•5 years ago
|
||
bugherder uplift |
Comment 15•5 years ago
|
||
After 64.0.2 first days crash problem resolved but today affected again with MacOs Mojave 10.14.2
Assignee | ||
Comment 16•5 years ago
|
||
(In reply to YavuzGm from comment #15)
After 64.0.2 first days crash problem resolved but today affected again with MacOs Mojave 10.14.2
What's the crash ID of your crash? I think it might have been bug 1515183 and not this bug.
Looking at the remaining crashes in 64.0.2, there's only been 4 in ClientLayerManager::DidComposite, and they look very different and are likely a separate bug. So I think this fix was successful!
Comment 17•5 years ago
|
||
(In reply to Markus Stange [:mstange] from comment #16)
(In reply to YavuzGm from comment #15)
After 64.0.2 first days crash problem resolved but today affected again with MacOs Mojave 10.14.2
What's the crash ID of your crash? I think it might have been bug 1515183 and not this bug.
Looking at the remaining crashes in 64.0.2, there's only been 4 in ClientLayerManager::DidComposite, and they look very different and are likely a separate bug. So I think this fix was successful!
Hi, thanks for the answer.I add my crash report link below.I hope this will help you.
https://crash-stats.mozilla.com/report/index/229d024e-c9de-4c11-bf5f-98c5e0190125
Description
•