Crash in mozilla::layers::APZCTreeManagerParent::RecvUpdateZoomConstraints
Categories
(Core :: Panning and Zooming, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
thunderbird_esr102 | --- | fixed |
firefox-esr60 | --- | wontfix |
firefox-esr102 | 110+ | fixed |
firefox62 | --- | wontfix |
firefox63 | --- | wontfix |
firefox64 | --- | wontfix |
firefox65 | --- | wontfix |
firefox66 | --- | wontfix |
firefox68 | --- | wontfix |
firefox69 | --- | wontfix |
firefox70 | --- | wontfix |
firefox108 | --- | wontfix |
firefox109 | --- | wontfix |
firefox110 | --- | fixed |
People
(Reporter: marcia, Assigned: dlrobertson)
References
Details
(Keywords: crash, regression, Whiteboard: [tbird crash])
Crash Data
Attachments
(2 files)
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-esr102+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-esr102+
|
Details | Review |
This bug was filed from the Socorro interface and is report bp-7314e0a6-2bd8-4eb4-bab0-ad0190180904. ============================================================= Seen while looking at nightly crash stats: https://bit.ly/2MTAkqk. Cross Platform crash which is more visible in 63 - and a few crashes in Linux in the latest 64 build. Around 22 crashes in the last 7 days (63/64). No comments and - the 2 URLs that come up are a Slack message and moz-extension://65a29e7a-8dda-4a4c-a11e-2d4dd86e7d79/data/slider/slider.html Top 10 frames of crashing thread: 0 xul.dll mozilla::layers::APZCTreeManagerParent::RecvUpdateZoomConstraints gfx/layers/ipc/APZCTreeManagerParent.cpp:128 1 xul.dll mozilla::layers::PAPZCTreeManagerParent::OnMessageReceived ipc/ipdl/PAPZCTreeManagerParent.cpp:326 2 xul.dll void mozilla::ipc::MessageChannel::DispatchMessageW ipc/glue/MessageChannel.cpp:2166 3 xul.dll mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:2045 4 xul.dll bool MessageLoop::DeferOrRunPendingTask ipc/chromium/src/base/message_loop.cc:459 5 xul.dll MessageLoop::DoWork ipc/chromium/src/base/message_loop.cc:534 6 xul.dll base::MessagePumpForUI::DoRunLoop ipc/chromium/src/base/message_pump_win.cc:210 7 xul.dll base::MessagePumpWin::Run ipc/chromium/src/base/message_pump_win.h:80 8 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:318 9 xul.dll base::Thread::ThreadMain ipc/chromium/src/base/thread.cc:181 =============================================================
Updated•6 years ago
|
Comment 1•6 years ago
|
||
The crash reports suggest that mTreeManager is null in APZCTreeManagerParent::RecvUpdateZoomConstraints(). Based on code inspection, some possible ways this could occur are: 1) APZCTreeManagerParent was given a null APZCTreeManager in the constructor or in ChildAdopted(). This would cause any subsequent RecvXXX() call on the APZCTreeManagerParent to crash, so RecvUpdateZoomConstraints() would have to be the first such call, which is possible given that this is a startup crash. 2) The APZCTreeManagerParent object itself has been destroyed and its fields overwritten (as far as I can tell, the destructor itself doesn't clear mTreeManager) at the time of the RecvUpdateZoomConstraints() call (i.e., a use-after-free). Neither of these should happen, but I haven't been able to rule out either by inspection. #1 seems more likely than #2 (in particular, if it were #2, I think we'd see some crashes at addresses other than 0x0). We could be defensive and add a check for a null mApzcTreeManager in release builds (we already assert in the places where the field is set in debug builds), but we could be covering up a deeper problem. Ryan, are you aware of any IPC-related changes in the past couple of months that may have caused a spike in this crash?
Comment 2•6 years ago
|
||
Unfortunately I can't think of any general changes. I also don't think specifically the PAPZCTreeManager code has been touched in a while.
Reporter | ||
Comment 3•6 years ago
|
||
17 crashes/20 installs across all platforms in the last 7 days. No real useful comments or URLs. One URL appears to be a slack message and the other is a link to amazon music album.
Updated•6 years ago
|
Comment 4•6 years ago
|
||
Marking fix-optional for 64. We could still take a patch for 65, and if it's verified and doesn't seem risky, could still take fixes for 64 as well.
Updated•6 years ago
|
Comment 5•5 years ago
|
||
I started Nightly (on Linux) this morning, checked for updates, an update was found and downloaded, I closed the browser and started it again. It crashed after applying the update when the UI was starting to show up, the crash ID is 462236b9-5f74-4fd6-8fba-c57740181223. My comment in the crash report might say something different about when it crashed, just ignore that. I looked into the crash report and found this bug as related. Note that I have "privacy.resistFingerprinting=true", so that the window is resized during startup. Start page is just about:blank. If you have further questions (e.g. regarding my setup), I would be happy to answer them.
Comment 6•5 years ago
|
||
Marking fix-optional for 65 and 66 so that these already triaged issues don't show up repeatedly in weekly regression triage. Happy to take a patch in nightly.
Updated•5 years ago
|
Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment 8•2 years ago
|
||
It looks like Florian figured out a way to reproduce this crash reliably. See bug 1742797 comment 85.
Comment 9•2 years ago
|
||
Those steps to reproduce worked, but as soon as another browser chrome test was added (less than a day after that bug landed) they no longer reproduced. The new test must have changed the chunking or order or something about the tests. Still possible to debug by pushing to try on a month+ old tip though.
Updated•2 years ago
|
Comment 10•2 years ago
|
||
FWIW the release of Thunderbird 102 seems to have resulted in an increase in the frequency of this crash.
Assignee | ||
Comment 11•1 year ago
|
||
Did some debugging and found that we're getting a null mApzcTreeManager
in the call to CompositorBridgeParent::AllocateAPZCTreeManagerParent in ContentCompositorBridgeParent::AllocPAPZCTreeManagerParent. I haven't verified this, but I'm guessing this occurs when CompositorBridgeParent::StopAndClearResources sets the mApzcTreeManager
to null here. Perhaps we should enter this block if mPaused is set or mCanSend
has been set by CompositorBridgeParent::ActorDestroy or CompositorBridgeParent::RecvWillClose.
Assignee | ||
Comment 12•1 year ago
|
||
(In reply to Dan Robertson (:dlrobertson) from comment #11)
Did some debugging and found that we're getting a null
mApzcTreeManager
in the call to CompositorBridgeParent::AllocateAPZCTreeManagerParent in ContentCompositorBridgeParent::AllocPAPZCTreeManagerParent. I haven't verified this, but I'm guessing this occurs when CompositorBridgeParent::StopAndClearResources sets themApzcTreeManager
to null here. Perhaps we should enter this block if mPaused is set ormCanSend
has been set by CompositorBridgeParent::ActorDestroy or CompositorBridgeParent::RecvWillClose.
This theory appears to be incorrect, as I'd expect the assert to be different in this try run
Assignee | ||
Comment 13•1 year ago
|
||
Fix static-analysis warnings in ContentCompositorBridgeParent.
Assignee | ||
Comment 14•1 year ago
|
||
Depends on D164682
Comment 15•1 year ago
|
||
Pushed by drobertson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/7fc8e79f2ff0 Fix static-analysis warnings. r=botond https://hg.mozilla.org/integration/autoland/rev/8cdb673361ab AllocPAPZCTreeManager should gracefully fail if APZ is not enabled. r=botond
Comment 16•1 year ago
|
||
Backed out for causing build bustages on ContentCompositorBridgeParent.cpp
- Backout link
- Push with failures
- Failure Log
- Failure line: builds/worker/checkouts/gecko/gfx/layers/ipc/ContentCompositorBridgeParent.cpp:213:22: error: unused variable 'id' [-Werror=unused-variable]
Comment 17•1 year ago
|
||
I guess we need to do this sort of thing.
Assignee | ||
Comment 18•1 year ago
|
||
(In reply to Botond Ballo [:botond] from comment #17)
I guess we need to do this sort of thing.
That would work, but I may just use a standard iterator and use it.second
. Is there another reason to unwrap the iterator like that?
Comment 19•1 year ago
|
||
(In reply to Dan Robertson (:dlrobertson) from comment #18)
(In reply to Botond Ballo [:botond] from comment #17)
I guess we need to do this sort of thing.
That would work, but I may just use a standard iterator and use
it.second
. Is there another reason to unwrap the iterator like that?
That works too. (One tiny nit, I would call it something like entry
, since an "iterator" is something that dereferences to an element of the range being iterated, while the range-for gives you the result of dereferencing a (hidden) iterator.)
Comment 20•1 year ago
|
||
Pushed by drobertson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c220d03cfb7f Fix static-analysis warnings. r=botond https://hg.mozilla.org/integration/autoland/rev/00fb6658b130 AllocPAPZCTreeManager should gracefully fail if APZ is not enabled. r=botond
Comment 21•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/c220d03cfb7f
https://hg.mozilla.org/mozilla-central/rev/00fb6658b130
Comment 22•1 year ago
|
||
Not sure this is really worth backporting to Beta since Firefox doesn't appear to be crashing at all and TB is only crashing on 102, but I'm open to a nomination if y'all think it'd be worth it. It does graft cleanly to Beta & ESR.
Assignee | ||
Comment 23•1 year ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #22)
Not sure this is really worth backporting to Beta since Firefox doesn't appear to be crashing at all and TB is only crashing on 102, but I'm open to a nomination if y'all think it'd be worth it. It does graft cleanly to Beta & ESR.
Thunderbird is based on ESR right? If so it might be worthwhile to backport to ESR, since the next ESR that would contain this isn't for some time. I would like to let this sit in nightly for a bit before a potential backport to ESR. IMO this is a low-risk patch, but I'd still like to have it tested in nightly a bit first.
Comment 24•1 year ago
|
||
Yeah, let's revisit next cycle.
Comment 25•1 year ago
•
|
||
(In reply to Jonathan Watt [:jwatt] from comment #10)
FWIW the release of Thunderbird 102 seems to have resulted in an increase in the frequency of this crash.
Yes. Thanks for catching that.
And an interesting, odd peak at 102.3.x, then dropped off weeks later. https://crash-stats.mozilla.org/signature/?signature=mozilla%3A%3Alayers%3A%3AAPZCTreeManagerParent%3A%3ARecvUpdateZoomConstraints&date=%3E%3D2022-06-30T15%3A14%3A00.000Z&date=%3C2022-12-31T15%3A14%3A00.000Z#graphs Which is curious, because were were still massively migrating users from version 91, even in November.
Assignee | ||
Comment 26•1 year ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #24)
Yeah, let's revisit next cycle.
The patch didn't seem to cause any regressions in nightly, so it might be worth a uplift to esr to check if it fixes the issue for thunderbird.
Comment 27•1 year ago
|
||
Frequency does seem lower for TB these days, but feel free to nominate.
Comment 28•1 year ago
|
||
Also disproportionately affects linux users of Thunderbird.
Assignee | ||
Comment 29•1 year ago
|
||
Comment on attachment 9308274 [details]
Bug 1488886 - AllocPAPZCTreeManager should gracefully fail if APZ is not enabled. r=botond
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Thunderbird is based on ESR and this bug largely impacts thunderbird. The next ESR release that would include this doesn't occur for quite some time.
- User impact if declined: Crashes may occur from time to time on thunderbird.
- Fix Landed on Version: 110
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): Added a check to
AllocPAPZCTreeManager
to gracefully fail if APZ is not enabled.
Assignee | ||
Updated•1 year ago
|
Comment 30•1 year ago
|
||
Comment on attachment 9308273 [details]
Bug 1488886 - Fix static-analysis warnings. r=botond
Approved for 102.8esr.
Updated•1 year ago
|
Comment 31•1 year ago
|
||
bugherder uplift |
Updated•1 year ago
|
Description
•