Crash in [@ nsWindow::RecvScreenPixels]
Categories
(Core Graveyard :: Widget: Android, defect, P1)
Tracking
(firefox-esr60 unaffected, firefox-esr68 unaffected, firefox67 wontfix, firefox67.0.1 wontfix, firefox68 wontfix, firefox69 fixed, firefox70 fixed)
Tracking | Status | |
---|---|---|
firefox-esr60 | --- | unaffected |
firefox-esr68 | --- | unaffected |
firefox67 | --- | wontfix |
firefox67.0.1 | --- | wontfix |
firefox68 | --- | wontfix |
firefox69 | --- | fixed |
firefox70 | --- | fixed |
People
(Reporter: davidb, Assigned: fluffyemily)
References
Details
(Keywords: crash, regression, Whiteboard: [geckoview:fenix:m7])
Crash Data
Attachments
(1 file)
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
This bug is for crash report bp-10b7d77e-9d56-431f-bb8e-e376d0190521.
Top 10 frames of crashing thread:
0 libxul.so nsWindow::RecvScreenPixels widget/android/nsWindow.cpp:2248
1 libxul.so mozilla::layers::UiCompositorControllerChild::RecvScreenPixels gfx/layers/ipc/UiCompositorControllerChild.cpp:262
2 libxul.so mozilla::layers::PUiCompositorControllerChild::OnMessageReceived ipc/ipdl/PUiCompositorControllerChild.cpp:506
3 libxul.so mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2151
4 libxul.so mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1968
5 libxul.so long long mozilla::jni::NativeStub<mozilla::java::GeckoThread::RunUiThreadCallback_t, GeckoThreadSupport, mozilla::jni::Args<> >::Wrap<&GeckoThreadSupport::RunUiThreadCallback> widget/android/jni/Natives.h:689
6 base.odex base.odex@0x1209a7
7 dalvik-LinearAlloc (deleted) dalvik-LinearAlloc @0xe36a
8 dalvik-main space (region space) (deleted) dalvik-main space @0x1141c9e
9 libart.so libart.so@0x3c06f5
More here: https://crash-stats.mozilla.com/signature/?product=Fenix&signature=nsWindow%3A%3ARecvScreenPixels
Comment 1•5 years ago
|
||
Looks like a null-deref of mLayerViewSupport
in nsWindow
. This is probably a widget issue, moving components.
Comment 2•5 years ago
|
||
21 crashes/6 installs in Fenix 68.0b0. Fairly small set of affected devices running APIs 26-28. This is one of the top crashes if you factor out all the system@framework@boot-framework.art and liblog.so signatures.
Reporter | ||
Comment 3•5 years ago
|
||
Randall I think this is your code, any thoughts?
Comment 4•5 years ago
|
||
I'm not sure what is happening. It shouldn't be possible for the lvs
to be null
since it is locked before invocation. If it is null
then there is a bug in our WindowPtr<>
. Otherwise the aMem
could be null
? But I'm not really sure how either could be null
.
Assignee | ||
Comment 6•5 years ago
|
||
I suspect this is more likely that aMem
is null. I wonder if this could occur if the compositor is brought down during the screen pixel capture?
Comment 7•5 years ago
|
||
The priority flag is not set for this bug.
:snorp, could you have a look please?
For more information, please visit auto_nag documentation.
Reporter | ||
Comment 8•5 years ago
|
||
NI+ Randall for comment 6 thoughts, and Chris to get this into our prioritization queue.
Comment 9•5 years ago
|
||
I've tagged this bug as post- Fenix MVP. As Marcia says, the crash volume is pretty low.
Comment 10•5 years ago
|
||
If it is the aMem
probably would be good to figure what part is null and check it first.
Comment 11•5 years ago
|
||
(In reply to Chris Peterson [:cpeterson] from comment #9)
I've tagged this bug as post- Fenix MVP. As Marcia says, the crash volume is pretty low.
Now that Fenix MVP has been released, it appears this is in the top 5 in Socorro (although not a high volume crash).
Assignee | ||
Comment 12•5 years ago
|
||
So, I'm pretty sure I know what is going on now, and it's because bug 1560641 was raised. This seems to be a race condition caused if detach
is called while a screen shot is being taken. The solution to this is twofold:
- Lock the list of waiting screenshot
GeckoResult
s such that we can't try and use the same one to both notify fordetach
and notify for a completed screenshot. - Check to ensure that
mLayerViewSupport
is not null before trying to return a screenshot.
Assignee | ||
Comment 13•5 years ago
|
||
This is caused by a race condition when the compositor is detached. Because the actual detachment happens in a new thread, detach
can complete and release the lock on mLayerViewSupport
, and RecvScreenPixels
can obtain the lock, before mLayerViewSupport
is properly cleaned up. We therefore check to ensure that lvs
is not null before calling a method on it.
Updated•5 years ago
|
Comment 14•5 years ago
|
||
Pushed by etoop@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/920cd58d56ff Lock `LayerViewSupport` during detach so that other methods cannot be called during this time. r=geckoview-reviewers,snorp
Comment 15•5 years ago
|
||
bugherder |
Comment 16•5 years ago
|
||
Is this something we should consider backporting to Beta for GV69 or can this ride with 70 to release?
Assignee | ||
Comment 17•5 years ago
|
||
This is something we should consider for GV69.
Assignee | ||
Comment 18•5 years ago
|
||
Comment on attachment 9075988 [details]
Bug 1553135 - Lock LayerViewSupport
during detach so that other methods cannot be called during this time. r=rbarker
Beta/Release Uplift Approval Request
- User impact if declined: In rare cases Fenix will crash rather than closing down cleanly.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: Yes
- If yes, steps to reproduce:
- List of other uplifts needed: Bug 1560641
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This crash was not visible to the user has it only happens when the app is being backgrounded.
- String changes made/needed:
Comment 19•5 years ago
|
||
Comment on attachment 9075988 [details]
Bug 1553135 - Lock LayerViewSupport
during detach so that other methods cannot be called during this time. r=rbarker
Fixes a Fenix crash. Approved for GV69.
Comment 20•5 years ago
|
||
bugherder uplift |
Comment 21•5 years ago
|
||
[geckoview:fenix:m7]
bugs should be priority P1.
I'm editing a bunch of GeckoView bugs. If you'd like to filter all this bugmail, search and destroy emails containing this UUID:
e88a5094-0fc0-4b7c-b7c5-aef00a11dbc9
Comment 22•5 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Updated•3 years ago
|
Description
•