Closed
Bug 1321275
Opened 8 years ago
Closed 7 years ago
[webvr] Crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::gfx::PVRLayer::Transition
Categories
(Core :: WebVR, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla56
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox53 | --- | disabled |
firefox54 | --- | disabled |
firefox55 | --- | fixed |
firefox56 | --- | fixed |
People
(Reporter: marcia, Assigned: kip)
References
Details
(Keywords: crash, Whiteboard: [gfx-noted])
Crash Data
Attachments
(2 files)
This bug was filed from the Socorro interface and is report bp-e1611277-1097-466e-83d8-7d9472161130. ============================================================= Seen while looking at crash stats: http://bit.ly/2gFAxfv. Started with Build 20161129171714. ni on kgilbert for ideas on a possible cause.
Flags: needinfo?(kgilbert)
Updated•8 years ago
|
Priority: -- → P3
Whiteboard: [gfx-noted]
Assignee | ||
Comment 1•8 years ago
|
||
To hit this call, the user must have been visiting a WebVR site. The hardware is also typical of a high end VR computer, with 32GB of ram, an NVidia 1080, and an Oculus VR display. The crash reports are showing that the GPU process has terminated abnormally, so this may be a driver or Oculus Compositor bug. We have seen more stability in the 64-bit builds of Firefox which link against the 64-bit Oculus runtimes and recommend WebVR users to download the 64-bit version of Firefox. In this instance, it appears to be a 32-bit instance of Firefox.
Flags: needinfo?(kgilbert)
Assignee | ||
Comment 2•8 years ago
|
||
The crash report shows that there were 36 instances, all on 32 bit Firefox, and only one 64 bit instance with the same crash.
Assignee | ||
Comment 3•8 years ago
|
||
Another common trend is that the crash appears to happen within the VRDisplay.exitPresent WebVR call. Likely this is seen by users as a crash while exiting VR presentation.
Assignee | ||
Updated•8 years ago
|
Summary: Crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::gfx::PVRLayer::Transition → [webvr] Crash in mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::gfx::PVRLayer::Transition
Assignee | ||
Updated•7 years ago
|
Component: Graphics → WebVR
Updated•7 years ago
|
Crash Signature: [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::gfx::PVRLayer::Transition] → [@ mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::gfx::PVRLayer::Transition]
[@ Abort | corrupted actor state | mozalloc_abort | NS_DebugBreak | mozilla::ipc::LogicError | mozilla::gfx::PVRLayer::Transition ]
Updated•7 years ago
|
status-firefox55:
--- → affected
Comment 4•7 years ago
|
||
There are 32 crashes across 21 different installations in the last 3 days.
Comment 5•7 years ago
|
||
I am thinking if the actor is destroyed at the parent side so we should override ActorDestroy() or Recv__delete__() to do some check.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 7•7 years ago
|
||
mozreview-review |
Comment on attachment 8864065 [details] Bug 1321275 - Confirm VRLayerChild is not destroyed before sending destroy message to the parent side; https://reviewboard.mozilla.org/r/135786/#review138910 LGTM, Thanks!
Attachment #8864065 -
Flags: review?(kgilbert) → review+
Comment 8•7 years ago
|
||
I have confirmed that the actor of VRLayerChild will be destroyed when we exit the VR present mode. So, I think it should be a correct fix.
Updated•7 years ago
|
Assignee: nobody → dmu
Pushed by dmu@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6f76ba14922c Confirm VRLayerChild is not destroyed before sending destroy message to the parent side; r=kip
Comment 10•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6f76ba14922c
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Updated•7 years ago
|
status-firefox54:
--- → disabled
status-firefox-esr52:
--- → unaffected
Comment 11•7 years ago
|
||
This seems to not resolve the crash issue completely, we probably have to consider add some operation at PVRLayerChild::Recv__delete__() instead of PVRLayerChild::ActorDestroy().
Comment 12•7 years ago
|
||
This is still crashing. #13 topcrash in the Windows nightly of 20170511063838.
Flags: needinfo?(dmu)
Updated•7 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 13•7 years ago
|
||
(In reply to Julian Seward [:jseward] from comment #12) > This is still crashing. > #13 topcrash in the Windows nightly of 20170511063838. Let's try to fix it at Bug 1364856
Comment 14•7 years ago
|
||
The signature spiked the 2017-06-01 (59 crashes) and there are 54 crashes today. Maybe it's related to the patch for bug 1362213. :kip, an idea ?
Flags: needinfo?(kgilbert)
Comment 15•7 years ago
|
||
79 occurrences in Nightly 20170602030204, which makes it the #4 Windows topcrash.
Comment 16•7 years ago
|
||
str |
I am reproduce it by follow the below steps: 1. Viewing a WebVR site like, https://webvr.info/samples/XX-vr-controllers.html, in the present mode. 2. Closing the SteamVR runtime. We can notice the WebVR page is suspended. 3. After press the exit VR button, the tab will crash, and the crash report is very similar as we have. We still need to do some confirmation for Oculus Rift, but it is a reproducible case at least.
Comment 17•7 years ago
|
||
I can't reproduce this crash on Oculus Rift, but following the steps from Comment 16 is very easy for reproducing in HTC Vive. Let's fix the crash on Vive first.
Assignee | ||
Comment 18•7 years ago
|
||
I may have found the cause of this issue. In VRManagerChild::DeallocPVRLayerChild, we delete the PVRLayerChild, rather than doing proper reference counting on the VRLayerChild object which has been deallocated but still referenced by VRDisplayPresentation::mLayers. This happens when the GPU process is terminated. When the user tries to stop the VR presentation or change the URL, PVRLayerChild::SendDestroy is called by VRDisplayPresentation::DestroyLayers. This results in the assertion about "invalid actor state". We can fix this by following the same pattern used by TextureClient: ------------- layers::PTextureChild* VRManagerChild::AllocPTextureChild(const SurfaceDescriptor&, const LayersBackend&, const TextureFlags&, const uint64_t&) { return TextureClient::CreateIPDLActor(); } bool VRManagerChild::DeallocPTextureChild(PTextureChild* actor) { return TextureClient::DestroyIPDLActor(actor); } PTextureChild* TextureClient::CreateIPDLActor() { TextureChild* c = new TextureChild(); c->AddIPDLReference(); return c; } // static bool TextureClient::DestroyIPDLActor(PTextureChild* actor) { static_cast<TextureChild*>(actor)->ReleaseIPDLReference(); return true; } // AddIPDLReference and ReleaseIPDLReference are only to be called by CreateIPDLActor // and DestroyIPDLActor, respectively. We intentionally make them private to prevent misuse. // The purpose of these methods is to be aware of when the IPC system around this // actor goes down: mIPCOpen is then set to false. void AddIPDLReference() { MOZ_ASSERT(mIPCOpen == false); mIPCOpen = true; AddRef(); } void ReleaseIPDLReference() { MOZ_ASSERT(mIPCOpen == false); Release(); } ------------- I'll take this bug and implement the fix.
Flags: needinfo?(kgilbert)
Assignee | ||
Updated•7 years ago
|
Assignee: dmu → kgilbert
Comment hidden (mozreview-request) |
Assignee | ||
Updated•7 years ago
|
Attachment #8885938 -
Flags: review?(dmu)
Assignee | ||
Comment 20•7 years ago
|
||
I would recommend this patch for uplift to Beta, once it has been proven stable in Nightly.
Comment 21•7 years ago
|
||
mozreview-review |
Comment on attachment 8885938 [details] Bug 1321275 - Fix reference counting of PVRLayerChild when GPU process has been terminated https://reviewboard.mozilla.org/r/156724/#review161888 LGTM. Good job! kip.
Attachment #8885938 -
Flags: review?(dmu) → review+
Comment hidden (mozreview-request) |
Assignee | ||
Comment 23•7 years ago
|
||
Unfortunately, the try run shows that there were some warnings and an assertion to fix before landing. I've updated the patch and will re-run try to confirm the fix.
Comment hidden (mozreview-request) |
Assignee | ||
Comment 25•7 years ago
|
||
The try push looks clean now. I have removed some unneeded variables causing warnings, corrected the initialized value of VRLayerChild::mIPCOpen to correct an assertion, and added a MOZ_COUNT_CTOR(VRLayerChild) to balance the MOZ_COUNT_DTOR(VRLayerChild). This seems to have fixed the warnings and cleaned up the try build. Could you please review the interdiff (2 revs) before I land?
Flags: needinfo?(dmu)
Comment 26•7 years ago
|
||
mozreview-review |
Comment on attachment 8885938 [details] Bug 1321275 - Fix reference counting of PVRLayerChild when GPU process has been terminated https://reviewboard.mozilla.org/r/156724/#review162694 ::: gfx/vr/ipc/VRLayerChild.cpp:24 (Diff revisions 2 - 3) > : mCanvasElement(nullptr) > , mShSurfClient(nullptr) > , mFront(nullptr) > , mIPCOpen(false) > { > + MOZ_COUNT_CTOR(VRLayerChild); It was removed at Bug 1321275 because the memory leak problem that is detected by our mochitests. If it doesn't have new leaks, it would be good to me.
Comment 27•7 years ago
|
||
Please take a look at what I mentioned at Comment 26. Thanks
Flags: needinfo?(dmu)
Comment 28•7 years ago
|
||
Pushed by kgilbert@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d97dc1dbc289 Fix reference counting of PVRLayerChild when GPU process has been terminated r=daoshengmu
Comment 29•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d97dc1dbc289
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
status-firefox56:
--- → fixed
Resolution: --- → FIXED
Updated•7 years ago
|
Target Milestone: mozilla55 → mozilla56
Comment 30•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/7bb0d5ad88b3
You need to log in
before you can comment on or make changes to this bug.
Description
•