Crash in [@ mozilla::ipc::IToplevelProtocol::OtherPid]
Categories
(Core :: WebVR, defect, P2)
Tracking
()
People
(Reporter: worcester12345, Assigned: daoshengmu)
References
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
47 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr68+
|
Details | Review |
This bug is for crash report bp-6717a756-1682-4568-8095-f6a340190708.
Top 10 frames of crashing thread:
0 xul.dll mozilla::ipc::IToplevelProtocol::OtherPid ipc/glue/ProtocolUtils.cpp:586
1 xul.dll void mozilla::gfx::VRProcessParent::InitAfterConnect gfx/vr/ipc/VRProcessParent.cpp:156
2 xul.dll nsresult mozilla::ipc::TaskFactory<mozilla::RDDProcessHost>::TaskWrapper<mozilla::ipc::TaskFactory<mozilla::RDDProcessHost>::RunnableMethod<void ipc/glue/TaskFactory.h:37
3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1175
4 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486
5 xul.dll nsThread::Shutdown xpcom/threads/nsThread.cpp:881
6 xul.dll nsresult mozilla::LazyIdleThread::ShutdownThread xpcom/threads/LazyIdleThread.cpp:276
7 xul.dll nsresult mozilla::LazyIdleThread::Notify xpcom/threads/LazyIdleThread.cpp:516
8 xul.dll nsresult nsTimerEvent::Run xpcom/threads/TimerThread.cpp:260
9 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1175
Reporter | ||
Comment 1•6 years ago
|
||
First Firefox crash on this computer since December!
Happened right after updating to:
Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:68.0) Gecko/20100101 Firefox/68.0 ID:20190705220548
Tabs in this window:133
Tabs in all windows:186
Updated•6 years ago
|
Assignee | ||
Comment 2•6 years ago
•
|
||
IIRC, in Win 7, we don't have GPU process. So, we should not launch VR process too. It looks like the gpuChild is a nullptr [1] when running in Win 7. That makes sense to me because it has no GPU process there. However, our VR process only can be asked to launch by GPU process, so it should not happen. If we look into the callstack:
0 xul.dll mozilla::ipc::IToplevelProtocol::OtherPid ipc/glue/ProtocolUtils.cpp:586
1 xul.dll void mozilla::gfx::VRProcessParent::InitAfterConnect gfx/vr/ipc/VRProcessParent.cpp:156
2 xul.dll nsresult mozilla::ipc::TaskFactory<mozilla::RDDProcessHost>::TaskWrapper<mozilla::ipc::TaskFactory<mozilla::RDDProcessHost
>::RunnableMethod<void ipc/glue/TaskFactory.h:37
It said the task in RDDProcessHost
was asking to init VR process parent. That should not happen. I am wondering if it is our crash report gives us wrong information or there is something wrong in RDDProcessHost
.
Comment 3•6 years ago
|
||
Updating affected versions based on crash-stats.
Assignee | ||
Comment 4•6 years ago
•
|
||
According to the version no. 7061 that is Win 7 SP1. It seems like GPU process is available there [1], so the questions will be why it only happens on Win 7, is it possible we just needs to check if gpuChild is a nullptr. Also, and why RDDProcessHost calls our VR process's callback [2].
[1] https://dxr.mozilla.org/mozilla-central/rev/5b91c8869f4275d2a2857114cf12fa7b7bf4190e/gfx/thebes/gfxWindowsPlatform.cpp#1647-1660
[2] https://searchfox.org/mozilla-central/rev/da3f3eaaacb6fb344fd21ac29ace2da0e33f12d3/gfx/vr/ipc/VRProcessParent.cpp#205
Updated•6 years ago
|
Assignee | ||
Comment 5•6 years ago
|
||
According to the report after FF 69 [1], the RDDProcessHost doesn't show up more. I would think we need to handle the situation when GPU process is shutdown but VR process is still trying to access GPU channels.
[1] https://crash-stats.mozilla.org/report/index/9fd7da2e-b7f2-4c00-a167-327b20190710
Assignee | ||
Comment 6•6 years ago
|
||
For the most reports are showing [0][GFX1-]: Killing GPU process due to IPC reply timeout
, I doubt GPU is killed by someone but we were still trying to create a VR process. If so, we need to fallback to non-GPU process mode and create a VR service.
Comment 7•6 years ago
•
|
||
These sound related, I don't know:
https://www.reddit.com/r/firefox/comments/ccno2j/cant_terminate_firefox_process_restart_required/
https://www.reddit.com/r/firefox/comments/ccpi38/firefox_broke_yesterday_and_only_showed_a_blank/
(Could also be related to bug 1565882 / bug 1514869.)
Reporter | ||
Comment 8•6 years ago
|
||
(In reply to Jan Andre Ikenmeyer [:darkspirit] and Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧
See Also: → 1564976
See Also: 1564976 →
1564976 is linux. This one is Windows 7.
Assignee | ||
Comment 9•6 years ago
|
||
I can't reproduce this crash in my local. I need more helpful information. :jld, may I know when the IToplevelProtocol::OtherPid() will be possible to be ACCESS_VIOLATION_READ? I saw we have the same situation at GPUChild[1] and VRProcessParent[2].
[1] https://crash-stats.mozilla.org/report/index/9fd7da2e-b7f2-4c00-a167-327b20190710
[2] https://crash-stats.mozilla.org/report/index/1be28cf7-4b17-4041-913e-65ce10190915
Comment 10•6 years ago
|
||
Note the address in both crashes: 0x38
, which is in roughly the right range to be a struct field offset, and in fact is the offset of mOtherPid
.
This means that IToplevelProtocol::OtherPid
is being called on a null this
pointer:
- bp-9fd7da2e-b7f2-4c00-a167-327b20190710 appears to be a null
gpuChild
inVRProcessParent::InitAfterConnect
, because it's from a build before bug 1565000 strengthened the assertion that it's not null - bp-1be28cf7-4b17-4041-913e-65ce10190915 appears to be because
mVRChild
is null inVRProcessParent::OtherPid
Assignee | ||
Comment 11•6 years ago
|
||
Assignee | ||
Updated•6 years ago
|
Assignee | ||
Comment 12•6 years ago
|
||
I have checked
0x20 is address of &mVRChild->mOtherPid and &gpuChild->mOtherPid at x86 when the mChild is a 0x0.
0x38 is address of &mVRChild->mOtherPid and &gpuChild->mOtherPid at amd64 when the mChild is a 0x0.
Comment 14•5 years ago
|
||
Assignee | ||
Comment 15•5 years ago
|
||
(In reply to Daosheng Mu[:daoshengmu] from comment #13)
:jld. Please help review. Thanks.
We wanna land it shortly. please ni? me if I need to change my mind. Thanks.
Comment 16•5 years ago
|
||
bugherder |
Comment 17•5 years ago
|
||
Please nominate this for Beta approval when you get a chance. Looks like we probably won't get much data on whether this fix helps with the crash until it hits release.
Assignee | ||
Comment 18•5 years ago
|
||
Comment on attachment 9094061 [details]
Bug 1564127 - Check if GPU/VR process has connected with the parent process.
Beta/Release Uplift Approval Request
- User impact if declined: The Crash in [@ mozilla::ipc::IToplevelProtocol::OtherPid] crash will still happen when it is failed to get VR child IPC.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): It tries to avoid crashes and will not happen other regressions.
- String changes made/needed:
Comment 19•5 years ago
|
||
Comment on attachment 9094061 [details]
Bug 1564127 - Check if GPU/VR process has connected with the parent process.
Crash fix, uplift approved for 71 beta 6, thanks.
![]() |
||
Comment 20•5 years ago
|
||
bugherder uplift |
Comment 21•5 years ago
|
||
I see >100 crash reports for this signature on ESR68 - is this something we should consider fixing there too? It'll need a rebased patch if yes.
Assignee | ||
Comment 22•5 years ago
|
||
Comment on attachment 9094061 [details]
Bug 1564127 - Check if GPU/VR process has connected with the parent process.
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration:
- User impact if declined: The Crash in [@ mozilla::ipc::IToplevelProtocol::OtherPid] crash will still happen when it is failed to get VR child IPC.
- Fix Landed on Version: ESR 68
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): It tries to avoid crashes and will not happen other regressions.
- String or UUID changes made by this patch:
Comment 23•5 years ago
|
||
Comment on attachment 9094061 [details]
Bug 1564127 - Check if GPU/VR process has connected with the parent process.
Simple crash fix, approved for 68.3esr.
Comment 24•5 years ago
|
||
bugherder uplift |
Description
•