Closed
Bug 1373540
Opened 7 years ago
Closed 7 years ago
Crash in mozilla::layers::CompositorBridgeParent::InitSameProcess when terminating GPU process from about:support
Categories
(Core :: Graphics: Layers, defect, P3)
Tracking
()
RESOLVED
FIXED
mozilla56
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox54 | --- | unaffected |
firefox55 | --- | unaffected |
firefox56 | --- | fixed |
People
(Reporter: m_kato, Assigned: aosmond)
References
Details
(Keywords: crash, Whiteboard: gfx-noted)
Crash Data
Attachments
(1 file, 2 obsolete files)
This bug was filed from the Socorro interface and is report bp-95a6e1c6-6561-4b5c-a004-f965f0170616. ============================================================= - Step 1. Browse about:support 2. Click [Terminate GPU process] - Result Chrome process crashes. - Expected Result Don't crash chrome process even if GPU process is terminated.
Reporter | ||
Comment 1•7 years ago
|
||
Reproduced on Windows 10 + Nighty 20170615030208
Comment 2•7 years ago
|
||
It seems like a regression of Bug 1365927. CompositorManagerParent::CreateSameProcessWidgetCompositorBridge() returned nullptr. And then de-reference the nullptr.
Comment 3•7 years ago
|
||
:aosmond, Bug 1365927 might cause this bug, can you take a look?
Flags: needinfo?(aosmond)
Assignee | ||
Comment 4•7 years ago
|
||
I expect this is mine. I tested Terminate GPU process in about:support on Linux (when I use layers.gpu-process.force-enabled, that is), and just assumed it would be similar on Windows. That doesn't seem to be the case though :(.
Assignee: nobody → aosmond
Flags: needinfo?(aosmond)
Assignee | ||
Comment 5•7 years ago
|
||
Never mind, this reproduces on Linux.
Assignee | ||
Comment 6•7 years ago
|
||
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=839c9eacebfbd713de1f66a0a41ad29efd449f0b
Assignee | ||
Updated•7 years ago
|
Attachment #8878462 -
Flags: review?(dvander)
Assignee | ||
Updated•7 years ago
|
Status: NEW → ASSIGNED
Has STR: --- → yes
OS: Windows 10 → All
Priority: -- → P3
Whiteboard: gfx-noted
Comment on attachment 8878462 [details] [diff] [review] Fix switchover from the GPU process compositor to the same process compositor., v1 Review of attachment 8878462 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/ipc/CompositorManagerChild.cpp @@ +65,5 @@ > + > + // GPUChild was torn down first when the GPU process was lost. We can't wait > + // for CompositorManagerChild::ActorDestroy because we may recreate the > + // CompositorManager pair before that is executed. > + sInstance->mCanSend = false; Do you need to set mCanSend here? It should be okay to wait for ActorDestroy for that part. ::: gfx/layers/ipc/CompositorManagerParent.cpp @@ +63,5 @@ > CompositorThreadHolder::Loop()->PostTask(runnable.forget()); > } > > +/* static */ void > +CompositorManagerParent::OnGPUProcessShutdown() Do you need this function? If this is for a GPU process crash, the parent doesn't exist in the UI process yet. (But even if it did, it should go away automatically.)
Assignee | ||
Comment 8•7 years ago
|
||
(In reply to David Anderson [:dvander] from comment #7) > Comment on attachment 8878462 [details] [diff] [review] > Fix switchover from the GPU process compositor to the same process > compositor., v1 > > Review of attachment 8878462 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: gfx/layers/ipc/CompositorManagerChild.cpp > @@ +65,5 @@ > > + > > + // GPUChild was torn down first when the GPU process was lost. We can't wait > > + // for CompositorManagerChild::ActorDestroy because we may recreate the > > + // CompositorManager pair before that is executed. > > + sInstance->mCanSend = false; > > Do you need to set mCanSend here? It should be okay to wait for ActorDestroy > for that part. > Hm, you are right, I was just being paranoid. > ::: gfx/layers/ipc/CompositorManagerParent.cpp > @@ +63,5 @@ > > CompositorThreadHolder::Loop()->PostTask(runnable.forget()); > > } > > > > +/* static */ void > > +CompositorManagerParent::OnGPUProcessShutdown() > > Do you need this function? If this is for a GPU process crash, the parent > doesn't exist in the UI process yet. (But even if it did, it should go away > automatically.) This makes sense yes. When I debugged it the first time, I was in CompositorManagerParent, but if I fix CompositorManagerChild first, that error never happens.
Attachment #8878462 -
Attachment is obsolete: true
Attachment #8878462 -
Flags: review?(dvander)
Attachment #8878520 -
Flags: review?(dvander)
Comment on attachment 8878520 [details] [diff] [review] Fix switchover from the GPU process compositor to the same process compositor., v2 Review of attachment 8878520 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/ipc/CompositorManagerChild.cpp @@ +63,5 @@ > + > + // GPUChild was torn down first when the GPU process was lost. We can't wait > + // for CompositorManagerChild::ActorDestroy because we may recreate the > + // CompositorManager pair before that is executed. > + sInstance = nullptr; Maybe worth asserting that sInstance is null or |this|
Attachment #8878520 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 10•7 years ago
|
||
(In reply to David Anderson [:dvander] from comment #9) > Comment on attachment 8878520 [details] [diff] [review] > Fix switchover from the GPU process compositor to the same process > compositor., v2 > > Review of attachment 8878520 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: gfx/layers/ipc/CompositorManagerChild.cpp > @@ +63,5 @@ > > + > > + // GPUChild was torn down first when the GPU process was lost. We can't wait > > + // for CompositorManagerChild::ActorDestroy because we may recreate the > > + // CompositorManager pair before that is executed. > > + sInstance = nullptr; > > Maybe worth asserting that sInstance is null or |this| Hm, it is a static method, and I think it *could* be null in some circumstances, like the GPU process went down before we even got to setup the CompositorManagerChild...
Comment 11•7 years ago
|
||
Pushed by aosmond@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/866d59780f6d Fix switchover from the GPU process compositor to the same process compositor. r=dvander
Comment 12•7 years ago
|
||
Backed out for mass test failures on Windows debug (tests terminated with RunWatchdog): https://hg.mozilla.org/integration/mozilla-inbound/rev/554c39a4445a5e3e2219f99c527e79e3d50eb6d7 Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=866d59780f6d2d2978aa1f5671eab0d6324db4fc&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable Failure log example: https://treeherder.mozilla.org/logviewer.html#?job_id=107944249&repo=mozilla-inbound 22:41:33 INFO - ###!!! [Child][MessageChannel] Error: (msgtype=0x74001B,name=PGPU::Msg_AccumulateChildKeyedHistograms) Closed channel: cannot send/recv 22:41:33 INFO - [GPU 388] WARNING: '!ipcActor->SendAccumulateChildKeyedHistograms(keyedAccumulationsToSend)', file z:/build/build/src/toolkit/components/telemetry/ipc/TelemetryIPCAccumulator.cpp, line 268 22:41:35 INFO - ###!!! [Child][MessageChannel] Error: (msgtype=0x74001A,name=PGPU::Msg_AccumulateChildHistograms) Closed channel: cannot send/recv 22:41:35 INFO - [GPU 388] WARNING: '!ipcActor->SendAccumulateChildHistograms(accumulationsToSend)', file z:/build/build/src/toolkit/components/telemetry/ipc/TelemetryIPCAccumulator.cpp, line 264 22:41:35 INFO - ###!!! [Child][MessageChannel] Error: (msgtype=0x74001B,name=PGPU::Msg_AccumulateChildKeyedHistograms) Closed channel: cannot send/recv 22:41:35 INFO - [GPU 388] WARNING: '!ipcActor->SendAccumulateChildKeyedHistograms(keyedAccumulationsToSend)', file z:/build/build/src/toolkit/components/telemetry/ipc/TelemetryIPCAccumulator.cpp, line 268 22:41:36 INFO - Hit MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) at z:/build/build/src/toolkit/components/terminator/nsTerminator.cpp:160 22:41:37 INFO - #01: _PR_NativeRunThread [nsprpub/pr/src/threads/combined/pruthr.c:397] 22:41:37 INFO - #02: pr_root [nsprpub/pr/src/md/windows/w95thred.c:95] 22:41:37 INFO - #03: ucrtbase.DLL + 0x3d5ef 22:41:37 INFO - #04: kernel32.dll + 0x53c45 22:41:37 INFO - #05: patched_BaseThreadInitThunk [mozglue/build/WindowsDllBlocklist.cpp:849] 22:41:37 INFO - #06: ntdll.dll + 0x637f5 22:41:37 INFO - #07: ntdll.dll + 0x637c8 22:41:37 INFO - [GPU 388] WARNING: '!compMgr', file z:/build/build/src/xpcom/components/nsComponentManagerUtils.cpp, line 63 22:41:37 INFO - nsStringStats 22:41:37 INFO - => mAllocCount: 4877 22:41:37 INFO - => mReallocCount: 33 22:41:37 INFO - => mFreeCount: 4877 22:41:37 INFO - => mShareCount: 5746 22:41:37 INFO - => mAdoptCount: 0 22:41:37 INFO - => mAdoptFreeCount: 0 22:41:37 INFO - => Process ID: 388, Thread ID: 384 22:41:37 WARNING - TEST-UNEXPECTED-FAIL | file:///Z:/task_1497738271/build/tests/reftest/tests/layout/reftests/w3c-css/submitted/variables/variable-declaration-40.html | application terminated with exit code 1
Flags: needinfo?(aosmond)
Assignee | ||
Comment 13•7 years ago
|
||
Not sure if I follow the exact cause of the failure, the stack traces don't offer many clues, but let's try keeping around the CompositorManagerChild until we actually replace it and see if that helps. try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=edf15979384fed4e304c79c893097314e6a76b95
Attachment #8878520 -
Attachment is obsolete: true
Flags: needinfo?(aosmond)
Comment 14•7 years ago
|
||
Pushed by aosmond@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/54f3c792db02 Fix switchover from the GPU process compositor to the same process compositor. r=me,dvander
Comment 15•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/54f3c792db02
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
status-firefox56:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Updated•7 years ago
|
status-firefox54:
--- → unaffected
status-firefox55:
--- → unaffected
status-firefox-esr52:
--- → unaffected
Comment 16•7 years ago
|
||
Hello Andrew: I still see crashes in 20170621102301 - is it possible there is more than one crash in this signature? The comments I see mention A-Frame and Web VR.
Flags: needinfo?(aosmond)
Assignee | ||
Comment 17•7 years ago
|
||
I will fix this in bug 1374278. It is the same race as in that bug, but happening in different states (reinit GPU process, or move into UI process).
Flags: needinfo?(aosmond)
Comment 18•7 years ago
|
||
Hey aosmond, Do you think this crash report[1], which crashes within mozilla::layers::CompositorBridgeParent::InitSameProcess, but also has the fix for this bug, indicates that there are other crashing cases that need to be addressed? Or is [1] another manifestation of the crasher you fixed in bug 1374278? [1]: https://crash-stats.mozilla.com/report/index/523b88f1-08b6-45be-b352-947c00170625
Assignee | ||
Comment 19•7 years ago
|
||
(In reply to Mike Conley (:mconley) from comment #18) > Hey aosmond, > > Do you think this crash report[1], which crashes within > mozilla::layers::CompositorBridgeParent::InitSameProcess, but also has the > fix for this bug, indicates that there are other crashing cases that need to > be addressed? Or is [1] another manifestation of the crasher you fixed in > bug 1374278? > > [1]: > https://crash-stats.mozilla.com/report/index/523b88f1-08b6-45be-b352- > 947c00170625 Thanks -- turns out to be the same signature, but a different problem. Bug 1376590 tracks this. Hopefully the last one.
See Also: → 1376590
You need to log in
before you can comment on or make changes to this bug.
Description
•