Closed Bug 1383386 Opened 7 years ago Closed 7 years ago

Crash in nsLayoutUtils::UpdateDisplayPortMarginsFromPendingMessages

Categories

(Core :: Layout, defect)

56 Branch
All
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- wontfix
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: philipp, Assigned: kats)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-996a1f81-22da-4a44-adf3-1bf9b0170716.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	nsLayoutUtils::UpdateDisplayPortMarginsFromPendingMessages() 	layout/base/nsLayoutUtils.cpp:9284
1 	xul.dll 	nsRefreshDriver::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:1835
2 	xul.dll 	mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, __int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:328
3 	xul.dll 	mozilla::RefreshDriverTimer::TickRefreshDrivers(__int64, mozilla::TimeStamp, nsTArray<RefPtr<nsRefreshDriver> >&) 	layout/base/nsRefreshDriver.cpp:298
4 	xul.dll 	mozilla::RefreshDriverTimer::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:320
5 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:761
6 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:674
7 	xul.dll 	mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp:575
8 	xul.dll 	mozilla::layout::VsyncChild::RecvNotify(mozilla::TimeStamp const&) 	layout/ipc/VsyncChild.cpp:67
9 	xul.dll 	mozilla::layout::PVsyncChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PVsyncChild.cpp:155
10 	xul.dll 	mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const&) 	obj-firefox/ipc/ipdl/PBackgroundChild.cpp:1608
11 	xul.dll 	mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const&) 	ipc/glue/MessageChannel.cpp:2093
12 	xul.dll 	mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message&&) 	ipc/glue/MessageChannel.cpp:2019
13 	xul.dll 	mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask&) 	ipc/glue/MessageChannel.cpp:1888
14 	xul.dll 	mozilla::ipc::MessageChannel::MessageTask::Run() 	ipc/glue/MessageChannel.cpp:1921
15 	xul.dll 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp:1437
16 	xul.dll 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:97
17 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:302
18 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:313
19 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:293
20 	xul.dll 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp:156
21 	xul.dll 	nsAppShell::Run() 	widget/windows/nsAppShell.cpp:271
22 	xul.dll 	XRE_RunAppShell() 	toolkit/xre/nsEmbedFunctions.cpp:895
23 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp:270
24 	xul.dll 	MessageLoop::RunHandler() 	ipc/chromium/src/base/message_loop.cc:313
25 	xul.dll 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc:293
26 	xul.dll 	XRE_InitChildProcess(int, char** const, XREChildData const*) 	toolkit/xre/nsEmbedFunctions.cpp:711
27 	firefox.exe 	content_process_main(mozilla::Bootstrap*, int, char** const) 	ipc/contentproc/plugin-container.cpp:64
28 	firefox.exe 	NS_internal_main(int, char**, char**) 	browser/app/nsBrowserApp.cpp:285
29 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:115
30 	firefox.exe 	__scrt_common_main_seh 	f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:253
31 	kernel32.dll 	BaseThreadInitThunk 	
32 	firefox.exe 	firefox.exe@0x6edf 	
33 	firefox.exe 	firefox.exe@0x6edf 	
34 	ntdll.dll 	RtlUserThreadStart

a handful of crashes with this signature are already present in earlier versions but their volume is spiking up on 56.0a1 after build 20170630030203. reports are from windows users, mostly in the content process and always contain GFX_ERROR "Receive IPC close with reason=AbnormalShutdown" = true (100.0% in signature vs 02.12% overall).
Jet, can your team take a look at this recent regression?
Flags: needinfo?(bugs)
In this case, CompositorBridgeChild::Get() appears to be null here:
http://searchfox.org/mozilla-central/source/layout/base/nsLayoutUtils.cpp#9292

I think we need to update the null-checks just above it.
Flags: needinfo?(bugs) → needinfo?(wmccloskey)
Yeah, I guess we could add a null check here. It seems like the parent process has already crashed (and hopefully that crash has been reported separately). In the mean time, the content process crashes here. From the user's standpoint it doesn't really matter, but fixing this would make our stats look a little better :-).
Flags: needinfo?(wmccloskey)
Kats: can you help get this one assigned? I had thought, perhaps naively, that we just need to fix the null-checks. Per Bill's comment 3, he thinks we're already hosed by the time we crash here. If he's correct, then null-checks won't really help. If we expect the Compositor process to crash and recover, then we should add them. I'm not sure why this is a recent regression though--can you take a closer look? Thx!
Flags: needinfo?(bugmail)
I think the null check does need updating - it looks like in bug 1305165 the call was changed from ContentChild to CompositorBridgeChild, but the guard on the condition was not. And avoiding the content process crash here would be good, because in theory we're supposed to survive gpu process crashes fine now. This crash was reported on Windows, so the GPU process should be enabled and this content process crash results in a worse user experience than we're aiming for.
Assignee: nobody → bugmail
Flags: needinfo?(bugmail)
Comment on attachment 8892912 [details]
Bug 1383386 - Update null checks to check the correct IPC actor.

https://reviewboard.mozilla.org/r/163938/#review169372
Attachment #8892912 - Flags: review?(wmccloskey) → review+
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ecbeb9354c24
Update null checks to check the correct IPC actor. r=billm
Backed out for mass-asserting at gfx/layers/ipc/CompositorBridgeChild.cpp:264:

https://hg.mozilla.org/integration/autoland/rev/975e4a8c3e3e83af2ac54fa3691861f79186b48f

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=ecbeb9354c24bac41419cfbd71ce107f86ea4619&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=120518407&repo=autoland

[task 2017-08-02T21:02:21.426351Z] 21:02:21     INFO - runtests.py | Running with e10s: False
[task 2017-08-02T21:02:21.429739Z] 21:02:21     INFO - runtests.py | Running tests: start.
[task 2017-08-02T21:02:21.431541Z] 21:02:21     INFO - 
[task 2017-08-02T21:02:21.457434Z] 21:02:21     INFO - Application command: /home/worker/workspace/build/application/firefox/firefox -marionette -foreground -profile /tmp/tmpBCemzn.mozrunner
[task 2017-08-02T21:02:21.514597Z] 21:02:21     INFO - runtests.py | Application pid: 1096
[task 2017-08-02T21:02:21.517548Z] 21:02:21     INFO - TEST-INFO | started process GECKO(1096)
[task 2017-08-02T21:02:21.667406Z] 21:02:21     INFO - GECKO(1096) | ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to /tmp/tmpBCemzn.mozrunner/runtests_leaks.log
[task 2017-08-02T21:02:27.637894Z] 21:02:27     INFO - GECKO(1096) | 1501707747630	Marionette	INFO	Enabled via --marionette
[task 2017-08-02T21:02:27.734863Z] 21:02:27     INFO - GECKO(1096) | ++DOCSHELL 0x7f7b68988800 == 1 [pid = 1096] [id = {fb4331d2-39e5-44f3-8963-1c5a659b6d0b}]
[task 2017-08-02T21:02:27.737094Z] 21:02:27     INFO - GECKO(1096) | ++DOMWINDOW == 1 (0x7f7b68989000) [pid = 1096] [serial = 1] [outer = (nil)]
[task 2017-08-02T21:02:27.775353Z] 21:02:27     INFO - GECKO(1096) | ++DOMWINDOW == 2 (0x7f7b68998800) [pid = 1096] [serial = 2] [outer = 0x7f7b68989000]
[task 2017-08-02T21:02:28.073968Z] 21:02:28     INFO - GECKO(1096) | ++DOCSHELL 0x7f7b6625e800 == 2 [pid = 1096] [id = {08a430a8-26e0-4311-966f-fcf3e3f6df45}]
[task 2017-08-02T21:02:28.077788Z] 21:02:28     INFO - GECKO(1096) | ++DOMWINDOW == 3 (0x7f7b6625f000) [pid = 1096] [serial = 3] [outer = (nil)]
[task 2017-08-02T21:02:28.081656Z] 21:02:28     INFO - GECKO(1096) | ++DOMWINDOW == 4 (0x7f7b66260000) [pid = 1096] [serial = 4] [outer = 0x7f7b6625f000]
[task 2017-08-02T21:02:28.380607Z] 21:02:28     INFO - GECKO(1096) | LoadPlugin() /tmp/tmpBCemzn.mozrunner/plugins/libnpswftest.so returned 7f7b6cec0c70
[task 2017-08-02T21:02:28.416606Z] 21:02:28     INFO - GECKO(1096) | LoadPlugin() /tmp/tmpBCemzn.mozrunner/plugins/libnptest.so returned 7f7b6cec0e20
[task 2017-08-02T21:02:28.457427Z] 21:02:28     INFO - GECKO(1096) | LoadPlugin() /tmp/tmpBCemzn.mozrunner/plugins/libnpthirdtest.so returned 7f7b6cec0f10
[task 2017-08-02T21:02:28.466146Z] 21:02:28     INFO - GECKO(1096) | LoadPlugin() /tmp/tmpBCemzn.mozrunner/plugins/libnpsecondtest.so returned 7f7b6cec0f40
[task 2017-08-02T21:02:28.635148Z] 21:02:28     INFO - GECKO(1096) | ++DOMWINDOW == 5 (0x7f7b7061d000) [pid = 1096] [serial = 5] [outer = 0x7f7b68989000]
[task 2017-08-02T21:02:28.833704Z] 21:02:28     INFO - GECKO(1096) | Assertion failure: !XRE_IsParentProcess(), at /home/worker/workspace/build/src/gfx/layers/ipc/CompositorBridgeChild.cpp:264
[task 2017-08-02T21:03:08.092894Z] 21:03:08     INFO - GECKO(1096) | #01: nsLayoutUtils::UpdateDisplayPortMarginsFromPendingMessages [layout/base/nsLayoutUtils.cpp:9297]
[task 2017-08-02T21:03:08.093394Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.099345Z] 21:03:08     INFO - GECKO(1096) | #02: nsRefreshDriver::Tick [layout/base/nsRefreshDriver.cpp:1840]
[task 2017-08-02T21:03:08.099620Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.101185Z] 21:03:08     INFO - GECKO(1096) | #03: mozilla::RefreshDriverTimer::TickRefreshDrivers [layout/base/nsRefreshDriver.cpp:302]
[task 2017-08-02T21:03:08.101440Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.104218Z] 21:03:08     INFO - GECKO(1096) | #04: mozilla::RefreshDriverTimer::Tick [layout/base/nsRefreshDriver.cpp:324]
[task 2017-08-02T21:03:08.105928Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.109113Z] 21:03:08     INFO - GECKO(1096) | #05: mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver [layout/base/nsRefreshDriver.cpp:679]
[task 2017-08-02T21:03:08.110841Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.114233Z] 21:03:08     INFO - GECKO(1096) | #06: mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::ParentProcessVsyncNotifier::Run [layout/base/nsRefreshDriver.cpp:524]
[task 2017-08-02T21:03:08.115988Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.119635Z] 21:03:08     INFO - GECKO(1096) | #07: nsThread::ProcessNextEvent [mfbt/ScopeExit.h:111]
[task 2017-08-02T21:03:08.121374Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.124080Z] 21:03:08     INFO - GECKO(1096) | #08: NS_ProcessNextEvent [xpcom/threads/nsThreadUtils.cpp:480]
[task 2017-08-02T21:03:08.127368Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.129898Z] 21:03:08     INFO - GECKO(1096) | #09: mozilla::ipc::MessagePump::Run [ipc/glue/MessagePump.cpp:98]
[task 2017-08-02T21:03:08.135310Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.137280Z] 21:03:08     INFO - GECKO(1096) | #10: MessageLoop::RunInternal [ipc/chromium/src/base/message_loop.cc:327]
[task 2017-08-02T21:03:08.139239Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.141452Z] 21:03:08     INFO - GECKO(1096) | #11: MessageLoop::Run [ipc/chromium/src/base/message_loop.cc:299]
[task 2017-08-02T21:03:08.143148Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.145190Z] 21:03:08     INFO - GECKO(1096) | #12: nsBaseAppShell::Run [widget/nsBaseAppShell.cpp:158]
[task 2017-08-02T21:03:08.146891Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.148850Z] 21:03:08     INFO - GECKO(1096) | #13: nsAppStartup::Run [toolkit/components/startup/nsAppStartup.cpp:288]
[task 2017-08-02T21:03:08.150534Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.156527Z] 21:03:08     INFO - GECKO(1096) | #14: XREMain::XRE_mainRun [toolkit/xre/nsAppRunner.cpp:4609]
[task 2017-08-02T21:03:08.158240Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.163241Z] 21:03:08     INFO - GECKO(1096) | #15: XREMain::XRE_main [toolkit/xre/nsAppRunner.cpp:4772]
[task 2017-08-02T21:03:08.164981Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.166927Z] 21:03:08     INFO - GECKO(1096) | #16: XRE_main [toolkit/xre/nsAppRunner.cpp:4867]
[task 2017-08-02T21:03:08.168621Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.212617Z] 21:03:08     INFO - GECKO(1096) | #17: do_main [browser/app/nsBrowserApp.cpp:236]
[task 2017-08-02T21:03:08.212897Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.214360Z] 21:03:08     INFO - GECKO(1096) | #18: main [browser/app/nsBrowserApp.cpp:311]
[task 2017-08-02T21:03:08.214625Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.215787Z] 21:03:08     INFO - GECKO(1096) | #19: libc.so.6 + 0x20830
[task 2017-08-02T21:03:08.216219Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.218950Z] 21:03:08     INFO - GECKO(1096) | #20: _start
[task 2017-08-02T21:03:08.220626Z] 21:03:08     INFO - 
[task 2017-08-02T21:03:08.224064Z] 21:03:08     INFO - GECKO(1096) | ExceptionHandler::GenerateDump cloned child 1139
[task 2017-08-02T21:03:08.226103Z] 21:03:08     INFO - GECKO(1096) | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2017-08-02T21:03:08.229143Z] 21:03:08     INFO - GECKO(1096) | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2017-08-02T21:05:21.621028Z] 21:05:21     INFO -  Traceback (most recent call last):
[task 2017-08-02T21:05:21.624271Z] 21:05:21     INFO -    File "/home/worker/workspace/build/tests/mochitest/runtests.py", line 2540, in doTests
[task 2017-08-02T21:05:21.626258Z] 21:05:21     INFO -      marionette_args=marionette_args,
[task 2017-08-02T21:05:21.628480Z] 21:05:21     INFO -    File "/home/worker/workspace/build/tests/mochitest/runtests.py", line 2141, in runApp
[task 2017-08-02T21:05:21.635032Z] 21:05:21     INFO -      self.marionette.start_session(timeout=port_timeout)
[task 2017-08-02T21:05:21.637115Z] 21:05:21     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/decorators.py", line 28, in _
[task 2017-08-02T21:05:21.639003Z] 21:05:21     INFO -      m._handle_socket_failure()
[task 2017-08-02T21:05:21.641075Z] 21:05:21     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/decorators.py", line 23, in _
[task 2017-08-02T21:05:21.643014Z] 21:05:21     INFO -      return func(*args, **kwargs)
[task 2017-08-02T21:05:21.645361Z] 21:05:21     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/marionette.py", line 1208, in start_session
[task 2017-08-02T21:05:21.647271Z] 21:05:21     INFO -      self.protocol, _ = self.client.connect()
[task 2017-08-02T21:05:21.649353Z] 21:05:21     INFO -    File "/home/worker/workspace/build/venv/local/lib/python2.7/site-packages/marionette_driver/transport.py", line 223, in connect
[task 2017-08-02T21:05:21.651216Z] 21:05:21     INFO -      self.sock.connect((self.addr, self.port))
[task 2017-08-02T21:05:21.653152Z] 21:05:21     INFO -    File "/usr/lib/python2.7/socket.py", line 228, in meth
[task 2017-08-02T21:05:21.654992Z] 21:05:21     INFO -      return getattr(self._sock,name)(*args)
[task 2017-08-02T21:05:21.656860Z] 21:05:21     INFO -  error: [Errno 111] Connection refused
Flags: needinfo?(bugmail)
Whoops. I guess we should only be calling this in the content process.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4cb2e1b1da149b5985d9075a84e983b806c143fc
Flags: needinfo?(bugmail)
That's looking better. I'll reland with the fix.
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7b9d0ff207eb
Update null checks to check the correct IPC actor. r=billm
https://hg.mozilla.org/mozilla-central/rev/7b9d0ff207eb
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Comment on attachment 8892912 [details]
Bug 1383386 - Update null checks to check the correct IPC actor.

Approval Request Comment
[Feature/Bug causing the regression]: bug 1305165
[User impact if declined]: low-volume crash
[Is this code covered by automated tests?]: yeah
[Has the fix been verified in Nightly?]: no STR so need to wait for confirmation via crash-stats
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: low risk
[Why is the change risky/not risky?]: correcting a fairly obvious null check error.
[String changes made/needed]: none
Attachment #8892912 - Flags: approval-mozilla-beta?
Comment on attachment 8892912 [details]
Bug 1383386 - Update null checks to check the correct IPC actor.

Crash fix, let's uplift for 56 beta 2.
Attachment #8892912 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #15)
> [Is this code covered by automated tests?]: yeah
> [Has the fix been verified in Nightly?]: no STR so need to wait for
> confirmation via crash-stats
> [Needs manual test from QE? If yes, steps to reproduce]: no

Setting qe-verify- based on Kartikaya's assessment on manual testing needs and the fact that this fix has automated coverage.
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.