Closed Bug 1128527 Opened 9 years ago Closed 9 years ago

Crash [@ mozilla::layers::ChromeProcessController::InitializeRoot() ] with layers.async-pan-zoom.enabled=true

Categories

(Core :: Panning and Zooming, defect)

x86_64
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
firefox35 --- unaffected
firefox36 --- unaffected
firefox37 --- fixed
firefox38 --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: johnp, Assigned: botond)

References

Details

(Keywords: crash, crashreportid)

Crash Data

Attachments

(4 files, 2 obsolete files)

First: I'm not sure if thats the right component, but as it happened in mozilla::layers... but please feel free to move ;)
So, Firefox was open background and VS debugger attached (trying to catch bug 1114079).
Suddenly the debugger popped up showing:

Unhandled exception at 0x00007FFA16E11D4F (xul.dll) in firefox.exe: 0xC0000005: Access violation reading location 0x0000000000000008.

The address resolved to jsexn.cpp ChromeProcessController::InitializeRoot() line
'nsIpresShell* presShell = view->GetPresShell();' in the code window. Note that this
happened right when the Nightly update notification came up and froze it while half-transparent.

I'll leave the debugger open for an hour or two so I can provide some more info if needed (e.g. locals of other points from the stack).
I'd also like to submit a crash-report, but I'm not sure if just detaching the debugger is enough to get the crash-reporter running... I'd appreciate any help in that if possible.


Locals:

-		this	0x0000000000000000 <NULL>	mozilla::layers::ChromeProcessController *
-		mozilla::layers::GeckoContentController	<struct at NULL>	mozilla::layers::GeckoContentController
		__vfptr	<Unable to read memory>	void * *
-		mRefCnt	{mValue={...} }	mozilla::ThreadSafeAutoRefCnt
-		mValue	{...}	mozilla::Atomic<unsigned __int64,2,void>
-		mozilla::detail::AtomicBaseIncDec<unsigned __int64,2>	{...}	mozilla::detail::AtomicBaseIncDec<unsigned __int64,2>
-		mozilla::detail::AtomicBase<unsigned __int64,2>	{mValue={...} }	mozilla::detail::AtomicBase<unsigned __int64,2>
-		mValue	{...}	std::atomic<unsigned __int64>
-		std::atomic_ullong	{_My_val=??? }	std::atomic_ullong
		_My_val	<Unable to read memory>	
-		mWidget	{...}	nsCOMPtr<nsIWidget>
-		nsCOMPtr_base	{mRawPtr=??? }	nsCOMPtr_base
		mRawPtr	<Unable to read memory>	
		mUILoop	<Unable to read memory>	
		presShellId	0	unsigned int
		viewId	0	unsigned __int64



Call Stack:

	xul.dll!mozilla::layers::ChromeProcessController::InitializeRoot() Line 45	C++
 	xul.dll!RunnableMethod<mozilla::ipc::MessageChannel,void (__cdecl mozilla::ipc::MessageChannel::*)(void) __ptr64,Tuple0>::Run() Line 311	C++
 	xul.dll!MessageLoop::DoWork() Line 447	C++
 	xul.dll!mozilla::ipc::DoWorkRunnable::Run() Line 234	C++
 	xul.dll!nsThread::ProcessNextEvent(bool aMayWait=false, bool * aResult=0x000000c00446f510) Line 861	C++
 	xul.dll!mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate * aDelegate=0x000000c00644e140) Line 99	C++
 	xul.dll!MessageLoop::RunHandler() Line 227	C++
 	xul.dll!MessageLoop::Run() Line 201	C++
 	xul.dll!nsBaseAppShell::Run() Line 166	C++
 	xul.dll!nsAppShell::Run() Line 180	C++
 	xul.dll!nsAppStartup::Run() Line 282	C++
 	xul.dll!XREMain::XRE_mainRun() Line 4160	C++
 	xul.dll!XREMain::XRE_main(int argc=-1263453952, char * * argv=0x0000000000000001, const nsXREAppData * aAppData=0x000000c006433000) Line 4236	C++
 	xul.dll!XRE_main(int argc=105050352, char * * argv=0x000000c004528760, const nsXREAppData * aAppData=0x000000c004528760, unsigned int aFlags=1) Line 4457	C++
 	firefox.exe!do_main(int argc=1533975489, char * * argv=0x0000000000000000, nsIFile * xreDirectory=0x000000c006431160) Line 294	C++
 	firefox.exe!NS_internal_main(int argc=0, char * * argv=0x00007ff7b44bd000) Line 669	C++
 	firefox.exe!wmain(int argc=0, wchar_t * * argv=0x0000000000000000) Line 124	C++
Crash Signature: [@ ChromeProcessController::InitializeRoot() ]
Keywords: crash
Crash Signature: [@ ChromeProcessController::InitializeRoot() ] → [@ mozilla::layers::ChromeProcessController::InitializeRoot() ]
Here's an example report from a post on mozillazine:

bp-09495cae-f63b-491f-8efe-f0d5f2150202
Blocking the bug that added this code 2 days ago.
:botond: Does this crash signature ring a bell?
Blocks: 1124452
Flags: needinfo?(botond)
Presumably you have the layers.async-pan-zoom.enabled pref set to true, is that correct?
In bug 1128613 Semtex mentioned STR for this crash:


    Steps to reproduce:

    This problem touch extensions like Gmail Notifier like also Gmail own notification about new mail:

    STR
    1. Instal https://addons.mozilla.org/pl/firefox/addon/gmail-notifier-restartless/?src=cb-dl-mostpopular
    Or enable Gmail notification in Gmail options
    2.Send new mail on Gmail from other account
    3. Wait for notification


    Actual results:

    4 Nightly will crash with this sig: https://crash-stats.mozilla.com/report/index/09495cae-f63b-491f-8efe-f0d5f2150202
    If You try resume Nightly will again crash.
Yes, I have layers.async-pan-zoom.enabled set to true. Is that an unsupported configuration on desktop?
Well it's not enabled by default, so it's unsupported. That being said we do want to make it work with that, so we should fix this. For now it might be wise to disable that pref.
Component: Graphics: Layers → Panning and Zooming
I guess we shouldn't be assuming that GetViewFor() always returns a non-null view?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #6)
Yeah, that was an ill-posed question... I meant like an 'only mobile/fennec' pref or so, but it seems I mixed sth up there (checked the meta-bug and informed myself about apz, now everything's clear).

(In reply to Botond Ballo [:botond] from comment #7)
> nsView* nsView::GetViewFor(nsIWidget* aWidget)
> {
>  [...]
>  return listener ? listener->GetView() : nullptr;
> }

So just convert the MOZ_ASSERT to a proper null-check?
(In reply to Botond Ballo [:botond] from comment #7)
> I guess we shouldn't be assuming that GetViewFor() always returns a non-null
> view?

I think it should be a safe assumption /if/ the widget we're passing in is the right widget. Based on the scenario described in comment 0 I think this might be one of those weird edge cases on Win where we create a new widget for a dialog which has a compositor but we don't really want it to have APZC. We probably want to catch these cases in the nsBaseWidget code that tries to create the ChromeProcessController.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: crashreportid
OS: Windows 8.1 → Windows 7
Summary: Crash [@ChromeProcessController::InitializeRoot()] → Crash [@ mozilla::layers::ChromeProcessController::InitializeRoot() ]
Summary: Crash [@ mozilla::layers::ChromeProcessController::InitializeRoot() ] → Crash [@ mozilla::layers::ChromeProcessController::InitializeRoot() ] with layers.async-pan-zoom.enabled=true
Talked to :Bas and :nical on IRC, who said WindowType() is the way to find out what kind of widget we're dealing with. Looking at the list of possible window types, I *think* 'toplevel' is the only one we want APZ for.

:nical also suggested checking if OMTC is enabled, so I added that check as well.
Assignee: nobody → botond
Flags: needinfo?(botond)
Attachment #8559249 - Flags: review?(bugmail.mozilla)
Attachment #8559249 - Flags: review?(bas)
Comment on attachment 8559249 [details] [diff] [review]
Only enable APZ for toplevel windows with OMTC enabled

Johannes, if you're able to build with a custom patch, could you try with this one and see if it fixes the crash? If not, we can land it and test with the next nightly build.
Attachment #8559249 - Flags: feedback?(johnp)
Comment on attachment 8559249 [details] [diff] [review]
Only enable APZ for toplevel windows with OMTC enabled

Review of attachment 8559249 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/nsBaseWidget.cpp
@@ +1048,5 @@
>    mCompositorChild->Open(parentChannel, childMessageLoop, ipc::ChildSide);
>  
> +  if (gfxPrefs::AsyncPanZoomEnabled() &&
> +      WindowType() == eWindowType_toplevel &&
> +      ShouldUseOffMainThreadCompositing()) {

I don't think we should ever enter this function if ShouldUseOffMainThreadCompositing() is false, so that might be redundant. Still it doesn't matter to me if you leave it in.
Attachment #8559249 - Flags: review?(bugmail.mozilla) → review+
Attachment #8559249 - Flags: review?(bas) → review+
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #13)
> I don't think we should ever enter this function if
> ShouldUseOffMainThreadCompositing() is false, so that might be redundant.
> Still it doesn't matter to me if you leave it in.

You're right, all call sites to CreateCompositor() are already guarded by ShouldUseOffMainThreadCompositing().

Updated patch to remove this condition. Carrying r+.
Attachment #8559249 - Attachment is obsolete: true
Attachment #8559249 - Flags: feedback?(johnp)
Attachment #8559261 - Flags: review+
Comment on attachment 8559261 [details] [diff] [review]
Only enable APZ for toplevel windows

(See comment 12.)
Attachment #8559261 - Flags: feedback?(johnp)
Comment on attachment 8559261 [details] [diff] [review]
Only enable APZ for toplevel windows

Gigantic fail! This patch appears to disable APZ on b2g.
Attachment #8559261 - Flags: review-
Attachment #8559261 - Flags: review+
Attachment #8559261 - Flags: feedback?(johnp)
Looks like the window type for the root process' widget on b2g is eWindowType_child. Not sure why I thought that didn't need APZ...
Let's try that again.
Attachment #8559261 - Attachment is obsolete: true
Attachment #8559287 - Flags: review?(bugmail.mozilla)
Attachment #8559287 - Flags: review?(bas)
Comment on attachment 8559287 [details] [diff] [review]
Only enable APZ for toplevel and child windows

Review of attachment 8559287 [details] [diff] [review]:
-----------------------------------------------------------------

Considering the first patch fail I'm not sure I'm qualified to review this. I don't really know what the different window types are and when they get used. Redirecting to Jim who might know this better.
Attachment #8559287 - Flags: review?(jmathies)
Attachment #8559287 - Flags: review?(bugmail.mozilla)
Attachment #8559287 - Flags: feedback+
Attachment #8559287 - Flags: review?(bas) → review+
I can confirm the win32 try-build doesn't crash using the Notification API[1], while the current win64 Nightly crashes, both with the same profile with APZ enabled.

[1] http://jsbin.com/ziwod/1/edit?html,js,output
(In reply to Johannes Pfrang [:johnp] from comment #22)
> I can confirm the win32 try-build doesn't crash using the Notification
> API[1], while the current win64 Nightly crashes, both with the same profile
> with APZ enabled.

Great, thanks!
(In reply to Botond Ballo [:botond] from comment #20)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=cebfa1840527

Other than Windows 8 builds being busted (in general, unrelated to this bug), looks clean!
So, I'm still running the win32 try-build and I'm experiencing really weird graphic glitches on switching tabs and scrolling... wow... now it's blinking black and white omg... never saw sth like that on my normal win64 nightly. Could be bug 1067470, but I'm not sure... and it's getting worse oO
Attached image glitch1.png
Example screenshot. I'm gonna attach two more.
Attached image glitch2.png
Attached image glitch3.png
Judging from the pictures this is not bug 1067470.
Attachment #8559287 - Flags: review?(jmathies) → review+
https://hg.mozilla.org/mozilla-central/rev/35b05b3a542c
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Comment on attachment 8559287 [details] [diff] [review]
Only enable APZ for toplevel and child windows

Approval Request Comment
[Feature/regressing bug #]: combination of bug 1110540 and bug 1124452. The root cause was bug 1110540 and then bug 1124452 came along and exposed a crash.
[User impact if declined]: With APZ enabled on desktop (which users have been doing by setting prefs), Firefox could crash upon showing dialogs boxes and other types of windows.
[Describe test coverage new/current, TreeHerder]: verified by reporter
[Risks and why]: pretty low risk; APZ is disabled by default on desktop so it should have no impact in the common case. We can skip this uplift if you don't think it's worth it, but I think it would be nice to robustify against users fiddling with prefs and accidentally crashing as a result.
[String/UUID change made/needed]: none
Attachment #8559287 - Flags: approval-mozilla-aurora?
Comment on attachment 8559287 [details] [diff] [review]
Only enable APZ for toplevel and child windows

Small change that although not absolutely necessary for 37 will benefit some early testers. I also like Botond's thought about not polluting crash data. As we still have some time on Aurora, let's take this fix. Aurora+
Attachment #8559287 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #32)
> I also like Botond's thought about not polluting crash data.

(I had a thought about not polluting crash data? News to me! :))
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: