Last Comment Bug 691443 - crash [@ nsAccessibleWrap::GetHWNDFor]
: crash [@ nsAccessibleWrap::GetHWNDFor]
Status: RESOLVED FIXED
: crash
Product: Core
Classification: Components
Component: Disability Access APIs (show other bugs)
: Trunk
: x86 Windows 7
: -- critical (vote)
: mozilla10
Assigned To: alexander :surkov
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-03 11:58 PDT by Naoki Hirata :nhirata (please use needinfo instead of cc)
Modified: 2011-10-11 02:38 PDT (History)
6 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (2.07 KB, patch)
2011-10-06 01:25 PDT, alexander :surkov
dbolter: review+
Details | Diff | Review

Description Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-03 11:58:52 PDT
This bug was filed from the Socorro interface and is 
report bp-972f2e40-c739-40d6-8e87-926442111001 .
============================================================= 
0 	xul.dll 	nsAccessibleWrap::GetHWNDFor 	accessible/src/msaa/nsAccessibleWrap.cpp:1621
1 	xul.dll 	nsAccessibleWrap::FirePlatformEvent 	accessible/src/msaa/nsAccessibleWrap.cpp:1572
2 	xul.dll 	nsAccessibleWrap::HandleAccEvent 	accessible/src/msaa/nsAccessibleWrap.cpp:1528
3 	xul.dll 	nsHyperTextAccessibleWrap::HandleAccEvent 	accessible/src/msaa/nsHyperTextAccessibleWrap.cpp:77
4 	xul.dll 	nsEventShell::FireEvent 	accessible/src/base/nsEventShell.cpp:63
5 	xul.dll 	nsDocAccessible::ProcessLoad 	accessible/src/base/nsDocAccessible.cpp:1551
6 	xul.dll 	NotificationController::WillRefresh 	accessible/src/base/NotificationController.cpp:286
7 	xul.dll 	nsRefreshDriver::Notify 	layout/base/nsRefreshDriver.cpp:326
8 	xul.dll 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:427
9 	xul.dll 	nsTimerEvent::Run 	xpcom/threads/nsTimerImpl.cpp:520
10 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:631
11 	xul.dll 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:245
12 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:134
13 	xul.dll 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:208
14 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
15 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
16 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:189
17 	xul.dll 	nsAppShell::Run 	widget/src/windows/nsAppShell.cpp:261
18 	xul.dll 	XRE_RunAppShell 	toolkit/xre/nsEmbedFunctions.cpp:677
19 	xul.dll 	mozilla::ipc::MessagePumpForChildProcess::Run 	ipc/glue/MessagePump.cpp:215
20 	xul.dll 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:208
21 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:201
22 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:175
23 	xul.dll 	XRE_InitChildProcess 	toolkit/xre/nsEmbedFunctions.cpp:516
24 	plugin-container.exe 	NS_internal_main 	ipc/app/MozillaRuntimeMain.cpp:81
25 	plugin-container.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:107
26 	plugin-container.exe 	__tmainCRTStartup 	crtexe.c:594
27 	kernel32.dll 	BaseThreadInitThunk 	
28 	ntdll.dll 	__RtlUserThreadStart 	
29 	ntdll.dll 	_RtlUserThreadStart

More crashes : https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2011-10-03%2009%3A00%3A00&signature=nsAccessibleWrap%3A%3AGetHWNDFor%28nsAccessible*%29&version=Fennec%3A10.0a1

#5 on top 10 crashes in Fennec Nightly.
Windows Only.
Comment 1 alexander :surkov 2011-10-03 20:12:22 PDT
Robert, when nsIFrame::GetNearestWidget can return null? Presumably we get nearest widget on root frame of tab document (nsIPresShell::GetRootFrame).
Comment 2 alexander :surkov 2011-10-03 20:12:38 PDT
Can we get regression range please?
Comment 3 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-10-04 15:07:46 PDT
(In reply to alexander surkov from comment #1)
> Robert, when nsIFrame::GetNearestWidget can return null? Presumably we get
> nearest widget on root frame of tab document (nsIPresShell::GetRootFrame).

That can sometimes happen when the presentation has been detached from the window while something is being torn down. It can also happen during printing.
Comment 4 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-04 16:51:08 PDT
It does not occur on Win XP; It does occur on Win 7

Found a repro steps : 
1. turn on an Accessibility application such as Narrator on Windows 7
2. go to http://people.mozilla.com/~nhirata/html_tp/formsninput.html
3. type a couple of lines of nonsense in the text area 
4. hit the submit button

Expected: no content crash
Actual: content crash

Note: 
1. after a crash happens the Accessibility application becomes unstable as well.

Regression range : 9/28 build works, 9/29 build crashes occasionally, 9/30 build crashes reliably.
Comment 5 Marco Zehe (:MarcoZ) 2011-10-05 01:20:09 PDT
I cannot reproduce this with NVDA and the current nightly of 2011-10-04. When I hit submit after typing in a few lines in the textarea, I get a "Redirecting back to forms", and the doorhanger saying that Firefox prevented this page from automatically redirecting... I then hit "allow", and am returned to the page. No crash.
Comment 6 Marco Zehe (:MarcoZ) 2011-10-05 01:20:40 PDT
Addition:L This is on Win7 x64 with the 32 bit version of Nightly.
Comment 7 alexander :surkov 2011-10-05 01:24:07 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #3)
> (In reply to alexander surkov from comment #1)
> > Robert, when nsIFrame::GetNearestWidget can return null? Presumably we get
> > nearest widget on root frame of tab document (nsIPresShell::GetRootFrame).
> 
> That can sometimes happen when the presentation has been detached from the
> window while something is being torn down. It can also happen during
> printing.

We fire accessibility events when refresh driver notifies us (mPresShell->AddRefreshObserver(this, Flush_Display) and this point we assume that is nsIFrame::GetNearestWidget is never null, otherwise we're not able to fire events. 

At what time we are ok to request GetNearestWidget and don't get a null? Is there any hint? Because otherwise we are basically faced to either crash or missed events.
Comment 8 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-05 09:30:56 PDT
This is on the nightly of Fennec windows desktop; which probably should have a lower priority.

http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-central-win32/
Comment 9 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-10-05 16:33:46 PDT
(In reply to alexander surkov from comment #7)
> We fire accessibility events when refresh driver notifies us
> (mPresShell->AddRefreshObserver(this, Flush_Display) and this point we
> assume that is nsIFrame::GetNearestWidget is never null, otherwise we're not
> able to fire events.

Hmm.

> At what time we are ok to request GetNearestWidget and don't get a null? Is
> there any hint? Because otherwise we are basically faced to either crash or
> missed events.

I think you can probably just miss the event.

Hmm, this is Fennec? Maybe you can get null GetNearestWidget more often there, I don't know exactly how widget setup works there.
Comment 10 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-05 17:34:39 PDT
Fennec Windows Desktop to be specific.  It allows for windows users that don't have android devices to test on Fennec, however it's way low percentage of the population that uses this method for testing Fennec.  

I do know Martijn sometimes does use this as a testing platform.  I'm unsure if he does use any of the accessibility though.  Looping in mw22.
Comment 11 alexander :surkov 2011-10-05 20:28:56 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #9)

> > At what time we are ok to request GetNearestWidget and don't get a null? Is
> > there any hint? Because otherwise we are basically faced to either crash or
> > missed events.
> 
> I think you can probably just miss the event.

That's not an option because it must break screen reader user experience. We could do not emit events while GetNearestWidget is null collecting them. But this should be a problem if GetNearestWidget stays null for a long time.

> Hmm, this is Fennec? Maybe you can get null GetNearestWidget more often
> there, I don't know exactly how widget setup works there.

Can we cc somebody who knows please?
Comment 12 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-10-05 20:30:55 PDT
(In reply to alexander surkov from comment #11)
> (In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #9)
> 
> > > At what time we are ok to request GetNearestWidget and don't get a null? Is
> > > there any hint? Because otherwise we are basically faced to either crash or
> > > missed events.
> > 
> > I think you can probably just miss the event.
> 
> That's not an option because it must break screen reader user experience. We
> could do not emit events while GetNearestWidget is null collecting them. But
> this should be a problem if GetNearestWidget stays null for a long time.

If it's null, it's unlikely to ever become non-null. The document is either going away or it's never going to have a widget, which probably means it's being printed.
Comment 13 alexander :surkov 2011-10-05 20:58:33 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #12)

> If it's null, it's unlikely to ever become non-null. The document is either
> going away or it's never going to have a widget, which probably means it's
> being printed.

Is it print preview documents that never have a widget?

Anyway this crash is not about printing. It's likely we crash for tab document. Maybe it is temporary document that was loaded and about to destroy. Don't you know whether presshell lives longer than widget associated with document?
Comment 14 alexander :surkov 2011-10-05 21:08:21 PDT
(In reply to Naoki Hirata :nhirata from comment #8)
> This is on the nightly of Fennec windows desktop; which probably should have
> a lower priority.
> 
> http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-central-
> win32/

it doesn't crash for me (with NVDA running) with steps to reproduce from comment #4.
Comment 15 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-10-05 21:29:42 PDT
(In reply to alexander surkov from comment #13)
> Anyway this crash is not about printing. It's likely we crash for tab
> document. Maybe it is temporary document that was loaded and about to
> destroy. Don't you know whether presshell lives longer than widget
> associated with document?

On Fennec, I don't know.

If this is only on a configuration hardly anyone uses, we shouldn't worry about it. I'd just null-check, drop the event, and maybe emit a warning in debug builds.
Comment 16 alexander :surkov 2011-10-05 21:40:40 PDT
what about desktop? We could shutdown document accessible earlier and don't do excess processing.
Comment 17 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2011-10-05 21:43:52 PDT
Yes, i think you could do that, although it probably doesn't matter.
Comment 18 alexander :surkov 2011-10-06 01:25:59 PDT
Created attachment 565160 [details] [diff] [review]
patch

null check, no warnings since the code is popups support and is not a case of the crash.
Comment 19 alexander :surkov 2011-10-06 01:26:41 PDT
(In reply to alexander surkov from comment #18)
> Created attachment 565160 [details] [diff] [review] [diff] [details] [review]
> patch
> 
> null check, no warnings since the code is popups support and is not a case
> of the crash.

anyway we'll warn when empty GetHWND returns.
Comment 20 Martijn Wargers [:mwargers] (not working for Mozilla) 2011-10-06 03:24:12 PDT
(In reply to Naoki Hirata :nhirata from comment #10)
> I do know Martijn sometimes does use this as a testing platform.  I'm unsure
> if he does use any of the accessibility though.  Looping in mw22.

I'm not doing any (or hardly) fuzz testing anymore.
Comment 21 David Bolter [:davidb] 2011-10-06 07:03:56 PDT
Comment on attachment 565160 [details] [diff] [review]
patch

ok r=me
Comment 22 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-06 09:52:25 PDT
Would you like me to test on a debug build before you land the patch?
I'm willing to do that.
Comment 23 alexander :surkov 2011-10-06 10:03:27 PDT
(In reply to Naoki Hirata :nhirata from comment #22)
> Would you like me to test on a debug build before you land the patch?
> I'm willing to do that.

please, just in case try build is http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/surkov.alexander@gmail.com-ee16bfda13f9/
Comment 24 Naoki Hirata :nhirata (please use needinfo instead of cc) 2011-10-10 13:12:25 PDT
I couldn't reproduce the issue with the try build.  

Just looking at the patch, is the patch only doing a null check on widget?
I'd like to understand so that I might be able to create fixes in the future.
Comment 25 alexander :surkov 2011-10-10 17:08:56 PDT
(In reply to Naoki Hirata :nhirata from comment #24)
> I couldn't reproduce the issue with the try build.  
> 
> Just looking at the patch, is the patch only doing a null check on widget?

yes, it's null check

> I'd like to understand so that I might be able to create fixes in the future.

it might be more complicated than this case but thanks.
Comment 26 alexander :surkov 2011-10-10 17:09:45 PDT
inbound - https://hg.mozilla.org/integration/mozilla-inbound/rev/f4fa6df431be
Comment 27 Marco Bonardo [::mak] 2011-10-11 02:38:58 PDT
https://hg.mozilla.org/mozilla-central/rev/f4fa6df431be

Note You need to log in before you can comment on or make changes to this bug.