Last Comment Bug 765252 - crash in nsCaretAccessible::RemoveDocSelectionListener
: crash in nsCaretAccessible::RemoveDocSelectionListener
Status: RESOLVED FIXED
[native-crash]
: crash
Product: Core
Classification: Components
Component: Disability Access APIs (show other bugs)
: Trunk
: All Linux
: -- critical (vote)
: mozilla17
Assigned To: Trevor Saunders (:tbsaunde)
:
: alexander :surkov
Mentors:
Depends on: 778666
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-15 08:42 PDT by Scoobidiver (away)
Modified: 2012-08-30 02:19 PDT (History)
8 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
affected
fixed


Attachments
firefox_2012-07-20-133858_cpeterson-07560.crash (78.58 KB, text/plain)
2012-07-20 14:05 PDT, Chris Peterson [:cpeterson]
no flags Details
patch (20.72 KB, patch)
2012-07-21 04:51 PDT, Trevor Saunders (:tbsaunde)
surkov.alexander: review+
bzbarsky: review+
dbolter: review+
Details | Diff | Splinter Review

Description Scoobidiver (away) 2012-06-15 08:42:23 PDT
There are a few crashes in the trunk.

Signature 	@0x0 | nsCOMPtr_base::assign_from_qi | nsCaretAccessible::RemoveDocSelectionListener More Reports Search
UUID	301a2ddc-924b-4186-aab2-29a4a2120606
Date Processed	2012-06-06 22:21:41
Uptime	5190
Last Crash	1.1 weeks before submission
Install Age	1.1 days since version was first installed.
Install Time	2012-06-05 19:48:58
Product	Firefox
Version	16.0a1
Build ID	20120605030522
Release Channel	nightly
OS	Linux
OS Version	0.0.0 Linux 3.3.0-0.12.el7.x86_64 #1 SMP Mon May 7 13:22:14 EDT 2012 x86_64
Build Architecture	amd64
Build Architecture Info	family 6 model 42 stepping 7
Crash Reason	SIGSEGV
Crash Address	0x0
App Notes 	
OpenGL: Tungsten Graphics, Inc -- Mesa DRI Intel(R) Sandybridge Mobile  -- 2.1 Mesa 8.0.3 -- texture_from_pixmap
EMCheckCompatibility	False

Frame 	Module 	Signature 	Source
0 		@0x0 	
1 	libxul.so 	nsCOMPtr_base::assign_from_qi 	obj-firefox/xpcom/build/nsCOMPtr.cpp:14
2 	libxul.so 	nsCaretAccessible::RemoveDocSelectionListener 	accessible/src/base/nsCaretAccessible.cpp:154
3 	libxul.so 	DocAccessible::RemoveEventListeners 	accessible/src/generic/DocAccessible.cpp:820
4 	libxul.so 	DocAccessible::Shutdown 	accessible/src/generic/DocAccessible.cpp:644
5 	libxul.so 	nsAccDocManager::HandleEvent 	accessible/src/base/nsAccDocManager.cpp:283
6 	libxul.so 	nsEventListenerManager::HandleEventSubType 	content/events/src/nsEventListenerManager.cpp:809
7 	libxul.so 	nsEventListenerManager::HandleEventInternal 	content/events/src/nsEventListenerManager.cpp:866
8 	libxul.so 	nsEventTargetChainItem::HandleEventTargetChain 	content/events/src/nsEventListenerManager.h:137
9 	libxul.so 	nsEventDispatcher::Dispatch 	content/events/src/nsEventDispatcher.cpp:643
10 	libxul.so 	nsEventDispatcher::DispatchDOMEvent 	content/events/src/nsEventDispatcher.cpp:706
11 	libxul.so 	nsDocument::DispatchPageTransition 	content/base/src/nsDocument.cpp:7267
12 	libxul.so 	nsDocument::OnPageHide 	content/base/src/nsDocument.cpp:7378
13 	libxul.so 	DocumentViewerImpl::PageHide 	layout/base/nsDocumentViewer.cpp:1253
14 	libxul.so 	nsDocShell::FirePageHideNotification 	docshell/base/nsDocShell.cpp:1599
15 	libxul.so 	nsDocShell::Destroy 	docshell/base/nsDocShell.cpp:4644
16 	libxul.so 	nsFrameLoader::Finalize 	content/base/src/nsFrameLoader.cpp:542
17 	libxul.so 	nsDocument::MaybeInitializeFinalizeFrameLoaders 	content/base/src/nsDocument.cpp:5468
18 	libxul.so 	nsDocument::EndUpdate 	content/base/src/nsDocument.cpp:4002
19 	libxul.so 	nsXULDocument::EndUpdate 	content/xul/document/src/nsXULDocument.cpp:3303
20 	libxul.so 	nsINode::doRemoveChildAt 	content/base/src/mozAutoDocUpdate.h:35
21 	libxul.so 	nsGenericElement::RemoveChildAt 	content/base/src/nsGenericElement.cpp:3803
22 	libxul.so 	nsXULElement::RemoveChildAt 	content/xul/content/src/nsXULElement.cpp:941
23 	libxul.so 	nsINode::RemoveChild 	content/base/src/nsGenericElement.cpp:504
24 	libxul.so 	nsIDOMNode_RemoveChild 	obj-firefox/js/xpconnect/src/dom_quickstubs.cpp:5469
25 	libxul.so 	js::InvokeKernel 	js/src/jscntxtinlines.h:395
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=%400x0+|+nsCOMPtr_base%3A%3Aassign_from_qi+|+nsCaretAccessible%3A%3ARemoveDocSelectionListener
Comment 1 David Bolter [:davidb] 2012-07-09 06:39:47 PDT
This was a native fennec crash. Eitan can you assess this one?
Comment 2 Chris Peterson [:cpeterson] 2012-07-20 14:05:55 PDT
Created attachment 644466 [details]
firefox_2012-07-20-133858_cpeterson-07560.crash

I have hit this same crash with Mac Nightly 17 (builds 7-18, 7-19, and 7-20) about four times in the past two days. I have Mac accessibility enabled indirectly because I have multiple language input sources enabled for testing.

The crashes were caught by Apple's Problem Reporter, not Firefox's Breakpad. I've attached one of my Apple Problem Reports. I have more if that would be useful.

I also hit this same crash a couple times in June with Mac Nightly 16 builds 6-21 and 6-29.
Comment 3 Trevor Saunders (:tbsaunde) 2012-07-20 14:44:41 PDT
(In reply to Chris Peterson (:cpeterson) from comment #2)
> Created attachment 644466 [details]
> firefox_2012-07-20-133858_cpeterson-07560.crash
> 
> I have hit this same crash with Mac Nightly 17 (builds 7-18, 7-19, and 7-20)
> about four times in the past two days. I have Mac accessibility enabled
> indirectly because I have multiple language input sources enabled for
> testing.
> 
> The crashes were caught by Apple's Problem Reporter, not Firefox's Breakpad.
> I've attached one of my Apple Problem Reports. I have more if that would be
> useful.
> 
> I also hit this same crash a couple times in June with Mac Nightly 16 builds
> 6-21 and 6-29.

I assume you don't have any idea how to reproduce right?
Comment 4 Trevor Saunders (:tbsaunde) 2012-07-20 14:48:28 PDT
So, at the beginning of DocAccessible::Shutdown() we bail if mPresShell is null.  The only way afaik for mPresShell to become null is for DocAccessible::Shutdown() to get pased RemoveEventListeners() which calls nsCaretAccessible::RemoveDomSelectionListener() where we crash.  So, I'm not really sure what *is* happening here, but off hand it doesn't make much sense.
Comment 5 Chris Peterson [:cpeterson] 2012-07-20 15:08:22 PDT
Sorry, I don't know how to reproduce this problem.

Note that the Mac stack trace points to a bogus pure virtual function call (from a destructor invoked by assign_from_qi?). Perhaps the Linux crash is algo a bogus pure virtual function call that manifests itself as a null pointer crash?

Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0   libsystem_kernel.dylib        	0x00007fff8a568ce2 __pthread_kill + 10
1   libsystem_c.dylib             	0x00007fff8b2a77d2 pthread_kill + 95
2   libsystem_c.dylib             	0x00007fff8b298a7a abort + 143
3   libc++abi.dylib               	0x00007fff8f1587bc abort_message + 214
4   libc++abi.dylib               	0x00007fff8f15609c __cxa_pure_virtual + 18
5   XUL                           	0x0000000101ee57b0 nsCOMPtr_base::assign_from_qi(nsQueryInterface, nsID const&) + 32
6   XUL                           	0x0000000101d40028 nsCaretAccessible::RemoveDocSelectionListener(nsIPresShell*) + 40
...
Comment 6 Chris Peterson [:cpeterson] 2012-07-20 17:42:53 PDT
I hit this crash again. I don't know if it is related, but Firefox crashed about 0.5 seconds after I closed a tab where I had finished watching a Flash video on youtube.
Comment 7 David Bolter [:davidb] 2012-07-20 17:50:26 PDT
cc+ Hub
Comment 8 Trevor Saunders (:tbsaunde) 2012-07-20 18:11:34 PDT
(In reply to Chris Peterson (:cpeterson) from comment #5)
> Sorry, I don't know how to reproduce this problem.
> 
> Note that the Mac stack trace points to a bogus pure virtual function call
> (from a destructor invoked by assign_from_qi?). Perhaps the Linux crash is
> algo a bogus pure virtual function call that manifests itself as a null
> pointer crash?

(In reply to Chris Peterson (:cpeterson) from comment #6)
> I hit this crash again. I don't know if it is related, but Firefox crashed
> about 0.5 seconds after I closed a tab where I had finished watching a Flash
> video on youtube.

so, I think what's happening is that DocAccessibles hold a weak pointer to their presShell, and I suspect the pres shell is going away and the weak ref isn't being dropped.

it looks like it must be true that before a pres shell is deleted its destroy() method is called, which calls nsAccessibilityService::PresShellDestroyed() which gets the document for the pres shell, then gets the accessible document for that dom doc, and shuts it down dropping the ref.  The only bit of this that appears to be falable is getting the dom doc from the pres shell, can that ever actually fail?

One fix would be to make the doc accessible hold a ref to the pres shell, but I suspect that would mean we leak the pres shell atleast for a while if not until shut down.

I think better fixes would be to look into having nsAccDocManager map from pres shells to doc accessibles (which we've thought about before iirc), or we can may be store a pointer to the doc accessible in the pres shell, and remove the indirection through nsAccessibilityService.
Comment 9 alexander :surkov 2012-07-20 18:23:49 PDT
CC'ing Boris for comment #8 questions

(In reply to Trevor Saunders (:tbsaunde) from comment #8)

> I think better fixes would be to look into having nsAccDocManager map from
> pres shells to doc accessibles (which we've thought about before iirc)

right, that's the way to go in either case
Comment 10 Trevor Saunders (:tbsaunde) 2012-07-20 18:38:32 PDT
> (In reply to Trevor Saunders (:tbsaunde) from comment #8)
> 
> > I think better fixes would be to look into having nsAccDocManager map from
> > pres shells to doc accessibles (which we've thought about before iirc)
> 
> right, that's the way to go in either case

well, if Boris is ok with making pres shells a word bigger I'd rather stash a pointer to the doc accessible there, rather than have a hash table since we can do the conversion in bits, and it should end up being faster.
Comment 11 Boris Zbarsky [:bz] (still a bit busy) 2012-07-20 18:45:33 PDT
> The only bit of this that appears to be falable is getting the dom doc from the pres
> shell, can that ever actually fail?

Not before Destroy().

Adding a word to presshells is not a problem.
Comment 12 Trevor Saunders (:tbsaunde) 2012-07-20 19:07:04 PDT
(In reply to Boris Zbarsky (:bz) from comment #11)
> > The only bit of this that appears to be falable is getting the dom doc from the pres
> > shell, can that ever actually fail?
> 
> Not before Destroy().

interesting, I'm not sure if its worth trying to figure out what's happening or if we should just stick a pointer to the doc accessible on the pres shell and I'm fairly confident just avoid the problem.

> Adding a word to presshells is not a problem.

cool.
Comment 13 Trevor Saunders (:tbsaunde) 2012-07-21 04:51:10 PDT
Created attachment 644622 [details] [diff] [review]
patch

It'd be nice if we didn't have to export quiet as much, but I don't have any good ideas how to get around that

surkov for all the a11y reorg
bz for the pres shell stuff
Comment 14 Boris Zbarsky [:bz] (still a bit busy) 2012-07-21 07:14:58 PDT
Comment on attachment 644622 [details] [diff] [review]
patch

> +  { mAccDocument = aAccDocument; }

Please do it like so:

{
  mAccDocument = aAccDocument;
}

and r=me.
Comment 15 David Bolter [:davidb] 2012-07-23 13:10:08 PDT
Comment on attachment 644622 [details] [diff] [review]
patch

Review of attachment 644622 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, I can r=me this, but I'll look around again later for potential follow up.

::: layout/base/nsPresShell.cpp
@@ +915,5 @@
> +    if (a11y::logging::IsEnabled(a11y::logging::eDocDestroy))
> +      a11y::logging::DocDestroy("presshell destroyed", mDocument);
> +#endif
> +
> +    mAccDocument->Shutdown();

Just curious, why don't you do the logging inside mAccDocument->shutdown()?
Comment 16 Trevor Saunders (:tbsaunde) 2012-07-23 17:51:50 PDT
(In reply to David Bolter [:davidb] from comment #15)
> Comment on attachment 644622 [details] [diff] [review]
> patch
> 
> Review of attachment 644622 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Nice, I can r=me this, but I'll look around again later for potential follow
> up.
> 
> ::: layout/base/nsPresShell.cpp
> @@ +915,5 @@
> > +    if (a11y::logging::IsEnabled(a11y::logging::eDocDestroy))
> > +      a11y::logging::DocDestroy("presshell destroyed", mDocument);
> > +#endif
> > +
> > +    mAccDocument->Shutdown();
> 
> Just curious, why don't you do the logging inside mAccDocument->shutdown()?

well, we already log there, so I'm not really clear on why both are needed, but otherwise just to keep things the same general way they were, no particularly good reason.
Comment 17 Trevor Saunders (:tbsaunde) 2012-07-25 16:49:46 PDT
landed https://hg.mozilla.org/integration/mozilla-inbound/rev/30fbd796d3bb
Comment 18 alexander :surkov 2012-07-25 17:50:26 PDT
Comment on attachment 644622 [details] [diff] [review]
patch

Review of attachment 644622 [details] [diff] [review]:
-----------------------------------------------------------------

my belated r+ :)

please file follow up to add getter method and get rid the cache on a11y side

::: accessible/src/atk/Makefile.in
@@ +44,5 @@
> +
> +EXPORTS_mozilla/a11y = \
> +	AccessibleWrap.h \
> +	HyperTextAccessibleWrap.h \
> +	$(null)

do spaces instead tabs

::: accessible/src/base/Makefile.in
@@ +47,5 @@
>  endif
>  
>  EXPORTS = \
>    a11yGeneric.h \
> +	AccEvent.h \

here's too

@@ +64,5 @@
>  
> +ifdef MOZ_DEBUG
> +EXPORTS_mozilla/a11y += \
> +	Logging.h \
> +	$(NULL)

same here

::: accessible/src/generic/DocAccessible-inl.h
@@ +38,5 @@
>    if (mNotificationController && HasLoadState(eTreeConstructed))
>      mNotificationController->ScheduleTextUpdate(aTextNode);
>  }
>  
> +  inline void

space before 'inline'

::: accessible/src/generic/Makefile.in
@@ +32,5 @@
>  
>  EXPORTS_mozilla/a11y = \
>    Accessible.h \
> +	DocAccessible.h \
> +	HyperTextAccessible.h \

same problem and below

::: layout/base/nsIPresShell.h
@@ +1329,5 @@
>    // GetRootFrame() can be inlined:
>    nsFrameManagerBase*       mFrameManager;
>    nsWeakPtr                 mForwardingContainer;
> +#ifdef ACCESSIBILITY
> +  DocAccessible* mAccDocument;

you could keep Accessible* pointer and you wouldn't need to export all these wrap classes (note, msaa versions includes msaa specific headers what may give unwanted effect).
Comment 19 alexander :surkov 2012-07-25 18:19:56 PDT
I wonder whether we need to listen pagehide DOM event to shutdown doc accessible if we shutdown it when preshsell goes away.
Comment 20 Trevor Saunders (:tbsaunde) 2012-07-25 18:32:55 PDT
(In reply to alexander :surkov from comment #19)
> I wonder whether we need to listen pagehide DOM event to shutdown doc
> accessible if we shutdown it when preshsell goes away.

I'm not really sure if we need, or for that matter the one in OuterDocAccessible::Shutdown() maybe we could just replace that with an assert there is no kid left?
Comment 21 Trevor Saunders (:tbsaunde) 2012-07-25 18:37:02 PDT
> please file follow up to add getter method and get rid the cache on a11y side

I already have  some stuff in my queue going down this road.

> ::: accessible/src/atk/Makefile.in
> @@ +44,5 @@
> > +
> > +EXPORTS_mozilla/a11y = \
> > +	AccessibleWrap.h \
> > +	HyperTextAccessibleWrap.h \
> > +	$(null)
> 
> do spaces instead tabs

I wonder how that didn't break anything in a Makefile :/
I'll push a follow up for these, and the inline nit unless you mind.

> ::: layout/base/nsIPresShell.h
> @@ +1329,5 @@
> >    // GetRootFrame() can be inlined:
> >    nsFrameManagerBase*       mFrameManager;
> >    nsWeakPtr                 mForwardingContainer;
> > +#ifdef ACCESSIBILITY
> > +  DocAccessible* mAccDocument;
> 
> you could keep Accessible* pointer and you wouldn't need to export all these
> wrap classes (note, msaa versions includes msaa specific headers what may
> give unwanted effect).

that's a reasonable idea, but I think I want to move some of the nsAccessibilityService methods into layout/ to avoid the overhead of the call, and make it a bit clearer what is actually happening, which would need a DocAccessible*.
Comment 22 Trevor Saunders (:tbsaunde) 2012-07-25 18:46:41 PDT
and https://hg.mozilla.org/integration/mozilla-inbound/rev/50e2e602bab0
Comment 23 alexander :surkov 2012-07-25 19:04:55 PDT
(In reply to Trevor Saunders (:tbsaunde) from comment #21)
> > please file follow up to add getter method and get rid the cache on a11y side
> 
> I already have  some stuff in my queue going down this road.

cool

> > do spaces instead tabs
> 
> I wonder how that didn't break anything in a Makefile :/
> I'll push a follow up for these, and the inline nit unless you mind.

thanks

> > you could keep Accessible* pointer and you wouldn't need to export all these
> > wrap classes (note, msaa versions includes msaa specific headers what may
> > give unwanted effect).
> 
> that's a reasonable idea, but I think I want to move some of the
> nsAccessibilityService methods into layout/ to avoid the overhead of the
> call, and make it a bit clearer what is actually happening, which would need
> a DocAccessible*.

ok, in either case we were going to separate wrappers from internal objects, just it should take a while and somebody can hit naming problems because of those headers
Comment 24 alexander :surkov 2012-07-25 19:06:22 PDT
(In reply to Trevor Saunders (:tbsaunde) from comment #20)
> (In reply to alexander :surkov from comment #19)
> > I wonder whether we need to listen pagehide DOM event to shutdown doc
> > accessible if we shutdown it when preshsell goes away.
> 
> I'm not really sure if we need, or for that matter the one in
> OuterDocAccessible::Shutdown() maybe we could just replace that with an
> assert there is no kid left?

we need to file a bug and investigate it, iirc that outerdoc trick had a cause.
Comment 25 Trevor Saunders (:tbsaunde) 2012-07-25 19:13:55 PDT
filed bug 777603 and bug 777606
Comment 26 Matt Brubeck (:mbrubeck) 2012-07-25 20:13:25 PDT
Sorry, I backed this out because it failed to build on Windows:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1c98db400e99
Comment 27 Boris Zbarsky [:bz] (still a bit busy) 2012-07-25 20:15:11 PDT
The presshell doesn't necessarily go away on pagehide.
Comment 28 Trevor Saunders (:tbsaunde) 2012-07-27 21:47:53 PDT
landed https://hg.mozilla.org/integration/mozilla-inbound/rev/30646643e581
Comment 29 Ryan VanderMeulen [:RyanVM] 2012-07-28 18:34:16 PDT
https://hg.mozilla.org/mozilla-central/rev/30646643e581

Note You need to log in before you can comment on or make changes to this bug.