Closed Bug 1263188 Opened 4 years ago Closed 4 years ago

crash in PLDHashTable::Search | mozilla::a11y::DocAccessible::UnbindFromDocument

Categories

(Core :: Disability Access APIs, defect, critical)

Unspecified
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 + fixed

People

(Reporter: MarcoZ, Assigned: surkov)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(5 files)

This bug was filed from the Socorro interface and is 
report bp-571c42ab-8b06-421d-ae90-aec012160408.
=============================================================
I encountered this while alt-tabbing back into today's Nightly. I had an IRCCloud chat tab open with Davidb at the time. 

1. I had alt-tabbed to the desktop, 
2. launched another app, 
3. and when I alt-tabbed back into Nightly, it immediately crashed.

In the meantime, David had sent in a few private messages to me. So the page I was about to come back to wasn't the same I had left earlier.

I suspect this is fall-out from bug 1261425, which landed between the 2016-04-07 and 2016-04-08 builds, and I'd never seen this crash before.
[Tracking Requested - why for this release]:

I'm experiencing this every time I search in the gmail interface.
We have 998 crash reports in Nightly in the past three days. That's a huge number. They started in Nightly 20160408030212.

There are also several other signatures that started at the same time and look like they are related:

> PLDHashTable::Search | nsBaseHashtable<T>::Get
> PLDHashTable::Remove | mozilla::a11y::DocAccessible::ShutdownChildrenInSubtree 
> PLDHashTable::Search | nsInterfaceHashtable<T>::Get
> PLDHashTable::Search | nsClassHashtable<T>::Get

They range from a handful to 100+.

The crash looks like it might be a null deref of mNodeToAccessibleMap?

Surkov, bug 1261425 landed on April 7. Could it be the cause? The crash rate is extreme, and if it's at all possible that bug is the cause then its patch should be backed out. Thank you.
Flags: needinfo?(surkov.alexander)
(In reply to Nicholas Nethercote [:njn] from comment #2)
> We have 998 crash reports in Nightly in the past three days. That's a huge
> number. They started in Nightly 20160408030212.
> 
> There are also several other signatures that started at the same time and
> look like they are related:
> 
> > PLDHashTable::Search | nsBaseHashtable<T>::Get
> > PLDHashTable::Remove | mozilla::a11y::DocAccessible::ShutdownChildrenInSubtree 
> > PLDHashTable::Search | nsInterfaceHashtable<T>::Get
> > PLDHashTable::Search | nsClassHashtable<T>::Get
> 
> They range from a handful to 100+.
> 
> The crash looks like it might be a null deref of mNodeToAccessibleMap?
> 
> Surkov, bug 1261425 landed on April 7. Could it be the cause? The crash rate
> is extreme, and if it's at all possible that bug is the cause then its patch
> should be backed out. Thank you.

It is indeed bug 1261425, I was able to reproduce in gmail like Benjamin, I hope to have a fix tomorrow or so.
Flags: needinfo?(surkov.alexander)
Attached patch stop bleedingSplinter Review
Attachment #8739828 - Flags: review?(yzenevich)
Keywords: leave-open
Comment on attachment 8739828 [details] [diff] [review]
stop bleeding

>diff --git a/accessible/base/EventTree.cpp b/accessible/base/EventTree.cpp
>--- a/accessible/base/EventTree.cpp
>+++ b/accessible/base/EventTree.cpp
>@@ -214,17 +214,17 @@ EventTree::Process()
>         // handling.
>         if (mtEvent->mAccessible->ARIARole() == roles::MENUPOPUP) {
>           nsEventShell::FireEvent(nsIAccessibleEvent::EVENT_MENUPOPUP_END,
>                                   mtEvent->mAccessible);
>         }
> 
>         AccHideEvent* hideEvent = downcast_accEvent(mtEvent);
>         if (hideEvent->NeedsShutdown()) {
>-          mContainer->Document()->ShutdownChildrenInSubtree(hideEvent->mAccessible);
>+          mtEvent->GetDocAccessible()->ShutdownChildrenInSubtree(mtEvent->mAccessible);
>         }
>       }
>     }
> 
>     // Fire reorder event at last.
>     if (mFireReorder) {
>       nsEventShell::FireEvent(nsIAccessibleEvent::EVENT_REORDER, mContainer);
>     }
Attachment #8739828 - Attachment description: patch → stop bleeding
Crash Signature: [@ PLDHashTable::Search | mozilla::a11y::DocAccessible::UnbindFromDocument] → [@ PLDHashTable::Search | mozilla::a11y::DocAccessible::UnbindFromDocument] [@ PLDHashTable::Search | nsBaseHashtable<T>::Get] [@ PLDHashTable::Remove | mozilla::a11y::DocAccessible::ShutdownChildrenInSubtree] [@ PLDHashTable::Search | nsInterfaceHasht…
I started getting this on Nightly(04/10 build) when playing youtube: dismissing advertisements, back, fw
Ubuntu 14.04
Attachment #8739828 - Flags: review?(yzenevich) → review+
Assignee: nobody → surkov.alexander
Note:

* 2016-01-13 was when this sig started (a11y active[1] for these crashes):
** PLDHashTable::Search | nsClassHashtable<T>::Get

* 2016-04-08 was when this sig started showing up:
** PLDHashTable::Search | mozilla::a11y::DocAccessible::UnbindFromDocument

Alex do you expect the current patch to fix both?

1. https://crash-stats.mozilla.com/signature/?product=Firefox&version=48.0a1&date=%3C2016-04-11T19%3A02%3A31&date=%3E%3D2016-04-04T19%3A02%3A31&signature=PLDHashTable%3A%3ASearch+%7C+nsClassHashtable%3CT%3E%3A%3AGet&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_columns=accessibility&page=1
Flags: needinfo?(surkov.alexander)
Flags: needinfo?(surkov.alexander)
Attachment #8740135 - Flags: review?(yzenevich)
Comment on attachment 8740135 [details] [diff] [review]
part1: fix event tree cutting

Review of attachment 8740135 [details] [diff] [review]:
-----------------------------------------------------------------

thanks, just one question.

::: accessible/base/EventTree.cpp
@@ +479,5 @@
>    // discard those subtree mutations as we are no longer interested in them.
> +  nsAutoPtr<EventTree>* node = &mFirst;
> +  while (*node) {
> +    if ((*node)->mContainer == aEv->mAccessible) {
> +      *node = Move((*node)->mNext);

Are we not clearing subtree mutations?
Attachment #8740135 - Flags: review?(yzenevich) → review+
(In reply to Yura Zenevich [:yzen] from comment #12)

> >    // discard those subtree mutations as we are no longer interested in them.
> > +  nsAutoPtr<EventTree>* node = &mFirst;
> > +  while (*node) {
> > +    if ((*node)->mContainer == aEv->mAccessible) {
> > +      *node = Move((*node)->mNext);
> 
> Are we not clearing subtree mutations?

we do, when we change *node, then it's previous object is destroyed, since it is nsAutoPtr, desctructor is called, which calls Clear().
Attachment #8740537 - Flags: review?(yzenevich)
(In reply to alexander :surkov from comment #13)
> (In reply to Yura Zenevich [:yzen] from comment #12)
> 
> > >    // discard those subtree mutations as we are no longer interested in them.
> > > +  nsAutoPtr<EventTree>* node = &mFirst;
> > > +  while (*node) {
> > > +    if ((*node)->mContainer == aEv->mAccessible) {
> > > +      *node = Move((*node)->mNext);
> > 
> > Are we not clearing subtree mutations?
> 
> we do, when we change *node, then it's previous object is destroyed, since
> it is nsAutoPtr, desctructor is called, which calls Clear().

gotcha
Attachment #8740671 - Flags: review?(yzenevich)
Attachment #8740673 - Flags: review?(yzenevich)
Comment on attachment 8740537 [details] [diff] [review]
part2: fix event tree building

Review of attachment 8740537 [details] [diff] [review]:
-----------------------------------------------------------------

couple of comments

::: accessible/base/EventTree.cpp
@@ +327,5 @@
>      prevNode = node;
>    } while ((node = node->mNext));
>  
>    MOZ_ASSERT(prevNode, "Nowhere to insert");
> +  MOZ_ASSERT(!prevNode->mNext, "A bed is taken");

Not sure I get this message.

@@ +332,5 @@
> +
> +  // If 'this' node contains the given container accessible, then
> +  //   do not emit a reorder event for the container
> +  //   if a dependent show event target contains the given container then do not
> +  //      emit show / hide events (to be done)

nit: indentation.
Attachment #8740537 - Flags: review?(yzenevich) → review+
Comment on attachment 8740671 [details] [diff] [review]
part3: more assertions

Review of attachment 8740671 [details] [diff] [review]:
-----------------------------------------------------------------

::: accessible/base/EventTree.cpp
@@ +230,4 @@
>  
> +  // Fire reorder event at last.
> +  if (mFireReorder) {
> +    MOZ_ASSERT(mContainer);

nit: message?
Attachment #8740671 - Flags: review?(yzenevich) → review+
Comment on attachment 8740673 [details] [diff] [review]
part4: fix event coalescense

Review of attachment 8740673 [details] [diff] [review]:
-----------------------------------------------------------------

looks good
Attachment #8740673 - Flags: review?(yzenevich) → review+
(In reply to Yura Zenevich [:yzen] from comment #20)

> >    MOZ_ASSERT(prevNode, "Nowhere to insert");
> > +  MOZ_ASSERT(!prevNode->mNext, "A bed is taken");
> 
> Not sure I get this message.

It was an allusion to Masha and the bear folk tale. I meant that mNext must be a null. Should I change it to that message or remove it or replace on something else?
(In reply to alexander :surkov from comment #23)
> (In reply to Yura Zenevich [:yzen] from comment #20)
> 
> > >    MOZ_ASSERT(prevNode, "Nowhere to insert");
> > > +  MOZ_ASSERT(!prevNode->mNext, "A bed is taken");
> > 
> > Not sure I get this message.
> 
> It was an allusion to Masha and the bear folk tale. I meant that mNext must
> be a null. Should I change it to that message or remove it or replace on
> something else?

We like contributors, even ones that don't know the Masha story.
(In reply to Yura Zenevich [:yzen] from comment #21)
> Comment on attachment 8740671 [details] [diff] [review]
> part3: more assertions
> 
> Review of attachment 8740671 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: accessible/base/EventTree.cpp
> @@ +230,4 @@
> >  
> > +  // Fire reorder event at last.
> > +  if (mFireReorder) {
> > +    MOZ_ASSERT(mContainer);
> 
> nit: message?

I'm not sure what a good message might be here. Having "container shouldn't be null" repeats the assert's expression itself, also the assertion is right before a line where we attempt to use it.
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/97c95a625a6f
https://hg.mozilla.org/mozilla-central/rev/6500ef918103
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Recent regression, tracking in case it reopens
You need to log in before you can comment on or make changes to this bug.