Closed Bug 1206107 Opened 6 years ago Closed 6 years ago

crash beginning in nightly 0916 in mozilla::a11y::Accessible::HasGenericType(mozilla::a11y::AccGenericType)

Categories

(Core :: Disability Access APIs, defect)

41 Branch
Unspecified
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox42 --- unaffected
firefox43 + disabled
firefox44 + fixed

People

(Reporter: away, Assigned: surkov)

References

Details

(Keywords: crash)

Crash Data

Attachments

(2 files, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-ed067ea1-340b-4fc3-9fd3-709fe2150916.
=============================================================

(Continued from bug 1186536 comment #27)
> (In reply to Trevor Saunders (:tbsaunde) from comment #25)
> > (In reply to David Major [:dmajor] from comment #24)
> > > This HasGenericType signature greatly spiked in volume in the 0916 build.
> > > I'm not entirely sure if 0917 is better, or if the crashes merely haven't
> > > come in yet. Do you know what happened?
> > > 
> > > Example: bp-ed067ea1-340b-4fc3-9fd3-709fe2150916
> > 
> > no :( I wouldn't expect anything I've checked in to cause that and would be
> > somewhat suprised if they did.  My first guess would be bug 1133213 because
> > it changes the accessible tree and events, which are both involved in this
> > crash.
> 
> Bug 1133213 would fit the regression range. :surkov, any ideas?

0 	xul.dll 	mozilla::a11y::Accessible::HasGenericType(mozilla::a11y::AccGenericType) 	accessible/generic/Accessible-inl.h
1 	xul.dll 	mozilla::a11y::Accessible::HandleAccEvent(mozilla::a11y::AccEvent*) 	accessible/generic/Accessible.cpp
2 	xul.dll 	mozilla::a11y::AccessibleWrap::HandleAccEvent(mozilla::a11y::AccEvent*) 	accessible/windows/msaa/AccessibleWrap.cpp
3 	xul.dll 	nsEventShell::FireEvent(mozilla::a11y::AccEvent*) 	accessible/base/nsEventShell.cpp
4 	xul.dll 	mozilla::a11y::EventQueue::ProcessEventQueue() 	accessible/base/EventQueue.cpp
5 	xul.dll 	mozilla::a11y::NotificationController::WillRefresh(mozilla::TimeStamp) 	accessible/base/NotificationController.cpp
6 	xul.dll 	nsRefreshDriver::Tick(__int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp
7 	xul.dll 	mozilla::RefreshDriverTimer::TickDriver(nsRefreshDriver*, __int64, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp
8 	xul.dll 	mozilla::InactiveRefreshDriverTimer::TickOne() 	layout/base/nsRefreshDriver.cpp
9 	xul.dll 	nsTimerImpl::Fire() 	xpcom/threads/nsTimerImpl.cpp
Is bug 1205476 related to this? Seems like both signatures were rising at the same time.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #1)
> Is bug 1205476 related to this? Seems like both signatures were rising at
> the same time.

it might be, however we crashed before we sent any events.

Trevor, where HasGenericType comes from in HandleAccEvent? I dont' see it around.
[Tracking Requested - why for this release]:
The crashes in this bug and bug 1205476 together are more than all other crashes for Nightly 43 together in recent days. We cannot ship Dev Edition updates with this kind of crash issue.
Attached patch patch (obsolete) — Splinter Review
Attachment #8663712 - Flags: review?(tbsaunde+mozbugs)
Comment on attachment 8663712 [details] [diff] [review]
patch

>diff --git a/accessible/base/EventQueue.cpp b/accessible/base/EventQueue.cpp
>--- a/accessible/base/EventQueue.cpp
>+++ b/accessible/base/EventQueue.cpp
>@@ -101,25 +101,34 @@ EventQueue::CoalesceEvents()
>         AccEvent* thisEvent = mEvents[index];
>         if (thisEvent->mEventRule != tailEvent->mEventRule)
>           continue;
> 
>         // We don't currently coalesce text change events from show/hide events.
>         if (thisEvent->mEventType != tailEvent->mEventType)
>           continue;
> 
>+        AccMutationEvent* tailMutEvent = downcast_accEvent(tailEvent);
>+        AccMutationEvent* thisMutEvent = downcast_accEvent(thisEvent);
>+
>         // Show events may be duped because of reinsertion (removal is ignored
>         // because initial insertion is not processed). Ignore initial
>         // insertion.

its preexisting, but if that's all this is for then  there should really be an assert they are both show events.

>-        if (thisEvent->mAccessible == tailEvent->mAccessible)
>+        if (thisEvent->mAccessible == tailEvent->mAccessible) {
>+          // If the accessible was reinserted then ignore the former event.

that doesn't really make sense or add to understanding of what is going on.  what is the "former event"? I would tend to think the thisEvent, not tailEvent which means the comment is just false.

>+          if (tailMutEvent->mParent != thisMutEvent->mParent) {

why should getting reinserted under a different parent be different from under the same parent?

>+            // XXX: the tail event may have a coalesced text event, which should
>+            // be unwrapped.
>+            tailEvent->mEventRule = AccEvent::eDoNotEmit;

so, if it gets reinserted under a new parent you only fire an event for getting inserted under the old parent?

>+            continue;

that should be break since continueing will just spin at the top of the loop.
(In reply to Trevor Saunders (:tbsaunde) from comment #6)

> >         // Show events may be duped because of reinsertion (removal is ignored
> >         // because initial insertion is not processed). Ignore initial
> >         // insertion.
> 
> its preexisting, but if that's all this is for then  there should really be
> an assert they are both show events.

assertion for no hide events make sense

> 
> >-        if (thisEvent->mAccessible == tailEvent->mAccessible)
> >+        if (thisEvent->mAccessible == tailEvent->mAccessible) {
> >+          // If the accessible was reinserted then ignore the former event.
> 
> that doesn't really make sense or add to understanding of what is going on. 
> what is the "former event"? I would tend to think the thisEvent, not
> tailEvent which means the comment is just false.

it's supposed to be tail event.

> 
> >+          if (tailMutEvent->mParent != thisMutEvent->mParent) {
> 
> why should getting reinserted under a different parent be different from
> under the same parent?

I didn't catch the question, pls put it into different wording

> 
> >+            // XXX: the tail event may have a coalesced text event, which should
> >+            // be unwrapped.
> >+            tailEvent->mEventRule = AccEvent::eDoNotEmit;
> 
> so, if it gets reinserted under a new parent you only fire an event for
> getting inserted under the old parent?

under the new parent

> 
> >+            continue;
> 
> that should be break since continueing will just spin at the top of the loop.

by breaking it we won't find possible other events to coalesce
> > 
> > >-        if (thisEvent->mAccessible == tailEvent->mAccessible)
> > >+        if (thisEvent->mAccessible == tailEvent->mAccessible) {
> > >+          // If the accessible was reinserted then ignore the former event.
> > 
> > that doesn't really make sense or add to understanding of what is going on. 
> > what is the "former event"? I would tend to think the thisEvent, not
> > tailEvent which means the comment is just false.
> 
> it's supposed to be tail event.

ok, so its accurate, but it doesn't help understand why it makes sense to ignore the event we just fired in favor of the previous one.

> > 
> > >+          if (tailMutEvent->mParent != thisMutEvent->mParent) {
> > 
> > why should getting reinserted under a different parent be different from
> > under the same parent?
> 
> I didn't catch the question, pls put it into different wording

why does how you coalesce two show events for the same accessible depend on if it had the same parent for the two events or not.

> > 
> > >+            // XXX: the tail event may have a coalesced text event, which should
> > >+            // be unwrapped.
> > >+            tailEvent->mEventRule = AccEvent::eDoNotEmit;
> > 
> > so, if it gets reinserted under a new parent you only fire an event for
> > getting inserted under the old parent?
> 
> under the new parent

that would make more sense, but afaict it is not what this patch does.

> > 
> > >+            continue;
> > 
> > that should be break since continueing will just spin at the top of the loop.
> 
> by breaking it we won't find possible other events to coalesce

you just set tailEvent->mEventRule to eDoNotEmit, so you'll go to the top of the loop and ignore all events that aren't also eDoNotEmit.
(In reply to Trevor Saunders (:tbsaunde) from comment #8)
> > > 
> > > >-        if (thisEvent->mAccessible == tailEvent->mAccessible)
> > > >+        if (thisEvent->mAccessible == tailEvent->mAccessible) {
> > > >+          // If the accessible was reinserted then ignore the former event.
> > > 
> > > that doesn't really make sense or add to understanding of what is going on. 
> > > what is the "former event"? I would tend to think the thisEvent, not
> > > tailEvent which means the comment is just false.
> > 
> > it's supposed to be tail event.
> 
> ok, so its accurate, but it doesn't help understand why it makes sense to
> ignore the event we just fired in favor of the previous one.

say you show an element which is referred by other aria-owns, an accessible will be created and put a show event into the queue, then we reallocate it, fire hide event under previous location and show event under new location.

> > > >+          if (tailMutEvent->mParent != thisMutEvent->mParent) {
> > > 
> > > why should getting reinserted under a different parent be different from
> > > under the same parent?
> > 
> > I didn't catch the question, pls put it into different wording
> 
> why does how you coalesce two show events for the same accessible depend on
> if it had the same parent for the two events or not.

let me know if previous comment doesn't answer it

> 
> > > 
> > > >+            // XXX: the tail event may have a coalesced text event, which should
> > > >+            // be unwrapped.
> > > >+            tailEvent->mEventRule = AccEvent::eDoNotEmit;
> > > 
> > > so, if it gets reinserted under a new parent you only fire an event for
> > > getting inserted under the old parent?
> > 
> > under the new parent
> 
> that would make more sense, but afaict it is not what this patch does.

why? tailEvent is an older event, we coalesce it, thisEvent, a newer one, is not coalesced.

> 
> > > 
> > > >+            continue;
> > > 
> > > that should be break since continueing will just spin at the top of the loop.
> > 
> > by breaking it we won't find possible other events to coalesce
> 
> you just set tailEvent->mEventRule to eDoNotEmit, so you'll go to the top of
> the loop and ignore all events that aren't also eDoNotEmit.

ok, it seems it makes sense for eDoNotEmit case.
(In reply to alexander :surkov from comment #9)
> (In reply to Trevor Saunders (:tbsaunde) from comment #8)
> > > > 
> > > > >-        if (thisEvent->mAccessible == tailEvent->mAccessible)
> > > > >+        if (thisEvent->mAccessible == tailEvent->mAccessible) {
> > > > >+          // If the accessible was reinserted then ignore the former event.
> > > > 
> > > > that doesn't really make sense or add to understanding of what is going on. 
> > > > what is the "former event"? I would tend to think the thisEvent, not
> > > > tailEvent which means the comment is just false.
> > > 
> > > it's supposed to be tail event.
> > 
> > ok, so its accurate, but it doesn't help understand why it makes sense to
> > ignore the event we just fired in favor of the previous one.
> 
> say you show an element which is referred by other aria-owns, an accessible
> will be created and put a show event into the queue, then we reallocate it,
> fire hide event under previous location and show event under new location.

then put that in a comment ;)

> 
> > > > >+          if (tailMutEvent->mParent != thisMutEvent->mParent) {
> > > > 
> > > > why should getting reinserted under a different parent be different from
> > > > under the same parent?
> > > 
> > > I didn't catch the question, pls put it into different wording
> > 
> > why does how you coalesce two show events for the same accessible depend on
> > if it had the same parent for the two events or not.
> 
> let me know if previous comment doesn't answer it

it doesn't, consider what happens if aria-owns is used to reorder the children of an accessible, then you have two show events for the same accessible with the same parent, why should that be coalesced differently?

> 
> > 
> > > > 
> > > > >+            // XXX: the tail event may have a coalesced text event, which should
> > > > >+            // be unwrapped.
> > > > >+            tailEvent->mEventRule = AccEvent::eDoNotEmit;
> > > > 
> > > > so, if it gets reinserted under a new parent you only fire an event for
> > > > getting inserted under the old parent?
> > > 
> > > under the new parent
> > 
> > that would make more sense, but afaict it is not what this patch does.
> 
> why? tailEvent is an older event, we coalesce it, thisEvent, a newer one, is
> not coalesced.

no, tailEvent is the one that was just added to the queue, thisEvent is the older one.
(In reply to Trevor Saunders (:tbsaunde) from comment #10)

> no, tailEvent is the one that was just added to the queue, thisEvent is the
> older one.

on yeah? I've got confused with naming then. There's no bug in this part of code then.

I started from assumption that parent is null in DocAccessibleChild::ShowEvent, if so any ideas how could a show event get broken?
Attached patch patch2Splinter Review
Attachment #8663712 - Attachment is obsolete: true
Attachment #8663712 - Flags: review?(tbsaunde+mozbugs)
Attachment #8664383 - Flags: review?(tbsaunde+mozbugs)
Comment on attachment 8664383 [details] [diff] [review]
patch2

I'm not sure if this will fix things, but I guess it will make event coalescing work more correctly.
Attachment #8664383 - Flags: review?(tbsaunde+mozbugs) → review+
so am I :) let's see if it helps
https://hg.mozilla.org/mozilla-central/rev/b139161f3562
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
I'm still seeing some crashes in 43 and 44 for this signature; around 140 crashes (per version) in the last 3 days. So maybe this made it less bad, but it's still significant.
Alexander, is that good enough to count this fixed (since it did seem to stop the drastic spike in crashes)? Or is this still a high enough crash rate that you want to work on it more in this or another bug?
Flags: needinfo?(surkov.alexander)
Attached patch part2Splinter Review
Flags: needinfo?(surkov.alexander)
Attachment #8667939 - Flags: review?(tbsaunde+mozbugs)
I just hit this crash signature (once) with 64-bit Nightly 44.0a1 (build 2015-10-04) on Windows 8.1:

bp-4f23ecf8-5db8-4592-97be-4e3e72151005
Trevor, review ping?
(In reply to David Major [:dmajor] from comment #21)
> Trevor, review ping?

yeah, was just hard to get time to review non trivial patches during work week.
Comment on attachment 8667939 [details] [diff] [review]
part2

>-
>-      MaybeNotifyOfValueChange(owner);
>-      FireDelayedEvent(reorderEvent);
>+      isReinserted = owner->AppendChild(child);

So, afaict the only way for this to end up false is for owner to be a LeafAccessible*.  Which means somebody did something like <input type="checkbox" aria-owns="foobar"> which is possible, but pretty crazy.  I guess we should actually allow that? (yuk!)
(In reply to Trevor Saunders (:tbsaunde) from comment #23)

> >+      isReinserted = owner->AppendChild(child);
> 
> So, afaict the only way for this to end up false is for owner to be a
> LeafAccessible*.  Which means somebody did something like <input
> type="checkbox" aria-owns="foobar"> which is possible, but pretty crazy.  I
> guess we should actually allow that? (yuk!)

I don't have any valid example of it, in general AT don't expect to have children on those, and so we are. It's ok to ignore such aria-owns for now.
Sorry didn't see activity here when I filed bug 1211929. The stack I provided there is:
https://crash-stats.mozilla.com/report/index/681fa501-4e64-47e6-b9fd-fc5f52151006
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to alexander :surkov from comment #24)
> (In reply to Trevor Saunders (:tbsaunde) from comment #23)
> 
> > >+      isReinserted = owner->AppendChild(child);
> > 
> > So, afaict the only way for this to end up false is for owner to be a
> > LeafAccessible*.  Which means somebody did something like <input
> > type="checkbox" aria-owns="foobar"> which is possible, but pretty crazy.  I
> > guess we should actually allow that? (yuk!)
> 
> I don't have any valid example of it, in general AT don't expect to have
> children on those, and so we are. It's ok to ignore such aria-owns for now.

I'd say the example I gave is valid per the spec, of course if your point is that its a bug but you don't care about it that's fine.
Attachment #8667939 - Flags: review?(tbsaunde+mozbugs) → review+
(In reply to Trevor Saunders (:tbsaunde) from comment #27)
> (In reply to alexander :surkov from comment #24)
> > (In reply to Trevor Saunders (:tbsaunde) from comment #23)
> > 
> > > >+      isReinserted = owner->AppendChild(child);
> > > 
> > > So, afaict the only way for this to end up false is for owner to be a
> > > LeafAccessible*.  Which means somebody did something like <input
> > > type="checkbox" aria-owns="foobar"> which is possible, but pretty crazy.  I
> > > guess we should actually allow that? (yuk!)
> > 
> > I don't have any valid example of it, in general AT don't expect to have
> > children on those, and so we are. It's ok to ignore such aria-owns for now.
> 
> I'd say the example I gave is valid per the spec, of course if your point is
> that its a bug but you don't care about it that's fine.

definitely somebody has a bug, I'm not sure yet who.
https://hg.mozilla.org/mozilla-central/rev/0e3fa7bcb288
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Trevor, want to uplift this too? Looks like 43 is affected.
Flags: needinfo?(tbsaunde+mozbugs)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #31)
> Trevor, want to uplift this too? Looks like 43 is affected.

uh, I guess its ok, their surkov's patches so I'll let him decide
Flags: needinfo?(tbsaunde+mozbugs) → needinfo?(surkov.alexander)
Comment on attachment 8664383 [details] [diff] [review]
patch2

Approval Request Comment
[Feature/regressing bug #]: bug 1133213
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: not covered
[Risks and why]: moderate, tweaking not trivial code
[String/UUID change made/needed]: no
Flags: needinfo?(surkov.alexander)
Attachment #8664383 - Flags: approval-mozilla-aurora?
Comment on attachment 8667939 [details] [diff] [review]
part2

Approval Request Comment
[Feature/regressing bug #]: bug 1133213
[User impact if declined]: crashes
[Describe test coverage new/current, TreeHerder]: not covered
[Risks and why]: moderate, tweaking not trivial code
[String/UUID change made/needed]: no
Attachment #8667939 - Flags: approval-mozilla-aurora?
There were a few topcrashes that I thought might be related. But looking back on this crash signature now, I don't see it appearing in the last few days -- and not since the 2015100803 build (nightly)  Then not at all on recent aurora builds. Hmm.
This issue may have been fixed in other uplifts.  Let's wait a few days and see if there is any evidence the crash is still happening. 

surkov what do you think?  is this something we can test and reproduce? I don't want to uplift something if we don't need it, when it also sounds risky to mess with.
Flags: needinfo?(surkov.alexander)
Wasn't bug 1133213 backed out of Aurora shortly after merge day? If so, this bug simply isn't relevant to the train beyond Nightly, and will just move along upon next merge day.
(In reply to Marco Zehe (:MarcoZ) from comment #37)
> Wasn't bug 1133213 backed out of Aurora shortly after merge day? If so, this
> bug simply isn't relevant to the train beyond Nightly, and will just move
> along upon next merge day.

it says the bug was fixed on 43. I guess e10s a11y might be disabled on aurora. In this case it still makes sense to port these patches since they make the behavior correct, but they are not so urgent because there's no crash.
Flags: needinfo?(surkov.alexander)
Marco is right. The target milestone was set to 43 when the patch first landed, then it was maybe overlooked that it should be set back to 44.  Thanks for clearing that up, I completely missed it!  

Let's let this work stay on 44 as well, then.
Comment on attachment 8664383 [details] [diff] [review]
patch2

Turns out we don't need to uplift this as the crashes should go away on beta when e10s is disabled, and the regressing bug only should be affecting 44 anyway.
Attachment #8664383 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
Comment on attachment 8667939 [details] [diff] [review]
part2

Turns out we don't need to uplift this after all.
Attachment #8667939 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-
Setting 43 tracking flag to disabled since e10s is disabled there except for some limited testing experiments.
Assignee: nobody → surkov.alexander
You need to log in before you can comment on or make changes to this bug.