Crash in mozilla::a11y::DocAccessible::GetAccessible

RESOLVED FIXED in Firefox -esr52

Status

()

P1
critical
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: calixte, Assigned: eeejay)

Tracking

({crash})

56 Branch
mozilla56
x86
Windows 7
crash
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 fixed, firefox54 wontfix, firefox55 fixed, firefox56 verified)

Details

(crash signature, URL)

Attachments

(1 attachment)

(Reporter)

Description

a year ago
This bug was filed from the Socorro interface and is 
report bp-b2b3d531-86ea-41de-b214-4465b0170628.
=============================================================

There are 400 crashes in beta 55 and is #34 in top crashers.
:surkov, could you investigate please ?
Flags: needinfo?(surkov.alexander)
possibly Document() returns null in [1], but it's not clear how it may happen, aContainer [2], we grab a document from, is in document.

we might want to add diagnostic assertions since the crash is visible on beta.

Eitan, thoughts?

[1] https://hg.mozilla.org/releases/mozilla-beta/annotate/af94af58d1f3/accessible/generic/DocAccessible.cpp#l1802
[2] https://hg.mozilla.org/releases/mozilla-beta/annotate/af94af58d1f3/accessible/generic/DocAccessible.cpp#l1851
Flags: needinfo?(surkov.alexander) → needinfo?(eitan)

Updated

a year ago
See Also: → bug 1376754

Comment 2

a year ago
Due to my testing for bug 1376754 I came across this one. I'm unable to reproduce this bug with 54. So I've narrowed down the regression range with MozRegression.

Steps to reproduce:

1. visit http://www.voetbalzone.nl/social_item.asp?uid=9412
2. accept cookies
3. wait until the page is fully loaded (if no crash occurs, then hard reload the page until a crash happens. Less than five hard reloads should be sufficient.)

Actual result: browser crashes

Expected result: no crash

According to MozRegression this is caused by Bug 1342433 - onclick changes shouldn't recreate the tree, r=yzen

It's probably not e10s related. See https://bugzilla.mozilla.org/show_bug.cgi?id=1376754#c3 and further

I won't change the tracking flag of firefox54 for now.
Blocks: 1342433
Has Regression Range: --- → yes
Has STR: --- → yes
(In reply to Arjen de Jager from comment #2)

> Steps to reproduce:
> 
> 1. visit http://www.voetbalzone.nl/social_item.asp?uid=9412
> 2. accept cookies
> 3. wait until the page is fully loaded (if no crash occurs, then hard reload
> the page until a crash happens. Less than five hard reloads should be
> sufficient.)
> 
> Actual result: browser crashes

I've got two crash signatures so far [1] (no bug) and bug 1380199 ('Event coalescence killed the accessible' assertion), and no crash signatures for this bug.

[1] https://crash-stats.mozilla.com/report/index/f4a82bbf-58d4-4dba-b03d-c3f050170717
(Assignee)

Comment 4

a year ago
I managed to capture this in rr, I'll work on it tomorrow.
(Assignee)

Comment 5

a year ago
I suspect this is a regression from bug 1363027.

In short, we shut down the accessible without uncaching it.

What happens is:
1. A parent is moved, so a hide event is queued.
2. Then a child is removed, DocAccessible::ContentRemoved is called
  a. Its hide event is coalesced with its parent's event.
  b. DropMutationEvent shuts down the child (Accessible::Shutdown), it goes to defunct state with no doc.
  c. ContentRemoved returns early and does not uncache the child and its subtree.
3. Child node is re-inserted
  a. GetAccessible gets a defunct accessible with a null doc.
  b. crash
Assignee: nobody → eitan
Flags: needinfo?(eitan)
(In reply to Eitan Isaacson [:eeejay] from comment #5)
> I suspect this is a regression from bug 1363027.
> 
> In short, we shut down the accessible without uncaching it.
> 
> What happens is:
> 1. A parent is moved, so a hide event is queued.
> 2. Then a child is removed, DocAccessible::ContentRemoved is called
>   a. Its hide event is coalesced with its parent's event.
>   b. DropMutationEvent shuts down the child (Accessible::Shutdown), it goes
> to defunct state with no doc.
>   c. ContentRemoved returns early and does not uncache the child and its
> subtree.

ShutdownChildrenInSubtree is supposed to uncache it anyway. It's not called by DropMutationEvent?
(Assignee)

Comment 7

a year ago
I first thought that this could be remedied by checking if given parent is a show event[1] target before checking for it being a hide event[2] target. That would have indicated a move.

The problem is, that at least in this case, the parent's show event is dropped in favor of a coalesced grandparent show event, so swapping those blocks wouldn't always help.

1. https://dxr.mozilla.org/mozilla-central/rev/1b065ffd8a535a0ad4c39a912af18e948e6a42c1/accessible/base/NotificationController.cpp#391
2. https://dxr.mozilla.org/mozilla-central/rev/1b065ffd8a535a0ad4c39a912af18e948e6a42c1/accessible/base/NotificationController.cpp#387
(Assignee)

Comment 8

a year ago
(In reply to alexander :surkov from comment #6)
> (In reply to Eitan Isaacson [:eeejay] from comment #5)
> > I suspect this is a regression from bug 1363027.
> > 
> > In short, we shut down the accessible without uncaching it.
> > 
> > What happens is:
> > 1. A parent is moved, so a hide event is queued.
> > 2. Then a child is removed, DocAccessible::ContentRemoved is called
> >   a. Its hide event is coalesced with its parent's event.
> >   b. DropMutationEvent shuts down the child (Accessible::Shutdown), it goes
> > to defunct state with no doc.
> >   c. ContentRemoved returns early and does not uncache the child and its
> > subtree.
> 
> ShutdownChildrenInSubtree is supposed to uncache it anyway. It's not called
> by DropMutationEvent?

Yeah, you are right. The accessible isn't cached anymore, but eIsNotInDocument state is not set.

eIsNotInDocument is set in UncacheChildrenInSubtreem but not in UnbindFromDocument, this seems like a bug.

So, this crash happens both because we wrongly(?) coalesce the hide event, and subsequently, we remove the accessible from the doc with the buggy UnbindFromDocument, instead of the usual path of UncacheChildrenInSubtreem.
(In reply to Eitan Isaacson [:eeejay] from comment #8)

> eIsNotInDocument is set in UncacheChildrenInSubtreem but not in
> UnbindFromDocument, this seems like a bug.
> 
> So, this crash happens both because we wrongly(?) coalesce the hide event,
> and subsequently, we remove the accessible from the doc with the buggy
> UnbindFromDocument, instead of the usual path of UncacheChildrenInSubtreem.

Correct. There's indeed something wrong with the coalescence system, and it makes sense to update eIsNotInDocument flag in this case. The latter one seems to be a sufficient fix for this bug as well.
(Assignee)

Comment 10

a year ago
Created attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

I spent forever trying to reproduce this in a clean test with no luck. Figure my time is better spent elsewhere, and the fix looks straightforward.
Attachment #8888495 - Flags: review?(surkov.alexander)
Comment on attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

Review of attachment 8888495 [details] [diff] [review]:
-----------------------------------------------------------------

On a second glance it shouldn't fix the crash (this is a correct thing though I believe), because DocAccessible::GetAccessible doesn't rely on this flag.

I'm still not sure I have of parts of the puzzle. Is the problem here is we fail to remove the accessible from mNodeToAccessibleMap for some reason? If so, then do you know how it may happen?

::: accessible/generic/DocAccessible.cpp
@@ +1337,5 @@
>    if (aAccessible->IsNodeMapEntry() &&
>        mNodeToAccessibleMap.Get(aAccessible->GetNode()) == aAccessible)
>      mNodeToAccessibleMap.Remove(aAccessible->GetNode());
>  
> +  aAccessible->mStateFlags |= eIsNotInDocument;

Accessible::Shutdown probably would be a more appropriate place
Comment on attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

cancelling review (bumps in email queue), comment #11 needs to be addressed
Flags: needinfo?(eitan)
Attachment #8888495 - Flags: review?(surkov.alexander)
Priority: -- → P1
(Assignee)

Comment 13

a year ago
(In reply to alexander :surkov from comment #11)
> Comment on attachment 8888495 [details] [diff] [review]
> Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov
> 
> Review of attachment 8888495 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> On a second glance it shouldn't fix the crash (this is a correct thing
> though I believe), because DocAccessible::GetAccessible doesn't rely on this
> flag.
> 
> I'm still not sure I have of parts of the puzzle. Is the problem here is we
> fail to remove the accessible from mNodeToAccessibleMap for some reason? If
> so, then do you know how it may happen?
> 
> ::: accessible/generic/DocAccessible.cpp
> @@ +1337,5 @@
> >    if (aAccessible->IsNodeMapEntry() &&
> >        mNodeToAccessibleMap.Get(aAccessible->GetNode()) == aAccessible)
> >      mNodeToAccessibleMap.Remove(aAccessible->GetNode());
> >  
> > +  aAccessible->mStateFlags |= eIsNotInDocument;
> 
> Accessible::Shutdown probably would be a more appropriate place

ProcessContentInserted does. That is what catches us from processing a container that has been removed in the meantime.

The stack trace is misleading. The crash occurs because the document is null. Because the container returns a null document in ProcessContentInserted. Here stack trace with argument values:

#0  0x00007fffe6d4ae78 in mozilla::a11y::DocAccessible::GetAccessible(nsINode*) const (this=0x0, aNode=0x7fffcc5c0b00) at /home/eitan/Mozilla/gecko/accessible/generic/DocAccessible.h:237
#1  0x00007fffe6d8dbc0 in mozilla::a11y::DocAccessible::GetAccessibleOrContainer(nsINode*) const (this=0x0, aNode=0x7fffcc5c0b00) at /home/eitan/Mozilla/gecko/accessible/generic/DocAccessible.cpp:1253
#2  0x00007fffe6d6216d in mozilla::a11y::DocAccessible::AccessibleOrTrueContainer(nsINode*) const (this=0x0, aNode=0x7fffcc5c0b00) at /home/eitan/Mozilla/gecko/accessible/generic/DocAccessible-inl.h:30
#3  0x00007fffe6d9d8e2 in InsertIterator::Next() (this=0x7fffffffa340)
    at /home/eitan/Mozilla/gecko/accessible/generic/DocAccessible.cpp:1812
#4  0x00007fffe6d9db6b in mozilla::a11y::DocAccessible::ProcessContentInserted(mozilla::a11y::Accessible*, nsTArray<nsCOMPtr<nsIContent> > const*) (this=0x7fffa1708a60, aContainer=0x7fffac9665c0, aNodes=0x7fffccc4ce70) at /home/eitan/Mozilla/gecko/accessible/generic/DocAccessible.cpp:1871
#5  0x00007fffe6d5bbba in mozilla::a11y::NotificationController::WillRefresh(mozilla::TimeStamp) (this=0x7fffb6802040, aTime=...) at /home/eitan/Mozilla/gecko/accessible/base/NotificationController.cpp:727
#6  0x00007fffe4ddbb1d in nsRefreshDriver::Tick(long, mozilla::TimeStamp) (this=0x7fffacf0c400, aNowEpoch=1501005526235087, aNowTime=...) at /home/eitan/Mozilla/gecko/layout/base/nsRefreshDriver.cpp:1861
Flags: needinfo?(eitan)
(Assignee)

Updated

a year ago
Attachment #8888495 - Flags: review?(surkov.alexander)
(Assignee)

Updated

a year ago
Duplicate of this bug: 1380172
(Assignee)

Updated

a year ago
See Also: → bug 1380153
Comment on attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

Review of attachment 8888495 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, thanks!

what do you think about moving this flag unsetting under Accessible::Shutdown() btw?
Attachment #8888495 - Flags: review?(surkov.alexander) → review+
(Assignee)

Comment 16

a year ago
(In reply to alexander :surkov from comment #15)
> Comment on attachment 8888495 [details] [diff] [review]
> Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov
> 
> Review of attachment 8888495 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=me, thanks!
> 
> what do you think about moving this flag unsetting under
> Accessible::Shutdown() btw?

I don't have my head completely around all the shutdown stages, why would you want to unbind the top node and not uncache all the children? When do you do the former without calling shutdown?

If I were to change this and make it a bit more foolproof, I would do:

bool IsInDocument() const { return mDoc != nullptr; }
(In reply to Eitan Isaacson [:eeejay] from comment #16)

> > what do you think about moving this flag unsetting under
> > Accessible::Shutdown() btw?
> 
> I don't have my head completely around all the shutdown stages, why would
> you want to unbind the top node and not uncache all the children? When do
> you do the former without calling shutdown?

Not sure I follow the questions. My suggestion was to move eIsInDocument flag unsetting from DocAccessible::ShutdownChildrenInSubtree into Accessible::Shutdown. ShutdownChildrenInSubtree calls Shutdown for each accessible it traverses, so it's basically same, expect it feels a bit weird that Shutdown doesn't make sure that all properties are unset properly.

> If I were to change this and make it a bit more foolproof, I would do:
> 
> bool IsInDocument() const { return mDoc != nullptr; }

I think we don't unset mDoc during unbinding stage and use a flag for that, because we need mDoc later. So this approach shouldn't work.
(Assignee)

Updated

a year ago
Blocks: 1380153
See Also: bug 1380153
(Assignee)

Comment 18

a year ago
Since this is a crasher, and not a big change I will nominate this for beta once this settles in on central.
Keywords: checkin-needed

Comment 19

a year ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/63f14ebd2473
Set eIsNotInDocument to accessible's state when unbinding from doc. r=surkov
Keywords: checkin-needed

Comment 20

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/63f14ebd2473
Status: NEW → RESOLVED
Last Resolved: a year ago
status-firefox56: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
FWIW, we built our final 55 beta today. The next build will be RC1 on Monday. Are you still aiming to get this uplifted into 55? If so, please nominate it ASAP :)
status-firefox54: affected → wontfix
Flags: needinfo?(eitan)
(Assignee)

Comment 22

a year ago
Comment on attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

Approval Request Comment
[Feature/Bug causing the regression]: 
[User impact if declined]: Crashes on certain content when accessibility is enabled.
[Is this code covered by automated tests?]: Not yet, I'm working on simplifying a case where this can happen
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: Same steps as above, with accessibility enabled:
1. visit http://www.voetbalzone.nl/social_item.asp?uid=9412
2. accept cookies
3. wait until the page is fully loaded (if no crash occurs, then hard reload the page until a crash happens. Less than five hard reloads should be sufficient.)

[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: I don't think so.
[Why is the change risky/not risky?]: It is one line that was overlooked before.
[String changes made/needed]: None.
Flags: needinfo?(eitan)
Attachment #8888495 - Flags: approval-mozilla-beta?

Comment 23

a year ago
Verified this bug is fixed on 56.0a1 20170729100254 with e10s turned off.
status-firefox56: fixed → verified

Comment 24

a year ago
(In reply to Eitan Isaacson [:eeejay] from comment #22)
> Comment on attachment 8888495 [details] [diff] [review]
> Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov
> 
> Approval Request Comment
> [Feature/Bug causing the regression]: 
> [User impact if declined]: Crashes on certain content when accessibility is
> enabled.
> [Is this code covered by automated tests?]: Not yet, I'm working on
> simplifying a case where this can happen
> [Has the fix been verified in Nightly?]: Yes.
> [Needs manual test from QE? If yes, steps to reproduce]: Same steps as
> above, with accessibility enabled:
> 1. visit http://www.voetbalzone.nl/social_item.asp?uid=9412
> 2. accept cookies
> 3. wait until the page is fully loaded (if no crash occurs, then hard reload
> the page until a crash happens. Less than five hard reloads should be
> sufficient.)
> 
> [List of other uplifts needed for the feature/fix]: None
> [Is the change risky?]: I don't think so.
> [Why is the change risky/not risky?]: It is one line that was overlooked
> before.
> [String changes made/needed]: None.

It turned out the URL was a temporary one. Now it forwards to a permanent URL, which also did crash. Therefore changing the URL accordingly to http://www.voetbalzone.nl/social.asp.

Updated

a year ago
See Also: bug 1376754
Comment on attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

a11y crash fix for 55.0 rc1
Attachment #8888495 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 26

a year ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/c2b9b5bec46f
status-firefox55: affected → fixed
the patch is landed, but it'd be curious to get comment #17 commented anyways (Eitan, that one might slept off the radar).
(In reply to alexander :surkov from comment #27)
> the patch is landed, but it'd be curious to get comment #17 commented
> anyways (Eitan, that one might slept off the radar).

In a new bug, of course :)
crash-stats is showing this signature happening a lot on ESR52 still. Please nominate the patch for approval when you get a chance.
Flags: needinfo?(eitan)
(Assignee)

Comment 30

a year ago
Comment on attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Crashes
Fix Landed on Version: 55
Risk to taking this patch (and alternatives if risky): This should not be risky. It has been in central and beta for a while and has not introduced regressions.
String or UUID changes made by this patch: 

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Flags: needinfo?(eitan)
Attachment #8888495 - Flags: approval-mozilla-esr52?
Comment on attachment 8888495 [details] [diff] [review]
Set eIsNotInDocument to accessible's state when unbinding from doc. r?surkov

high volume crash fix for esr52.4
Attachment #8888495 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+

Comment 32

a year ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-esr52/rev/eeeec9cafc4e
status-firefox-esr52: affected → fixed
You need to log in before you can comment on or make changes to this bug.