Open Bug 1387308 Opened 3 years ago Updated 2 months ago

Tree cycle when aria-owns adopts ancestor which isn't yet created

Categories

(Core :: Disability Access APIs, defect, P3)

defect

Tracking

()

Tracking Status
firefox-esr52 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- fix-optional
firefox59 --- ?

People

(Reporter: tsmith, Assigned: eeejay)

References

(Blocks 2 open bugs)

Details

(Keywords: testcase)

Attachments

(2 files, 1 obsolete file)

Attached file test_case.html
This test case will hang the browser (or single process if e10s is enabled)

#0  0x00007fffe00071c7 in mozilla::a11y::Accessible::HasGenericType (this=<optimized out>, aType=mozilla::a11y::eDocument) at src/accessible/generic/Accessible-inl.h:87
#1  0x00007fffe003897c in mozilla::a11y::NotificationController::CoalesceMutationEvents (this=0x6120002050c0) at src/accessible/base/NotificationController.cpp:363
#2  0x00007fffe003adb8 in mozilla::a11y::NotificationController::WillRefresh (this=0x6120002050c0, aTime=...) at src/accessible/base/NotificationController.cpp:821
#3  0x00007fffdd94d08f in nsRefreshDriver::Tick (this=0x618000213880, aNowEpoch=140737488331040, aNowTime=...) at src/layout/base/nsRefreshDriver.cpp:1856
#4  0x00007fffdd95653f in mozilla::RefreshDriverTimer::TickRefreshDrivers (this=0x6080001646b0, aJsNow=1501813345275298, aNow=..., aDrivers=...) at src/layout/base/nsRefreshDriver.cpp:300
#5  0x00007fffdd95630e in mozilla::RefreshDriverTimer::Tick (this=0x6080001646a0, jsnow=1501813345275298, now=...) at src/layout/base/nsRefreshDriver.cpp:321
#6  0x00007fffdd959916 in mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers (this=0x6080001646a0, aTimeStamp=...) at src/layout/base/nsRefreshDriver.cpp:763
#7  0x00007fffdd9589b7 in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver (this=0x6110005d5ec0, aVsyncTimestamp=...) at src/layout/base/nsRefreshDriver.cpp:676
#8  0x00007fffdd9548e8 in mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::ParentProcessVsyncNotifier::Run (this=<optimized out>) at src/layout/base/nsRefreshDriver.cpp:522
#9  0x00007fffd80fbdc1 in nsThread::ProcessNextEvent (this=<optimized out>, aMayWait=<error reading variable: access outside bounds of object referenced via synthetic pointer>, aResult=0x7fffffffb840) at src/xpcom/threads/nsThread.cpp:1446
#10 0x00007fffd8101a01 in NS_ProcessNextEvent (aThread=0x615000047700, aMayWait=<error reading variable: access outside bounds of object referenced via synthetic pointer>) at src/xpcom/threads/nsThreadUtils.cpp:480
#11 0x00007fffd8c67bf6 in mozilla::ipc::MessagePump::Run (this=0x6060000bce00, aDelegate=<optimized out>) at src/ipc/glue/MessagePump.cpp:97
#12 0x00007fffd8bb6e28 in MessageLoop::RunInternal (this=<optimized out>) at src/ipc/chromium/src/base/message_loop.cc:326
#13 0x00007fffd8bb6cba in MessageLoop::Run (this=0x614000001c40) at src/ipc/chromium/src/base/message_loop.cc:299
#14 0x00007fffdd44a71b in nsBaseAppShell::Run (this=<optimized out>) at src/widget/nsBaseAppShell.cpp:156
Depends on: 1368784
Priority: -- → P1
Assignee: nobody → eitan
I don't think this is related to event coalescing. The aria-owns attribute manages to create a cyclical relationship, and then we end up in an infinite loop in ApplyARIAState as the parent/child references are cyclical.
(In reply to Eitan Isaacson [:eeejay] from comment #1)
> The aria-owns attribute
> manages to create a cyclical relationship, and then we end up in an infinite
> loop in ApplyARIAState as the parent/child references are cyclical.

ouch, that's bad, I was sure we detect all possible loops which may be caused by aria-owns stuff
Depends on: 1388062
No longer depends on: 1368784
Attachment #8898010 - Flags: review?(surkov.alexander)
(In reply to Eitan Isaacson [:eeejay] from comment #3)
> Created attachment 8898010 [details] [diff] [review]
> Don't allow ancestors to be created as aria-owned children. r?surkov

could you please give a more detailed description what happens here?
I'll describe the attached test case:

The parent DOM node (id="a") is an inline element, so it initially doesn't get an accessible object. It's child (id="b") is a block node and gets an accessible. When |DoARIAOwnsRelocation| runs it constructs an accessible for "a" and inserts it into "b". That is how we end up with a circular reference.

The circular reference check that currently exists in |DoARIAOwnsRelocation| only works when the child accessible already exists.

My patch checks via DOM that "a" isn't the parent of "b" before constructing the accessible.
I'm curious how do we get into the stack crash? we create an accessible, put events for it into the queue, then process the queue in WillRefresh, then?

It's neat that we do not create an accessible at all, but I'm worried that if we created an accessible, then we put it as a child of aria owned unconditionally, i.e we skip a loop check for it, that we do for all other accessibles. It might be not necessary in most cases but since an accessible tree and a DOM tree may be fairly unsynchronized, then we could run into problems. I'm not sure if I have ideas on top of my head how to address it.

Having said, if no bright ideas, then I'm good to get the patch with XXX comment in it describing the issue. It definitely has to be a fix of this bug.
(In reply to alexander :surkov from comment #6)
> I'm curious how do we get into the stack crash? we create an accessible, put
> events for it into the queue, then process the queue in WillRefresh, then?
> 

I don't understand this question. When is ApplyARIAState called? I can check, but does it matter?

We already know that a child/parent cyclical relationship means we will get in trouble. This patch comes with a test that demonstrates it.

> It's neat that we do not create an accessible at all, but I'm worried that
> if we created an accessible, then we put it as a child of aria owned
> unconditionally, i.e we skip a loop check for it, that we do for all other
> accessibles. It might be not necessary in most cases but since an accessible
> tree and a DOM tree may be fairly unsynchronized, then we could run into
> problems. I'm not sure if I have ideas on top of my head how to address it.
> 

Yeah, this won't catch all cyclical issues.

For example:
<span id="a"><div id="b" aria-owns="c"></div></span>
<div id="c">

$("c").setAttribute("aria-owns", "a");



> Having said, if no bright ideas, then I'm good to get the patch with XXX
> comment in it describing the issue. It definitely has to be a fix of this
> bug.

I have a bright idea. I'll come up with a new patch. In the meantime:
https://www.youtube.com/watch?v=qu_Y1wQ923g
On a side note. I ran into the hang in ApplyARIAState. The signature of this bug makes it look like Tyson ran into it in the coalesce stage which does a lot of parent traversing too.
(In reply to Eitan Isaacson [:eeejay] from comment #7)
> (In reply to alexander :surkov from comment #6)
> > I'm curious how do we get into the stack crash? we create an accessible, put
> > events for it into the queue, then process the queue in WillRefresh, then?
> > 
> 
> I don't understand this question. When is ApplyARIAState called? I can
> check, but does it matter?

probably not, I just not sure why we crash at all. Did the event coalescence destroyed an accessible because of our crazy parent-chain relations or what?

> > Having said, if no bright ideas, then I'm good to get the patch with XXX
> > comment in it describing the issue. It definitely has to be a fix of this
> > bug.
> 
> I have a bright idea.

awesome! should I wait for another path then? :)

> I'll come up with a new patch. In the meantime:
> https://www.youtube.com/watch?v=qu_Y1wQ923g

love it! it totally describes the spirit of our code :)
(In reply to alexander :surkov from comment #9)
> (In reply to Eitan Isaacson [:eeejay] from comment #7)
> > (In reply to alexander :surkov from comment #6)
> > > I'm curious how do we get into the stack crash? we create an accessible, put
> > > events for it into the queue, then process the queue in WillRefresh, then?
> > > 
> > 
> > I don't understand this question. When is ApplyARIAState called? I can
> > check, but does it matter?
> 
> probably not, I just not sure why we crash at all. Did the event coalescence
> destroyed an accessible because of our crazy parent-chain relations or what?

There is no crash. Just an infinite loop: while (parent = child->Parent()
Eitan, the bright idea is on the way (the review requests keeps pinging me, and I'm not sure whether I should decline it)?
Flags: needinfo?(eitan)
I think maybe this is worth landing for now and opening another bug for the wider issue.
Flags: needinfo?(eitan)
Comment on attachment 8898010 [details] [diff] [review]
Don't allow ancestors to be created as aria-owned children. r?surkov

I'll followup with a patch with a comment.
Attachment #8898010 - Attachment is obsolete: true
Attachment #8898010 - Flags: review?(surkov.alexander)
(In reply to Eitan Isaacson [:eeejay] from comment #12)
> I think maybe this is worth landing for now and opening another bug for the
> wider issue.

Can you outline the idea though? I'd rather have something nicer if we can have it. I agree that's a fix, but the code looks unsafe overall.
I first thought I can use TreeWalker to see if aOwner is in childEl's subtree. But that isn't possible.

I then thought of retrieving all adopting parents and checking if any of their associated contents are descendants of childEl.

The problem with the latter approach is that it won't take into account a subtree of childEl that has been relocated (meaning, aOwner may be a child of childEl in layout, but in a11y it has been relocated elsewhere).

I think the best solution would be to modify TreeWalker so it can do that. I'm swamped now with other stuff, so I won't get to it soon.
(In reply to Eitan Isaacson [:eeejay] from comment #12)
> I think maybe this is worth landing for now and opening another bug for the
> wider issue.

It might be not difficult though, all we need is to replace DOM tree traversal on accessible tree traversal, something like

Accessible* child = aOwner;
while (child != aOwner->Document()) {
  if (child->Parent() == childContainer) {
    if (nsContentUtils::ContentIsFlattenedTreeDescendantOf(child->GetContent(), childEl)) {
      // oops, a loop
    }
  }
}

It should fix examples like:

<span id="top">
  <div aria-owns="div">
</span>
<div id="div">
  <div aria-owns="top"></div>
</div>
I think this resolves this specific case - cyclical relationships from yet-to-be-created accessibles. There are other cycle-related issues. I filed bug 1401374 and bug 1401381.
Attachment #8910014 - Flags: review?(surkov.alexander)
Comment on attachment 8910014 [details] [diff] [review]
Don't allow ancestors to be created as aria-owned children. r?surkov

Review of attachment 8910014 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for delay in review, aria-owns things require some time in a row for a review. I commented on the approach, but curious if comment #16 makes any sense to you; if I don't miss anything and if the approach works at all :), then it should be faster than the proposed one.

::: accessible/base/nsAccUtils.cpp
@@ +457,5 @@
> +
> +  if (!doc || doc != aAncestor->Document()) {
> +    // Either accessible is not bound to a doc, or it is not in the same doc as ancestor.
> +    return false;
> +  }

it's good to have a common util functions, but all those condition are presumably false in a document, so it'd be more reasoanble to have assertions for them if you want to than the real checks.

@@ +463,5 @@
> +  Accessible* accessible = aAccessible;
> +  do {
> +    if (accessible == aAncestor)
> +      return true;
> +  } while (accessible != doc && (accessible = accessible->Parent()));

I'm not sure if there are good benefits of moving few lines of code into a separate function, but if we do that, then it makes sense to make sure it's inlined

@@ +487,5 @@
> +  Accessible* accessible = aAccessible;
> +  do {
> +    if (accessible->HasOwnContent() && finder.Seek(accessible->GetContent()))
> +      return true;
> +  } while (accessible != doc && (accessible = accessible->Parent()));

that's almost o(n^2) loop? Seek -> walk from a document to a child for each child until the document
Attachment #8910014 - Flags: review?(surkov.alexander)
Hanging a NI for Eitan. If this proves to be super helpful for perf we will consider uplift.
Flags: needinfo?(eitan)
(In reply to David Bolter [:davidb] (NeedInfo me for attention) from comment #20)
> Hanging a NI for Eitan. If this proves to be super helpful for perf we will
> consider uplift.

while Eitan is on PTO, answering on his behalf, hope it's ok. This bug is not about perf at all, it's about loops, introduced by aria-owns. It'd be nice to upload a fix, if we can get in time to do that safely. We are not yet figured out the best way to fix the bug.
Flags: needinfo?(eitan)
Has Regression Range: --- → no
This issue opened a whole pandora's box of similar problems. We talked about at least 3 different ways to create cyclical parent/child relationships. It's still on my radar. I'll probably get to it next week.
Flags: needinfo?(eitan)
Whiteboard: a11y:crash-tree:aria-owns
I don't quite understand the edge cases discussed here, but I think I fixed the test case in comment 0 in bug 1485097. Certainly, it doesn't hang any more. Should we close this for now or is there unfinished business in this specific bug?
Flags: needinfo?(eitan)
I think these spinoffs relate to some of the edge cases:
+ Bug 1401381 - cyclical child/parent relationship is created when aria-owns is removed.
+ Bug 1401374 - dom-cyclical aria-owns does not resolve correctly

As for this specific bug, one edge case is when the adopted child (which is an ancestor) is not yet created. Consider this markup:

<span id="a">
    <div id="b" aria-owns="container"></div>
</span>
<div id="container"></div>

if aria-owns="a" is set on "container", a is created and the cycle check fails. I'm not in the code, but I can imagine why..
Flags: needinfo?(eitan)
I got that case from the test in my patch above.. looks like this was abandoned after the Toronto work week because I couldn't figure out how to rework our cycle detection to answer all these cases.

(In reply to Eitan Isaacson [:eeejay] from comment #26)

As for this specific bug, one edge case is when the adopted child (which is
an ancestor) is not yet created. Consider this markup:

<span id="a">
<div id="b" aria-owns="container"></div>
</span>
<div id="container"></div>

if aria-owns="a" is set on "container", a is created and the cycle check
fails. I'm not in the code, but I can imagine why..

Here's a snippet for that test case:

data:text/html,<span id="a"><div id="b" aria-owns="container"></div></span><div id="container"></div><script>setTimeout(function() { container.setAttribute("aria-owns", "a"); }, 2000);</script>

The cycle check does indeed fail, but this doesn't seem to cause a hang. Of course, once we're in this state, triggering a hang is quite possible. Even so, since it's not a current real world issue, I'm dropping the priority.

Keywords: hang
Priority: P1 → P3
Summary: Hang in [@ mozilla::a11y::NotificationController::CoalesceMutationEvents] → Tree cycle when aria-owns adopts ancestor which isn't yet created
Whiteboard: a11y:crash-tree:aria-owns
You need to log in before you can comment on or make changes to this bug.