Closed Bug 1040735 Opened 10 years ago Closed 10 years ago

crash in mozilla::a11y::Accessible::BindToParent(mozilla::a11y::Accessible*, unsigned int)

Categories

(Core :: Disability Access APIs, defect)

defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla35

People

(Reporter: martijn.martijn, Assigned: surkov)

References

Details

(Keywords: crash, testcase)

Crash Data

Attachments

(2 files, 2 obsolete files)

Attached file testcase
See testcase, which crash Mozilla within 100ms. 
You need to have specialpowers extension installed to see the crash:
http://people.mozilla.org/~mwargers/extensions/specialpowers/specialpowers_20140612.xpi
Or turn on accessibility for the document yourself.


This bug was filed from the Socorro interface and is 
report bp-18c2612f-5cdb-4be7-a4a4-d9db62140718.
=============================================================
0 	XUL 	mozilla::a11y::Accessible::BindToParent(mozilla::a11y::Accessible*, unsigned int) 	obj-firefox/x86_64/dist/include/nsTArray.h
1 	XUL 	mozilla::a11y::Accessible::InsertChildAt(unsigned int, mozilla::a11y::Accessible*) 	accessible/generic/Accessible.cpp
2 	XUL 	mozilla::a11y::AccessibleWrap::InsertChildAt(unsigned int, mozilla::a11y::Accessible*) 	accessible/mac/AccessibleWrap.mm
3 	XUL 	mozilla::a11y::DocAccessible::CacheChildren() 	accessible/generic/Accessible.h
4 	XUL 	mozilla::a11y::DocAccessible::CacheChildrenInSubtree(mozilla::a11y::Accessible*, mozilla::a11y::Accessible**) 	accessible/generic/Accessible.cpp
5 	XUL 	mozilla::a11y::DocAccessible::ProcessContentInserted(mozilla::a11y::Accessible*, nsTArray<nsCOMPtr<nsIContent> > const*) 	accessible/generic/DocAccessible.cpp
6 	XUL 	mozilla::a11y::NotificationController::ContentInsertion::Process() 	accessible/base/NotificationController.cpp
7 	XUL 	mozilla::a11y::NotificationController::WillRefresh(mozilla::TimeStamp) 	accessible/base/NotificationController.cpp
8 	XUL 	nsRefreshDriver::Tick(long long, mozilla::TimeStamp) 	layout/base/nsRefreshDriver.cpp
it crashes in Accessible::BindToParent during child adoption, regardless child adoption is illegal, the crash is because mParent->RemoveChild(this) nullify mParent what makes subsequent mParent->InvalidateChildrenGroupInfo() to crash. It may be sidefixed by bug 1041070.
Depends on: 1041070
(In reply to alexander :surkov from comment #1)
> it crashes in Accessible::BindToParent during child adoption, regardless
> child adoption is illegal, the crash is because mParent->RemoveChild(this)
> nullify mParent what makes subsequent mParent->InvalidateChildrenGroupInfo()
> to crash. It may be sidefixed by bug 1041070.

maybe, but I don't see why we need that InvalidateChildrenGroupInfo call in BindToParent at all, the one in UndinbFromParent should be enough I'd think.
(In reply to Trevor Saunders (:tbsaunde) from comment #2)
> (In reply to alexander :surkov from comment #1)

> maybe, but I don't see why we need that InvalidateChildrenGroupInfo call in
> BindToParent at all, the one in UndinbFromParent should be enough I'd think.

it happens during adoption only what means UnbindFromParent wasn't triggered.

we deal with two problems here that has to be fixed: 1) crash and 2) child adoption
(In reply to alexander :surkov from comment #3)
> (In reply to Trevor Saunders (:tbsaunde) from comment #2)
> > (In reply to alexander :surkov from comment #1)
> 
> > maybe, but I don't see why we need that InvalidateChildrenGroupInfo call in
> > BindToParent at all, the one in UndinbFromParent should be enough I'd think.
> 
> it happens during adoption only what means UnbindFromParent wasn't triggered.

but then mParent wasn't set to null either.

> we deal with two problems here that has to be fixed: 1) crash and 2) child
> adoption

well, fixing the second automatically fixes the first.
(In reply to Trevor Saunders (:tbsaunde) from comment #4)

> > it happens during adoption only what means UnbindFromParent wasn't triggered.
> 
> but then mParent wasn't set to null either.

oh, right, RemoveChild should trigger it, so all we need is to remove that piece

> > we deal with two problems here that has to be fixed: 1) crash and 2) child
> > adoption
> 
> well, fixing the second automatically fixes the first.

for this case only
so the adoption problem is because DocAccessible::RemoveContent gets unexpected container element. The tree looks like

<marque>
  <div> marque impl #1
    <div> marque impl #2
      <div> explicit, hidden, moved in document #3
        <div> explicit, visible #4

when we receive notification that div #3 is removed then we get marque as container and thus our tree update logic fails as we expects div #2 as container.

presumably nsCSSFrameConstructor::ContentRemoved(nsIContent* aContainer is called for wrong container in terms of a11y needs. Perhaps we should do aChild->GetFlattenParent() instead for DocAccessible::RemoveContent call.

Boris, does it sound correct?
ContentRemoved is called with aContainer set to the parentNode in the DOM.

If you want something else, then yes, you need to do that yourself.
Ok, do you know if GetFlattenParent() returns #2 div for explicit #3 div or do I need different method? I need something to traverse up a compound document (similar to discussion from bug 1026047).
GetFlattenedParent on #3 should return #2, yes.
Attached patch patch (obsolete) — Splinter Review
fix a11y tree update
Assignee: nobody → surkov.alexander
Status: NEW → ASSIGNED
Attachment #8460864 - Flags: review?(trev.saunders)
Attachment #8460864 - Flags: review?(bzbarsky)
OS: Mac OS X → All
Comment on attachment 8460864 [details] [diff] [review]
patch

> nsAccessibilityService::ContentRemoved(nsIPresShell* aPresShell,
>-                                       nsIContent* aContainer,
>                                        nsIContent* aChild)
> {
>+  nsIContent* container = aChild->GetFlattenedTreeParent();

why not do that in DocAccessible::ContentRemoved?

r=me if you add the testcase
Attachment #8460864 - Flags: review?(trev.saunders) → review+
(In reply to Trevor Saunders (:tbsaunde) from comment #11)

> >+  nsIContent* container = aChild->GetFlattenedTreeParent();
> 
> why not do that in DocAccessible::ContentRemoved?

container is logged here

> r=me if you add the testcase

test on assertion triggering?
(In reply to alexander :surkov from comment #12)
> (In reply to Trevor Saunders (:tbsaunde) from comment #11)
> 
> > >+  nsIContent* container = aChild->GetFlattenedTreeParent();
> > 
> > why not do that in DocAccessible::ContentRemoved?
> 
> container is logged here

bleh

> > r=me if you add the testcase
> 
> test on assertion triggering?

just checking ti doesn't assert or crash seems enough
Comment on attachment 8460864 [details] [diff] [review]
patch

There's a ton of irrelevant whitespace changes here, presumably from an over-zealous editor configuration.

Please either remove those changes or do them in a separate changeset from substantive changes.
Attachment #8460864 - Flags: review?(bzbarsky) → review-
Attached patch patch2 (obsolete) — Splinter Review
Attachment #8460864 - Attachment is obsolete: true
Attachment #8461604 - Flags: review?(bzbarsky)
Comment on attachment 8461604 [details] [diff] [review]
patch2

>\ No newline at end of file

Should have one.

r=me
Attachment #8461604 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] from comment #9)
> GetFlattenedParent on #3 should return #2, yes.

eventually it doesn't, could it be because #2 has visibility: hidden; style or is it specifics of marquee implementation?
I don't know offhand.
(In reply to Boris Zbarsky [:bz] from comment #18)
> I don't know offhand.

maybe you could recommend somebody who might know, I don't have good idea where to start looking.
Maybe try wchen?
Flags: needinfo?(wchen)
GetFlattenedTreeParent() on #3 should return #2, I've verified this for a basic marquee in gdb.

If this is not the case, I suspect the binding has not been installed on the marquee element by the time you call GetFlattenedTreeParent(), in which case it will just return the same thing as GetParent(), the marquee element. You can see if this is happening by checking if GetParent()->GetXBLBinding() is nullptr and GetParent()->HasFlag(NODE_MAY_BE_IN_BINDING_MNGR) is false. I've noticed that it may take a spin of the event loop for a binding to install, possibly more if XBL resources need to be loaded.

If adoption is involved here, then it may also be something weird such as uninstalling the binding on marquee and calling GetFlattenedTreeParent() before another binding is installed.
Flags: needinfo?(wchen)
It returns explicit parent when a11y receives removal notification from nsCSSFrameConstructor::ContentRemoved (https://hg.mozilla.org/try/annotate/ba8df173201e/layout/base/nsCSSFrameConstructor.cpp#l7738). Could it be that binding gets unattached at this point?
landing the patch https://hg.mozilla.org/integration/mozilla-inbound/rev/8589d9bdd501,

leaving the bug open to figure out what's wrong with flatterParent in this case
Keywords: leave-open
Reverted for B2G gaia-ui-test crashes:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46034515&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=46035590&tree=Mozilla-Inbound

Stack:
07:36:12     INFO -  TEST-START | test_launch_l10n.py TestLaunchL10n.test_launch_by_localised_name
07:36:24     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/b2g/tinderbox-builds/mozilla-inbound-macosx64_gecko/1408110373/en-US/b2g-34.0a1.en-US.mac64.crashreporter-symbols.zip
07:36:40    ERROR -  PROCESS-CRASH | runner.py | application crashed [@ mozilla::a11y::DocAccessible::UpdateTree(mozilla::a11y::Accessible*, nsIContent*, bool)]
07:36:40     INFO -  Crash dump filename: /var/folders/sx/862v_8y156bgd1w91grcm9m800000w/T/tmpBx7rEB/minidumps/F6C38351-DF2A-41A5-BCFC-98E58333FF98.dmp
07:36:40     INFO -  Operating system: Mac OS X
07:36:40     INFO -                    10.8.0 12A269
07:36:40     INFO -  CPU: amd64
07:36:40     INFO -       family 6 model 42 stepping 7
07:36:40     INFO -       8 CPUs
07:36:40     INFO -  Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
07:36:40     INFO -  Crash address: 0x70
07:36:40     INFO -  Thread 0 (crashed)
07:36:40     INFO -   0  XUL!mozilla::a11y::DocAccessible::UpdateTree(mozilla::a11y::Accessible*, nsIContent*, bool) [Accessible-inl.h:8589d9bdd501 : 19 + 0x5]
07:36:40     INFO -      rbx = 0x0000000000000001   r12 = 0x000000010c71d040
07:36:40     INFO -      r13 = 0x000000010c7511c0   r14 = 0x00000001276fba80
07:36:40     INFO -      r15 = 0x000000010c7511c0   rip = 0x0000000102d46174
07:36:40     INFO -      rsp = 0x00007fff5fbfb050   rbp = 0x0000000121e72ca0
07:36:40     INFO -      Found by: given as instruction pointer in context
07:36:40     INFO -   1  XUL!nsCSSFrameConstructor::ContentRemoved(nsIContent*, nsIContent*, nsIContent*, nsCSSFrameConstructor::RemoveFlags, bool*) [nsCSSFrameConstructor.cpp:8589d9bdd501 : 7790 + 0xe]
07:36:40     INFO -      rbx = 0x0000000126888bd0   r12 = 0x000000010c71d040
07:36:40     INFO -      r13 = 0x00000001266c2600   r14 = 0x0000000126888c48
07:36:40     INFO -      r15 = 0x0000000126888bd0   rip = 0x00000001029a6b3c
07:36:40     INFO -      rsp = 0x00007fff5fbfb0c0   rbp = 0x00000001266c2600
07:36:40     INFO -      Found by: call frame info
07:36:40     INFO -   2  XUL!nsCSSFrameConstructor::RecreateFramesForContent(nsIContent*, bool) [nsCSSFrameConstructor.cpp:8589d9bdd501 : 8948 + 0x16]
07:36:40     INFO -      rbx = 0x00000001266d5350   r12 = 0x0000000126888c48
07:36:40     INFO -      r13 = 0x00000001266c2600   r14 = 0x0000000000000001
07:36:40     INFO -      r15 = 0x000000010c71d040   rip = 0x00000001029a1497
07:36:40     INFO -      rsp = 0x00007fff5fbfb140   rbp = 0x00000001266d5470
07:36:40     INFO -      Found by: call frame info
07:36:40     INFO -   3  XUL!nsCSSFrameConstructor::MaybeRecreateContainerForFrameRemoval(nsIFrame*, tag_nsresult*) [nsCSSFrameConstructor.cpp:8589d9bdd501 : 8775 + 0x4]
07:36:40     INFO -      rbx = 0x00007fff5fbfb210   r12 = 0x0000000129fb7ca8
07:36:40     INFO -      r13 = 0x0000000129fb7ca8   r14 = 0x0000000129fb7ca8
07:36:40     INFO -      r15 = 0x0000000129fb7c10   rip = 0x00000001029a7722
07:36:40     INFO -      rsp = 0x00007fff5fbfb180   rbp = 0x000000010c71d040
07:36:40     INFO -      Found by: call frame info
07:36:40     INFO -   4  XUL!nsCSSFrameConstructor::ContentRemoved(nsIContent*, nsIContent*, nsIContent*, nsCSSFrameConstructor::RemoveFlags, bool*) [nsCSSFrameConstructor.cpp:8589d9bdd501 : 7737 + 0xa]
07:36:40     INFO -      rbx = 0x00000001266c2600   r12 = 0x000000010c71d040
07:36:40     INFO -      r13 = 0x00000001266c2600   r14 = 0x0000000129fb7ca8
07:36:40     INFO -      r15 = 0x0000000000000000   rip = 0x00000001029a69b5
07:36:40     INFO -      rsp = 0x00007fff5fbfb1d0   rbp = 0x00000001273adc10
07:36:40     INFO -      Found by: call frame info

remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/5de91aef0d9c
(In reply to alexander :surkov from comment #22)
> It returns explicit parent when a11y receives removal notification from
> nsCSSFrameConstructor::ContentRemoved
> (https://hg.mozilla.org/try/annotate/ba8df173201e/layout/base/
> nsCSSFrameConstructor.cpp#l7738). Could it be that binding gets unattached
> at this point?

it looks so,

flattened parent is expected right before IMPL_MUTATION_NOTIFICATION call in nsNodeUtils::ContentRemoved (http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsNodeUtils.cpp#181) and it returns explicit parent when this macros calls PresShell::ContentRemoved.

Is there a chance to make layout to not break flatten hierarchy before nsCSSFrameConstructor::ContentRemoved is done or that's something has to be handled on a11y side?
Flags: needinfo?(wchen)
What is happening here is that the binding manager has a mutation observer on the marquee and when a child is removed, the insertion parent is cleared on the child causing GetFlattenedTreeParent() to return the explicit parent (http://dxr.mozilla.org/mozilla-central/source/dom/xbl/nsBindingManager.cpp#956). Due to the order in which mutation observers are called (from descendants towards the root), PresShell::ContentRemoved is among the last to be notified.

If we want to have the flattened hierarchy to stay the same until until after nsCSSFrameConstructor::ContentRemoved, we will need some other mechanism to clear the insertion parent after all the mutation observers are called. I'm not sure if that would be more trouble than trying to handle this on the a11y side.
Flags: needinfo?(wchen)
for this particular case we care about flat hierarchy on root only so if PresShell::ContentRemoved is called before the root is unattached then it should work for us. I can think of a11y workaround but it's rather ugly and not supposed to be really performant.
Attached patch patch3Splinter Review
workaround the issue on a11y side
Attachment #8461604 - Attachment is obsolete: true
Attachment #8492273 - Flags: review?(dbolter)
Comment on attachment 8492273 [details] [diff] [review]
patch3

Review of attachment 8492273 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, this looks like it does what I think you want to do. It is worrisome we have to do this but I'm ok if you are okay with it. I wonder if we'll hit cases in the wild where we feel this performance-wise.

(Aside: wchen says 'PresShell::ContentRemoved is among the last to be notified' - I wonder if we should be triggered from somewhere else in the removal code.)

::: accessible/base/nsAccessibilityService.cpp
@@ +407,5 @@
> +    // Flatten hierarchy may be broken at this point so we cannot get a true
> +    // container by traversing up the DOM tree. Find a parent of first accessible
> +    // from the subtree of the given DOM node, that'll be a container. If no
> +    // accessibles in subtree then we don't care about the change.
> +    Accessible* child = document->GetAccessible(aChildNode);

Sort of want a GetAccessibleOrFirstInTree, or GetAccessibleOrFirstChild... but I guess doing that explicitly below makes sense. I wonder how often we'll hit that code. Would be interesting to know how often child->Parent() != aChildNode->GetFlattenedTreeParent().

::: accessible/tests/mochitest/treeupdate/test_bug1040735.html
@@ +14,5 @@
> +  <script type="application/javascript">
> +    function doTest()
> +    {
> +      document.body.appendChild(document.getElementById("mw_a"));
> +      setTimeout(function() { ok(true, "no crash and assertions"); SimpleTest.finish(); } , 0);

Regarding organization of tests, should we have a crash test area/directory? It seems a bit odd to mingle this in with regular tests.

@@ +34,5 @@
> +  </pre>
> +
> +  <marquee>
> +    <div id="mw_a" style="visibility: hidden;">
> +      <div style="visibility: visible;" id="mw_inside"></div>

Curious about 'mw_a' and 'mw_inside' naming. What do they stand for?

::: accessible/tests/mochitest/treeupdate/test_optgroup.html
@@ +79,5 @@
>          this.selectNode.removeChild(this.selectNode.firstChild);
>        }
>  
>        this.eventSeq = [
> +        new invokerChecker(EVENT_REORDER, this.selectList)

Interesting.
Attachment #8492273 - Flags: review?(dbolter) → review+
Comment on attachment 8492273 [details] [diff] [review]
patch3

>+  DocAccessible* document = GetDocAccessible(aPresShell);
>+  if (document) {
>+    // Flatten hierarchy may be broken at this point so we cannot get a true
>+    // container by traversing up the DOM tree. Find a parent of first accessible
>+    // from the subtree of the given DOM node, that'll be a container. If no
>+    // accessibles in subtree then we don't care about the change.
>+    Accessible* child = document->GetAccessible(aChildNode);
>+    if (!child) {
>+      a11y::TreeWalker walker(document->GetContainerAccessible(aChildNode),
>+                              aChildNode, a11y::TreeWalker::eWalkCache);
>+      child = walker.NextChild();
>+    }
>+
>+    if (child) {
>+      document->ContentRemoved(child->Parent(), aChildNode);

that seems broken if aChildNode is inaccessible, and has more than one accessible subtree within it.  For example if the content being removed is
<div role=presentation>
<div a role=button>foobar</div>
<div b role=button>another div</div>
</div>

I guess you'll remove a but not b.
(In reply to Trevor Saunders (:tbsaunde) from comment #30)

> that seems broken if aChildNode is inaccessible, and has more than one
> accessible subtree within it.  For example if the content being removed is
> <div role=presentation>
> <div a role=button>foobar</div>
> <div b role=button>another div</div>
> </div>
> 
> I guess you'll remove a but not b.

in this case container is accessible parent of presentation div (as a parent of accessible 'a' div), so in this particular case the logic is unchanged
(In reply to alexander :surkov from comment #31)
> (In reply to Trevor Saunders (:tbsaunde) from comment #30)
> 
> > that seems broken if aChildNode is inaccessible, and has more than one
> > accessible subtree within it.  For example if the content being removed is
> > <div role=presentation>
> > <div a role=button>foobar</div>
> > <div b role=button>another div</div>
> > </div>
> > 
> > I guess you'll remove a but not b.
> 
> in this case container is accessible parent of presentation div (as a parent
> of accessible 'a' div), so in this particular case the logic is unchanged

Oh, I see this works that's... tricky
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Verified fixed, using 2014-10-02 Nightly.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: