Closed Bug 881636 Opened 6 years ago Closed 6 years ago

crash in mozilla::a11y::DocAccessible::UpdateTree

Categories

(Core :: Disability Access APIs, defect, critical)

24 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla27
Tracking Status
firefox23 --- unaffected
firefox24 --- wontfix
firefox25 + verified
firefox26 + verified
firefox27 --- verified
firefox-esr24 --- wontfix

People

(Reporter: scoobidiver, Assigned: tbsaunde)

References

(Blocks 1 open bug)

Details

(4 keywords)

Crash Data

Attachments

(2 files, 1 obsolete file)

It first showed up in 24.0a1/20130610. The regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=efbe547a7972&tochange=252b1ac4d537
It's likely a regression from bug 852150.

Signature 	mozilla::a11y::DocAccessible::UpdateTree(mozilla::a11y::Accessible*, nsIContent*, bool) More Reports Search
UUID	0681933a-7be2-459d-aaa6-ca0582130611
Date Processed	2013-06-11 00:39:05
Uptime	66
Last Crash	1.2 minutes before submission
Install Age	17.7 minutes since version was first installed.
Install Time	2013-06-11 00:21:15
Product	Firefox
Version	24.0a1
Build ID	20130610031147
Release Channel	nightly
OS	Windows NT
OS Version	6.2.9200
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 58 stepping 9
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x14
App Notes 	
AdapterVendorID: 0x8086, AdapterDeviceID: 0x0166, AdapterSubsysID: 108d1043, AdapterDriverVersion: 9.17.10.2875
D2D? D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ 
Processor Notes 	sp-processor04_phx1_mozilla_com_24874:2012
EMCheckCompatibility	True
Adapter Vendor ID	0x8086
Adapter Device ID	0x0166
Total Virtual Memory	4294836224
Available Virtual Memory	3628269568
System Memory Use Percentage	52
Available Page File	2475212800
Available Physical Memory	1978818560
Accessibility	Active

Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::a11y::DocAccessible::UpdateTree 	accessible/src/generic/DocAccessible.cpp:1760
1 	xul.dll 	mozilla::a11y::DocAccessible::ContentRemoved 	accessible/src/generic/DocAccessible.cpp:1361
2 	xul.dll 	mozilla::a11y::DocAccessible::RecreateAccessible 	accessible/src/generic/DocAccessible.cpp:1380
3 	xul.dll 	mozilla::a11y::DocAccessible::UpdateAccessibleOnAttrChange 	accessible/src/generic/DocAccessible.cpp:1645
4 	xul.dll 	mozilla::a11y::DocAccessible::AttributeChanged 	accessible/src/generic/DocAccessible.cpp:885
5 	xul.dll 	nsNodeUtils::AttributeChanged 	content/base/src/nsNodeUtils.cpp:106
6 	xul.dll 	mozilla::dom::Element::SetAttrAndNotify 	content/base/src/Element.cpp:1831
7 	xul.dll 	nsGenericHTMLElement::SetAttr 	content/html/content/src/nsGenericHTMLElement.cpp:972
8 	xul.dll 	mozilla::dom::HTMLAnchorElement::SetAttr 	content/html/content/src/HTMLAnchorElement.cpp:366
9 	xul.dll 	mozilla::dom::Element::SetAttribute 	content/base/src/Element.cpp:750
10 	xul.dll 	mozilla::dom::ElementBinding::setAttribute 	obj-firefox/dom/bindings/ElementBinding.cpp:254
11 	xul.dll 	mozilla::dom::ElementBinding::genericMethod 	obj-firefox/dom/bindings/ElementBinding.cpp:1980
12 		@0xa6df64 	

More reports at:
https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Aa11y%3A%3ADocAccessible%3A%3AUpdateTree%28mozilla%3A%3Aa11y%3A%3AAccessible*%2C+nsIContent*%2C+bool%29
Keywords: testcase-wanted
I think what has to be happening here is that we're looking to see which children of an accessible need to be removed and one of them returns null when we call GetContent() on it, but I can't see how that can happen and ignoring accessibles in that situation seems risky to me.  If we have to I guess we can add assert and hope fuzzing catches it.
It's a bit early days but this is currently #10 in Firefox 25.0b1.
https://crash-stats.mozilla.com/topcrasher/products/Firefox/versions/25.0b1
Keywords: topcrash
(In reply to Trevor Saunders (:tbsaunde) from comment #1)
> I think what has to be happening here is that we're looking to see which
> children of an accessible need to be removed and one of them returns null
> when we call GetContent() on it, but I can't see how that can happen and
> ignoring accessibles in that situation seems risky to me.  If we have to I
> guess we can add assert and hope fuzzing catches it.

That's all we can do I think. We can remove those accessible though. Trev, do you have a patch?
(In reply to David Bolter [:davidb] from comment #4)
> Top URLs:
> 346 http://www.tmz.com/
> 136 http://www.arcor.de/
> 27  http://www.tmz.com/page/2/
> 17  http://www.tokyo-gas.co.jp/

A few of the comments also mention TMZ.com, I'll try banging on that and see if anything shakes out.
Yep thanks Anthony, you have a way of instantiating our a11y engine right?
(In reply to David Bolter [:davidb] from comment #6)
> Yep thanks Anthony, you have a way of instantiating our a11y engine right?

Umm, no, how can I do that?
Probably simplest to run Windows Narrator.
(In reply to David Bolter [:davidb] from comment #8)
> Probably simplest to run Windows Narrator.

Thanks, I was able to crash with Firefox 24.0b10 on Windows 7 32-bit on TMZ.com with Windows Narrator running. I just kept tabbing around the page while it was loading and it crashed in a minute or so. I'll see if I can find a
Keywords: reproducible
I just crashed again without Windows Narrator running -- maybe this doesn't need a11y software to trigger?
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #10)
> I just crashed again without Windows Narrator running -- maybe this doesn't
> need a11y software to trigger?

Nevermind this, I haven't been able to reproduce the crash without Narrator running.
Last Good: Firefox 24.0a1 2013-06-09 (efbe547a7972)
First Bad: Firefox 24.0a1 2013-06-10 (252b1ac4d537)
Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=efbe547a7972&tochange=252b1ac4d537

Looking at the above push log, bug 852150 seems like a good candidate.

Note that on Nightly 24.0a1 the signature is different, for example:
> @ xul.dll@0x7abcd8 | xul.dll@0x7abd8d | xul.dll@0x7abde7 | xul.dll@0x7abe6e | xul.dll@0x7ac1c4 | xul.dll@0x2968c7 | xul.dll@0x22d8b2 | xul.dll@0x344ef8 | xul.dll@0x2fb1e8 | xul.dll@0x22c49c | xul.dll@0x5ca172 | xul.dll@0x5cc658 | mozjs.dll@0x192dc8 ]
> https://crash-stats.mozilla.com/report/index/68765323-b871-437b-8984-125d22130919

It still only crashes with Narrator running so I think it's likely the same crash.
Flags: needinfo?(trev.saunders)
(In reply to Anthony Hughes, Mozilla QA (:ashughes) from comment #12)
> Last Good: Firefox 24.0a1 2013-06-09 (efbe547a7972)
> First Bad: Firefox 24.0a1 2013-06-10 (252b1ac4d537)
> Pushlog:
> http://hg.mozilla.org/mozilla-central/
> pushloghtml?fromchange=efbe547a7972&tochange=252b1ac4d537
> 
> Looking at the above push log, bug 852150 seems like a good candidate.

yes, see comment 0

interestingly I don't seem to be able reproduce this on linxu, guess I'll look at this more tomorrow.
Flags: needinfo?(trev.saunders)
(In reply to Trevor Saunders (:tbsaunde) from comment #13)
> interestingly I don't seem to be able reproduce this on linxu, guess I'll
> look at this more tomorrow.

Looking at https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Aa11y%3A%3ADocAccessible%3A%3AUpdateTree%28mozilla%3A%3Aa11y%3A%3AAccessible*%2C+nsIContent*%2C+bool%29 all of these crashes are Windows only and mostly with Windows 7 32-bit which is where I reproduced this and narrowed down the range. I hope that helps.
Attached the simple testcase.
Mmm, I could not specify the content type of attachment file to text/html...
Attachment #807570 - Attachment mime type: text/plain → text/html
Trevor please let us know if you can recreate this on windows.
Assignee: nobody → trev.saunders
(In reply to David Bolter [:davidb] from comment #18)
> Trevor please let us know if you can recreate this on windows.

actually, I don't need to, the test case is enough.

What happens here is href on the inaccessible <a> element changes, and we get notified of that.
We may need to recreate or create an accessible, so we call RecreateAccessible which calls ContentRemoved()
That calls GetAccessibleOrContainer() on  the <a> element find the nearest accessible is the HTMLWinObjectOwner thing.
So we end up at DocAccessible.cpp:1782
Then the HTMLWinObjectOwner thing has one child the ObjectAccessible which has null mContent.
We see if we need to remove that child for this update and so get null nsINode for the ObjectAccessible, and that's what we crash trying to deref.

I guess the easiest / safest fix is to special case ObjectAccessible / the atk equivelent thing I think at DocAccessible.cpp:1785 or so, but nicer ideas are welcome.
it seems we should have HasOwnContent() check there
(In reply to alexander :surkov from comment #20)
> it seems we should have HasOwnContent() check there

I'm sort of worried about things that have null mContent, but shouldn't but that's probably silly.
HasOwnContent() is probably not right since it should break HTML selects which has artificial listbox control accessible. So mContent check is what we need. That should be ok, hopefully we don't run into null mContent when it shouldn't be null.
(In reply to alexander :surkov from comment #22)
> HasOwnContent() is probably not right since it should break HTML selects
> which has artificial listbox control accessible. So mContent check is what
> we need. That should be ok, hopefully we don't run into null mContent when
> it shouldn't be null.

I think HTMLSelect might actually be ok, HTMLSelectList is artificial, but options / optgroup are not so updating sub tree under HTMLSelectListAccessible should work and afaik  HTMLSelectAccessible will always be the child we get if modification is outside of select.  HasOwnContent() feels a little more correct, but just null check should be slightly faster, and a bit safer so I guess we can do that.
ok, we dont' have a case when shared node accessible contains inaccessible aChild having accessible children, right? No way to put an inaccessible node between options and select?
(In reply to alexander :surkov from comment #24)
> ok, we dont' have a case when shared node accessible contains inaccessible
> aChild having accessible children, right? No way to put an inaccessible node
> between options and select?

HTMLComboboxAccessible will only cache the HTMLSelectListAccessible as a child so I think so.
As this is understood now, how usable is Firefox 24 with a11y solutions without this fix? If it's a huge problem, we may need to ponder if it qualifies for shipping a point-release to fix it.
Comment on attachment 807875 [details] [diff] [review]
bug 881636 - don't try and Update accessibles without there own content in DocAccessible::UpdateTree

Review of attachment 807875 [details] [diff] [review]:
-----------------------------------------------------------------

::: accessible/src/generic/DocAccessible.cpp
@@ +1783,5 @@
>        for (uint32_t idx = 0; idx < aContainer->ContentChildCount();) {
>          Accessible* child = aContainer->ContentChildAt(idx);
> +
> +        // If accessible doesn't have its own content then we assume parent
> +        // will handle its update.

I wish we had an assertion (no ideas which sorry). I afraid either crazy HTML or crazy XHTML break this rule. Taking into account importance of the bug r=me

::: accessible/tests/mochitest/elm/test_plugin.html
@@ +41,5 @@
> +                         { EMBEDDED_OBJECT: [ { NOTHING: [] } ] });
> +
> +      getNode("fallback1").setAttribute("href", "5");
> +      getNode("fallback2").setAttribute("href", "5");
> +      SimpleTest.executeSoon(function () { SimpleTest.finish(); });

it'd be great to have a comment what you test here, on a side sight it can look as piece of crazy code (setting DOM attributes and finish the test)
Attachment #807875 - Flags: review?(surkov.alexander) → review+
(In reply to alexander :surkov from comment #28)
> Comment on attachment 807875 [details] [diff] [review]
> bug 881636 - don't try and Update accessibles without there own content in
> DocAccessible::UpdateTree
> 
> Review of attachment 807875 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: accessible/src/generic/DocAccessible.cpp
> @@ +1783,5 @@
> >        for (uint32_t idx = 0; idx < aContainer->ContentChildCount();) {
> >          Accessible* child = aContainer->ContentChildAt(idx);
> > +
> > +        // If accessible doesn't have its own content then we assume parent
> > +        // will handle its update.
> 
> I wish we had an assertion (no ideas which sorry). I afraid either crazy
> HTML or crazy XHTML break this rule. Taking into account importance of the
> bug r=me

what would you want to assert?  Can you think of any alternatives?

> ::: accessible/tests/mochitest/elm/test_plugin.html
> @@ +41,5 @@
> > +                         { EMBEDDED_OBJECT: [ { NOTHING: [] } ] });
> > +
> > +      getNode("fallback1").setAttribute("href", "5");
> > +      getNode("fallback2").setAttribute("href", "5");
> > +      SimpleTest.executeSoon(function () { SimpleTest.finish(); });
> 
> it'd be great to have a comment what you test here, on a side sight it can
> look as piece of crazy code (setting DOM attributes and finish the test)

ok
(In reply to Trevor Saunders (:tbsaunde) from comment #29)
> > > +        // If accessible doesn't have its own content then we assume parent
> > > +        // will handle its update.
> > 
> > I wish we had an assertion (no ideas which sorry). I afraid either crazy
> > HTML or crazy XHTML break this rule. Taking into account importance of the
> > bug r=me
> 
> what would you want to assert?  Can you think of any alternatives?

a case when removed inaccessible element contain accessibles that are children of !HasOwnContent() accessible
(In reply to alexander :surkov from comment #30)
> (In reply to Trevor Saunders (:tbsaunde) from comment #29)
> > > > +        // If accessible doesn't have its own content then we assume parent
> > > > +        // will handle its update.
> > > 
> > > I wish we had an assertion (no ideas which sorry). I afraid either crazy
> > > HTML or crazy XHTML break this rule. Taking into account importance of the
> > > bug r=me
> > 
> > what would you want to assert?  Can you think of any alternatives?
> 
> a case when removed inaccessible element contain accessibles that are
> children of !HasOwnContent() accessible

Well, I'd think we could assert that if we wanted to, it would just be slow, well actually maybe not, I'm not sure there's ever a big sub tree under one of those elements.
(In reply to Trevor Saunders (:tbsaunde) from comment #31)
> (In reply to alexander :surkov from comment #30)
> > (In reply to Trevor Saunders (:tbsaunde) from comment #29)
> > > > > +        // If accessible doesn't have its own content then we assume parent
> > > > > +        // will handle its update.
> > > > 
> > > > I wish we had an assertion (no ideas which sorry). I afraid either crazy
> > > > HTML or crazy XHTML break this rule. Taking into account importance of the
> > > > bug r=me
> > > 
> > > what would you want to assert?  Can you think of any alternatives?
> > 
> > a case when removed inaccessible element contain accessibles that are
> > children of !HasOwnContent() accessible
> 
> Well, I'd think we could assert that if we wanted to, it would just be slow,
> well actually maybe not, I'm not sure there's ever a big sub tree under one
> of those elements.

getting some extra slow under debug is probably fine, I just worry we can hit this case and tree will fail to update, it'd be obviously edge case hard to catch in wild life
I pushed this so we can get it tested, we can sort out asserts later.
https://hg.mozilla.org/integration/mozilla-inbound/rev/1d1bf837e723
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #27)
> As this is understood now, how usable is Firefox 24 with a11y solutions
> without this fix? If it's a huge problem, we may need to ponder if it
> qualifies for shipping a point-release to fix it.

FWIW, this signature is currently #13 in Firefox 24.0 crashes (0.59%) and has dropped 7 positions in the last week.
https://crash-stats.mozilla.com/topcrasher/products/Firefox/versions/24.0?days=7
https://hg.mozilla.org/mozilla-central/rev/1d1bf837e723
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Blocks: 888531
I've been unable to trigger this crash with Firefox 27.0a2 2013-09-25 using my original steps to reproduce. However we should give this at least a few days of bake time so we can collect crash-stats before evaluating for uplift.
Status: RESOLVED → VERIFIED
There are no crashes for this on Firefox 27 crashstats anymore. Trevor, feel free to nominate this for uplift.
Trevor, do you plan on nominating this for uplift or will this ride the trains with Firefox 27?
Flags: needinfo?(trev.saunders)
I don't think its too scary lets try and back port it
Flags: needinfo?(trev.saunders)
Comment on attachment 807875 [details] [diff] [review]
bug 881636 - don't try and Update accessibles without there own content in DocAccessible::UpdateTree

[Approval Request Comment]
Bug caused by (feature/regressing bug #):  852150
User impact if declined:  crashes on sites that use plugins
Testing completed (on m-c, etc.):  its been on m-c for a while
Risk to taking this patch (and alternatives if risky):  low, its possible that some accessible content wouldn't get updated, but that  this almost only changes what happens when we would have crashed, and its been tested for a while on m-c and no regressions have been found.
String or IDL/UUID changes made by this patch: none
Attachment #807875 - Flags: approval-mozilla-beta?
Attachment #807875 - Flags: approval-mozilla-aurora?
(In reply to Trevor Saunders (:tbsaunde) from comment #40)
> Testing completed (on m-c, etc.):  its been on m-c for a while

Note to Release Management, QA has also tested and verified this is fixed on m-c.
Comment on attachment 807875 [details] [diff] [review]
bug 881636 - don't try and Update accessibles without there own content in DocAccessible::UpdateTree

Swapping a crash for something better in support of a topcrash is the right  thing to do.
Attachment #807875 - Flags: approval-mozilla-beta?
Attachment #807875 - Flags: approval-mozilla-beta+
Attachment #807875 - Flags: approval-mozilla-aurora?
Attachment #807875 - Flags: approval-mozilla-aurora+
(even though it's the second to last beta today)
Keywords: verifyme
The signature here is currently non-existent on Nightly, #72 on Aurora, and non-existent on 25.0b9. Calling this verified fixed based on this data.
You need to log in before you can comment on or make changes to this bug.