Closed Bug 1376754 Opened 7 years ago Closed 7 years ago

Intermittent crash in IPCError-browser | PDocAccessibleParent::AddChildDoc binding to nonexistant proxy!

Categories

(Core :: Disability Access APIs, defect, P1)

55 Branch
All
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- wontfix
firefox55 --- wontfix
firefox56 --- fixed
firefox57 --- verified

People

(Reporter: connect, Assigned: eeejay)

References

()

Details

(Keywords: crash, nightly-community, regression, Whiteboard: aes+)

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-867d880c-f871-4c91-a075-93ac10170628.
=============================================================

Steps to reproduce:

1. visit http://voetbalzone.nl/
2. accept cookies
3. wait until the page is fully loaded (if no crash occurs, then browse pages until a crash happens. Less than five pages should be sufficient.)

Current result: tab crashes

Expected result: no crash

According to mozregression this is caused by Bug 1342433 - onclick changes shouldn't recreate the tree, r=yzen
Has Regression Range: --- → yes
Has STR: --- → yes
Flags: needinfo?(surkov.alexander)
Keywords: regression
OS: Windows 10 → Windows
Hardware: Unspecified → All
(In reply to Arjen de Jager from comment #0)

> According to mozregression this is caused by Bug 1342433 - onclick changes
> shouldn't recreate the tree, r=yzen

I'm surprised to see bug 1342433 is causing crash in a11y mutation event processing, since that bug makes us to fire less events.

Who's keen to take a look at the stack, it's e10s related?
Flags: needinfo?(surkov.alexander)
(In reply to alexander :surkov from comment #1)
> Who's keen to take a look at the stack, it's e10s related?

Because I don't know how to take a look at the stack, I just turned e10s off through Options. Instead of a tab crash, a browser crash occurs intermittently. The corresponding crash reports refer to bug 1376825. I'm not sure bug 1376825 excludes this bug when e10s is turned off.
After more testing regarding bug 1376825 it seems to me this bug and bug 1376825 are somewhat related to each other. The page http://www.voetbalzone.nl/social_item.asp?uid=9412 crashes the tab intermittently with e10s turned on, this bug. The same happens with e10s turned off, except the browser crashes instead, likely bug 1376825. Also, if privacy.trackingprotection.enabled is set to true, no crashes occur at all. On the referenced page a video from Twitter is embedded and appears if privacy.trackingprotection.enabled is set to false (current default).
See Also: → 1376825
It's probably not e10s related. I've narrowed down the regression range of bug 1376825 with MozRegression and it resulted in the same range as this bug.
(In reply to Arjen de Jager from comment #2)
> (In reply to alexander :surkov from comment #1)
> > Who's keen to take a look at the stack, it's e10s related?
> 
> Because I don't know how to take a look at the stack, I just turned e10s off
> through Options. Instead of a tab crash, a browser crash occurs
> intermittently. 

do you have a crash stack at hands for e10s disabled case?
(In reply to alexander :surkov from comment #5)
> (In reply to Arjen de Jager from comment #2)
> > 
> > Because I don't know how to take a look at the stack, I just turned e10s off
> > through Options. Instead of a tab crash, a browser crash occurs
> > intermittently. 
> 
> do you have a crash stack at hands for e10s disabled case?

Where could I find that?
(In reply to Arjen de Jager from comment #6)
> (In reply to alexander :surkov from comment #5)
> > (In reply to Arjen de Jager from comment #2)
> > > 
> > > Because I don't know how to take a look at the stack, I just turned e10s off
> > > through Options. Instead of a tab crash, a browser crash occurs
> > > intermittently. 
> > 
> > do you have a crash stack at hands for e10s disabled case?
> 
> Where could I find that?

it should be listed in about:crashes
(In reply to alexander :surkov from comment #7)
> (In reply to Arjen de Jager from comment #6)
> > (In reply to alexander :surkov from comment #5)
> > > 
> > > do you have a crash stack at hands for e10s disabled case?
> > 
> > Where could I find that?
> 
> it should be listed in about:crashes

Actually I knew that already, my bad.

E10s disabled:

bp-1af7710a-a6a1-4ca2-98bf-b0fac0170717
bp-ab932282-c67f-49cb-ae99-f64270170717
bp-cc2da9f9-e180-4680-b9ca-839d30170717

E10s enabled:

bp-ca7e1de5-ec63-4a9c-aeef-e50210170717
bp-adb71263-2fd4-4cf2-8ec2-fcf760170717
bp-46e64006-c883-4b25-ae71-5f3630170717

Surprisingly the first stack with e10s enabled refers to this bug 1376754. The second and third refer to bug 1376825. Maybe I missed those mixed results earlier.
This bug 1376754 turned out to be unrelated to bug 1376825.

I can still reproduce bug 1376754 fairly easily:

bp-a237bf65-c821-44d4-9b94-7ba8f0170729
bp-139efcd0-dba0-4318-b7b8-511020170729
bp-4909f1d5-39be-4c3a-8cc9-ce9730170729
bp-901eba0a-496d-4d2d-98d4-7aa380170729
bp-44b67195-c48f-4acf-acb9-a9b310170729

It turned out the URL was a temporary one. Now it forwards to a permanent URL, which also does crash. Therefore changing the URL accordingly to http://www.voetbalzone.nl/social.asp.

Do these crash stacks (and those provided in comment #8) contain any useful information?
Flags: needinfo?(surkov.alexander)
See Also: 1376825
It crashes when we try to deliver TextChangeEvent to a parent process. I'm not sure what happens here. Aaron, are you?
Flags: needinfo?(surkov.alexander)
Priority: -- → P1
(In reply to Aaron Klotz [:aklotz] (a11y work receiving priority right now, please send interceptor reviews to dmajor or handyman) from comment #11)
> Not sure.

Could I do anything more to support fixing this crash?
Flags: needinfo?(aklotz)
(In reply to Aaron Klotz [:aklotz] (a11y work receiving priority right now, please send interceptor reviews to dmajor or handyman) from comment #11)
> Not sure.

is it something a11y related or ipc one?
(In reply to alexander :surkov from comment #13)
> is it something a11y related or ipc one?

This is definitely a11y. We'll have to assign someone to look at it.
Whiteboard: aes+
(In reply to Arjen de Jager from comment #12)
> Could I do anything more to support fixing this crash?

Once we have somebody assigned to this, they'll try your steps to reproduce. Hopefully those work for us too.
Flags: needinfo?(aklotz)
Eitan can you take a look at this one (try recreating and debugging)?
Flags: needinfo?(eitan)
Assignee: nobody → eitan
Flags: needinfo?(eitan)
Looks like the outer doc (google ads iframe) is being created in the same refresh tick when the remote inner doc is being constructed. The outer doc's show event is not dispatched yet, so the parent doesn't have a proxy for it yet in DocAccessibleParent::AddSubtree.
oops, I mean DocAccessibleParent::AddChildDoc
I take back my earlier prognosis. Looks like we intentionally process mutation events before adding child docs. This may be an event coalescence bug.
The iframe container's show event is erroneously dropped and not dispatched. This means that the iframe's accessible is never serialized and doesn't exist on the parent's end as I said above.

This happens because an ancestor accessible is reparented, and queues a show+hide event. The hide event is dropped in the coalesce stage, but the accessible is never unset as a hide event target. This causes any show event from descendants of that accessible to be dropped.
I guess we could unset the hide target flag regardless of whether its a needsShutDown event. But not doing so seems less disruptive.
Attachment #8897004 - Flags: review?(surkov.alexander)
Comment on attachment 8897004 [details] [diff] [review]
Remove hide event target flag from accessible when event is dropped. r?surkov

Review of attachment 8897004 [details] [diff] [review]:
-----------------------------------------------------------------

::: accessible/base/NotificationController.cpp
@@ +276,5 @@
>  
>      if (hideEvent->NeedsShutdown()) {
>        mDocument->ShutdownChildrenInSubtree(aEvent->GetAccessible());
> +    } else {
> +      aEvent->GetAccessible()->SetHideEventTarget(false);

I think you are right, and we should unset this bit unconditionally, so please change that. This code overall makes me scary though, r+ keeping fingers crossed.
Attachment #8897004 - Flags: review?(surkov.alexander) → review+
Attachment #8897004 - Attachment is obsolete: true
Attachment #8897644 - Attachment is obsolete: true
Pushed by eisaacson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6240e75545e6
Remove hide event target flag from accessible when event is dropped. r=surkov
https://hg.mozilla.org/mozilla-central/rev/6240e75545e6
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Looks like this could use a Beta approval request when you're comfortable doing so. Not sure about ESR52, will leave that up to your judgement.
Flags: needinfo?(eitan)
Verified this bug is fixed on 57.0a1 20170817100132.
Comment on attachment 8897862 [details] [diff] [review]
Remove hide event target flag from accessible when event is dropped.

Approval Request Comment
[Feature/Bug causing the regression]: e10s
[User impact if declined]: Tab crash with accessibilty+e10s
[Is this code covered by automated tests?]: No.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: 
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]: I don't think so.
[Why is the change risky/not risky?]: It's in Nightly and seems stable.
[String changes made/needed]:
Flags: needinfo?(eitan)
Attachment #8897862 - Flags: review+
Attachment #8897862 - Flags: approval-mozilla-beta?
Comment on attachment 8897862 [details] [diff] [review]
Remove hide event target flag from accessible when event is dropped.

Fix a crash and was verified. Beta56+.
Attachment #8897862 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Do we need this on ESR52 too or can it ride the 56 train?
Flags: needinfo?(eitan)
Comment on attachment 8897862 [details] [diff] [review]
Remove hide event target flag from accessible when event is dropped.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Occasional content crash with accessibility enabled
Fix Landed on Version: 56
Risk to taking this patch (and alternatives if risky): This has had a while in 56 and 57 and no regressions have beein introduced.
String or UUID changes made by this patch: 

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Flags: needinfo?(eitan)
Attachment #8897862 - Flags: approval-mozilla-esr52?
This crash signature seems to have very low volume, 0 on esr in the last week, and you didn't answer the first question in the template.  Care to elaborate?
Flags: needinfo?(eitan)
I'm not surprised this is a low volume crasher. It would only be encountered rarely on accessibility-enabled Linux instances. This is a small cross section of users. But it is after all a crasher. If that doesn't justify an ESR uplift, I'm fine with it not landing there.
Flags: needinfo?(eitan)
Comment on attachment 8897862 [details] [diff] [review]
Remove hide event target flag from accessible when event is dropped.

low volume, let's skip esr on this one
Attachment #8897862 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52-
You need to log in before you can comment on or make changes to this bug.