Closed Bug 1325258 Opened 5 years ago Closed 4 years ago

Crash in IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor

Categories

(Core :: Disability Access APIs, defect)

Unspecified
All
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 1341731
mozilla53
Tracking Status
firefox53 + fixed

People

(Reporter: ting, Assigned: tbsaunde)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: aes+)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-6467fa5e-5e97-416d-863f-d8d1f2161220.
=============================================================
Top #8 for Nightly 20161218030213 on Windows, 17 crashes from 7 installations. The first appearance was on 20161117030212.

Could this also be a fallout from bug 1314707?
Flags: needinfo?(aklotz)
These reports are weird. All of the ones that I've selected so far contains stacks that have nothing to do with a11y. Not sure if this is a misclassification or something else; I'll have to examine a few raw dumps.
Flags: needinfo?(aklotz)
Ah, looks like crash-stats is showing the stack from the wrong process...
Yeah, basically it's a IPC_FAIL_NO_REASON(this) from TabParent::RecvPDocAccessibleConstructor().
Whiteboard: aes+
#7 topcrash in Nightly 20161230030205. davidb, any idea who can look at this? It would be nice to get a11y crashes out of the Nightly top 10 list, but this bug and bug 1324863 are preventing that...
Flags: needinfo?(dbolter)
Aaron, do you know what's happening here?
Flags: needinfo?(dbolter) → needinfo?(aklotz)
TabParent::RecvPDocAccessibleConstructor has been called with a non-null aParentDoc but aParentID is 0.

That isn't supposed to happen, so we return failure.

How we got into that state is a whole other question.
Flags: needinfo?(aklotz)
(In reply to Aaron Klotz [:aklotz] from comment #6)
> How we got into that state is a whole other question.

I think this can happen in OuterDocAccessible::Shutdown() when the code from bug 862863 executes.

The child doc's parent has been nulled out due to RemoveChild() but then we go and rebind the child.
See comment 7. IPC doesn't like us rebinding a child doc whose parent was already nulled out. Any ideas?
Flags: needinfo?(surkov.alexander)
Trev, since surkov is away would you mind taking commenting on this? See comment 7.
Flags: needinfo?(surkov.alexander) → needinfo?(tbsaunde+mozbugs)
(In reply to Aaron Klotz [:aklotz] from comment #7)
> (In reply to Aaron Klotz [:aklotz] from comment #6)
> > How we got into that state is a whole other question.
> 
> I think this can happen in OuterDocAccessible::Shutdown() when the code from
> bug 862863 executes.
> 
> The child doc's parent has been nulled out due to RemoveChild() but then we
> go and rebind the child.

so, that should cause us to send a BindChildDoc message from NotificationController.cpp:855 id being 0 there would be very odd.  I guess we can turn that assert into a diagnostic assert and see what happens though.
Flags: needinfo?(tbsaunde+mozbugs)
Assignee: nobody → tbsaunde+mozbugs
Flags: needinfo?(jmathies)
Pushed by tsaunders@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb4089cd9b19
make some asserts diagnostic asserts to investigate crashes
https://hg.mozilla.org/mozilla-central/rev/bb4089cd9b19
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Reopening because the patch that landed just adds diagnostic assertions, it doesn't fix the problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Added the crash signature from diagnostic asserts.
Crash Signature: [@ IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor ] → [@ IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor ] [@ mozilla::a11y::NotificationController::WillRefresh ]
So the diagnostic asserts show that the immediate problem is the parent document doesn't have an ipc actor.  I'm not really sure why that is, but hopefully it is because a11y was started after the child process, and the parent document were created.  In that case we just need to deal with that case which should be easy enough.
Duplicate of this bug: 1325244
Flags: needinfo?(jmathies)
This is the #2 topcrash in Nightly 20170113030227, with 186 crashes at time of writing.
(In reply to Nicholas Nethercote [:njn] from comment #17)
> This is the #2 topcrash in Nightly 20170113030227, with 186 crashes at time
> of writing.

To clarify: the "IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor" signature is #2. The "mozilla::a11y::NotificationController::WillRefresh" is #5.
(In reply to Trevor Saunders (:tbsaunde) from comment #15)
> So the diagnostic asserts show that the immediate problem is the parent
> document doesn't have an ipc actor.  I'm not really sure why that is, but
> hopefully it is because a11y was started after the child process, and the
> parent document were created.  In that case we just need to deal with that
> case which should be easy enough.

Yura, Trevor mentioned bug 1329977 is the hope for were this fix would happen. Can you make that a top priority? (Ask for help if needed)
Flags: needinfo?(yzenevich)
Flags: needinfo?(yzenevich)
^ that was supposed to be a thumbs up emoji
Tracking 53+ for this top crash.
Depends on: 1329977
I talked with aklotz about this yesterday or the day before, and we are now less convinced the issue in bug 1329977 is the problem here.  I think the patch landed in bug 1325834 may actually be more for this than that bug, but I don't know if we have data yet to say if that is true.
I get this crash (and the other similar a11y crashes that appeared around the same time) fairly often in Nightly, on my Surface Pro machine. It seems to most often happen shortly after I close a nytimes.com page, though I can't reproduce it consistently.
First signature no longer occurs. The WillRefresh signature is still showing up. Happen on Windows and Linux. Currently #27 in a windows only top crash list, so not high volume.
OS: Windows 10 → All
Whiteboard: aes+
(In reply to Jim Mathies [:jimm] from comment #25)
> First signature no longer occurs. The WillRefresh signature is still showing
> up. Happen on Windows and Linux. Currently #27 in a windows only top crash
> list, so not high volume.

WillRefresh signature is logged as bug 1330484 with a test case.
No longer depends on: 1329977
up to #9 top crash now. re-adding tracking.
Whiteboard: aes+
See Also: → 1330484
Trevor - top-crash regression - any idea when you'll get to this?
Flags: needinfo?(tbsaunde+mozbugs)
seems like it may be the same as 1341731?  I just wrote a patch for that.
Flags: needinfo?(tbsaunde+mozbugs)
Depends on: 1341731
(In reply to Trevor Saunders (:tbsaunde) from comment #29)
> seems like it may be the same as 1341731?  I just wrote a patch for that.

I'm going to mark this as a duplicate of bug 1341731 because both signatures are similar and have disappeared from the crash lists, which is good!

But the "mozilla::a11y::NotificationController::WillRefresh" signature is still showing up. We have bug 1330484 for that.
Status: REOPENED → RESOLVED
Closed: 5 years ago4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1341731
Hi :tbsaunde,
I observed that the [@ IPCError-browser | PBrowserParent::RecvPDocAccessibleConstructor ] signature is shown up in 53 but the crash in 53 is unaffected in bug 1341731. Is it correct?
Flags: needinfo?(tbsaunde+mozbugs)
(In reply to Gerry Chang [:gchang] from comment #31)
> Hi :tbsaunde,
> I observed that the [@ IPCError-browser |
> PBrowserParent::RecvPDocAccessibleConstructor ] signature is shown up in 53
> but the crash in 53 is unaffected in bug 1341731. Is it correct?

I'm not sure what you are trying to ask.

e10s + a11y should be disabled on windows there, if we're still seeing this crash there I guess there are things we could try backporting to fix this.
Flags: needinfo?(tbsaunde+mozbugs)
If it's being seen on pre-55 releases, that is due to people force-enabling e10s.
Fixed in the duplicate bug in 54, and I don't see any crashes on 53 beta 1.
You need to log in before you can comment on or make changes to this bug.