Closed Bug 1712210 Opened 4 years ago Closed 4 years ago

Crash in [@ mozilla::a11y::RemoteAccessibleBase<T>::Shutdown]

Categories

(Core :: Disability Access APIs, defect, P1)

Unspecified
All
defect

Tracking

()

RESOLVED FIXED
90 Branch
Fission Milestone M7a
Tracking Status
firefox-esr78 --- unaffected
firefox88 --- unaffected
firefox89 --- unaffected
firefox90 --- fixed

People

(Reporter: aryx, Assigned: Jamie)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

10 crashes with 7+ installations for Nightly 90.0a1, oldest build ID 20210505215208. 1 crash on Linux, others are Windows.

Crash report: https://crash-stats.mozilla.org/report/index/faf0d6db-7761-4c84-af9a-8224f0210520

MOZ_CRASH Reason: MOZ_DIAGNOSTIC_ASSERT(!IsDoc())

Top 10 frames of crashing thread:

0 xul.dll mozilla::a11y::RemoteAccessibleBase<mozilla::a11y::RemoteAccessible>::Shutdown accessible/ipc/RemoteAccessibleBase.cpp:25
1 xul.dll mozilla::a11y::RemoteAccessibleBase<mozilla::a11y::RemoteAccessible>::Shutdown accessible/ipc/RemoteAccessibleBase.cpp:36
2 xul.dll mozilla::a11y::RemoteAccessibleBase<mozilla::a11y::RemoteAccessible>::Shutdown accessible/ipc/RemoteAccessibleBase.cpp:36
3 xul.dll mozilla::a11y::RemoteAccessibleBase<mozilla::a11y::RemoteAccessible>::Shutdown accessible/ipc/RemoteAccessibleBase.cpp:36
4 xul.dll mozilla::a11y::RemoteAccessibleBase<mozilla::a11y::RemoteAccessible>::Shutdown accessible/ipc/RemoteAccessibleBase.cpp:36
5 xul.dll mozilla::a11y::DocAccessibleParent::RecvHideEvent accessible/ipc/DocAccessibleParent.cpp:216
6 xul.dll mozilla::a11y::PDocAccessibleParent::OnMessageReceived ipc/ipdl/PDocAccessibleParent.cpp:345
7 xul.dll mozilla::dom::PContentParent::OnMessageReceived ipc/ipdl/PContentParent.cpp:6597
8 xul.dll mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2076
9 xul.dll mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal xpcom/threads/TaskController.cpp:766
Severity: -- → S2

As far as I can tell, all crashes have Fission enabled.

While removing a subtree, we end up calling RemoteAccessibleBase::Shutdown on a DocAccessibleParent. This is never supposed to happen; child documents should get handled by a separate code path chosen because the parent is an OuterDoc. So, a document became a child of something that wasn't an OuterDoc, which is very bad.

Looking at one of the crash dumps, I was able to glean this info:

  1. The parent of the document was an image.
  2. The document was an OOP iframe.

My guess is that the following happened:

  1. The OOP iframe DocAccessibleParent got created.
  2. DocAccessibleParent::AddChildDoc was called, but the parent id didn't exist yet because the embedder process hadn't sent the OuterDoc yet. Therefore, it got saved in mPendingChildDocs.
  3. An OuterDocAccessible with that id was somehow never sent to the parent process. I'm not quite sure how this could happen, though. Maybe the OuterDocAccessible was recreated before we could ever send a show event for it? Or maybe it was created and removed before the OOP iframe document was ever created?
  4. Later, another accessible (e.g. an image) reused the id.
  5. At this point, the entry in mPendingChildDocs was found and the OOP iframe document was added to the image created in 4).

To (hopefully) address this, I think we should:

  1. Add a diagnostic assertion ensuring that a document is never added to something which isn't an OuterDoc.
  2. When the embedder accessible for an OOP iframe is changed (BrowserBridgeParent::RecvSetEmbedderAccessible), remove any mPendingChildDocs entry for an associated child doc.

https://gmconline.com.br/ seems to intermittently reproduce this for me after waiting for 20 seconds or so.

Assignee: nobody → jteh
Blocks: a11y-fission
Fission Milestone: --- → ?
Priority: -- → P1

Previously, if an OuterDoc was never sent to the parent process and its id was reused later, we ended up adding the document to that accessible, which usually wasn't even an OuterDoc.
Alongside the actual fix, add some assertions to make breakage in this area easier to debug in future.

As bad luck would have it, I now can't reproduce this (even in Nightly) with https://gmconline.com.br/ or any of the other sites in crash reports, so I can't verify the fix. :(

Pushed by jteh@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/01692079bf96 Remove a pending child doc addition (if any) when the embedder accessible for an OOP iframe changes. r=eeejay
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 90 Branch

That crash report shows the build is a week or so out of date. Could you please manually check for updates? Thanks.

Flags: needinfo?(lomombwlo)

(In reply to James Teh [:Jamie] from comment #9)

That crash report shows the build is a week or so out of date. Could you please manually check for updates? Thanks.

That's impossible.
https://github.com/lomomcat/FirefoxNightCrashes/raw/main/Snipaste_2021-05-29_00-10-06.png

Flags: needinfo?(lomombwlo)

The crash report you linked in comment 7 was this one:
https://crash-stats.mozilla.org/report/index/8972eee1-e210-429a-b95d-dc6290210521
If you take a look at the build id listed in that crash report, it shows 20210520095745, a build from 20 May. Perhaps the wrong crash report link was accidentally provided?

Furthermore, I can't find any crash reports for this crash in builds after the patch landed. See the Crash-Stop data at the top of this bug.

Blocks: 1713680
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: