Closed Bug 1570038 Opened 5 years ago Closed 5 years ago

Crash in [@ mozilla::WrapNotNull<T> | mozilla::a11y::DocAccessibleParent::SendParentCOMProxy]

Categories

(Core :: Disability Access APIs, defect, P2)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla70
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- unaffected
firefox68 --- unaffected
firefox69 --- disabled
firefox70 --- fixed

People

(Reporter: marcia, Assigned: Jamie)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-3e18b7ca-7075-4042-9de4-cdfe20190724.

Small volume Windows crash which was first seen when 69 was in nightly: https://bit.ly/2SPuoOL. A few of the crashes I spot checked have Fission enabled.

All crashes have same Moz Crash reason, MOZ_RELEASE_ASSERT(aBasePtr)

Comment:

" This is the 39th time I crashed today and I'm using my Windows10 1903 laptop and I have fission using Firefox nightly 70 64 bit. And now the website crashed in private window."

Top 10 frames of crashing thread:

0 xul.dll static class mozilla::NotNull<RefPtr<IAccessible> > mozilla::WrapNotNull<RefPtr<IAccessible> > mfbt/NotNull.h:153
1 xul.dll mozilla::a11y::DocAccessibleParent::SendParentCOMProxy accessible/ipc/DocAccessibleParent.cpp:772
2 xul.dll mozilla::a11y::DocAccessibleParent::AddChildDoc accessible/ipc/DocAccessibleParent.cpp:566
3 xul.dll mozilla::dom::BrowserParent::RecvPDocAccessibleConstructor dom/ipc/BrowserParent.cpp:1155
4 xul.dll mozilla::dom::PBrowserParent::OnMessageReceived ipc/ipdl/PBrowserParent.cpp:2604
5 xul.dll mozilla::dom::PContentParent::OnMessageReceived ipc/ipdl/PContentParent.cpp:5565
6 xul.dll mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2108
7 xul.dll mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1986
8 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1225
9 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:486

Jamie, sending this to you for investigation.

Flags: needinfo?(jteh)

We're seeing an intermittent crash with a similar signature in automated testing (bug 1557609). I was hoping this was fixed by other work, but apparently not.

Assignee: nobody → jteh
Flags: needinfo?(jteh)
Priority: -- → P2
  1. This is a Fission code path.
  2. The outer doc accessible has both a valid Gecko accessible id and a valid MSAA id.
  3. It has a null COM proxy, which is expected.
  4. Since it has a null COM proxy, ProxyAccessible::GetCOMInterface (called by GetNativeInterface) will call GetIAccessibleFor.
  5. That should call GetRemoteIAccessibleFor, which will walk through remote document accessibles in the same process as the given MSAA id, calling accChild on each with the MSAA id until success.
  6. For some reason, GetNativeInterface is returning null, which (if the above is correct) means GetRemoteIAccessibleFor is returning null. That suggests that accChild failed for this id on all relevant remote documents.
  7. The big question is why. My guess is that the outer doc accessible died, but in that case, it should have been removed from the ProxyAccessible cache. It's possible (likely even?) that we're dealing with a race between the two content processes: the embedder is going away, but the parent hasn't received that yet. Meanwhile, the parent receives this message from the embedded document first.
  8. That said, the comment from the user talks about a page crash; it doesn't mention quickly closing the document, quickly refreshing, etc. So, I don't see why the embedder would be going away in this particular case.

I'm not really sure what circumstances trigger this, so it's hard to figure out the "correct" fix here. I can fix the crash by baling out if the native accessible for the outer doc is null. However, that means the embedded document will end up with no parent. If it doesn't die quickly and the client ends up interacting with that, having a null parent is bad. It won't crash Firefox, but clients will be confused and Firefox will also suspend events for the document until it gets a parent.

I guess it's possible that the outer doc accessible is being recreated (e.g. due to frame reconstruction). We don't currently handle that case. Is that even possible while an iframe is loading? That will require a call to AddChildDoc when the new embedder accessible is sent up to the parent.

I think I'm just going to null check here to deal with the crash and figure out any weirdness when we see it.

Previously, we expected that we'd always be able to get the COM proxy for the parent (outer doc), so we crashed if it was null.
For an out-of-process iframe, this sometimes fails.
That is probably because the outer doc died in the embedder process, but the parent process hasn't received a message to remove it from the ProxyAccessible tree yet.

Pushed by jteh@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7a9fe2316293
When sending the parent COM proxy for a remote document, return early if the parent COM proxy can't be retrieved. r=yzen
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: