Closed Bug 1849230 Opened 1 year ago Closed 1 year ago

Intermittent application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | single tracking bug

Categories

(Core :: Disability Access APIs, defect)

defect

Tracking

()

RESOLVED FIXED
121 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox119 --- wontfix
firefox120 --- wontfix
firefox121 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: eeejay)

References

Details

(Keywords: crash, intermittent-failure, intermittent-testcase)

Crash Data

Attachments

(1 file)

Filed by: ayeddi [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=426377360&repo=try
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/F-czDvO5RZ-6G0BtkGdFCg/runs/0/artifacts/public/logs/live_backing.log


Failed twice in a row while running Try with  `--enable-a11y-checks` flag 

Failed twice while running of the Linux 18.04 x64 WebRender opt Mochitests with accessibility checks enabled test-linux1804-64-qr/opt-mochitest-devtools-chrome-a11y-checks dt1 while testing WIP patch https://phabricator.services.mozilla.com/D118126

PROCESS-CRASH | application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | devtools/client/webconsole/test/browser/browser_webconsole_console_logging_workers_api.js
Log

Severity: -- → S3
Priority: -- → P3

New crashes on Try (devtools a11y-checks job):

  1. PROCESS-CRASH | application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | devtools/client/inspector/rules/test/browser_rules_color_scheme_simulation.js
  2. PROCESS-CRASH | application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | devtools/client/webconsole/test/browser/browser_webconsole_console_logging_workers_api.js
  3. PROCESS-CRASH | application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | devtools/client/debugger/test/mochitest/browser_dbg-features-source-tree.js
  4. PROCESS-CRASH | application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | devtools/client/framework/test/browser_toolbox_meatball.js

new crashes (try):

CRASH	devtools/client/webconsole/test/browser/browser_webconsole_split_persist.js 
	application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH]

and from 10/6 run (try):

CRASH	devtools/client/webconsole/test/browser/browser_webconsole_split_persist.js 
	application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH]

This seems to happen outside of devtools as well, but I see accessibility related objects in the trace

PROCESS-CRASH | application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | browser/components/urlbar/tests/browser/browser_searchMode_setURI.js

[task 2023-10-07T14:53:25.086Z] 14:53:25     INFO -  0  firefox-bin!MOZ_Crash(char const*, int, char const*) [Assertions.h:ef73c2e8cd831f2a46ff5704623db63232fd912d : 281]
[task 2023-10-07T14:53:25.086Z] 14:53:25     INFO -     Found by: inlining
[task 2023-10-07T14:53:25.087Z] 14:53:25     INFO -  1  firefox-bin!mozilla::detail::InvalidArrayIndex_CRASH(unsigned long, unsigned long) [Assertions.cpp:ef73c2e8cd831f2a46ff5704623db63232fd912d : 50 + 0x0]
[task 2023-10-07T14:53:25.087Z] 14:53:25     INFO -      rax = 0x00005637f583d830    rdx = 0x0000000000000001
[task 2023-10-07T14:53:25.088Z] 14:53:25     INFO -      rcx = 0x0000000000000022    rbx = 0x00007f7d47f94790
[task 2023-10-07T14:53:25.088Z] 14:53:25     INFO -      rsi = 0x00005637f577888d    rdi = 0x00007fff86269bf0
[task 2023-10-07T14:53:25.089Z] 14:53:25     INFO -      rbp = 0x00007fff8626a040    rsp = 0x00007fff8626a040
[task 2023-10-07T14:53:25.089Z] 14:53:25     INFO -       r8 = 0x00000000ffffffff     r9 = 0x0000000000000005
[task 2023-10-07T14:53:25.089Z] 14:53:25     INFO -      r10 = 0x0000000000000001    r11 = 0x0000000000000000
[task 2023-10-07T14:53:25.090Z] 14:53:25     INFO -      r12 = 0x0000000000000003    r13 = 0x0000000000000000
[task 2023-10-07T14:53:25.090Z] 14:53:25     INFO -      r14 = 0x00007f7d19518900    r15 = 0x00007f7d47f94ca0
[task 2023-10-07T14:53:25.090Z] 14:53:25     INFO -      rip = 0x00005637f5828a3a
[task 2023-10-07T14:53:25.091Z] 14:53:25     INFO -     Found by: given as instruction pointer in context
[task 2023-10-07T14:53:25.091Z] 14:53:25     INFO -  2  libxul.so!nsTArray_Impl<mozilla::a11y::LocalAccessible*, nsTArrayInfallibleAllocator>::ElementAt(unsigned long) const [nsTArray.h:ef73c2e8cd831f2a46ff5704623db63232fd912d : 1217]
[task 2023-10-07T14:53:25.092Z] 14:53:25     INFO -     Found by: inlining
[task 2023-10-07T14:53:25.092Z] 14:53:25     INFO -  3  libxul.so!mozilla::a11y::LocalAccessible::ContentChildAt(unsigned int) const [LocalAccessible.h:ef73c2e8cd831f2a46ff5704623db63232fd912d : 363]
[task 2023-10-07T14:53:25.092Z] 14:53:25     INFO -     Found by: inlining
[task 2023-10-07T14:53:25.093Z] 14:53:25     INFO -  4  libxul.so!mozilla::a11y::DocAccessible::ShutdownChildrenInSubtree(mozilla::a11y::LocalAccessible*) [DocAccessible.cpp:ef73c2e8cd831f2a46ff5704623db63232fd912d : 2633 + 0x4]
[task 2023-10-07T14:53:25.093Z] 14:53:25     INFO -      rbx = 0x00007f7d47f94790    rbp = 0x00007fff8626a080
[task 2023-10-07T14:53:25.094Z] 14:53:25     INFO -      rsp = 0x00007fff8626a050    r12 = 0x0000000000000003
[task 2023-10-07T14:53:25.094Z] 14:53:25     INFO -      r13 = 0x0000000000000000    r14 = 0x00007f7d19518900
[task 2023-10-07T14:53:25.094Z] 14:53:25     INFO -      r15 = 0x00007f7d47f94ca0    rip = 0x00007f7d81c7ce91
[task 2023-10-07T14:53:25.095Z] 14:53:25     INFO -     Found by: call frame info
[task 2023-10-07T14:53:25.095Z] 14:53:25     INFO -  5  libxul.so!mozilla::a11y::DocAccessible::ShutdownChildrenInSubtree(mozilla::a11y::LocalAccessible*) [DocAccessible.cpp:ef73c2e8cd831f2a46ff5704623db63232fd912d : 2641 + 0xe]
[task 2023-10-07T14:53:25.096Z] 14:53:25     INFO -      rbx = 0x00007f7d47e5faf0    rbp = 0x00007fff8626a0c0
[task 2023-10-07T14:53:25.096Z] 14:53:25     INFO -      rsp = 0x00007fff8626a090    r12 = 0x0000000000000003
[task 2023-10-07T14:53:25.097Z] 14:53:25     INFO -      r13 = 0x0000000000000000    r14 = 0x00007f7d19518900
[task 2023-10-07T14:53:25.097Z] 14:53:25     INFO -      r15 = 0x00007f7d47f94790    rip = 0x00007f7d81c7ce71
[task 2023-10-07T14:53:25.097Z] 14:53:25     INFO -     Found by: call frame info
[task 2023-10-07T14:53:25.098Z] 14:53:25     INFO -  6  libxul.so!mozilla::a11y::NotificationController::ProcessMutationEvents() [NotificationController.cpp:ef73c2e8cd831f2a46ff5704623db63232fd912d : 593 + 0x4]
[task 2023-10-07T14:53:25.098Z] 14:53:25     INFO -      rbx = 0x00007f7d47e5faf0    rbp = 0x00007fff8626a1b0
[task 2023-10-07T14:53:25.098Z] 14:53:25     INFO -      rsp = 0x00007fff8626a0d0    r12 = 0x0000000000000000
[task 2023-10-07T14:53:25.099Z] 14:53:25     INFO -      r13 = 0x0000000000000000    r14 = 0x00007fff8626a0f0
[task 2023-10-07T14:53:25.099Z] 14:53:25     INFO -      r15 = 0x00007f7d20e4b4a0    rip = 0x00007f7d81c48491
[task 2023-10-07T14:53:25.100Z] 14:53:25     INFO -     Found by: call frame info

Component: General → Disability Access APIs
Product: DevTools → Core
Severity: S3 → --
Priority: P3 → --

(In reply to Julian Descottes [:jdescottes] from comment #6)

This seems to happen outside of devtools as well, but I see accessibility related objects in the trace

PROCESS-CRASH | application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH] | browser/components/urlbar/tests/browser/browser_searchMode_setURI.js

This crash has happened on this Autoland push while running test-linux1804-64-qr/opt-mochitest-browser-chrome-swr-a11y-checks-3 - log

New crash on Try shippable (log)

devtools/client/debugger/test/mochitest/browser_dbg-breakpoints-list.js 
	application crashed [@ mozilla::detail::InvalidArrayIndex_CRASH]

We try to get an invalid index here while shutting down a subtree after we handle its hide event. I've been trying to fathom how that's possible.

Most of the time, we (deliberately) repeatedly retrieve index 0; i.e. jdx doesn't change from 0. We rely on the fact that shutting down a child removes that child from its parent. In other words, we remove children from the first to the last. The only exception is that we skip a child if it's not bound to its parent, since the child can't remove itself. That's the only time we increment jdx. This really should never happen.

The only thing i can think is that some child is removing a sibling, but i can't think of any reason that could happen. aria-owns can mess with the sibling tree, but aria-owns gets cleaned up during ContentRemoved. Owned elements should be back in their correct places by the time we get here.

Eitan, can you think of any reason a child would remove a sibling? Or any other reason we might be crashing here that I'm missing?

One way to wallpaper this would be to check ContentChildCount before trying to call ContentChildAt in every iteration of the loop. That should avoid the crash, but it's wallpapering something i really don't understand.

Flags: needinfo?(eitan)

I reproduced this locally and i'm looking into it. from a first glance looks like the accessible's doc is shutdown somehow within ShutdownChildrenInSubtree which causes things to get badly. trying to figure out how that is possible.

Flags: needinfo?(eitan)

Hmm. This reminds me a bit of bug 1690456. I wonder if something we do during shutdown allows JS to run, which allows a document or even the accessibility service to shut down? See also bug 1849179 comment 16, where the accessibility service has mysteriously shut down with a document still alive.

Yeah, those can all be a similar issue. In this crash what is happening is:

  1. ShutdownChildrenInSubtree is called on a container, and it loops through the children
  2. A child of the container is shut down.
  3. The child is also the last accessible in an xpcom doc's cache.
  4. Since it is the last accessible, RemoveFromXPCDocumentCache is called on the accessible's xpc doc.
  5. It is also the last xpc doc so the accessibility service is shut down.
  6. the DocAccessibles are shut down, all their referenced accessibles are shutdown.
  7. We get index 0 in the container's children, but the container is now defunct and mChildren is empty.
  8. Boom

At least in this case I don't see a need to keep a11y alive after we lose the last xpcom doc, so an IsDefunct check of the container in ShutdownChildrenInSubtree makes sense to me. As for the focus manager deref, I guess I would do the same just have a good comment there.

Looked a bit closer at bug 1849179, and I think it is a similar problem there, only the a11y shutdown happens after the last/only child of a container is shutdown, so then the recursion stack pops to unbind the parent container and since the focus manager is already removed things go bad.

I think I have a fix that is more specific than just a defunct check that will remedy both issues..

Unbinding an accessible from a document can cause accessibility to shut
down if the accessible is the last remaining one with an xpcom wrapper.

In that case we need to return early from ShutdownChildrenInSubtree.

Assignee: nobody → eitan
Status: NEW → ASSIGNED
Pushed by eisaacson@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b67f6f17bf52 End ShutdownChildrenInSubtree if accessibility was shut down. r=Jamie
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 121 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: