Closed Bug 1806064 Opened 2 years ago Closed 2 years ago

Crash in [@ nsQueryObject<T>::operator()] with string bundles

Categories

(Core :: DOM: Workers, defect)

Desktop
Windows
defect

Tracking

()

RESOLVED FIXED
110 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox108 --- unaffected
firefox109 --- unaffected
firefox110 + fixed

People

(Reporter: pascalc, Assigned: yulia)

References

(Regression)

Details

(5 keywords, Whiteboard: [fixed by backout][post-critsmash-triage][adv-main110-])

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/7e9bbefb-973a-4503-8207-b62260221215

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0  xul.dll  nsQueryObject<nsIStringBundle>::operator const  xpcom/base/nsQueryObject.h:24
0  xul.dll  RefPtr<  xpcom/base/nsCOMPtr.h:1475
0  xul.dll  nsStringBundleService::RegisterContentBundle  intl/strres/nsStringBundle.cpp:835
1  xul.dll  mozilla::dom::ContentChild::RecvRegisterStringBundles  dom/ipc/ContentChild.cpp:2369
2  xul.dll  mozilla::dom::PContentChild::OnMessageReceived  ipc/ipdl/PContentChild.cpp:10988
3  xul.dll  mozilla::ipc::MessageChannel::DispatchAsyncMessage  ipc/glue/MessageChannel.cpp:1756
3  xul.dll  mozilla::ipc::MessageChannel::DispatchMessage  ipc/glue/MessageChannel.cpp:1681
3  xul.dll  mozilla::ipc::MessageChannel::RunMessage  ipc/glue/MessageChannel.cpp:1481
3  xul.dll  mozilla::ipc::MessageChannel::MessageTask::Run  ipc/glue/MessageChannel.cpp:1579
4  xul.dll  mozilla::RunnableTask::Run  xpcom/threads/TaskController.cpp:539
Group: core-security

use-after-free in the crash address: 0xe5e5e5e5e5e5e5e5

Flags: needinfo?(ystartsev)
Regressed by: 1247687

[@ mozilla::dom::WorkerRunnable::Run] variant: bp-48d4a7d3-ee36-4bcf-aecb-467250221216

Crash Signature: [@ nsQueryObject<T>::operator()] → [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run]
Summary: Crash in [@ nsQueryObject<T>::operator()] → Crash in [@ nsQueryObject<T>::operator()], [@ mozilla::dom::WorkerRunnable::Run]

It looks like the issue might be that the StringBundle isn't initialized before an error occurs. But that only explains the second variant, not the first. I am not sure how the worker modules changes impacted the first, as that happens on the content process and not on the worker process or the main thread while doing worker work. In particular, it looks like an issue when opening PDF files -- which suggests that the pdf worker might be triggering it... but the error still comes from content. I'll ask around.

Set release status flags based on info from the regressing bug 1247687

Group: core-security → dom-core-security

Here's an example of the string bundle crash on a poison value: bp-bbd28d65-3982-480a-981e-927520221216

See Also: → 1806119

(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #3)

[@ mozilla::dom::WorkerRunnable::Run] variant: bp-48d4a7d3-ee36-4bcf-aecb-467250221216

Should there maybe be a new bug filed for this instead? It does seem like it probably has the same regressor, but the stack looks completely different.

Flags: needinfo?(aryx.bugmail)

Bug 1805387 in the regression range involves both PDFs and strings, which is suspicious, but the actual patch is a CSS tweak to avoid windows overlapping so I can't see how it could be related.

One comment says "Cannot open PDF files."

The part of the worker module code that is causing this issue is most likely https://phabricator.services.mozilla.com/D163239 -- I added this because we cannot use the string bundle cross thread otherwise, and the module loader crashes without it if it fails on the worker side. However, I have little insight to how this works, and I don't know if :kmag is still on until the end of the year. I'll cc him here and maybe we can take a look together. If this is severe, i have no qualms about pulling the module loader back out if necessary. I've been keeping this rebased for a half year anyway.

Kmag, you helped yoshi with the stringbundle stuff related to worklets, ccing you here in case you have any idea. To me it looks like I am calling it in the wrong place, either too early or too late.

Flags: needinfo?(ystartsev) → needinfo?(kmaglione+bmo)

(shouldn't have cleared the needinfo for me here, i might forget to come back to this otherwise)

Flags: needinfo?(ystartsev)

Via the signature spikes report, here's what looks like another string bundle issue. As with the other one, a decent chunk are UAFs. bp-50a47787-801c-47ab-ab06-17de60221216

Crash Signature: [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run] → [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run] [@ nsContentUtils::EnsureAndLoadStringBundle ]

..and another, [@ nsContentUtils::FormatLocalizedString ]: bp-5e6f3da4-b23d-4c24-96ed-580c30221216

Crash Signature: [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run] [@ nsContentUtils::EnsureAndLoadStringBundle ] → [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run] [@ nsContentUtils::EnsureAndLoadStringBundle ] [@ nsContentUtils::FormatLocalizedString ]
Severity: -- → S2
Crash Signature: [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run] [@ nsContentUtils::EnsureAndLoadStringBundle ] [@ nsContentUtils::FormatLocalizedString ] → [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run] [@ nsContentUtils::EnsureAndLoadStringBundle] [@ nsContentUtils::FormatLocalizedString] [@ nsContentUtils::GetLocalizedString]
Flags: needinfo?(ystartsev)

The bug is linked to topcrash signatures, which match the following criteria:

  • Top 10 desktop browser crashes on nightly
  • Top 10 AArch64 and ARM crashes on nightly

For more information, please visit auto_nag documentation.

Keywords: topcrash

The bug is marked as tracked for firefox110 (nightly). However, the bug still isn't assigned.

:jstutte, could you please find an assignee for this tracked bug? Given that it is a regression and we know the cause, we could also simply backout the regressor. If you disagree with the tracking decision, please talk with the release managers.

For more information, please visit auto_nag documentation.

Flags: needinfo?(jstutte)

(In reply to Andrew McCreight [:mccr8] from comment #7)

(In reply to Sebastian Hengst [:aryx] (needinfo me if it's about an intermittent or backout) from comment #3)

[@ mozilla::dom::WorkerRunnable::Run] variant: bp-48d4a7d3-ee36-4bcf-aecb-467250221216

Should there maybe be a new bug filed for this instead? It does seem like it probably has the same regressor, but the stack looks completely different.

Yeah, that seems to be a different issue. Cracking up https://crash-stats.mozilla.org/report/index/570e2142-c5cb-4712-9630-00c3e0221218 shows me that apparently we try to run an already freed WorkerRunnable while doing a sync loop and fail on reading its mWorkerPrivate member, which sounds weird.

Flags: needinfo?(jstutte)
See Also: → 1806395
Crash Signature: [@ nsQueryObject<T>::operator()] [@ mozilla::dom::WorkerRunnable::Run] [@ nsContentUtils::EnsureAndLoadStringBundle] [@ nsContentUtils::FormatLocalizedString] [@ nsContentUtils::GetLocalizedString] → [@ nsQueryObject<T>::operator()] [@ nsContentUtils::EnsureAndLoadStringBundle] [@ nsContentUtils::FormatLocalizedString] [@ nsContentUtils::GetLocalizedString]

For now this is mitigated by backout as of bug 1247687 comment 77.

Yes, and bug 1806390 should fix the stringbundle crashes.

Flags: needinfo?(kmaglione+bmo)

I confirmed that the crashes have gone away, so I'll mark this fixed.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by backout]
Flags: needinfo?(aryx.bugmail)
Assignee: nobody → ystartsev
Summary: Crash in [@ nsQueryObject<T>::operator()], [@ mozilla::dom::WorkerRunnable::Run] → Crash in [@ nsQueryObject<T>::operator()] with string bundles
Group: dom-core-security → core-security-release
Target Milestone: --- → 110 Branch
Duplicate of this bug: 1806718

I will try to reland the changes now that emilios changes are in. I'll monitor the crashes over the next few days.

Duplicate of this bug: 1808213
Flags: qe-verify-
Whiteboard: [fixed by backout] → [fixed by backout][post-critsmash-triage]
Whiteboard: [fixed by backout][post-critsmash-triage] → [fixed by backout][post-critsmash-triage][adv-main110-]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.