Closed Bug 1550951 Opened 5 years ago Closed 3 years ago

startup Crash in [@ mozilla::dom::DocumentL10n::Init] related to omni.ja corruption

Categories

(Core :: Internationalization, defect, P2)

67 Branch
defect

Tracking

()

RESOLVED FIXED
92 Branch
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- wontfix
firefox-esr78 --- wontfix
firefox-esr91 --- wontfix
firefox66 --- unaffected
firefox67 --- wontfix
firefox67.0.1 --- wontfix
firefox68 --- wontfix
firefox69 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix
firefox72 --- wontfix
firefox73 --- wontfix
firefox74 --- wontfix
firefox92 --- fixed

People

(Reporter: philipp, Unassigned)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression, Whiteboard: [tbird crash])

Crash Data

This bug is for crash report bp-c462e041-f4fb-49b0-b1c9-5c3c10190510.

Top 10 frames of crashing thread:

0 xul.dll mozilla::dom::DocumentL10n::Init intl/l10n/DocumentL10n.cpp:98
1 xul.dll mozilla::dom::Document::InitializeLocalization dom/base/Document.cpp:3126
2 xul.dll mozilla::dom::Document::OnL10nResourceContainerParsed dom/base/Document.cpp:3202
3 xul.dll nsresult mozilla::dom::PrototypeDocumentContentSink::ResumeWalk dom/prototype/PrototypeDocumentContentSink.cpp:449
4 xul.dll nsXULPrototypeDocument::NotifyLoadDone dom/xul/nsXULPrototypeDocument.cpp:403
5 xul.dll nsresult XULContentSinkImpl::DidBuildModel dom/xul/nsXULContentSink.cpp:197
6 xul.dll nsresult nsParser::ResumeParse parser/htmlparser/nsParser.cpp:1008
7 xul.dll nsresult nsParser::OnStopRequest parser/htmlparser/nsParser.cpp:1354
8 xul.dll nsresult nsDocumentOpenInfo::OnStopRequest uriloader/base/nsURILoader.cpp:360
9 xul.dll nsresult nsJARChannel::OnStopRequest modules/libjar/nsJARChannel.cpp:1026

this is a low volume crash signature starting to show up in firefox 67 with
MOZ_RELEASE_ASSERT(jsm) or MOZ_RELEASE_ASSERT(mDOMLocalization) added in bug 1523194.

Priority: -- → P2

low volume, wontfix 67

Still quite low volume. Marking fix-optional for 69 as well to get this out of regression triage.

Adding another signature seen during triage. These either have moz crash reason MOZ_RELEASE_ASSERT(jsm) or MOZ_RELEASE_ASSERT(mLocalization).

Crash Signature: [@ mozilla::dom::DocumentL10n::Init] → [@ mozilla::dom::DocumentL10n::Init] [@ mozilla::intl::Localization::Init ]

Marking fix optional per triage team.

Adding 71 as affected, although crashes there are fairly sparse. This is the #12 top content crash in 70.0rc2.

Makoto, could you help us find an owner for this bug please? Thanks

Flags: needinfo?(m_kato) → needinfo?(gandalf)

Kris, can you advice me here?

The crash is coming from:

nsCOMPtr<mozILocalizationJSM> jsm =
    do_ImportModule("resource://gre/modules/Localization.jsm  ");
MOZ_RELEASE_ASSERT(jsm);

and I don't know how to reproduce it or investigate it, or even how to handle it.
The Localization.jsm has no "runtime" code, and only register its functions so I doubt that it throws any exception intermittently just on loading.

I have a couple hypothesis:

  1. We've observed cases where JS code on user machine is contaiminated with a random character breaking JS (see 1594453 comment 17)
  2. Something happens that makes the loading of the module impossible (OOM?)

In both cases, we are out of luck and need to abort, but I don't know how we should do it to avoid crashing?
Is there a way to gather more information why do_ImportModule didn't succeed?

Flags: needinfo?(gandalf) → needinfo?(kmaglione+bmo)

This is probably the same issue as bug 1403348, which we're pretty sure at this point is either omnijar or memory corruption. There really isn't any better option than to crash in that case. If omnijar is corrupt, we can't trust anything, and we're better off not running at all.

Flags: needinfo?(kmaglione+bmo)

Is the volume acceptable for marking this as WONTFIX then? I'm concerned about high volume of omnijar or memory corruption that somehow particularly addresses Localization.jsm, so I'm wondering if other do_importModule cases register comparable volume.

I'd be inclined to just mark it as a dupe of bug 1403348. What we really need for both issues is a way to detect and report (to the user) or recover from (by reinstalling) omni jar corruption at runtime. Other than that, there really isn't anything we can do.

Or, well, I suppose we could special case the specific cases where we crash if we fail to load a module to also tell users their Firefox installation is probably corrupt... That might not be so bad as a short term solution, but omni jar corruption manifests in so many other ways that I'm not sure there's a point in playing whack-a-mole that way.

We have shipped our last beta for 71 and the bug is unassigned, so this is wontfix for 71.

Hey, this seems to be getting worse, and may have regressed in september. Should we re-prioritize this?

Flags: needinfo?(gandalf)

I'm not the right person to triage it.

The crash is not really related to Intl logic, but to the do_ImportModule logic. As Kris stated in comment 10, there's likely not much we can salvage if that happens, so I don't know how to improve the situation.

In the long run we'll migrate Localization to Rust, but that will likely just push the crash to the next do_ImportModule use case which will crash.

I'll redirect the NI to Kris in case he sees anything actionable here.

Flags: needinfo?(gandalf) → needinfo?(kmaglione+bmo)

It has nothing to do with do_ImportModule either. It has to do with omni.ja corruption, and if we have a corrupt omni.ja, we don't have any option other than to crash.

Flags: needinfo?(kmaglione+bmo)
Blocks: 1616059
Crash Signature: [@ mozilla::dom::DocumentL10n::Init] [@ mozilla::intl::Localization::Init ] → [@ mozilla::dom::DocumentL10n::Init] [@ mozilla::intl::Localization::Init ] [@ mozilla::intl::Localization::Localization ]
Crash Signature: [@ mozilla::dom::DocumentL10n::Init] [@ mozilla::intl::Localization::Init ] [@ mozilla::intl::Localization::Localization ] → [@ mozilla::dom::DocumentL10n::Init] [@ mozilla::intl::Localization::Init ] [@ mozilla::intl::Localization::Localization ] [@ mozilla::intl::Localization::Activate ]
QA Whiteboard: qa-not-actionable
Summary: Crash in [@ mozilla::dom::DocumentL10n::Init] → startup Crash in [@ mozilla::dom::DocumentL10n::Init] related to omni.ja corruption
Whiteboard: [tbird crash]

maybe the latter, definitely not the former.

But it should go away after bug 1613705 lands.

this seems fixed starting in 93.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 92 Branch
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.