Closed Bug 1547649 Opened 2 years ago Closed 7 months ago

OOM crash in [@ NS_InitXPCOM] needs annotation

Categories

(Core :: Javascript: WebAssembly, defect, P2)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED FIXED
mozilla79
Tracking Status
firefox-esr68 --- wontfix
firefox-esr78 --- fixed
firefox77 --- wontfix
firefox78 --- wontfix
firefox79 --- fixed

People

(Reporter: gsvelto, Assigned: lth)

References

Details

(Keywords: crash)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-875d6118-8036-47c5-bd35-bf9210190427.

Top 10 frames of crashing thread:

0 xul.dll NS_InitXPCOM xpcom/build/XPCOMInit.cpp:442
1 xul.dll XRE_InitEmbedding2 toolkit/xre/nsEmbedFunctions.cpp:196
2 xul.dll mozilla::ipc::ScopedXREEmbed::Start ipc/glue/ScopedXREEmbed.cpp
3 xul.dll mozilla::dom::ContentProcess::Init dom/ipc/ContentProcess.cpp:188
4 xul.dll XRE_InitChildProcess toolkit/xre/nsEmbedFunctions.cpp:727
5 firefox.exe static int content_process_main ipc/contentproc/plugin-container.cpp:56
6 firefox.exe static int NS_internal_main browser/app/nsBrowserApp.cpp:263
7 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:131
8 firefox.exe static int __scrt_common_main_seh f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:288
9 kernel32.dll BaseThreadInitThunk 

There might be two different crashes here (and a few random single-installation crashes intermingled with them too).

The first one is on Windows where we fail in JS_InitWithFailureDiagnostic() with the crash reason set to "js::wasm::Init() failed". The second one has the same stack and fails in the same place on Fennec with "u_init() failed".

The first instance of the Windows crash is on buildid 20190316094726 and the first instance of the second is on buildid 20190331141835.

Both crashes seem to be happening only on nightly and beta.

"js::wasm::Init() failed" is the most returned moz-crash reason (61% over all the crashes so far), however the only reason for failing would be js_new issue, which sounds unlikely, especially in a start-up crash, unless there is absolutely no memory available on the system for making a small allocation.

Component: JavaScript Engine: JIT → Javascript: WebAssembly

Yes, agreed. Seems weird, but if I'm reading the above crash data chart correctly, the crash rate is really low so maybe this is just some weird/rare OOM?

(In reply to Luke Wagner [:luke] from comment #2)

Yes, agreed. Seems weird, but if I'm reading the above crash data chart correctly, the crash rate is really low so maybe this is just some weird/rare OOM?

I just double-checked the reports: they all have ridiculously low amount of commit space available (check the "available page file" field in the reports). With less than 10MiB available these are likely OOM conditions. Is there any particular reason why this would show up only on nightly and beta but not release?

I don't know if you were asking someone else, but I can't think of anything. For reference, I just checked and sizeof(ProcessCodeSegmentMap) (the js_new that is failing) is just 168, so it's not like this is some big allocation that is failing either.

The crashes have so little memory that even a small allocation is likely to fail. In fact it's a miracle that Firefox was still running at that point. I doubt there's anything fixable here but it would be nice if we could add an OOM annotation in this case so that this type of crashes are flagged as such.

Severity: critical → normal
Priority: -- → P3
Summary: Crash in [@ NS_InitXPCOM] → OOM crash in [@ NS_InitXPCOM] needs annotation
See Also: → 1629217
See Also: → 1637864
Duplicate of this bug: 1637864
Duplicate of this bug: 1629217
Assignee: nobody → lhansen
Severity: normal → S3
Status: NEW → ASSIGNED
Priority: P3 → P2
Crash Signature: [@ NS_InitXPCOM] → [@ NS_InitXPCOM] [@ xul.dll + 0x403e47]
Crash Signature: [@ NS_InitXPCOM] [@ xul.dll + 0x403e47] → [@ NS_InitXPCOM] [@ xul.dll + 0x403e47] [@ NS_InitXPCOM + 0x1344] [@ NS_InitXPCOM + 0x1397]

Gabriele, what kind of form would the OOM annotation take?

Crash Signature: [@ NS_InitXPCOM] [@ xul.dll + 0x403e47] [@ NS_InitXPCOM + 0x1344] [@ NS_InitXPCOM + 0x1397] → [@ NS_InitXPCOM] [@ xul.dll + 0x403e47] [@ NS_InitXPCOM + 0x1344] [@ NS_InitXPCOM + 0x1397]
Flags: needinfo?(gsvelto)

This is normally done via AnnotateOOMAllocationSize(). So one would need to put a call to that function with the size of the object to be allocated here, and then call it again with the value 0 here. That being said why not just make the allocation infallible?

JS_InitWithFailureDiagnostic() is just passing the error string to MOZ_CRASH_UNSAFE() which is not great because we lose the stack where the issue actually happened. It would be better to just crash whenever we hit an error we know we can't recover from, then we get naturally grouped crash signatures that are easy to read and analyse.

Flags: needinfo?(gsvelto)

Thanks. I was also pondering whether making it infallible would not be best - it's such a small object, nothing's going to work if this fails to be allocated. Will investigate that instead.

If we fail this allocation during initialization we're going to crash very shortly
anyway, so make the allocation infallible - it's small, and crashing here will
preserve the call stack properly.

Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7922c8a99821
Make allocation in wasm::Init infallible.  r=rhunt
Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79

Is this something we should consider uplifting to ESR78?

Flags: needinfo?(lhansen)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #14)

Is this something we should consider uplifting to ESR78?

Yeah, I guess - it would have no impact on functionality but it might make for fewer spurious crash reports, which would probably be helpful.

Flags: needinfo?(lhansen)

Comment on attachment 9158138 [details]
Bug 1547649 - Make allocation in wasm::Init infallible. r?rhunt

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: It would hopefully tag a bunch of crash reports as OOM crashes, thus reducing volume of not-understood crashes.
  • User impact if declined: None? I'm not even sure if OOM crashes are surfaced as such to the user...
  • Fix Landed on Version: 79
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): It adds an OOM annotation and crashes in a controlled way sooner than it would crash otherwise - no attempt was made previously to recover from the OOM, we would just crash later with less information.
  • String or UUID changes made by this patch:
Attachment #9158138 - Flags: approval-mozilla-esr78?

Comment on attachment 9158138 [details]
Bug 1547649 - Make allocation in wasm::Init infallible. r?rhunt

Approved for 78.1esr.

Attachment #9158138 - Flags: approval-mozilla-esr78? → approval-mozilla-esr78+
You need to log in before you can comment on or make changes to this bug.