Open Bug 1473892 Opened Last year Updated 9 months ago

Crash in OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash | js::gc::StoreBuffer::WholeCellBuffer::allocateCellSet

Categories

(Core :: JavaScript: GC, defect, P3, critical)

x86
Windows 7
defect

Tracking

()

Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox61 --- affected
firefox62 --- affected
firefox63 --- affected

People

(Reporter: marcia, Unassigned)

References

Details

(Keywords: crash)

Crash Data

This bug was filed from the Socorro interface and is
report bp-d6a573a5-8c73-4bc4-8143-db8680180706.
=============================================================

Seen while looking at crash stats: https://bit.ly/2KUtkEn. These crashes started in 61, and it appears as if some code was touched in Bug 1447385. 1318 crashes in 61.0. It is visible in the 62 betas but in fairly low volume.

Top 10 frames of crashing thread:

0 xul.dll js::AutoEnterOOMUnsafeRegion::crash js/src/vm/JSContext.cpp:1587
1 xul.dll js::gc::StoreBuffer::WholeCellBuffer::allocateCellSet js/src/gc/StoreBuffer.cpp:140
2 xul.dll js::gc::StoreBuffer::WholeCellBuffer::put js/src/gc/StoreBuffer-inl.h:75
3 xul.dll js::jit::PostWriteElementBarrier<0> js/src/jit/VMFunctions.cpp:736
4 xul.dll static js::jit::EnterJitStatus EnterJit js/src/jit/Jit.cpp:99
5 xul.dll js::jit::MaybeEnterJit js/src/jit/Jit.cpp:163
6 xul.dll static bool Interpret js/src/vm/Interpreter.cpp:3144
7 xul.dll js::RunScript js/src/vm/Interpreter.cpp:417
8 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:489
9 xul.dll js::Call js/src/vm/Interpreter.cpp:535

=============================================================
Hey Jon, this is currently almost 3% of our content process crashes on Fx61. Any idea what might be going on here?
Flags: needinfo?(jcoppeard)
A quick scan of most of the reports show the predominate locale affected is fr. Seems as if https://www.sfr.fr/ is mentioned in some of the comments.
tentatively marking this as blocking bug 1447385, which would fit into the regression range.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #1)
Bug 1447385 changed how we allocate buffers for the whole cell store buffer.  It doesn't actually allocate any more memory than before though, so my suspicion is that this has just shifted the OOMs around.

Can we tell if the overall OOM rate increased with this change?
Flags: needinfo?(jcoppeard)
you're right, the same issue apparently has showed up as [@ OOM | small] crash before this change, so it's not a real regression.

there is a visible spike for [@ OOM | small] content crashes for french builds starting on july 3rd too, so i suspect the SFR webmail site made some changes there introducing the instability/bug:
https://crash-stats.mozilla.com/signature/?product=Firefox&useragent_locale=fr&process_type=%3Dcontent&signature=OOM%20%7C%20small&date=%3E%3D2018-05-01#graphs

maybe we could reach out to that 3rd-party website?
Keywords: regression
Maybe Adam can help with reaching out to SFR webmail?
Flags: needinfo?(astevenson)
Reaching out to a contact we have from a webcompat issue & trying some new contacts on LinkedIn.
Flags: needinfo?(astevenson)
We now have a contact for this issue, included Andrew on the email chain.

Hi Adam,

Did anything come out of the e-mail discussions.

What'd be nice if possible is something we can reproduce.

Flags: needinfo?(astevenson)
Priority: -- → P3

Hey Paul. They didn't notice complaints from users about it.

From my view we did not have any user complaint about this issue.
We haven’t make some changes around the July 3rd.

They did offer to help in any way they could though. I forwarded you the two email threads that came to us. Please let me know if you'd like me to bring you into the discussion.

Flags: needinfo?(astevenson)

Thanks Adam,

I don't think we need to follow up up with SFR since they didn't have any clues at the time.

The frequency of crashes has also dropped.

I'm not sure what to say about the crashes themselves. It's a case were jemalloc fails to allocate memory needed for the StoreBuffer. But it's unclear why. AFAIK virual and physical memory are both available.

This crash is useful because it includes the memory report:

https://crash-stats.mozilla.com/report/index/78aac780-8d9f-4811-b885-09b700190114

Could be a symptom of Bug 1520366

Depends on: 1520366
You need to log in before you can comment on or make changes to this bug.