Closed Bug 1215181 Opened 6 years ago Closed 6 years ago

Large spike in mostly OOM or GC crashes on 2015-10-14 across all desktop channels

Categories

(Firefox :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox41 - wontfix
firefox42 + wontfix
firefox43 + wontfix

People

(Reporter: kairo, Unassigned)

Details

(Keywords: crash)

Crash Data

Attachments

(3 files)

On 2015-10-14, we experienced a spike across desktop channels in crashes that have mostly OOM and/or GC signatures.

See the bold rows in https://crash-analysis.mozilla.com/rkaiser/2015-10-14/2015-10-14.firefox.41.explosiveness.html and https://crash-analysis.mozilla.com/rkaiser/2015-10-14/2015-10-14.firefox.42.explosiveness.html

I so far have no idea what's going on, but it sounds like it was triggered by a website change, though many of the signatures only show the usual spread of URLs that are mostly Facebook.
Terrence, given that those are mostly GC OOM, any idea of what could be going on here?
Flags: needinfo?(terrence)
Adding a few of the signatures to this bug
Crash Signature: [@ OOM | unknown | js::CrashAtUnhandlableOOM(char const*) | js::gc::StoreBuffer::MonoTypeBuffer<T>::trace(js::gc::StoreBuffer*, js::TenuringTracer&)] [@ OOM | unknown | js::CrashAtUnhandlableOOM(char const*) | js::TenuringTracer::moveToTenured(JSObject*)…
[Tracking Requested - why for this release]: Based on channel meeting notes, this has caused a spike in crash rates across all channels.
Crash Signature: , JS::Handle<T>)] [@ OOM | small] [@ OOM | unknown | js::CrashAtUnhandlableOOM(char const*) | js::gc::StoreBuffer::MonoTypeBuffer<T>::trace] [@ OOM | unknown | js::CrashAtUnhandlableOOM(char const*) | js::TenuringTracer::moveToTenured] [@ OOM | unknow… → , JS::Handle<T>)] [@ js::gc::GCRuntime::decommitArenas(js::AutoLockGC&)] [@ GlobalMemoryStatusEx] [@ OOM | small] [@ OOM | unknown | js::CrashAtUnhandlableOOM(char const*) | js::gc::StoreBuffer::MonoTypeBuffer<T>::trace] [@ OOM | unknown | js::CrashA…
Comment on bp-78d6eb2b-cbb9-4562-9ed3-b85842151015

"has been happening alot over the last several days. It is only/always on FB and it's one of the static's ... which means it is probably some sort of advertisement. "

There are a number of comments pointing to Facebook but also others. Also, all those crashes only happen on desktop and not Android, and Facebook does have different versions for desktop and mobile. So there is definitely some suspicion that on 10/13 or 10/14 they pushed something to their desktop website that eats JS memory in huge amounts.
Looking at these reports, it looks like they all have plenty of available virtual memory. It looks like more of an allocator issue to me.
I don't think the underlying allocation functions that the GC uses have changed recently, and they're quite good at exhausting virtual memory (better than jemalloc). Could this be physical memory exhaustion? Jonco has been doing a lot of work on the OOM reporting infrastructure recently, but I wouldn't expect that to blow up unless there was an OOM *somewhere*.
Also, this has increased this Wednesday on *release* as well, where we did not do any code changes in the last week.
Tracking since this is a sharp spike in crashes across several releases.
(In reply to Aaron Klotz [:aklotz] (please use needinfo) from comment #5)
> Looking at these reports, it looks like they all have plenty of available
> virtual memory. It looks like more of an allocator issue to me.

What do you consider plenty? I'm mostly seeing <250M address space remaining, which may seem like a lot, but remember that we lose some space to fragmentation and alignment restrictions. Generally anything below 300M means an OOM is imminent. So I think these OOMs are real.

To understand the spike, it may be necessary to dig into about:memory reports. You can search on contains_memory_report=1 to find the crash reports that include an about:memory file.
the spikes are going down again. so it looks whatever the underlying issue was, it has already been addressed by that third-party...
Yes, sounds like it.
OK, I will stop tracking it in 42 than.
Looks good for 43 and 44 as well. Whew. Untracking.
This is not a top crash anymore based on previous comments. Given that, it is a now a wontfix for 41.
Looks like this was a site issue: removing ni.
Flags: needinfo?(terrence)
Attached file during the crash!
to be precise, not simply on Facebook, but only by scrolling (OR watching) FB photo albums.
Maybe related to 
https://bugzilla.mozilla.org/show_bug.cgi?id=1029671
resolved in FF43.

But please do not stop working on this crash, thanks.
> the spikes are going down again. so it looks whatever the underlying issue was, it has already been
> addressed by that third-party...

-> WFM then?
Status: NEW → RESOLVED
Closed: 6 years ago
Component: Untriaged → General
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.