Open Bug 1955282 Opened 7 months ago Updated 4 months ago

Crash in [@ std::basic_filebuf<T>::underflow] on Intel Raptor Lake CPUs

Categories

(Toolkit :: Telemetry, defect, P4)

defect

Tracking

()

People

(Reporter: gsvelto, Unassigned)

References

(Blocks 2 open bugs)

Details

(Keywords: crash)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/0244e5fd-d4c8-44c0-b593-136bf0250320

Reason:

SIGSEGV / SEGV_ACCERR

Top 10 frames:

0  ?  @0x00007bf6dc9e0030
1  libstdc++.so.6  std::basic_filebuf<char, std::char_traits<char> >::underflow()  /build/gcc-12-ALHxjy/gcc-12-12.3.0/build/x86_64-linux-gnu/libstdc++-v3/include/bits/fstream.tcc:355
2  libstdc++.so.6  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>...  /build/gcc-12-ALHxjy/gcc-12-12.3.0/build/x86_64-linux-gnu/libstdc++-v3/include/bits/basic_string.h:1430
2  libstdc++.so.6  std::getline<char, std::char_traits<char>, std::allocator<char> >(std::basic_...  /build/gcc-12-ALHxjy/gcc-12-12.3.0/src/libstdc++-v3/src/c++98/istream-string.cc:161
3  libxul.so  mozilla::GetMemoryMappings(nsTArray<mozilla::MemoryMapping>&, int)  /build/firefox/parts/firefox/build/xpcom/base/MemoryMapping.cpp:134
4  libxul.so  GetProcSelfSmapsPrivate(long*, int)  /build/firefox/parts/firefox/build/xpcom/base/nsMemoryReporterManager.cpp:104
4  libxul.so  ResidentUniqueDistinguishedAmount(long*, int)  /build/firefox/parts/firefox/build/xpcom/base/nsMemoryReporterManager.cpp:131
5  libxul.so  mozilla::MemoryTelemetry::GatherReports(std::function<void ()> const&)::$_1::...  /build/firefox/parts/firefox/build/xpcom/base/MemoryTelemetry.cpp:421
5  libxul.so  mozilla::detail::RunnableFunction<mozilla::MemoryTelemetry::GatherReports(std...  /build/firefox/parts/firefox/build/xpcom/threads/nsThreadUtils.h:548
6  libxul.so  nsThreadPool::Run()  /build/firefox/parts/firefox/build/xpcom/threads/nsThreadPool.cpp:456

While this crash originates in telemetry code it's highly unlikely we can do something about it: only users with Intel Raptor Lake CPUs appear to be affected, so this is most likely caused by a CPU bugs. All microcode versions appear affected including the most recent version available (0x12c).

Linux-only from the GetProcSelfSmapsPrivate in the stack. Maybe we just don't report residentunique from Linux? We already don't from MacOS because it's slow (bug 1779138).

ni?pbone based on his knowledgeable comments in bug 1779138 -- Should we skip reporting residentunique on Linux to avoid this (alleged, but likely) cpu bug?

Flags: needinfo?(pbone)

It could be a CPU bug as gsvelto suggests, because it's faulting on a stack access (the first access on a new page too) but that looks like a small stack. But with one larger object on it (the array of mappings).

Assignee: nobody → pbone
Status: NEW → ASSIGNED
Flags: needinfo?(pbone)

Confirming that this is most likely a CPU bug, and it affects even the latest version of Intel microcode (which seemed to have fixed other problems such as bug 1950764).

So should we skip reporting residentunique on Linux?

Flags: needinfo?(pbone)

My workaround didn't work.

So should we skip reporting residentunique on Linux?

Maybe? There are 148 crashes in the last 3 months, I don't feel like that's high. I also don't know how likely it is that Intel will fix their CPUs, which would be the ideal solution. I'll ask in #memshrink if anyone is using resident unique. if we remove it I propose to keep it in about:memory.

Assignee: pbone → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(pbone)

They fixed bug 1950764 so maybe they're more proactive than in the past.

Severity: -- → S4
Priority: -- → P4
You need to log in before you can comment on or make changes to this bug.