Crash in AllocInfo::Get<T>
Categories
(Core :: DOM: Core & HTML, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox-esr60 | --- | wontfix |
firefox60 | --- | unaffected |
firefox61 | + | fixed |
firefox62 | - | wontfix |
firefox63 | - | verified disabled |
firefox64 | --- | wontfix |
firefox65 | --- | wontfix |
firefox66 | --- | wontfix |
People
(Reporter: philipp, Unassigned)
References
Details
(Keywords: crash, regression, topcrash, Whiteboard: [geckoview:p3])
Crash Data
Reporter | ||
Comment 1•6 years ago
|
||
Updated•6 years ago
|
Comment 3•6 years ago
|
||
Comment 4•6 years ago
|
||
Comment 5•6 years ago
|
||
Comment 6•6 years ago
|
||
Comment 9•6 years ago
|
||
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
Updated•6 years ago
|
Comment 12•6 years ago
|
||
Updated•6 years ago
|
Updated•6 years ago
|
Reporter | ||
Comment 13•6 years ago
|
||
Comment 14•6 years ago
|
||
Comment 15•6 years ago
|
||
Comment 16•6 years ago
|
||
Comment 17•6 years ago
|
||
Comment 21•6 years ago
|
||
Comment 22•6 years ago
|
||
Comment 23•6 years ago
|
||
Reporter | ||
Comment 24•6 years ago
|
||
Comment 26•6 years ago
|
||
Comment 28•6 years ago
|
||
Updated•6 years ago
|
Comment 30•6 years ago
|
||
Updated•6 years ago
|
Comment 31•6 years ago
|
||
Comment 33•6 years ago
|
||
Comment 34•6 years ago
|
||
Comment 35•6 years ago
|
||
Comment 36•6 years ago
|
||
Comment 37•6 years ago
|
||
Comment 38•6 years ago
|
||
Comment 39•6 years ago
|
||
Comment 40•6 years ago
|
||
Comment 41•6 years ago
|
||
Comment 42•6 years ago
|
||
Comment 44•6 years ago
|
||
Comment 45•6 years ago
|
||
Updated•6 years ago
|
Comment 46•6 years ago
|
||
Comment 47•6 years ago
|
||
Comment 48•6 years ago
|
||
Comment 49•6 years ago
|
||
Comment 50•6 years ago
|
||
Comment 51•6 years ago
|
||
Comment 52•6 years ago
|
||
Updated•6 years ago
|
Comment 55•6 years ago
|
||
Comment 56•6 years ago
|
||
Updated•6 years ago
|
Updated•6 years ago
|
Updated•6 years ago
|
Comment 57•6 years ago
|
||
Comment 58•6 years ago
|
||
Updated•6 years ago
|
Comment 59•6 years ago
|
||
There were 22398 reports in Fennec 61.0, 47 in Fennec 64.0, and only 1 in 66.0a1.
This crash signature is reported on all platforms, but 90% of the reports are from Android.
Assignee | ||
Updated•6 years ago
|
Comment 60•5 years ago
|
||
I stumbled over this bug by accident, and if I query https://crash-stats.mozilla.org/search/?signature=%3DAllocInfo%3A%3AGet%3CT%3E&date=%3E%3D2018-12-04T00%3A00%3A00.000Z&date=%3C2019-12-03T14%3A58%3A00.000Z&_facets=install_time&_facets=version&_facets=address&_facets=moz_crash_reason&_facets=reason&_facets=build_id&_facets=platform_pretty_version&_facets=signature&_facets=useragent_locale&_sort=-date&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=moz_crash_reason&_columns=build_id&_columns=platform_pretty_version#crash-reports, I see crashes in all our products on all platforms.
All of them (well, the ones I clicked on) reported to fail more or less here: https://searchfox.org/mozilla-central/rev/732a4acab44c9ed4199611c15be0ab63c08e7ca2/memory/build/mozjemalloc.cpp#3181
Thus it does not look like a DOM specific failure to me. And it would be helpful to know, what is the intent of the MOZ_DIAGNOSTIC_ASSERT(ret != 0);
: Does it check a potential error in the mozjemalloc implementation itself or is it indicating a misuse (like calling this with an already freed pointer or similar)?
In the latter case probably we should analyze all stack traces and find the source of the dangling pointer case by case. Phew.
Comment 61•5 years ago
|
||
I also had a look and the signature is too generic to be meaningfull; we should add it to Socorro's prefix list so that we can tell different stacks apart. Some crashes have common stacks but a bunch of others don't so this could just be coming from faulty hardware. It happens often with over-generic crash signatures such as this one.
Comment 62•5 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #61)
I also had a look and the signature is too generic to be meaningfull; we should add it to Socorro's prefix list so that we can tell different stacks apart. Some crashes have common stacks but a bunch of others don't so this could just be coming from faulty hardware. It happens often with over-generic crash signatures such as this one.
I would want to start from the assumption, that the MOZ_DIAGNOSTIC_ASSERT(ret != 0);
has been put on purpose and wants to tell us something. And if it's a misuse: As we frequently see other UAF issues, it would be helpful to have some (potentially 1000 as of the number of search results) hints on potential UAFs through this assert. But I am speculating.
Comment 63•5 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #62)
I would want to start from the assumption, that the
MOZ_DIAGNOSTIC_ASSERT(ret != 0);
has been put on purpose and wants to tell us something. And if it's a misuse: As we frequently see other UAF issues, it would be helpful to have some (potentially 1000 as of the number of search results) hints on potential UAFs through this assert. But I am speculating.
Yeah, I'm not saying this isn't valid, it's possible that there are valid stacks hiding in there. In particular this stack comes up a few times:
0 mozglue.dll AllocInfo::Get<1>(void const*) memory/build/mozjemalloc.cpp:3047
1 mozglue.dll moz_malloc_size_of memory/mozalloc/mozalloc.cpp:128
2 xul.dll unsigned __int64 mozilla::webgl::ShaderValidatorResults::SizeOfIncludingThis(*) dom/canvas/WebGLShaderValidator.cpp:485
3 xul.dll mozilla::WebGLShader::SizeOfIncludingThis(unsigned __int64 (*)(void const*)) dom/canvas/WebGLShader.cpp:302
4 xul.dll static __int64 mozilla::WebGLMemoryTracker::GetShaderSize() dom/canvas/WebGLMemoryTracker.cpp:121
5 xul.dll nsresult mozilla::WebGLMemoryTracker::CollectReports(class nsIHandleReportCallback*, class nsISupports*, bool) dom/canvas/WebGLMemoryTracker.cpp:61
6 xul.dll nsresult mozilla::detail::RunnableFunction<`lambda at z:/task_1564955640/build/src/xpcom/base/nsMemoryReporterManager.cpp:1863:7'>::Run() xpcom/threads/nsThreadUtils.h:564
7 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1224
8 xul.dll NS_ProcessNextEvent(nsIThread*, bool) xpcom/threads/nsThreadUtils.cpp:486
9 xul.dll mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:88
10 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:308
This is also the only stack associated with a crash of a machine that has ECC memory and thus not susceptible to memory errors.
For other crashes with weird stacks and older versions it's possible given the low volume that these are bit-flips in memory being caught by the assertion since none of the reports come from machines with ECC memory. We're loading the mapbits
data from memory here; since the assertion is checking against 0 (after masking) it's possible for stuck bits or bit flips to trigger it.
Comment 64•5 years ago
|
||
I filed bug 1600951 to disambiguate the crashes; once it lands the crash volume here will go to zero as Socorro generates new more specific signatures. It should be sufficient to triage some crashes with a signature starting with AllocInfo::Get<T>
to find out a new more relevant signature.
Comment 65•5 years ago
|
||
Does it check a potential error in the mozjemalloc implementation itself or is it indicating a misuse (like calling this with an already freed pointer or similar)?
The latter.
Comment 66•5 years ago
|
||
Thanks :glandium, clearing needinfo to Eric then.
:gsvelto: Thanks for de mangling this! Can we easily apply this prefix filter also on already existing reports for the last 6 months?
Comment 67•5 years ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #66)
:gsvelto: Thanks for de mangling this! Can we easily apply this prefix filter also on already existing reports for the last 6 months?
Yes, I think we can reprocess the crashes. I'll see if I can do it.
Comment 68•5 years ago
|
||
OK, I can reprocess all the crashes with this signature (and similar ones) but I'll have to wait until bug 1600951 reaches production.
Comment 69•5 years ago
|
||
Looks like we'll also need bug 1601223.
Comment 70•5 years ago
|
||
AllocInfo::Get<T>
is now in the prefix list, time to reprocess crashes.
Comment 71•5 years ago
|
||
... and it's done. I reprocessed all the crashes from the past 6 months so the actual results might take a while to show up on crash-stats, but tomorrow we can probably start looking for new signatures.
Comment 72•5 years ago
|
||
Reprocessing the crashes paid off handsomely: the vast majority of the resulting signatures affect Firefox & Thunderbird version 60 and older so we don't care about those. The stack I put in comment 63 now has his own signature and looks like a real problem affecting both Firefox 72 & 73. I'm closing this bug since this signature is no more and opening a new one for that.
Comment 73•5 years ago
|
||
Closing this, we can follow up with the new signature is in bug 1603041.
Updated•3 years ago
|
Description
•