Some uses of MOZ_CRASH_ANNOTATE get optimized out in release builds
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox137 | --- | fixed |
People
(Reporter: yannis, Assigned: sergesanspaille)
References
Details
(Keywords: perf-alert)
Attachments
(2 files, 2 obsolete files)
:padenot mentioned that the crash reason annotation was missing while working on crashes in mozilla::AudioChunk::ChannelDataForWrite<T> (example report). The crash goes through mozilla::detail::InvalidArrayIndex_CRASH:
MFBT_API MOZ_NORETURN MOZ_COLD void mozilla::detail::InvalidArrayIndex_CRASH(
size_t aIndex, size_t aLength) {
MOZ_CRASH_UNSAFE_PRINTF("ElementAt(aIndex = %zu, aLength = %zu)", aIndex,
aLength);
}
MOZ_CRASH_UNSAFE_PRINTF resolves to MOZ_Crash(__FILE__, __LINE__, MOZ_CrashPrintf("" format, __VA_ARGS__)); where:
static MOZ_ALWAYS_INLINE_EVEN_DEBUG MOZ_COLD MOZ_NORETURN void MOZ_Crash(
const char* aFilename, int aLine, const char* aReason) {
MOZ_FUZZING_HANDLE_CRASH_EVENT4("MOZ_CRASH", aFilename, aLine, aReason);
#if defined(DEBUG) || defined(FUZZING)
MOZ_ReportCrash(aReason, aFilename, aLine);
#endif
MOZ_CRASH_ANNOTATE(aReason);
MOZ_REALLY_CRASH(aLine);
}
Looking at the binaries for release build 134.0.2 (and possibly others) shows that the call to MOZ_Crash gets inlined and the inner call to MOZ_CRASH_ANNOTATE(aReason); gets optimized out (or is considered a no-op), both on Windows x64 and Android aarch64:
mozglue!mozilla::detail::InvalidArrayIndex_CRASH:
sub rsp, 28h
mov r8, aLength (rdx)
mov rdx, aIndex (rcx)
lea rcx, [mozglue!... (7ffae87cfa5d)]
call mozglue!MOZ_CrashPrintf (7ffae87c5f20)
// start of MOZ_REALLY_CRASH(0x33)
int 3
mov ecx, 33h
call mozglue!MOZ_NoReturn (7ffae878e797)
mozilla::detail::InvalidArrayIndex_CRASH
str x30, [sp, #-0x10]!
mov x2, x1
mov x1, x0
nop
adr x0, s_ElementAtString
bl MOZ_CrashPrintf
// start of MOZ_REALLY_CRASH(0x33)
mov x8, xzr
movz w9, #0x33
str w9, [x8]
bl abort
We probably want to annotate gMozCrashReason as volatile? I'm not sure if that will be enough.
| Assignee | ||
Updated•1 year ago
|
Comment 1•1 year ago
|
||
jstutte mentioned this bug in relation to bug 1945246. They seem similar but not the same, in that in the other bug we're crashing on the assignment to gMozCrashReason instead of skipping it entirely.
| Assignee | ||
Comment 2•1 year ago
•
|
||
The asm statement forces the load of the global, so that a store
happens and the reason appears as live to the compiler.
Comment 3•1 year ago
|
||
I filed a bug about the same basic problem in 2023, bug 1858670. Maybe that could be duped here or something.
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
| Assignee | ||
Comment 4•1 year ago
|
||
Updated•1 year ago
|
Comment 5•1 year ago
|
||
The difference it makes is that it makes m mfbt inherit the
LIBRARY_DEFINES from mozglue, which were intended to propagate
there.
Comment 6•1 year ago
|
||
The difference it makes is that it makes m mfbt inherit the
LIBRARY_DEFINES from mozglue, which were intended to propagate
there.
Updated•1 year ago
|
Updated•1 year ago
|
| Assignee | ||
Comment 8•1 year ago
|
||
Mike, I checked if at that point https://searchfox.org/mozilla-central/source/mfbt/Assertions.cpp#48 we had (defined(MOZ_HAS_MOZGLUE) || defined(MOZILLA_INTERNAL_API)) and it's still false, while it should be true. Can you see why your patch didn't do the trick?
Comment 9•1 year ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/9a6f61aab352
https://hg.mozilla.org/mozilla-central/rev/4bf33527613e
| Assignee | ||
Comment 10•1 year ago
|
||
My bad, as Mike pointed out in a conversation we had, Assertions.cpp is compiled several times across the codebase, some callers don't need the extra annotation support. We seem to be interested by the one in Firefox and it's fine:
% objdump --disassemble=_ZN7mozilla6detail23InvalidArrayIndex_CRASHEmm obj-x86_64-pc-linux-gnu/dist/bin/firefox
obj-x86_64-pc-linux-gnu/dist/bin/firefox: file format elf64-x86-64
Disassembly of section .text:
0000000000051422 <_ZN7mozilla6detail23InvalidArrayIndex_CRASHEmm>:
51422: 55 push %rbp
51423: 48 89 e5 mov %rsp,%rbp
51426: 53 push %rbx
51427: 50 push %rax
51428: 48 89 f2 mov %rsi,%rdx
5142b: 48 89 fe mov %rdi,%rsi
5142e: 48 8d 3d 1b e3 fb ff lea -0x41ce5(%rip),%rdi # f750 <_ZZN7mozilla3dmd19ToIdStringConverter6Base32EjE6digits+0x220>
51435: 31 db xor %ebx,%ebx
51437: 31 c0 xor %eax,%eax
51439: e8 72 fe ff ff call 512b0 <MOZ_CrashPrintf>
5143e: 48 8d 05 fb 77 06 00 lea 0x677fb(%rip),%rax # b8c40 <_ZL18sPrintfCrashReason>
51445: 48 8d 0d e4 77 06 00 lea 0x677e4(%rip),%rcx # b8c30 <gMozCrashReason>
5144c: 48 89 01 mov %rax,(%rcx)
5144f: b8 33 00 00 00 mov $0x33,%eax
51454: 48 89 03 mov %rax,(%rbx)
51457: e8 74 91 ff ff call 4a5d0 <abort>
Comment 11•1 year ago
•
|
||
(In reply to Sandor Molnar[:smolnar] from comment #9)
https://hg.mozilla.org/mozilla-central/rev/9a6f61aab352
https://hg.mozilla.org/mozilla-central/rev/4bf33527613e
Perfherder has detected a browsertime performance change from push 4bf33527613eb08cac37f891c362ecba9e15c718.
Improvements:
| Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
|---|---|---|---|---|---|
| 7% | speedometer3 TodoMVC-jQuery/CompletingAllItems/total | linux1804-64-nightlyasrelease-qr | fission webrender | 167.84 -> 156.54 | Before/After |
| 5% | speedometer3 TodoMVC-jQuery/CompletingAllItems/total | linux1804-64-nightlyasrelease-qr | fission webrender | 165.00 -> 156.50 | |
| 4% | speedometer3 TodoMVC-JavaScript-ES6-Webpack-Complex-DOM/Adding100Items/Sync | linux1804-64-nightlyasrelease-qr | fission webrender | 50.62 -> 48.37 | Before/After |
| 4% | speedometer jQuery-TodoMVC | linux1804-64-nightlyasrelease-qr | fission webrender | 351.24 -> 336.60 | |
| 4% | speedometer3 TodoMVC-JavaScript-ES6-Webpack-Complex-DOM/Adding100Items/total | linux1804-64-nightlyasrelease-qr | fission webrender | 57.05 -> 54.70 | Before/After |
| ... | ... | ... | ... | ... | ... |
| 3% | speedometer3 TodoMVC-JavaScript-ES5/total | linux1804-64-nightlyasrelease-qr | fission webrender | 104.94 -> 102.11 | Before/After |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 44075
For more information on performance sheriffing please see our FAQ.
Comment 12•1 year ago
•
|
||
How did these patches improve some subtests on SP3?
(https://treeherder.mozilla.org/perfherder/alerts?id=44075&hideDwnToInv=0 also looks to be becuase of this bug, but still being confirmed)
| Assignee | ||
Comment 13•1 year ago
|
||
I've observed that this patch changes the inlining decision for several functions: some functions are now inlined, some are no longer etc. As a result the final binary is actually smaller (but looks like its also slower)...
Comment 14•1 year ago
|
||
The alerts say these were improvements. And is it possible to deliberately do more of what this bug did accidentally/incidentally?
Description
•