Firefox crashes on a thread doing LZ4 compression in unoptimized debug build
Categories
(Toolkit :: Startup and Profile System, defect)
Tracking
()
People
(Reporter: TYLin, Unassigned)
Details
I notice in my Linux environment, Firefox unoptimized debug build starts to crash after opening for about a minutes. Remove the objdir and rebuild doesn't fix the issue.
rr call stack shows the following.
Thread 5 received signal SIGSEGV, Segmentation fault.
[Switching to Thread 4452.4588]
__memset_avx2_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:141
141 ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such file or directory.
(rr) bt
#0 0x00007fe21eb41f2d in __memset_avx2_erms () at ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:141
#1 0x000055812f386cae in LZ4_streamHC_t_alignment () at /home/aethanyc/Projects/gecko/mfbt/lz4/lz4hc.c:830
#2 0x000055812f3869d6 in LZ4_initStreamHC (buffer=0x7fe1ec52a000, size=262200) at /home/aethanyc/Projects/gecko/mfbt/lz4/lz4hc.c:917
#3 0x000055812f386c1d in LZ4_createStreamHC () at /home/aethanyc/Projects/gecko/mfbt/lz4/lz4hc.c:896
#4 0x000055812f38248d in LZ4F_compressBegin_usingCDict (cctxPtr=0x7fe1ece7e040, dstBuffer=0x7fe1ec47e000, dstCapacity=262156, cdict=0x0, preferencesPtr=0x7fe1f95f8878) at /home/aethanyc/Projects/gecko/mfbt/lz4/lz4frame.c:621
#5 0x000055812f3836fd in LZ4F_compressBegin (cctxPtr=0x7fe1ece7e040, dstBuffer=0x7fe1ec47e000, dstCapacity=262156, preferencesPtr=0x7fe1f95f8878) at /home/aethanyc/Projects/gecko/mfbt/lz4/lz4frame.c:715
#6 0x000055812f3aa66a in mozilla::Compression::LZ4FrameCompressionContext::BeginCompressing(mozilla::Span<char, 18446744073709551615ul>) (this=0x7fe1f95f8bc8, aWriteBuffer=...) at /home/aethanyc/Projects/gecko/mfbt/Compression.cpp:126
#7 0x00007fe20f7211c1 in mozilla::scache::StartupCache::WriteToDisk() (this=0x7fe21e756c40) at /home/aethanyc/Projects/gecko/startupcache/StartupCache.cpp:531
#8 0x00007fe20f723bb2 in mozilla::scache::StartupCache::ThreadedWrite(void*) (aClosure=0x7fe21e756c40) at /home/aethanyc/Projects/gecko/startupcache/StartupCache.cpp:654
#9 0x00007fe21fdd55ea in _pt_root (arg=0x7fe1ece25ee0) at /home/aethanyc/Projects/gecko/nsprpub/pr/src/pthreads/ptthread.c:201
#10 0x00007fe21f8ee6db in start_thread (arg=0x7fe1f95f9700) at pthread_create.c:463
#11 0x00007fe21ead488f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
(rr) f 1
#1 0x000055812f386cae in LZ4_streamHC_t_alignment () at /home/aethanyc/Projects/gecko/mfbt/lz4/lz4hc.c:830
830 struct { char c; LZ4_streamHC_t t; } t_a;
My mozconfig contains the following: (I guess the --disable-optimize is the key, otherwise it should have been detected by our CI.)
ac_add_options --enable-debug
ac_add_options --disable-optimize
I can reproduce a quicker crash by change the number 60000
to 1000
in
https://searchfox.org/mozilla-central/rev/7cc0f0e89cb40e43bf5c96906f13d44705401042/startupcache/StartupCache.cpp#740,756
I disabled optimize in my debug for accurate information in rr and gdb, so this may affect other people's day to day work.
This may related to bug 1550108. Doug, could you take a look?
Comment 1•5 years ago
|
||
That is a very strange stack. I'll try to reproduce on my end, but in the mean time could you print the disassembly of LZ4_streamHC_t_alignment?
Reporter | ||
Comment 2•5 years ago
|
||
Sure.
(rr) f 1
#1 0x000055c89924bcae in LZ4_streamHC_t_alignment () at /home/tlin/Projects/gecko/mfbt/lz4/lz4hc.c:830
830 struct { char c; LZ4_streamHC_t t; } t_a;
(rr) disassemble
Dump of assembler code for function LZ4_streamHC_t_alignment:
0x000055c89924bc80 <+0>: push %rbp
0x000055c89924bc81 <+1>: mov %rsp,%rbp
0x000055c89924bc84 <+4>: sub $0x40050,%rsp
0x000055c89924bc8b <+11>: mov %fs:0x28,%rax
0x000055c89924bc94 <+20>: mov %rax,-0x8(%rbp)
0x000055c89924bc98 <+24>: lea -0x40048(%rbp),%rdi
0x000055c89924bc9f <+31>: mov $0xaa,%esi
0x000055c89924bca4 <+36>: mov $0x40040,%edx
0x000055c89924bca9 <+41>: callq 0x55c89928ba80 <memset@plt>
=> 0x000055c89924bcae <+46>: mov %fs:0x28,%rdx
0x000055c89924bcb7 <+55>: mov -0x8(%rbp),%rdi
0x000055c89924bcbb <+59>: cmp %rdi,%rdx
0x000055c89924bcbe <+62>: mov %rax,-0x40050(%rbp)
0x000055c89924bcc5 <+69>: jne 0x55c89924bcd9 <LZ4_streamHC_t_alignment+89>
0x000055c89924bccb <+75>: mov $0x8,%eax
0x000055c89924bcd0 <+80>: add $0x40050,%rsp
0x000055c89924bcd7 <+87>: pop %rbp
0x000055c89924bcd8 <+88>: retq
0x000055c89924bcd9 <+89>: callq 0x55c89928ba50 <__stack_chk_fail@plt>
End of assembler dump.
Comment 3•5 years ago
|
||
I don't know why it's calling memset on 0x40040 bytes of the stack just to compute the offset of a member of a struct - I assume for some kind of instrumented sanity check. But in any case, this looks like it's just a dupe of bug 1550108.
Reporter | ||
Comment 4•5 years ago
|
||
Do you mean to dup this over bug 1587107? I just apply https://phabricator.services.mozilla.com/D48570, and it does help.
Comment 5•5 years ago
|
||
Woops! Yes.
Description
•