Thunderbird built with rustc-1.68.0 and LLVM-16 segfaults on startup
Categories
(Thunderbird :: Untriaged, defect)
Tracking
(Not tracked)
People
(Reporter: renodr, Unassigned)
References
Details
(Whiteboard: [closeme 2023-10-01])
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Firefox/91.0
Steps to reproduce:
Build Thunderbird with rustc-1.68.0 and LLVM-16. The configuration used for Linux From Scratch can be found here: https://linuxfromscratch.org/blfs/view/systemd/xsoft/thunderbird.html
Note that the build requires a patch for the rust-bindgen crate when using LLVM-16 as well, which can be grabbed from: https://linuxfromscratch.org/patches/blfs/svn/firefox-102.9.0-upstream_fixes-1.patch (it applies cleanly to the tree and fixes the build error there) - the more "proper" way to handle this would be to just update the rust-bindgen crate, though that isn't trivial.
Our ticket for this issue at BLFS can be found at https://wiki.linuxfromscratch.org/blfs/ticket/17794, where I've been documenting my various attempts to debug this issue.
Actual results:
When building Thunderbird-102.9.0 with rustc-1.68.0, the application immediately crashes upon startup. After checking 'ldd' on the rustc command, I've verified that it's using libLLVM-16.so. If I use rustc-1.67.1 (which uses LLVM-15 on a Linux From Scratch system), it builds and functions properly.
In terms of console output, I get:
[ImapModuleLoader] Using nsImapService.cpp
[NntpModuleLoader] Using NntpService.jsm
[Pop3ModuleLoader] Using Pop3Service.jsm
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
ATTENTION: default value of option mesa_glthread overridden by environment.
[calBackendLoader] Using Thunderbird's ical.js backend
Segmentation fault (core dumped)
And for a backtrace, I get:
(gdb) bt full
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=11, no_tid=no_tid@entry=0) at pthread_kill.c:44
tid = <optimized out>
ret = 0
pd = <optimized out>
old_mask = {__val = {0}}
ret = <optimized out>
#1 0x00007f7e8687c0ff in __pthread_kill_internal (signo=11, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007f7e8682e462 in __GI_raise (sig=11) at ../sysdeps/posix/raise.c:26
ret = <optimized out>
#3 0x00007f7e80b06b4c in () at /usr/lib/thunderbird/libxul.so
#4 0x00007f7e815ca9a7 in () at /usr/lib/thunderbird/libxul.so
#5 0x00007f7e8682e500 in <signal handler called> () at /usr/lib/libc.so.6
#6 0x00007f7e81d17fb6 in () at /usr/lib/thunderbird/libxul.so
#7 0x00007f7e81863586 in () at /usr/lib/thunderbird/libxul.so
#8 0x00007f7e81862e4c in () at /usr/lib/thunderbird/libxul.so
#9 0x0000000000000000 in ()
In order to get more information, I attempted a build with debugging symbols present. That crashed with the following output once it tried to compile gkrust:
27:10.54 error: Cannot represent a difference across sections
27:32.84 error: could not compile gkrust
due to previous error
I then attempted a build with stripping turned off so that I could hopefully get more information, and wasn't able to get anything out of there either.
My colleagues and I did a check to see if there was any differences between the Firefox-102.9.0esr Mozilla code vs. what Thunderbird is using and we weren't able to find anything, so we think that something is going on with the Thunderbird-specific code in this case. We've been able to do some rudimentary debugging, but haven't been able to go as in-depth as desired because we cannot get a debug build to work without the build failing.
One of my colleagues, Xi Ruoyao, was able to get that it's something with style::custom_properties::CustomPropertiesBuilder::build::he418231f9106fe2e (or _ZN5style17custom_properties23CustomPropertiesBuilder5build17h), and the instruction sequence at crash is:
0x00007ffff0445706: mov (%r14),%rax 0x00007ffff0445709: cmp $0xffffffffffffffff,%rax 0x00007ffff044570d: je 0x7ffff0445912 0x00007ffff0445713: lock incq (%r14) 0x00007ffff0445717: jg 0x7ffff0445912
... and r14 contains "0xe5e5e5e5e5e5e5e5", which Mozilla's documentation says is a Use-After-Free.
Note that I've tried a fresh profile as well as part of normal troubleshooting and had no difference.
Expected results:
Thunderbird should build and execute correctly.
Comment hidden (obsolete) |
Comment 2•2 years ago
|
||
I'm another developer of the linuxfromscratch books, and I'd say it is not a problem with build instructions, since the same build instructions work with llvm-15+rust-1.67.1, and even llvm-15 +rust-1.68.1. The failure comes from something new in llvm-16. Whether it is a bug in llvm-16 or a bug in thunderbird revealed by the change in llvm (or even a bug in rust revealed by a change in llvm), I cannot tell, but there is something going on.
Comment 3•2 years ago
|
||
Interestingly, if ac_add_options --disable-release
is added to the mozconfig (without changing anything else to what is on https://linuxfromscratch.org/blfs/view/systemd/xsoft/thunderbird.html), then the crash is gone.
This is now tracked under Bug 1831242 as well.
Updated•2 years ago
|
Comment 6•1 year ago
|
||
Does this also reproduce when using version 115 started in Help > Troubleshoot Mode?
If it does, and you have not already done so, please list complete steps to reproduce.
Updated•1 year ago
|
Description
•