SEGV in tbird 60 on aarch64 with lightning add-on
Categories
(Thunderbird :: General, defect)
Tracking
(Not tracked)
People
(Reporter: richard.palo, Unassigned)
Details
(Keywords: crash)
Attachments
(3 files)
| Reporter | ||
Comment 1•7 years ago
|
||
Comment 2•7 years ago
|
||
| Reporter | ||
Comment 3•7 years ago
|
||
| Reporter | ||
Comment 4•7 years ago
|
||
Updated•7 years ago
|
| Reporter | ||
Comment 5•7 years ago
|
||
Comment 6•7 years ago
|
||
| Reporter | ||
Comment 7•7 years ago
|
||
Comment 8•7 years ago
|
||
| Reporter | ||
Comment 9•7 years ago
|
||
| Reporter | ||
Comment 10•7 years ago
|
||
Comment 11•7 years ago
|
||
Updated•7 years ago
|
| Reporter | ||
Comment 12•7 years ago
|
||
Comment 13•7 years ago
|
||
| Reporter | ||
Comment 14•7 years ago
|
||
Comment 15•7 years ago
|
||
| Reporter | ||
Comment 16•7 years ago
|
||
Comment 17•7 years ago
|
||
| Reporter | ||
Comment 18•6 years ago
|
||
this is an strace as the latest (Thunderbird 60.5.1) running on
Linux odroid-001e06336dd6 4.20.10-1-ARCH #1 SMP Fri Feb 15 17:55:03 MST 2019 aarch64 GNU/Linux
is now impossible to launch, even in safe-mode, without a near immediate coredump.
For the record, this is with the home directory (and therefore TB profile) on an NFS mount
192.168.0.1:/home/richard on /home/richard type nfs4 (rw,nosuid,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.30,local_lock=none,addr=192.168.0.1)
Next attachment is via root on local filesystem which also hangs (but doesn't coredump)
| Reporter | ||
Comment 19•6 years ago
|
||
running root on local file system with first-time execution of thunderbird
(which hangs)
Comment 20•6 years ago
|
||
How much memory do you have?
(In reply to Richard PALO from comment #18)
Created attachment 9045168 [details]
tb.stracethis is an strace as the latest (Thunderbird 60.5.1) running on
Linux odroid-001e06336dd6 4.20.10-1-ARCH #1 SMP Fri Feb 15 17:55:03 MST 2019 aarch64 GNU/Linux
is now impossible to launch, even in safe-mode, without a near immediate coredump.For the record, this is with the home directory (and therefore TB profile) on an NFS mount
192.168.0.1:/home/richard on /home/richard type nfs4 (rw,nosuid,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.0.30,local_lock=none,addr=192.168.0.1)Next attachment is via root on local filesystem which also hangs (but doesn't coredump)
In the trace, I see this near the end:
--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, si_addr=0x87c881e544ee0} ---
This SEGV_MAPPER would suggest dangling pointer, etc.
If nullptr reference is attempted, I think si_addr would have been all 0.
It could be that
- the memory allocation has failed (maybe unlikely, hard to tell), or
- pointer stored in a memory was corrupted and the dereferencing of this pointer caused memory access error, etc.
si_addr=0x87c881e544ee0 <--- strange value.
This address looks suspicious. I search for 87c8 in the file and this is the only occurence.
I think there was some kind of buffer overflow or some other form of memory corruption to produce this bogus pointer.
I am not sure if this can be aaarch64-specific or not.
I wonder if you can obtain a symbolc stack trace instead of numeric stack trace a la
Stack trace of thread 1181:
#0 0x0000ffff93a93b0c n/a (libxul.so)
#1 0x0000ffff93a9c69c n/a (libxul.so)
#2 0x0000ffff93a9cc90 n/a (libxul.so)
#3 0x0000ffff93a62a94 n/a (libxul.so)
#4 0x0000ffff93a801d0 n/a (libxul.so)
#5 0x0000ffff93a80d3c n/a (libxul.so)
#6 0x0000ffff93a8102c n/a (libxul.so)
It would he insanely great if we can get the stackdump in the form of a symbol + numeric offset...
| Reporter | ||
Comment 21•6 years ago
|
||
As for memory, SBC's are typically on the order of 2GB:
$ LANG=C free -m
total used free shared buff/cache available
Mem: 1968 273 1198 16 496 1656
Swap: 10239 212 10027
I doubt I can get much usable symbolic info from a stripped binary, unfortunately:
$ file /usr/lib/thunderbird/libxul.so
/usr/lib/thunderbird/libxul.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=878ae1e829fd09c44b88f8b96d7796f5a74c8d95, stripped
Comment 22•6 years ago
|
||
(In reply to Richard PALO from comment #21)
As for memory, SBC's are typically on the order of 2GB:
$ LANG=C free -m
total used free shared buff/cache available
Mem: 1968 273 1198 16 496 1656
Swap: 10239 212 10027
I am not sure if 2GB is sufficient or not for typical TB operation these days [this should depend on the number of meessages you load into memory], but you have swap enabled, and so should be OK.
I doubt I can get much usable symbolic info from a stripped binary, unfortunately:
$ file /usr/lib/thunderbird/libxul.so
/usr/lib/thunderbird/libxul.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=878ae1e829fd09c44b88f8b96d7796f5a74c8d95, stripped
I am afraid that you need to find someone who can create non-stripped binary for debugging for aarch64 target. Otherwise, it is really difficult to figure out where the error is occurring. (I have a suspicion that the mutex-related routines may have caused the segfault, but even then, the symbolic value would be very helpful to figure out which variable (field of a struct, etc) holds the bogus value, etc.
Sorry I do only x86_64 linux binary :-)
Wait.
/usr/lib/thunderbird/libxul.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=878ae1e829fd09c44b88f8b96d7796f5a74c8d95, stripped
This refers to .so version of the library.
How does an OS or dynamic linker can link the executable with the function in libxul.so IF the function symbols are not there (!?)
I think the "stripped" here refers to the DEBUGGING SYMBOL INFORMATION. The dynamically linked function names ought to be in it still.
But when I did "nm my-version-of-thunderbird-libxul.so" on my linux, it says "No symbol".
Now I have it figured out.
https://stackoverflow.com/questions/20288485/how-does-the-linker-locate-code-in-stripped-dynamic-libraries
Use "nm -D libxul.so" prints out the symbols (at least the function entry names). This will be a start to obtain the numerical stack trace.
I am not sure if this is relevant for aaarch64 binary, but in TB/FF source tree there is a routine to convert the
numeric symbol of a run-time numerical dump (from ASSERT macro) to symbol+offset form.
(This is very sensitive to the particular format of the dump and you need to have cooperating objdump utlities.)
The attached script is how I obtain the symbol+offset form of ASSERT numberic dump during TB testing using that script.
As I mentioned, the script seems to expect the dump in a certain format [printed by assert macro] and you need pytyon and a host of objdump utility programs.
YMMV.
| Reporter | ||
Comment 23•6 years ago
|
||
Although I didn't have time personally to do these scripts, the good news seems
that today, under 5.0.11-1-ARCH #1 SMP Fri May 3 01:14:14 UTC 2019 aarch64 GNU/Linux and
thunderbird 60.6.1-2
thunderbird-i18n-fr 60.6.1-1
I've been able to use thunderbird so far without SEGV, even when opening attachments.
I'll report back if things take a change for the worse.
cheers,
Richard
Comment 24•6 years ago
|
||
Thanks for the update
Description
•