Closed Bug 1412169 Opened 7 years ago Closed 6 years ago

Firefox 52.4.0 -- Segmentation Fault on Startup

Categories

(Core :: General, defect)

52 Branch
ARM
Linux
defect
Not set
critical

Tracking

()

RESOLVED WONTFIX

People

(Reporter: wadecline, Unassigned)

References

Details

Crash Data

Attachments

(2 files, 1 obsolete file)

Firefox simply Segmentation Faults on startup; note that this happens even when running something like 'firefox --help', suggesting that the error is very early in the startup process.  Stack trace is:

(gdb) bt
#0  0xffff0fc0 in ?? ()
#1  0xb3b339b8 in google::protobuf::GoogleOnceInitImpl(int*, google::protobuf::Closure*) () from /usr/lib/firefox/libxul.so
#2  0xb2183174 in google::protobuf::internal::GetEmptyString[abi:cxx11]() () from /usr/lib/firefox/libxul.so
#3  0xb216b310 in mozilla::layers::layerscope::TexturePacket::SharedCtor() () from /usr/lib/firefox/libxul.so
#4  0xb216b3dc in mozilla::layers::layerscope::TexturePacket::TexturePacket() () from /usr/lib/firefox/libxul.so
#5  0xb216e21c in mozilla::layers::layerscope::protobuf_AddDesc_LayerScopePacket_2eproto() () from /usr/lib/firefox/libxul.so
#6  0xb6f7daa4 in ?? () from /lib/ld-linux-armhf.so.3
#7  0xb664b700 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I also tried 'firefox -P' as per the documentation but still got a SIGSEGV.

Special notes:
System is running a "hardened" Gentoo profile and armhf (Hard Float, VFP support).  Also had to disable the startup cache via '--disable-startupcache' as it would error on installation (see the Gentoo bug).
Component: General → Graphics: Layers
Product: Firefox → Core
There's two crash reports in protobuf's initialization code that may be related: https://crash-stats.mozilla.com/search/?signature=~GoogleOnceInitImpl&product=Firefox&date=%3E%3D2017-09-03T14%3A18%3A00.000Z&date=%3C2017-11-06T14%3A18%3A48.000Z&_sort=-date&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#crash-reports

Protobuf appears to have a rather convoluted mechanism for lazy global initializations which involves an externally declared symbol https://searchfox.org/mozilla-central/rev/5a60492a53667fc61a62af1847d005a210b7a4f6/toolkit/components/protobuf/src/google/protobuf/message_lite.h#152 These things tend to be somewhat dangerous in C++ because of how build configurations and the order of linker arguments can affect if and how many times the external symbols are actually instantiated.

I don't think this is specific to layers or graphics. Moving to Core::General since this is where some protobuf related changes like bug 1385461 happened.
Component: Graphics: Layers → General
Attached file 52.5.2 backtrace (obsolete) —
Also broken in 52.5.2
Comment on attachment 8939146 [details]
52.5.2 backtrace

Did not upload properly (thanks Lynx), please delete.
Attached file 52.5.2 backtrace
Attachment #8939146 - Attachment is obsolete: true
If this still happens with more recent versions of firefox, it might be related to your hardend profile at some point. With a 13.0 profile and stable toolchain, it compiles fine if you enable -O2 optimization and append -fno-sched-insns as cxx flags, and runs without any segmention faults. If you still use the old gcc-5.4 , there is no need for optimization or cxx flags change.
I just tried this with 52.7.3 and gcc 6.4 am still segfaulting.  I'm going to try adding the '-fno-sched-insns' flag and see if that works; after that, I'll see about going through the 13.0 profile upgrade.
This is the patch I'm using, it is against the portage tree obviously. 

The point about -O2 and -fno-schedule-insns is that there is a linking error with gcc-6, on a 13.0 profile but non-hardened. schedule-insns with gcc is somehow broken on arm, and to sum it up, -O2 inherits schedule-insns and is deactivated via -fno-schedule-insns. So you get full -O2 optimization with the exception of the broken schedule-insns. 

Hardened profiles have produced all kinds of bugs in the past, if the issue still persists you might want to open a bug on the gentoo side. Link it here if you want to, so I can have a look at it.
I added the '-fno-sched-insns' flag to the custom Firefox CXX flags but it got stripped out by Portage; looking at the eclass I needed to first enable the 'custom-cflags' USE option first.  Looking at your patch, I think you're missing a '-' before the '-fno-sched-insns' flag.  I'm going to try building with just the custom env+flags first since any changes to the portage tree will get overridden on the next system upgrade.

There is (sort of) a bug for this on the Gentoo Bugzilla, but it stems from an old issue and could arguably be made into a new bug: https://bugs.gentoo.org/show_bug.cgi?id=631442
yeah, it's a typo, should be -fno-schedule-insns
No luck, but maybe I need the 'custom-optimization' USE flag as well?  Going to try with this enabled in addition to 'custom-cflags'.
Same problem with '-fno-schedule-insns' added to CXXFLAGS and 'custom-cflags' and 'custom-optimization' as USE flags.  Going to try using your eclass modifications and turning off the extra CXX/USE flags next (as I don't trust the ebuild after seeing my custom CXXFLAGS scrubbed once), then probably a world upgrade (for newer glibc and paxutils), then finally see about the profile upgrade.
Indeed, the ebuild strips overly agressive cflags. Which is why you have to patch the eclass instead. The -fno-schedule-insns and -O2 patch for the ebuild fixes a linking error if you use gcc-6, and likeley needs to be expanded to gcc-7 in the future. You should have the rest of your system up to date before you try to solve this, there were plenty of bugs recently fixed in binutils, gcc and so on, and so forth.
You shouldn't need to patch the eclass with 'custom-cflags' and 'custom-optimization' set as USE flags.  Anyways, none of those methods worked, however, someone pointed out https://github.com/sakaki-/novena-kernel-patches/blob/master/1101-novena_defconfig-enable-minimal-GRKERNSEC.patch to me and when I ran an old 4.4.0 kernel I was able to start Firefox... this makes me think that my hardened kernel is doing something evil.  Sadly, I've been having weird LCD boot problems whenever I try to compile another kernel (even 4.4.0 with my saved configuration), so I can't properly test this hypothesis until I figure out how to fix my boot issues... :/

I'll update when I have more relevant information.
Closing because no crash reported since 12 weeks.
Status: UNCONFIRMED → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Closing because no crash reported since 12 weeks.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: