Closed Bug 586784 Opened 14 years ago Closed 11 years ago

alpha: firefox-3.5.5+ illegal instruction crash nsComponentManagerImpl::GetService() on startup

Categories

(Core :: XPCOM, defect)

DEC
Linux
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: rct, Unassigned)

References

Details

Crash Data

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.2.8) Gecko/20100722 Firefox/3.6.8 ( .NET CLR 3.5.30729; .NET4.0C)
Build Identifier: Mozilla/5.0 (X11; U; Linux alpha; en-US; rv.1.9.1.5) Gecko/20100812 Firefox/3.5.5

On the Alpha platform running with an up-to-date Debian unstable build environment, firefox builds from the standard release mozilla source tree crash on startup with an illegal instruction fault for version 3.5.5 and later.  See below for abbreviated stack backtrace.

Reproducible: Always

Steps to Reproduce:
1. Build firefox on alpha with --disable-optimize, --disable-debug, --disable-tests using 3.5.5 or later source tree with standard Alpha workarounds (add "-Wl,--no-relax" to DSO_LDOPTS, and add "Linuxalphaev56" to filter statement in "Linux/Alpha" section of "xpcom/reflect/xptcall/src/md/unix/Makefile.in"
2. Invoke "firefox".

Actual Results:  
Firefox dies with illegal instruction fault before any screen appears.


Ran "firefox -g" with an unstripped libxul.so and got the following stack backtrace:

nsComponentManagerImpl::GetService()
CallGetService()
nsGetServiceByCIDWithError::operator()()
nsCOMPtr_base::assign_from_gs_cid_with_error()
nsCOMPtr<nsIProxyObjectManager>::nsCOMPtr()
NS_GetProxyForObject()
nsNativeModuleLoader::LoadModule()
(last call repeated for MANY pages of debugging output...)

Tried replacing the following two files with their 3.5.4 counterparts, and got a working 3.5.5 build (no crash):

xpcom/glue/nsThreadUtils.cpp
xpcom/glue/nsThreadUtils.h

Currently trying a 3.5.11 build using the same workaround.
sounds like TLS stuff
Hardware: Other → DEC
3.5.11 with the 3.5.4 versions of nsThreadUtils.[cpp,h] confirmed working.
3.6.8 with minimally upgraded (from 3.5.4) versions of nsThreadUtils.[cpp,h] works.  I assume the correct fix for this issue is considerably more complicated than the workaround of undoing the effects of the #ifdef statements added in 3.5.5, or it would have been done.
Blocks: 521750
FWIW, the Debian alpha build machine doesn't encounter such problems (though there are several others, related to xptcall, afaict)
Debian builds/packages firefox (iceweasel) differently: wrapped xulrunner instead of monolithic (libxul.so) build.  I think that's part of the key to this issue based on comments I read in 521750.
(In reply to comment #5)
> Debian builds/packages firefox (iceweasel) differently: wrapped xulrunner
> instead of monolithic (libxul.so) build.  I think that's part of the key to
> this issue based on comments I read in 521750.

AFAIK, that makes no difference regarding tls. Bug 521750 is totally unrelated and is about the mips toolchain not being able to link huge libraries with tls stuff.
(In reply to comment #6)
> Bug 521750 is totally unrelated
> and is about the mips toolchain not being able to link huge libraries with tls
> stuff.

Gah I was hit by the "awesome bar" outsmarting me... and getting me to bug 528687.
Bob, can you still reproduce?
Crash Signature: [@ nsComponentManagerImpl::GetService()]
Flags: needinfo?(rct)
Summary: alpha: firefox-3.5.5+ illegal instruction crash on startup → alpha: firefox-3.5.5+ illegal instruction crash nsComponentManagerImpl::GetService() on startup
Whiteboard: [closeme 2013-02-18]
The last fully-functional monolithic firefox build I did was 9.0.1, back in April of 2012.  At that time, I still needed the "nsThreadUtils*" patches to avoid the crash on startup.

Experiments with 10.0.X builds uncovered other unrelated problems, but I *did* note that *without* the "nsThreadUtils*" patches, an optimized build succeeded and worked fine, but an unoptimized build segfaulted.  With the patches, both optimized and unoptimized builds succeeded and worked fine.

At least one other alpha user besides me and Michael Cree verified this behavior, but as I recall, we never narrowed it down beyond what was originally reported.

Not sure this is worth pursuing at this point.  It "smelled" like a compiler/linker issue at the time, and the Debian "xulrunner" firefox package started working again (for the first time in many moons on alpha), so I lost interest in doing my own alpha builds.  As modern firefox versions (8.0.1 and later) take just under two days to build on my machine, further troubleshooting / debugging on my part is impractical.  Feel free to close this bug.
Flags: needinfo?(rct)
Resolved per whiteboard and Comment 9
Status: UNCONFIRMED → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Whiteboard: [closeme 2013-02-18]
You need to log in before you can comment on or make changes to this bug.