Firefox crashes on startup in PR_SetLogFile() with NSPR_LOG_FILE env var set

RESOLVED FIXED in 4.8

Status

defect
--
blocker
RESOLVED FIXED
10 years ago
10 years ago

People

(Reporter: matt, Assigned: wtc)

Tracking

({crash, regression})

Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(3 attachments)

Posted file strace log, bzip2'd
The nightly 3.6a1pre builds crash on startup, even if it's something as simple as "firefox -?" or "firefox--help".  The last build which doesn't crash is Build ID 20090501053152, source stamp 2919f7dc85d0, and the first which crashes is Build ID 20090501090244, source stamp 85e6ca56e859 (the fix for bug 489231).  The URL for this bug links to the changeset that caused the problem.

Strace shows right before the crash (full trace output is attached):

12100 sched_get_priority_min(SCHED_OTHER) = 0
12100 sched_get_priority_max(SCHED_OTHER) = 0
12100 sched_getparam(12100, { 0 })      = 0
12100 sched_getscheduler(12100)         = 0 (SCHED_OTHER)
12100 sched_setscheduler(12100, SCHED_OTHER, { 0 }) = 0
12100 getuid32()                        = 500
12100 geteuid32()                       = 500
12100 getgid32()                        = 500
12100 getegid32()                       = 500

And gdb gives this stack trace:

#0  0x00000000 in ?? ()
#1  0xb70f1c34 in PR_OpenFile () from ./libnspr4.so
#2  0xb70f1d7b in PR_Open () from ./libnspr4.so
#3  0xb70dc9ee in PR_SetLogFile () from ./libnspr4.so
#4  0xb70dced0 in ?? () from ./libnspr4.so
#5  0xbfbf2d2e in ?? ()
#6  0xb70fab1b in ?? () from ./libnspr4.so
#7  0xbfbf17e4 in ?? ()
#8  0xbfbf17e4 in ?? ()
#9  0xbfbf17e8 in ?? ()
#10 0xbfbf17e4 in ?? ()
#11 0x00000060 in ?? ()
#12 0x08051a3f in calloc ()
#13 0xb70e7355 in ?? () from ./libnspr4.so
#14 0x00000000 in ?? ()

See bp-643daf7d-cdf2-4f5f-aba5-f05f92090504 details and modules for more info on my setup.
In pr/src/md/unix/unix.c, we see this change:

     5.7 -struct _MD_IOVector _md_iovector = { open };
     5.8 +struct _MD_IOVector _md_iovector;

It was determined that, on (some) other unix platforms, _md_iovector is 
explicitly initialized in the code, and so this static initializer was 
thought to be redundant (and a problem on Symbian).  

So, what I wonder is: on what platforms does the code NOT explicitly 
assign a pointer to _md_iovector, and why is it different on some unix
platforms than others?  

In the meantime, I'm guessing this might be fixed with a patch like so:

+#ifdef SYMBIAN
 struct _MD_IOVector _md_iovector;
+#else
+struct _MD_IOVector _md_iovector = { open };
+#endif

Matthew, would you care to try that?
bug 432430 comes to mind. and yes, i know what it's theoretically about. i don't care.
nelson: the problem is initialization order.

i think you'll find that PR_SetLogFile is called *long* before 
_PR_MD_FINAL_INIT();
Depends on: 432430
specifically, it's called from _PR_InitLog(void)

202 void _PR_InitLog(void)
261         ev = PR_GetEnv("NSPR_LOG_FILE");
263             if (!PR_SetLogFile(ev)) {

which is:

166 static void _PR_InitStuff(void)
236     _PR_InitLog();
246     _PR_MD_FINAL_INIT();
Keywords: regression
Summary: Firefox crashes on startup in PR_SetLogFile() → Firefox crashes on startup in PR_SetLogFile() with NSPR_LOG_FILE env var set
I can confirm that the crash goes away if NSPR_LOG_FILE is unset.
The patch I suggested in comment 1 will revert the troublesome change on 
all platforms except SYMBIAN, and will serve as a short-term fix until we
figure out the right long term fix.
bug 432430 comment 86 made the suggestion in case people don't like searching through long bugs.

nelson: i'd suggest that someone include a test in the suite to make sure NSPR_LOG_FILE works. And i'd also request that you revert this. NSPR_LOG_FILE is a promise nspr has made. This regression is ok on a branch, but not acceptable on a version that mozilla uses.
Severity: critical → blocker
Timeless, If you confirm that the patch suggested in comment 1 solves the 
problem on Linux, then I will commit it in the CVS repository.
ok, so for people testing you need to set NSPR_LOG_MODULES=something and NSPR_LOG_FILE=something.

With those set on my mac (os x), i crash. w/ the patch nelson offers, the crash doesn't happen.

r=timeless
Committed on CVS trunk.  
pr/src/md/unix/unix.c; new revision: 3.56; previous revision: 3.55
http://hg.mozilla.org/mozilla-central/rev/47e935bd7f3b
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
As timeless suggested in comment 7, I added a regression test for this crash.
Attachment #376537 - Flags: review?(nelson)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Wan-Teh, 

If I am not mistaken, the "fix" commited in comment 1 didn't fix this issue 
for Symbian.  It left Symbian broken.  If we don't really fix it for Symbian, 
then this test program will cause tests to immediately begin to fail on Symbian.  

Perhaps the purpose for for this test program is to determine that the 
problem is not fixed on Symbian?  Is that your intent?
Comment on attachment 376537 [details] [diff] [review]
Regression test

My intent is to document why _md_iovector._open64 needs to
be initialized with 'open' statically (I didn't know why,
or I forgot why), and to prevent regression.  The Symbian
port is work in progress, so it's fine for the Symbian port
to fail a test.  I know the test will crash on Symbian.

I'm just trying to follow a common policy of adding a
regression test with each bug fix.
Comment on attachment 376537 [details] [diff] [review]
Regression test

ok, r=nelson
Let the Symbian pain begin.
Attachment #376537 - Flags: review?(nelson) → review+
I checked in the patch (attachment 376537 [details] [diff] [review]) on the NSPR trunk (NSPR 4.8).

Checking in pr/src/md/unix/unix.c;
/cvsroot/mozilla/nsprpub/pr/src/md/unix/unix.c,v  <--  unix.c
new revision: 3.57; previous revision: 3.56
done
Checking in pr/tests/Makefile.in;
/cvsroot/mozilla/nsprpub/pr/tests/Makefile.in,v  <--  Makefile.in
new revision: 1.61; previous revision: 1.60
done
RCS file: /cvsroot/mozilla/nsprpub/pr/tests/logfile.c,v
done
Checking in pr/tests/logfile.c;
/cvsroot/mozilla/nsprpub/pr/tests/logfile.c,v  <--  logfile.c
initial revision: 1.1
done
Checking in pr/tests/runtests.pl;
/cvsroot/mozilla/nsprpub/pr/tests/runtests.pl,v  <--  runtests.pl
new revision: 1.6; previous revision: 1.5
done
Checking in pr/tests/runtests.sh;
/cvsroot/mozilla/nsprpub/pr/tests/runtests.sh,v  <--  runtests.sh
new revision: 1.9; previous revision: 1.8
done
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
OS: Linux → All
Hardware: x86 → All
Resolution: --- → FIXED
Target Milestone: --- → 4.8
wtc: thanks for writing the test, however is it really proper to write:
+ * Portions created by the Initial Developer are Copyright (C) 1998-2000

the test as far as i can tell to the extent it has any meaningful code is new and should be dated 2009.
OK, OK :-)

I intentionally copied the license header from another file
because I didn't want the copyright notice to imply that the
Initial Developer of NSPR is Google.  I guess I can't avoid
the ambiguity of whether the Original Code means the entire
library or just that source file.

I checked in this patch on the NSPR trunk (NSPR 4.8).

Checking in logfile.c;
/cvsroot/mozilla/nsprpub/pr/tests/logfile.c,v  <--  logfile.c
new revision: 1.2; previous revision: 1.1
done
Comment on attachment 376751 [details] [diff] [review]
Fix copyright notice in regression test

timeless: alteratively, I can use the old Netscape copyright
notice, and just add "Google Inc." to the Contributor(s) section.
That'll avoid the implication that the Initial Developer of NSPR
is Google.  But the copyright notice will still say 1998-2000.

Would that be acceptable?
you can change it to say "this test" instead of "nspr" if that's what you're worried about.

IANAL, MoCo does have council and should be able to answer questions about this.

http://mxr.mozilla.org/mozilla-central/search?string=The+Original+Code+is
Version: other → 4.8
You need to log in before you can comment on or make changes to this bug.