Crash on browser/installer exit on win9x

VERIFIED FIXED

Status

()

P1
critical
VERIFIED FIXED
19 years ago
19 years ago

People

(Reporter: doronr, Assigned: waterson)

Tracking

({crash, regression})

Trunk
x86
Windows 98
crash, regression
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [nsbeta3++][dogfood-] FIX IN HAND)

Attachments

(3 attachments)

(Reporter)

Description

19 years ago
Possible related to 45842

Win98 2000092005 crashes on exit of browser, the last line in console is
Pref_Cleanup().

Can anyone other reproduce this?
*** Bug 53359 has been marked as a duplicate of this bug. ***
windows commercial build 2000-09-20-06-M18

closing browser with "X" or File | Close  or File | Exit crashes with error:

messages "This program has perfomed an illegal operation and will be shut down"

and:

Runtime error!

Program: c:\ *

R6016
-not enough psace for thread data

Comment 3

19 years ago
Talkback data? unable to reproduce on NT with 092005 build.

Comment 4

19 years ago
i'm also seeing this one
drwatson on 98 gives me this info:

Remote Procedure Call DLL performed an invalid memory access.

Module Name: RPCRT4.DLL
Description: Remote Procedure Call DLL
Version: 4.71.2900
Product: Microsoft(R) Windows NT(TM) Operating System
Manufacturer: Microsoft Corporation

Application Name: Mozilla.exe
and now without drwatson:

MOZILLA caused an invalid page fault in
module RPCRT4.DLL at 017f:7fb9181c.
Registers:
EAX=00000000 CS=017f EIP=7fb9181c EFLGS=00010246
EBX=81998794 SS=0187 ESP=0068fd8c EBP=0068fdc0
ECX=d82db5b0 DS=0187 ESI=7fb90000 FS=6f07
EDX=c003094c ES=0187 EDI=00000000 GS=0000
Bytes at CS:EIP:
ff 70 28 ff 15 78 d0 bd 7f c7 05 bc c0 bd 7f 01 
Stack dump:
00000000 00000000 7fb90000 81998794 00000000 16670246 bff741f7 0068fd90 0068fbbc 
0068ff78 7fb953e8 7fbd4a70 ffffffff 0068ff88 bff7ddd6 7fb90000 

talkback does not seem to kick in i'm running mozilla with -console and to be 
able to close the console after the crask i'm forced to end winold task using 
the ctrl-alt-del trick

i can attach a more detailed drwatson report if needed
(Reporter)

Comment 5

19 years ago
talkback is shutdown before the crash. annoying and makes it harder to locate
the culprit

Comment 6

19 years ago
over tyo XPCOM for investigation.
Component: Browser-General → XPCOM

Comment 7

19 years ago
reassigning because I forgot.  I'm not sure this is XPCOM but the only lxr ref
to that windows RPCRT4.DLL is in xpcom/tests and the microsoft literature talks
about it in idl and com docs.
Assignee: asa → rayw
QA Contact: doronr → rayw

Comment 8

19 years ago
Since the windows installer uses xpcom, it also crashes with the same result at 
the end of setup.exe.

Comment 9

19 years ago
adding dp and scc

Comment 10

19 years ago
Thus far, I have been unable to duplicate this.  I tried on several independent 
occations before the bug disappeared and after it reappeared in my scope.  I do 
not have a Windows 98 platform to test on, but it does not appear on my WNT 4.0 
SP 6 build using a strait Mozilla build.  While the original bug wasn't logged 
against a commercial build, at least one reproduction was on a commercial build, 
so that is my next attempt.
(Reporter)

Comment 11

19 years ago
this is a win9x only bug!  Still in 2000092108.  Nominating for nsbeta3, as
win9x is a very wide spread OS.  Updating summary to make it clearer where we crash.
Keywords: crash, nsbeta3
Summary: Crash on exit → Crash on browser/isntaller exit on win9x

Comment 12

19 years ago
adding myself to the CC: list.

Comment 13

19 years ago
Not sure if anyone else sees this, but the crash created by this bug creates a
presistent crash window that doesn't go away on my Win98 computer, forcing me to
reboot to get rid of it.

If I'm not the only one experiencing this, then this should give this bug a
little more priority.

Comment 14

19 years ago
Why isn't tinderbox catching this?

Comment 15

19 years ago
*** Bug 53666 has been marked as a duplicate of this bug. ***

Comment 16

19 years ago
RE versions: This problem is reported in Win95 and Win98 only, not NT or 2k.
http://bugscape.netscape.com/show_bug.cgi?id=2415

Worse, this isn't plussed yet.

Does someone need to go to PDT to explain this needs to be fixed? We can't ship 
anything that has an installer that crashes at the end, even if it installs 
correctly.

If this sounds like I'm volunteering, I volunteer.

Comment 17

19 years ago
Created attachment 15328 [details]
stack trace of assertions prior to crash in RPCT4.DLL under win98

Comment 18

19 years ago
I've attached a stack trace of the assertions that are thrown when exiting the 
browser.  I thought they might be useful.

I'm going to try to get a stack trace during the exit of the installer now.

Comment 19

19 years ago
I would plus it and accept it as assigned if I could dup it. I suggest that 
someone who can dup it look at it and figure out what is causing the problem.  
If required, I will set up Win98 and the surrounding development platform, 
but I doubt I will have the bug troubleshooted by Monday when I leave for 
Boston.  From all the assertions, it would appear that there is non-thread-safe 
stuff happening on timers.

While XPCOM as a model is the root of many of this type of problem, XPCOM 
registers no timers I am aware of and does not do RPC's by itself during 
shutdown.  I could be wrong, but that is my belief.
(Reporter)

Comment 20

19 years ago
I'm afraid I do not have the needed debug tools to test this.  However, 100% of
win9x people I have asked see this.

Comment 21

19 years ago
I have this problem on a laptop. I'm game to letting people work on it.

Comment 22

19 years ago
If anyone needs the proper debugging environment to debug this problem, let me 
know.  I have everything set up to build and debug this win98 bug in my cube.

My win98 system has VC6 with yesterday's debug build on it that reproduces this 
problem consistently.

Comment 23

19 years ago
per PDT, this is upgrade to Priority P2, and is now nsbeta3+
re-assigned to ssu
Assignee: rayw → ssu
Priority: P3 → P1
Whiteboard: [nsbeta3+]

Comment 24

19 years ago
I am not the right person to look at this bug.  I am not that familiar with 
timers and xpcom to be looking at this.  I just have a win98 system that can be 
used to debug this problem.
Reassigning to dougt as possibly more appropriate to deal with Win9x threading 
problem. This is not an install issue.

CC'ing valeski because he's probably going to object :-)
Assignee: ssu → dougt
*** Bug 53817 has been marked as a duplicate of this bug. ***

Comment 27

19 years ago
*** Bug 53890 has been marked as a duplicate of this bug. ***

Comment 28

19 years ago
*** Bug 53890 has been marked as a duplicate of this bug. ***

Comment 29

19 years ago
*** Bug 53962 has been marked as a duplicate of this bug. ***

Comment 30

19 years ago
I see this R6016 error with a daily on my Win98 machine.

I don't see a link to the MSDN search here that would give us clues. It is:
Go to http://search.microsoft.com/us/dev/ and type in R6016

somewhat useful:
http://msdn.microsoft.com/library/devprods/vs6/visualc/vccore/r6016.htm

suggested user workaround that did not work for me:
http://support.microsoft.com/support/kb/articles/Q193/9/03.ASP

I can see about getting a debug build on my win98 machine to see if I can get 
further clues. Otherwise I don;t have any special insight.

Comment 31

19 years ago
This might be helpful.  I found an article in MSDN6 and MSDN Online, Article ID: 
Q126709:
   PRB: Error on Win32s: R6016 - not enough space for thread data
   http://support.microsoft.com/support/kb/articles/Q126/7/09.asp

Comment 32

19 years ago
Keep in mind, though, that article is concerning an old version of the Win32s
software for Windows 3.x

Comment 33

19 years ago
Yes. That article was one of the three found in the search I showed above. I 
think the other two are more interesting.

I see this behavior in win98 with the debug build too. I also see it with viewer 
and winEmbed. I *don't* see it with xpcshell, testxpc (which also init and exit 
xpcom).

The threadsafety asserts attached by ssu@netscape.com are telling. I 
see them too. taskinfo2000  - http://www.iarsn.com/download.html#TaskInfo - 
shows that at the time of these asserts, and of the crash, there is only one 
thread left running in the process and it is *not* the original main thread. 
This is very odd.

Comment 34

19 years ago
I'm seeing a variation on my win98 pc.

MOZILLA caused an invalid page fault in
module KERNEL32.DLL at 017f:bff9db61.
Registers:
EAX=c00309c4 CS=017f EIP=bff9db61 EFLGS=00010212
EBX=0068ff78 SS=0187 ESP=0058ff4c EBP=005901e8
ECX=00000000 DS=0187 ESI=00000000 FS=68c7
EDX=bff76855 ES=0187 EDI=bff79198 GS=0000
Bytes at CS:EIP:
53 8b 15 e4 9c fc bf 56 89 4d e4 57 89 4d dc 89 
Stack dump:

Also I'm getting the presistent crash window.

Bug is in bin\xpcom.dll. I coppied this file from the 0919 build to the 0924 
build. 0924 shuts down normally with the 0919 xpcom.dll file.
(Reporter)

Comment 35

19 years ago
>>Bug is in bin\xpcom.dll. I coppied this file from the 0919 build to the 0924
>>build. 0924 shuts down normally with the 0919 xpcom.dll file.

We need to find out who checked in XPCOM stuff after 0919 and before 0921.

Interesting is, that if I have the console open, the crash causes it to not
close, and freezes my win98. Adding regression/dogfood keywords to hopefully get
more attention.
Keywords: dogfood, regression
Good idea, Doron. I see only three people checking into XPCOM in that time: 
warren, waterson, and jband.

warren only changed some chrome jar makefile stuff

jband touched xpt error checking, looks safe enough.

waterson *did* touch XPCOM shutdown, including a fix that says "Add memory 
flusher thread." Bingo, I think we have a winner.

Comment 37

19 years ago
I reproduced this also on my Win95 machine at home with 9/23 build

Comment 38

19 years ago
Yes, you are right.  I knew someone was putting in a memory flusher on a timer 
thread.  I just hadn't figured out who it was.

Comment 39

19 years ago
I just tried running debug with waterson's flusher thread disabled using 
 #undef NS_MEMORY_FLUSHER_THREAD
 
I still get all the threadsafe assertions, but not the R6016 error and crash.

I don't think that just disabling the flusher thread is the right thing to do. 
We need to understand whay it is doing this cleanup work on some other thread. 
These asserts are warnings we should not ignore.
*** Bug 54045 has been marked as a duplicate of this bug. ***

Updated

19 years ago
Summary: Crash on browser/isntaller exit on win9x → Crash on browser/installer exit on win9x

Comment 41

19 years ago
MS Windows 95 4.00.950a french version.  
NN4 default browser, IE 5 5.50.4134.0600 implemented. 
Same R6016 error and crash on 2000092520 setup. 

Same crash on M18 exit with these details : 

MOZILLA a causé une défaillance de page dans
 le module RPCRT4.DLL à 0147:70101e19.
Registres :
EAX=00000000 CS=0147 EIP=70101e19 EFLGS=00010246
EBX=8165ec90 SS=014f ESP=0068fd80 EBP=0068fdb4
ECX=c757ecc4 DS=014f ESI=8165ecd4 FS=3c8f
EDX=c0020ed8 ES=014f EDI=70100000 GS=0000
Octets à CS : EIP :
ff 70 28 ff 15 78 e0 14 70 c7 05 6c d0 14 70 01 
Etat de la pile :
00000000 70100000 8165ecd4 8165ec90 0068ff80 00000001 bff74277 0068fd84 0068fbac 
0068ff70 7010a6dc 70146488 ffffffff 0068ff80 bff7b9b5 70100000 

followed by R6016 error and : 

MOZILLA a causé une défaillance de page dans
 le module KERNEL32.DLL à 0147:bff9a08c.
Registres :
EAX=0068febc CS=0147 EIP=bff9a08c EFLGS=00000246
EBX=8165ec90 SS=014f ESP=0068feb8 EBP=0068ff0c
ECX=80002f48 DS=014f ESI=00000000 FS=3c8f
EDX=80005f80 ES=014f EDI=780025ff GS=0000
Octets à CS : EIP :
5e 8b e5 5d c2 10 00 64 a1 00 00 00 00 55 8b ec 
Etat de la pile :
78037130 c0000005 00000000 00000000 bff9a08c 00000000 5c3a4520 474f5250 204d4152 
454c4946 45535c53 4e4f4d41 5c59454b 495a4f4d 2e414c4c 0a455845 

Comment 42

19 years ago
pulling off of dougt's plate.
Assignee: dougt → valeski

Comment 43

19 years ago
reassigning to waterson (per jband's comments) and raising to nsbeta3++, we need
this fixed on the branch.
Assignee: valeski → waterson
Whiteboard: [nsbeta3+] → [nsbeta3++]

Comment 44

19 years ago
Still seeing this bug on 2000092508, Win98 with IE5.5 installed, on all closes

Message:

Visual C++ Runtime Library

-R6016
Not enough space for thread data

(Assignee)

Updated

19 years ago
Status: NEW → ASSIGNED

Comment 45

19 years ago
Recently upgraded my work box (win98) from an earlier m18 nightly to the
25/09/00 vers: fine! wonderful! scrumdiddlydumptious!
Installed the Green & Black skin: no wukkers!
Installed the latest Aphrodite nightly (from
http://aphrodite.mozdev.org/installation.html): BOOM!
NOW I'm getting the same MS VisC++ RL "R6016 - not enough space for thread data"
Runtime Error! every time I close Mozilla.
Tried removing the obvious Mozilla related bits and doing a clean reinstall, but
it's still there so I must have missed something (either that or the VisC++
library has been screwed over).
Anybody else seeing an Aphrodite install as a trigger for this bug?

Comment 46

19 years ago
This bug doesn't exist in 2000091908, but appeared in 2000092008 and has
persisted through to current (2000092608). The bug occurs in all 3 windows
binary builds under the original 95 through to 98 SE.
Being without native nor cross-compiler for windows however, I haven't been able
to test builds from source.
(Reporter)

Comment 47

19 years ago
there is no need to report "still seeing this in build x" and such, this is 100%
reproducable on win9x.

Comment 48

19 years ago
OK, so we won't say it still exists.  The question, however, is when is someone 
likely to FIX it?  This is a week of being broken . . . 

Beker@cnpr.org
(Assignee)

Comment 49

19 years ago
Created attachment 15697 [details] [diff] [review]
proposed fix
(Assignee)

Comment 50

19 years ago
The above patch fixes one problem, which is factoring out the "startup" and
"shutdown" of the memory service from it's creation and destruction. Turns out
nsMemoryImpl is created well before XPCOM initialization, and is re-created
after XPCOM shutdown (doing memory management for nsString's, both times). This
was causing the memory flusher thread to be *re-created* after XPCOM shutdown;
certainly not something I expected to happen!

With this patch, XPCOM startup calls nsMemoryImpl::Startup(), which'll start the
memory flusher thread. As before XPCOM shutdown calls nsMemoryImpl::Shutdown()
to spin down the memory flusher thread. But, I made XPCOM shutdown call
nsMemoryImpl::Shutdown() *before* calling nsThread::Shutdown(). (Since
nsMemoryImpl's Shutdown is doing thread tinkering.)

With this patch, I still see the assertions on exit. Here's what appears to be
happening with those: the last thread to exit the app appears to be the Winsock
thread (the code is from WS2_32.DLL), and apparently since it's the last thread,
it gets to run all the app's static dtors. The half a dozen or so static
nsCOMPtr's being clobbered on this thread (static nsCOMPtr's are a no-no,
remember?) are each asserting because the objects that they hold were created on
the main thread, not the Winsock(?) thread.

Why the existence of the memory flusher thread affects whether or not Winsock(?)
exits properly is beyond me. Maybe there is some funky startup ordering problem?
(Assignee)

Comment 51

19 years ago
cc'ing wtc & rpotts, who may have insight into what WS2_32.DLL is. Also cc'ing
warren for some r= on the proposed fix.
(Assignee)

Updated

19 years ago
Whiteboard: [nsbeta3++] → [nsbeta3++] FIX IN HAND

Comment 52

19 years ago
*** Bug 54429 has been marked as a duplicate of this bug. ***
(Reporter)

Comment 53

19 years ago
approval keyword.
Keywords: approval

Comment 54

19 years ago
*** Bug 54484 has been marked as a duplicate of this bug. ***
(Reporter)

Comment 55

19 years ago
*** Bug 54487 has been marked as a duplicate of this bug. ***

Comment 56

19 years ago
*** Bug 54487 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 57

19 years ago
Extremely informative email from wan-teh...recording for posterity's sake.

Chris Waterson wrote:

> Hey, if you get a chance, could you look at my last couple of comments
> for bug 53353? I think I've got a fix for the bug, but we're still
> assert-botching like crazy on exit. It appears that what's happening
> is that a bunch of static dtors are running on a thread other than the
> main thread. This is befuddling to me. First, why would that happen?
> Is the "last DLL to exit" the lucky winner that gets to run the static
> dtors?

I don't know why the static dtors are running on
a thread other than the main thread.  My experience
is that as soon as the main() function returns,
all the other threads get instant death.  At least
it appears to be that way.  However, if your main()
function calls _endthreadex() or ExitThread(), the
main thread terminates but the process does not
terminate until all the other threads terminate.

Does Mozilla's main() function call _endthreadex()
or ExitThread()?

Do you need the assumption that the static dtors
are running on the main thread?


> Second, why would the WS2_32.DLL's thread (Winsock?) be
> lingering after shutdown? Are we failing to clean up winsock
> correctly?

 PR_Cleanup() calls WSAShutdown(), but PR_Cleanup() is
a dangerous function to call in a program as complicated
as Mozilla.  (I will spare you the details.)  So very
likely Mozilla is not calling WSAShutdown().

Wan-Teh

Comment 58

19 years ago
Well, as far as I can tell, the patch does exactly what you say it does in your 
comment on the 27th; and that sounds like the right thing to me.  Naturally, I'm 
concerned about these other threads continuing to live beyond the main thread.  
_That_ seems wrong.  Was this the case before any of the memory flusher 
modifications?  Is this something to worry about?  Have we filed a separate bug 
to get rid of the static |nsCOMPtr|s?

If these other problems warrant a separate bug, and/or are of significantly less 
importance that this bug --- and I believe they are --- then r=scc on this last 
patch (09/27/00 20:16 'proposed fix').

Comment 59

19 years ago
I recall hearing about this, and I think that the crash is ugly, but has no evil
side effects (i.e., the install works).  IF the crash blocks the install, then
this is a dogfood-plus.
Believing this is just an ugly crash (and noting we want it for beta3), I just
don't think it would stop internal folks from using the product.
marking dogfood-minus

IS this going to land today on the branch? I think we're now on our final respin
plan for Beta3 on friday AM.
Whiteboard: [nsbeta3++] FIX IN HAND → [nsbeta3++][dogfood-] FIX IN HAND
(Assignee)

Comment 60

19 years ago
Created attachment 15777 [details] [diff] [review]
patch, v2, after sr from warren
(Assignee)

Comment 61

19 years ago
Some minor cleanup per warren's suggestions: fix race condition with Stop() and
testing mRunning; use nsAutoLock to detect deadlocks. Re-testing on Win98 now...
(Assignee)

Comment 62

19 years ago
fix checked in, tip & branch.
Status: ASSIGNED → RESOLVED
Last Resolved: 19 years ago
Resolution: --- → FIXED
Adding myself, roberts, and tpringle to cc: list.
(Reporter)

Comment 64

19 years ago
*** Bug 54654 has been marked as a duplicate of this bug. ***
(Reporter)

Comment 65

19 years ago
*** Bug 54609 has been marked as a duplicate of this bug. ***

Comment 66

19 years ago
*** Bug 54653 has been marked as a duplicate of this bug. ***

Comment 67

19 years ago
Still crashes in build 92808, Win98SE
verified fixed on windows build 2000-09-29-08-M18
Status: RESOLVED → VERIFIED
(Reporter)

Comment 69

19 years ago
I still se this with branch build 2000092908 win98. However, other people who
have seen it say it is gone (win98se only though).  Is anyone still seeing this
other than me? Not going to reopen yet

Comment 70

19 years ago
WFM with branch build 092908 on Win 98SE
(Reporter)

Comment 71

19 years ago
looks like my computer was acting up, after a reboot, installing the new build
fixes this.  way to go warren!

Comment 72

19 years ago
*** Bug 7799 has been marked as a duplicate of this bug. ***

Comment 73

19 years ago
*** Bug 7799 has been marked as a duplicate of this bug. ***

Comment 74

19 years ago
As the designated techno-laggard for install testing ... this does not crash on 
2000-09-29-08-MN6 with Win 95 Debut (the original 95).
Thanks, Chris.

Comment 75

19 years ago
I am still seeing this on 2000100108 Win98.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---

Comment 76

19 years ago
After restarting this is no long visible. Sorry for the spam.
Status: REOPENED → RESOLVED
Last Resolved: 19 years ago19 years ago
Resolution: --- → FIXED

Comment 77

19 years ago
And spam #3.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.