Closed
Bug 160602
Opened 22 years ago
Closed 22 years ago
Large integers, e.g. getTime(), causing crash at 0x39393929
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: mike.campbell, Assigned: khanson)
References
()
Details
(Keywords: crash, Whiteboard: [Windows-only] [Related: bug 140852? ] fixed1.3)
Attachments
(5 files, 2 obsolete files)
13.47 KB,
image/jpeg
|
Details | |
667 bytes,
patch
|
khanson
:
review+
brendan
:
superreview+
|
Details | Diff | Splinter Review |
669 bytes,
patch
|
Details | Diff | Splinter Review | |
54.48 KB,
application/octet-stream
|
Details | |
23.64 KB,
application/x-zip-compressed
|
Details |
STEPS TO REPRODUCE
1. Load http://www.oracle.com
2. Crash!
Note that the crash does not occur everytime.
Sometimes the page loads, othertimes the browser (and mail) crashes with a windows exception violation error, and other times it crashes with no windows error.
Reporter | ||
Comment 1•22 years ago
|
||
Note that this has failed with the Mozilla 1.1 beta as well as the nightly build labeled 2002073004
Comment 2•22 years ago
|
||
wfm with win2k build 20020730..
Reporter:
Which Build and which flash version do you use ?
Can you please use a talkback enabled build and send a talkback report ?
After Talkback sent this report run mozilla/components/talkback.exe and add the
talkback ID from the crash report in this bug.
Thanks
Severity: normal → critical
Keywords: crash,
stackwanted
Comment 3•22 years ago
|
||
Mike has sent me three Talkback IDs: TB8669796Y
TB8779296W
TB86957678W
The first two contained no stack traces! Looks like whatever
caused the crash also prevented the Talkback code from working.
The third one I could not locate in the Talkback database,
and it looks like there is a typo in the incident number,
as it is one digit longer than the others.
I, too, have been unable to duplicate this crash -
Updated•22 years ago
|
Reporter | ||
Comment 4•22 years ago
|
||
Note that this problem ONLY occurs if javascript is enabled in the browser. Disabling javascript prevents the crash but the page does not load everything.
Shockwave Flash 6.0 r40
Reporter | ||
Comment 5•22 years ago
|
||
Note that the 3rd talkback seesion is TB8695678W
Comment 6•22 years ago
|
||
Reporter:
Can you please close Mozilla and remove (temporary) "flashplayer.xpt" from your
plugins directory and try it again.
if you still crash, remove "npswf32.dll".
Reporter | ||
Comment 7•22 years ago
|
||
There was no flashplayer.xpt (nor any flashplayer.* files anywhere) that I could remove. I did remove the "npswf32.dll" file but the crash still occured.
The current files in the plugins directory are:
NPSWF32.dll nprdx5.dll nprpverplug.dll
npnul32.dll nprjplug.dll pnmi3260.dll
Note that the "npswf32.dll" was restored after it did not resolve the problem.
Comment 8•22 years ago
|
||
Note: I looked up that 3rd talkback session (TB8695678W).
Unfortunately, once again there was no stack trace recorded!
Here are some facts from the report, in case it helps:
Processor: Vendor Type Speed Features
GenuineIntel Pentium 598 MHz MMX
Operating System: Windows NT 5.0 build 2195 Service Pack: Service Pack 2
Physical Memory: 384.0 MB
Memory Status: Available Total
Physical Memory: 169.1 MB 384.0 MB
Page File: 687.9 MB 921.9 MB
Virtual Memory: 2028.9 MB 2047.9 MB
Mounted Drive Information: Type Size Free File System
A: Removable - - -
C: Fixed 19571.3 MB 8232.0 MB NTFS
D: CD-ROM - - CDFS
Network Card: 3Com 3C918 Integrated Fast Ethernet Controller
(3C905B-TX Compatible)
Screen Configuration: 1024 x 768, 24 bits per pixel. 75 Hz. 8388608 Bytes
Comment 9•22 years ago
|
||
i still can't reproduce this with a recent build and 1.1b
Do you see this crash if you open the main page or if you click links on the page ?
Have you installed mozilla in a clean directory or over an older build ?
I thought that this is a flash problem since talkback can't sometimes catch
crashes in a plugin (Java, flash..)
Reporter | ||
Comment 10•22 years ago
|
||
A new note. I can also duplicate the crash if I visit the url http://vh1.com. I get the same browser crash.
A new talkback session has been uploaded with ID TB8993332Z with the capture of the crash from visiting vh1.com.
Note that the current version of mozilla that I am using is 2002072104.
Comment 11•22 years ago
|
||
Mike: thanks again! Unfortunately, this latest incident (TB8993332Z)
also comes up with your machine data but no stack trace!!!
Meanwhile, I'm looking at http://www.oracle.com and http://www.vh1.com
to see if there is any common principle involved between the two sites -
Comment 12•22 years ago
|
||
cc'ing Amar: are you able to crash on either http://www.oracle.com
or http://www.vh1.com with your Win2K box?
All you have to do is load each site, no other actions are required.
We were wondering if its a plug-ins problem, but we aren't sure.
Thanks -
Comment 13•22 years ago
|
||
Both the sites http://www.oracle.com and http://www.vh1.com does not crash for
me on WIN2K with both 2002-08-01-08-1.0(branch) and 2002-08-01-08-trunk builds.
Reporter | ||
Comment 14•22 years ago
|
||
Another site that seems to demonstrate the same behavior on my machine is www.cnet.com. It crashes the browser just about every time I visit the page.
Comment 15•22 years ago
|
||
Mike: thanks. I can't remember if we've tried a new Mozilla profile yet.
You can always bring up the Profile Manager (if it doesn't already come
up automatically) by launching Mozilla from a console:
[(path to Mozilla)] ./mozilla -profilemanager
I wonder if the crash goes away by running under a brand-new profile.
Sometimes old profiles get corrupted. It's worth a try -
Also: do you have access to any other Win2K machine? Does
Mozilla crash on that machine on these sites, too?
Finally: you mentioned that the problem seemed to go away for awhile.
Is it possible that there is some background application that is
running when the crashes occur? Like an audio program of some sort?
And when that app is not running in the backgound, no problem?
Reporter | ||
Comment 16•22 years ago
|
||
I do not have another windows machine to test this on but did try it from a linux box. The browser does NOT crash from linux.
I tried a new profile from the windows box but it still crashed so I don't think it is related to the profile.
I don't know of anything that is or is not running when the crash occurs that is different from other times. In general I run the same stuff all day.
From what I have noticed it seems the crash will occur immediately after starting up the browser. However, if I do some other work and then visit the problem sites (maybe 30-60 minutes later) the crash does not occur. Go figure.
Comment 17•22 years ago
|
||
cc'ing dbradley for advice. Note Mike's comment above:
> From what I have noticed it seems the crash will occur immediately after
> starting up the browser. However, if I do some other work and then visit the
> problem sites (maybe 30-60 minutes later) the crash does not occur. Go figure.
Have you ever heard of this type of behavior? Is it possible that
some XPCOM components get lazily registered, and this process is
not working correctly?
Note the crashes Mike has experienced are very hard: no stack trace
has been preserved in Talkback for any of them.
Reporter | ||
Comment 18•22 years ago
|
||
Some additional configuration info I'll add. I didn't mention it before as it should not affect anything but just to make sure:
1 - My internet connection goes through a Redcreek Ravlin II VPN hardware box and a linksys router.
2 - I also run the program Proxomitron on this win2k machine but have disabled its use and the browser still crashes.
Comment 19•22 years ago
|
||
This is an odd one. I bounced between VH1 and Oracle and had no problems, even
ran it under Purify. The only thing I saw in common was that both sites had
references to JS files. I didn't see anything else that appeared to be common
between the two sites.
The only question I can think to ask, is when it crashes, does the pop-up give
you a DLL name?
Maybe try clearing your cache.
Lastly a grasp at a straw, try deleting the xpti.dat file in the components
directory.
Reporter | ||
Comment 20•22 years ago
|
||
Reporter | ||
Comment 21•22 years ago
|
||
No luck with the other suggestions. I cleared the cache (both memory and disk) and deleted the xpti.dat file. After restarting mozilla the xpti.dat file was recreated but the crash still occured when visiting oracle.com.
Note that in the uploaded error the instruction is always the same when the browser crashes. The 0x77f83cab may not mean anything but it is the instruction always listed.
Note also that I don't have to click ok to terminate the program. It has already died. I just press ok to get rid of the dialog.
Comment 22•22 years ago
|
||
The only interesting thing I see in the error message is that the address
0x39393929 represents a string "999)". No Mozilla string contains this, though
and I don't see it in the HTML of any of the sites in question.
Do you know if that number changes (The second one in the error dialog).
Reporter | ||
Comment 23•22 years ago
|
||
Looks like the address of memory is always 0x39393929. I tried it several times and it always showed the same address no matter what site I visit.
Comment 24•22 years ago
|
||
*** Bug 161574 has been marked as a duplicate of this bug. ***
Comment 25•22 years ago
|
||
From the duped bug, Slashdot.com is having the same problem.
Comment 26•22 years ago
|
||
Confirming bug: thanks to Mike's patience, and to dbradley, we've finally
found a pattern! From the duplicate bug report:
> When trying to get to slashdot.org (and some other sites but slashdot is
> best example) Mozilla immediately closes (sometimes with "instruction at
> 0x77f83cab referenced memory at 0x39393929 <- this value is suspect)
> - Talkback doesn't get activated. It's not always - it's pretty random but
> frequent (like 30%).
> It started some time after 1.1 branch.
Note the other bug report was also made from a Windows 2000 box -
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 27•22 years ago
|
||
Hi, I reported slasdot.org as having the same problems. Few additional notes:
- I have the same problem on 2 win2k boxes (on different connections,
one is Duron/Soltek and the other is genuine Dell/Pentium).
- I downloaded the site (with wget -p) but couln't reproduce the error locally (I
tried several times)
- It's alway 0x39393929 - lots of such strings 999) are in history file (and
other) - I cleaned the history (and all other files that could be regenerated)
but it didn't help.
Comment 28•22 years ago
|
||
Have you installed Mozilla in a clean directory or over an older build ?
Do you get a stack trace (from drwatson) if you start c:\winnt\drwatson.exe ?
I'll provide an optimized builds with symbols if that works.
Comment 29•22 years ago
|
||
see http://www.computerhope.com/software/drwatson.htm how dr.watson works and
how do you find the log.
Comment 30•22 years ago
|
||
I removed old build before installing new, now
I have 2002080614. I know how drwtsn works but
I guess it doesn't catch these crashes. Anyway
I tried to reproduce the problem last 45 minutes
without luck :) - I'll try again tomorrow (it's
2:30 am here).
Comment 31•22 years ago
|
||
Ok, my mozilla crashed on
http://www.mbank.com.pl
but it doesn't generate the crashump or log.
I know there're two types of crash msgboxes on windows,
one is with yellow "warning" sign saying "...saving log"
(and that's dr.watson's) and the other (with red "stop sign)
- and that's the one we can see when mozilla crashes.
Comment 32•22 years ago
|
||
Can the people who are able to "reproduce" this crash report:
1. type of video card.
2. Windows version. Win2k sp?
3. Is QuickLaunch turned on?
4. Anything else you can think of ;-)
Just trying to figure out what's common.
Reporter | ||
Comment 33•22 years ago
|
||
My video card shows as an ATI 3D Rage Pro AGP 2x
Windows 2000 sp2 and now sp3 (crashes occur in both)
quicklaunch is not enabled
Comment 34•22 years ago
|
||
1. type of video card.
GeForce 2MX and other one the Dell system
2. Windows version. Win2k sp?
win2000 sp2 + some pre-sp3 hotfixes
win2000 sp3
3. Is QuickLaunch turned on?
both on and off
Comment 35•22 years ago
|
||
Other note:
I use the same profile (copied) on both boxes
BUT it checked with new clean profile and
new installation of mozilla and it still crashes.
Is there any other place that mozilla uses to keep data
(except of profile directory)?
registry? %WINDIR%? %USERPROFILE%?
Comment 36•22 years ago
|
||
cc'ing jrgm in case he knows the answer to that -
Comment 37•22 years ago
|
||
Mozilla stores data in :
c:\programs\mozilla.org
C:\Dokuments and Settings\%username%\Application Data\Mozilla\
and only some minor (!) things in the registry.
(windows integration =open Html files with Mozilla)
but Mozilla always searches for installed plugins in your system
there is no affitional data stored from Mozilla.
Comment 38•22 years ago
|
||
Matti's answer is correct (with the slight addendum that the location of the
profile is a bit different on win95/98, but we're talking win2k here so
%USERPROFILE%\Application\Mozilla is the target).
The stuff in the win32 registry is relatively minor. It wouldn't be my first
concern in looking at this crash.
Comment 39•22 years ago
|
||
As Mike suggested:
> Note that this problem ONLY occurs if javascript is enabled in the browser.
> Disabling javascript prevents the crash but the page does not load everything.
Today I turned off javascript and browsed many pages - and Mozilla didn't
crashed - so it seems the bug IS javascript related.
I'll continue this test tomorrow...
Comment 40•22 years ago
|
||
After another day of javascript-off browsing It seems to be true -
something's wrong with js. It worked well whole day until
I turned js on (I wanted to submit one form) - it crashed immediately.
Comment 41•22 years ago
|
||
Browsing without JS turned on doesn't prove anything. The browser
has a huge number of modules. Imagine this type of picture:
Module X ---> JS Module ---> Module Y ---> Module Z ---> Module W ---> ...
^ ^
| |
| |
user shuts off JS but the bug is actually here
Sure, shutting off JS will stop the bug, because the bug is farther
down the chain of interdependencies. That doesn't mean there is a
bug in JS !!! There conceviably could be; but not much can be learned
by turning it off; it's too low-level in the browser to tell.
Comment 42•22 years ago
|
||
> - I downloaded the site (with wget -p) but couln't reproduce the error locally
> (I tried several times)
this seems like a clue, but it's hard to know what it means.
but you might download just the html and add a <base href="___"> to the <head>
-----------
<html>
<head>
<base href="http://www.oracle.com/">
...
-----------
that would get Mozilla to load everything but the html from the network (which
might be part of the problem). if so, you could then prune the html page down
to a simpler page that crashes (which would be very helpful!)
there could also be something happening with the cache. loading locally won't
hit the cache.
Also, since disabling Javascript prevents the crash, you might try stepping
through the javascript in the Javascript debugger to see what the Javascript is
doing when it dies.
Comment 43•22 years ago
|
||
Ok, I have something more specific:
It crashes ONLY when file is open via http connection (not file://)
I created following file
<html>
<head>
<base href="http://www.infopoll.com">
<script language="javascript" src="/live/infopoll.js"></script>
</head>
<body><h1>hello</h1></body>
</html>
and put it on local Apache server - now every access make mozilla
crash.
Other thing - it crashes more frequently when started
with empty cache (it looks it's liek that).
Comment 44•22 years ago
|
||
Sorry - I was wrong
It crashes mozilla w/o server connection (just file open)
- seems there's something strange in this .js file
I'll try to find this evel line :)
Comment 45•22 years ago
|
||
Ok, finally got it - following piece of code crashes mozilla:
<script language="javascript">
var expdate = new Date();
var base = new Date(0);
expdate.setTime (expdate.getTime() + (24 * 60 * 60 * 1000));
</script>
hope it will help you fix the bug :)
Comment 46•22 years ago
|
||
this looks like a dupe of bug 140544.
are you using Macro Express or JS Virtual Pager?
Comment 47•22 years ago
|
||
Yes, it seem to be a dupe BUT I don't run
any of those 2 tools. As I said before I experienced
this on two boxes so I'll list applications that I have on both:
- Trillian
- TClockEx (it may be suspect, I'll check after I post this)
- Tiny Personal Firewall
- Edit+
- IrfanView
- Windows Commander
- MS Office XP
- other common applications (I don't think it's the case) like
winamp or getright
Anyway in 140544 there's short piece of JS code
doing something with Date/time (like my example do).
Does it mean anything?
Comment 48•22 years ago
|
||
Marcin: thank you for finding a reduced tetscase for this bug!
The JS code from bug 140544 is indeed very similar:
<SCRIPT language="javascript">
var now = new Date();
var tail = now.getTime();
document.write(tail);
</SCRIPT>
whereas your testcase is
<SCRIPT language="javascript">
var expdate = new Date();
var base = new Date(0);
expdate.setTime (expdate.getTime() + (24 * 60 * 60 * 1000));
</SCRIPT>
Mike:
1. Do you crash on either of these examples?
2. What applications besides Mozilla are running when you crash?
3. Any chance you are using Macro Express or JS Virtual Pager (cf. bug 140544)?
Comment 49•22 years ago
|
||
Note the reduced testcases from bug 140544 and this bug both
involve the .setTime() method of Date objects. This returns a
large integer (the number of milliseconds since 1970-01-01 GMT).
For example, here is what I get right now:
(new Date()).getTime() ---> 1029166874593
We have an open bug in JS Engine on numbers like this. The bug
shows up on the Windows OS only, and in optimized builds only;
very similar to what has been reported here and in bug 140544:
bug 140852
"String(819187200000) == '8191871:0000' in xpcshell, browser"
I'm wondering if the issue there, though no crash is mentioned,
might be related to the crashes that are occurring here. Note the
only stack trace we've been able to get so far is from bug 140544:
------- Additional Comment_ #15 From Hal Black 2002-05-09 14:50 -------
Yes, here is the stacktrace:
NTDLL! 77f83b27()
NTDLL! 77f83bae()
NTDLL! 77f82f0b()
XPCOM! 10033c78()
03670488()
XPCOM! 1000b862()
f18b5608()
Whiteboard: [Duplicate of bug 140544? ]
Comment 50•22 years ago
|
||
Typo above: I meant to say |getTime|, not |setTime|.
Comment 51•22 years ago
|
||
cc'ing rogerl, khanson to ask: could the problem in bug 140852,
"String(819187200000) == '8191871:0000' in xpcshell, browser",
cause a crash? I suppose the thing to do now is run the above
testcases under Purify on Win2K -
Comment 52•22 years ago
|
||
I ran the example from comment 45 under Purify and didn't have any problems.
Unfortunately my build was compiled with /O2 and not /O1. I'll try again after
getting it build using /O1. I would have expected that not to make that much of
a difference, since I think /O1 is a subset of /O2 IIRC.
Reporter | ||
Comment 53•22 years ago
|
||
Both of the sample JS code snipets will cause my browser to crash with the same invalid memory referenced errors as the web pages that I've reported. Looks like you may be on to something here.
Note that I do not have Macro Express or JS Virtual Pager running but I do have another pager called eDesk running. However, I have tested with eDesk shutdown and the crash still occurs.
Comment 54•22 years ago
|
||
I can confirm bug 140852 on my box, I played with stuff like that:
var expdate = new Date(100000000001);
alert(expdate.getTime());
and got:
100000000000.:
but it does NOT crash Mozilla - I played with values a while, for
huge numbers I got NaN (expected result), sometimes the result contains
garbage (like above example) - but I got no crash.
So, maybe that's Date() constructor is guilty?
Comment 55•22 years ago
|
||
I found something more:
var expdate = new Date();
alert(expdate.getFullYear());
gives
34583
hm... I took the red pill...
Comment 56•22 years ago
|
||
I forgot to mention that
var expdate = new Date(1029185125000); // the number is my box's timestamp*1000
alert(expdate.getFullYear());
gives correct answer (2002), again - something's wrong with Date()
Comment 57•22 years ago
|
||
Marcin: does this javascript:URL give you 34583 ???
javascript: alert((new Date()).getFullYear());
(Note: if anyone on the bug is unfamiliar with javascript:URLs,
they are entered in the URL bar just like an http:URL)
Comment 58•22 years ago
|
||
Now it stopped working this way and shows 2002
(as it was mentioned before also crashes does not happen
every time).
But
javascript: alert((new Date()).getTime());
crashed Mozila
I'll try again get this year wrong...
Comment 59•22 years ago
|
||
Mike and Marcin: thanks for testing this again! Will Mozilla still
crash for you if you remove the alert() and just do this?
javascript: (new Date()).getTime();
Comment 60•22 years ago
|
||
yes, it closes immediately.
Comment 61•22 years ago
|
||
Reassigning to JS Engine and cc'ing Brendan on this crasher.
I have never been able to reproduce this, and the contributors'
Talkback reports are all devoid of stack traces.
The same has been reported in bug 140544. I will continue to try
to reproduce this and see if I can get a debug stack trace -
Note we have an open bug in JS Engine for large integers, which is
what (new Date()).getTime() will produce. The bug shows up on the
Windows OS only, and in optimized builds only; very similar to what
has been reported here and in bug 140544:
bug 140852
"String(819187200000) == '8191871:0000' in xpcshell, browser"
Assignee: Matti → khanson
Component: Browser-General → JavaScript Engine
QA Contact: asa → pschwartau
Comment 62•22 years ago
|
||
*** Bug 140544 has been marked as a duplicate of this bug. ***
Comment 63•22 years ago
|
||
cc'ing contributors from the duplicate bug 140544. Note they report
their crashes depends on other software running in the background:
the JS Pager virtual desktop (http://hem.fyristorg.com/jspage/),
or Macro Express 3.
Contributors here find these apps are not necessary to crash (or are
these running as hidden processes, perhaps?) On Windows this can be
investigated by doing
Task Manager > Processes > (alphabetize the process names)
Meanwhile, Hal has come up with a binary stack trace,
and is currently trying to get a debug stack trace:
------- Additional Comment_ #29 From Hal Black 2002-08-12 18:20 -------
"Yes, same crash...
Note, this is only with JS pager on that I get this crash.
Otherwise, no problems.
Here's the stacktrace I get when running with build 2002071608
and using that javascript URL:
javascript: (new Date()).getTime();
NTDLL! 77f8e59a()
NTDLL! 77f8edc6()
NTDLL! 77f848a5()
XPCOM! 60eb1928()
02ba30c8()
XPCOM! 60e89a75()
f18b5608()
I download/compile/debug with newer versions if there's something
you'd like to try. This is fairly reliably reproducable..."
Comment 64•22 years ago
|
||
Okay, verified (javascript URL crash) with latest build (2002081209), here's the
stacktrace.
NTDLL! 77f8e59a()
NTDLL! 77f8edc6()
NTDLL! 77f848a5()
XPCOM! 60eb18cf()
027999f8()
XPCOM! 60e899e7()
f18b5608()
I haven't been able to get it to crash (ever) while running a debug build.
Note, that JS Pager must be running when Mozilla STARTS to cause the crash.
Stopping it while Mozilla is running still causes the crash, and starting it
after mozilla doesn't cause the crash.
Comment 65•22 years ago
|
||
FYI, I've been able to reproduce this bug in a optimized with symbols build. I'm
trying to make sense of what I'm seeing right now. Wouldn't do much good to post
what I've found at this time, it's way to strange. Looks like the Windows
message queue is getting corrupted, and I'm trying to figure out how and by who.
Comment 66•22 years ago
|
||
Ok, here's where I'm at so far. The FPU's CTRL register is getting clobbered. I
traced this down to CoInitialize, but I've seen it happen in other calls as
well, after commenting out CoInitialize. Could be those calls made calls to
CoInitialize, though. What happens is the precision the FPU uses changes.
Normally we run at 53 bits of precision but in this case it is getting bumped to
64. This then causes the rounding issue we see in this code. To verify this I
manually assigned 0x027f to the register (the value when JS Page is not running)
within js_dtoa and the function did the conversion properly.
So I think we have two problems here.
1. Why is the FPU's control register is changing.
2. Why are we overflowing the buffer when we were given a size.
I don't know enough about the FPU and the instructions to know if this is a
compiler issue, an OS issue, or a JS Page not playing nice issue.
I'm sure we could slam in some assembler around CoInitialize to save off this
value and restore it, but I'd like to understand the problem a little better.
Comment 67•22 years ago
|
||
I tried an experiment. Wrapping the CoInitialize call isn't going to be enough.
Apparently there are many thing that Mozilla calls that also call CoInitialize
that we have no control over. I also verified that this behavior occurs in other
applications as well.
I suspect this may be COM forcing higher precision for some of its types. What I
still don't understand is how JS Pager is triggering this, unless it's
registering something that needs this precision.
So as I see it we have two options.
1. Get the floating point conversion code working with the higher precision numbers
2. Wrap the code for Window's (or maybe Intel CPU's?) with assembler that saves
off the precision, sets it to what we need, and then restores it.
Comment 68•22 years ago
|
||
I would think that the floating point control register would be part of a
"context switch"? As in, if that is a value set by an application, when the OS
switches from one process to another, it should save it off and restore it when
it comes back.
Say you wrap the call, for instance... Well, if the context switches inside the
wrapper, wouldn't that mess things up also?
Maybe there is some initial value not being set by mozilla when it starts up?
Does it assume the FPU control register is a certain value, but not set it to
that on startup?
Comment 69•22 years ago
|
||
I wondered that myself, whether this FPU register wasn't getting preserved
properly. So I ran a little test, and it does. It's not getting changed because
it wasn't restored after the context switch. It appears that the code within
CoInitialize is setting it. Somehow JS Pager running triggers this effect within
CoInitialize. I wish the debugger would allow you to set breakpoints on that
register, unfortunately it doesn't. I tried walking through the assembler, but
it's pretty far in.
I also verified that it has this affect on other application that call
CoInitialize and it does. I created a simple MFC app that called it, ran JS
Pager and then the MFC app and it got set. I suspect most applications don't
notice this, because floating point isn't used all that much, and when it does,
usually small rounding errors don't matter that much.
Comment 70•22 years ago
|
||
I would expect, but don't know for sure that this register would be preserved on
a thread basis as well as a process basis.
Comment 71•22 years ago
|
||
Now that I think of it, this may be an optimization issue. I ran a debug build
and the FPU CTRL register has the same value. But the results it produces are
"correct" and it runs fine. So it may be VC++ not correctly clearing something
or taking some kind of short cut it shouldn't. I'll see if I can compare the
assembler and see what might be the problem.
Comment 72•22 years ago
|
||
I don't know if this will help or not, but if you disable the JS Pager option of
"Send to Desktop (x,y)" in the system menu the crash does not occur.
The crash when running JS Pager *seems* related to the extra system menu items.
Comment 73•22 years ago
|
||
Joe also discovered this (from bug 140544):
"Note: Mozilla1.0 did not have this problem. It started this behavior
with 1.1alpha."
Comment 74•22 years ago
|
||
I'm running on about 10 minutes of sleep in the past 30 hours, so take this
patch with that in mind ;-)
From what I could tell the line L = (Long) ((d / ds); had some rounding issues
when the precision was increased. I added a 0.5 to d and I think that clears up
the problem. Also there may be other similar issues in this function, but from
what I can tell the rest looks ok.
This will add to the time to this function, don't know if there's a more
efficient way to achieve the the same thing.
In any case, I see no more crashes wh
Comment 75•22 years ago
|
||
David: thanks!!!
cc'ing Daniel, Steve -
Comment 76•22 years ago
|
||
*** Bug 153402 has been marked as a duplicate of this bug. ***
Comment 77•22 years ago
|
||
David,
Given the apparent change in precision in the FPU, you might give a glance at
http://www.netlib.org/fp/gdtoa.tgz, which is Gay's dtoa code stretched to other
precisions. I glanced there myself, but was not struck by anything obvious.
None of the code there had your solution added.
Since this code exists, however, I'd say that we should *not* rely on the dtoa
code in Mozilla to work with anything other than 53 bit mantissas.
You might also give a shout on the netlib mailing list (if that's even
possible); see the contact info in bug 156253.
--scole
Comment 78•22 years ago
|
||
I need to learn to read the comments ;-)
The comments say to use _control87(PC_53, MCW_PC); Not sure how portable that
is. I effected the same thing with assembler and it did fix the problem.
Assuming this works on the Intel platforms we build on, do we sprinkle these
around code dealing with floating point numbers? Could this trip up plugins
that might rely on the 64 bit mode? Do we want to incur the overhead of trying
to wrap the code and preserving the mode? Some things to think about.
Kenton what's your take?
Comment 79•22 years ago
|
||
cc'ing Waldemar -
Whiteboard: [Duplicate of bug 140544? ] → [Related: bug 140852? ]
Assignee | ||
Comment 80•22 years ago
|
||
David,
Setting the FPU environemnt to the default VCC++ is a perfectly valid defense
against corruption by other users. All of these routines in (dtoa) assume this
default environment.
Comment 81•22 years ago
|
||
I agree, that's probably the easiest solution given the current situation. I
wasn't sure how expensive the operation. Should it be set to 53 bits and left
that way. Or should we save off the previous state and restore it on exit?
The _control87 seems to have support by VC++ and I see references to gcc and
this function. I don't know if we can blanket all intel based environments or
not. I'm thinking of OS/2 and Beos. Looks like OS/2 supports it. This function
allows us to set the flags and get the previous value so we could restore it on
exit.
Comment 82•22 years ago
|
||
Another possible patch. This patch takes the approach of switching the FPU to a
53 bit mantesa aka IEEE compliant.
1. Should we worry about restoring it to the previous state?
2. Are there other areas using floating point math in the JS engine that this
should be used?
3. I don't know if that we can assume that once it is set, that it will remain
set, thus I set it on each call. We have no control over plugin code or other
API calls might do.
4. These calls to _controlfp don't appear to be cheap, not sure how expensive
they are in a release build. I could do the inline assembler, it's easy enough
to do, but wasn't sure how supported _asm was on the various Intel based
compilers.
5. I used _controlfp rather than _control87 because at least according to the
MS documentation _controlfp is more widely supported in non MS compilers.
Attachment #95120 -
Attachment is obsolete: true
Comment 83•22 years ago
|
||
David, do you know if the "cheapness" varies between get and set or not? I'm
thinking that if getting is cheaper than setting, then we probably want to do a
get first, to see if we need to change the mode at all (since this probably will
happen only rarely). We probably also would only perform a restore if the
control word changed.
--scole
Assignee | ||
Comment 84•22 years ago
|
||
David,
The Javascript standard avoids issues of non standard arithmetic. It is
excepted to run in the default IEEE setting. It expects floating-point
precision to be double (53-bits of of mantissa), rounding to nearest, and no
floating-point trap handlers enabled. This default environment is set in
“js_InitRuntimeNumberState” in jsnum.c. Actually it doesn’t get set for all
environments (bug 109286), but it does for Intel. This implies that the
environment is getting changed after Javascript has started. The crash you are
seeing could be caused by any of these non default settings. It is somewhat
troublesome that it is happening. After you restore the settings to those
originally encountered, the remainder of the Javascript program is vulnerable to
a strange floating-point environment.
What I’d like to see is an alert box that occasionally and randomly checks for
non-standard fp environments and alerts us of this situation. At the minimum it
should be included in debug builds. It should report all non standard settings.
Comment 85•22 years ago
|
||
js_InitRuntimeNumberState only sets the interrupt mask and doesn't touch the
precision of the FPU. So maybe the macro FIX_FPU needs to be modified to change
the precision as well as the interrupt mask.
There's still a risk this could change after this call, although CoInitialize is
called before this function. Changing the FIX_FPU macro might at least address
99% of the cases.
Comment 86•22 years ago
|
||
Here's a short patch that changes FIX_FPU macro to set the precision as well as
the exception mask. I did a quick test that started JS Pager after CoInitialize
was called, and the FPU didn't appear to get set back. So looks like this may
take care of most of the cases for us.
Attachment #95579 -
Attachment is obsolete: true
Assignee | ||
Comment 87•22 years ago
|
||
Comment on attachment 95725 [details] [diff] [review]
Patch that changes FIX_FPU to set precision
r=khnason
I think this is the correct fix. I didn't realize precison wasn't getting set
correctly.
Attachment #95725 -
Flags: review+
Comment 88•22 years ago
|
||
Comment on attachment 95725 [details] [diff] [review]
Patch that changes FIX_FPU to set precision
"mantissa", and period at the end of sentence in comment, please. Fix those
nits and sr=brendan@mozilla.org.
/be
Attachment #95725 -
Flags: superreview+
Comment 89•22 years ago
|
||
Comment 90•22 years ago
|
||
fix checked in.
Do we want to try and get this on the 1.1 branch?
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 91•22 years ago
|
||
I'm not familiar with the bug fix scenerio here. If the patch was checked in
does that mean it will be in the nightly builds? If so should I just be able to
download the latest build and try it out?
Comment 92•22 years ago
|
||
you can try the tomorrows nightly trunk build
(ftp://ftp.mozilla.org/pub/mozilla/nightly/latest-trunk/)
Comment 93•22 years ago
|
||
Cc'ing some drivers to consider this for 1.0.1 and the 1.0 branch. I think we
should take it, but I sr'd, so I don't count as a driver here.
/be
Keywords: mozilla1.0.1
Comment 94•22 years ago
|
||
I downloaded Mozilla from trunk builds (binaries dated: 2002-08-20 8:53)
and didn't crashed yet (I did all the tests + some browsing).
Results from .getTime() and others are valid.
I just wonder if that version contains fix from David, he announced
the fix at 06:06 (how long the build lasts?)
Comment 95•22 years ago
|
||
Windows usually builds pretty quickly, so I would think it had the change, but
hard to know 100%.
Comment 96•22 years ago
|
||
Bad news.
My mozilla crashed on .getTime()
This is trunk build, binaries are dated 2002-08-21 9:27
Do you know the assembly that is produced by
_control87(MCW_EM | PC_53, MCW_EM | MCW_PC)
? (so I could check if this version contains the fix)
Comment 97•22 years ago
|
||
Here it is. The key difference is the two number in the push statements.
bytes: 68 1F 00 0B 00 68 1F 00 09 00 8B 77 14 FF 15 A0 00 04 01
0102374F push 0B001Fh
01023754 push 9001Fh
01023759 mov esi,dword ptr [edi+14h]
0102375C call dword ptr [__imp___control87 (010400a0)]
Comment 98•22 years ago
|
||
Yeah, I got it in
js3250.dll:
.600C374F: 681F000B00 push 0000B001F ;" ♂ ▼"
.600C3754: 681F000900 push 00009001F ;" ○ ▼"
.600C3759: 8B7714 mov esi,[edi][00014]
.600C375C: FF15A0000E60 call _control87 ;MSVCRT.dll
Comment 99•22 years ago
|
||
When it crashes, are you getting the 0x39393939 numbers, or is it something
different? Trying to determine if this is the same crash or something different.
Which test case are you using? What other software is running on your system at
the time? I was testing with JS Pager.
Comment 100•22 years ago
|
||
It crashed with standard 0x39393929 msg box,
as test I used "javascript: alert(new Date().getTime())" url,
later as a test I removed almos all processes and stopped
almost all services - and it still crashed.
Later I tried harded removing processes and after I removed
CTFMON.EXE it stopped crashing (I'll test it again).
CTFMON is part of MS Office XP (it handles alternate keyboards etc.)
Comment 101•22 years ago
|
||
CTFMON may be causing some Windows API function we call to set the FPU back to
64bit, just a guess. This must be occuring after the call the call to
js_InitRuntimeNumberState.
As Kenton stated, the JS numerical system is designed to work with IEEE doubles
and the FPU getting into this 64 bit mode is going to cause problems not only in
this area but others.
I wonder if this is this really a Mozilla problem, or maybe some bug in COM?
Comment 102•22 years ago
|
||
I tested w/o CTFMON and it crashed too.
I'll do some more tests later...
Comment 103•22 years ago
|
||
Reopening since the problem still persists in some environments.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 104•22 years ago
|
||
I am one of the original reporters. In the two or three builds since this bug
was said to be resolved, Mozilla no longer crashes on washingtonpost.com or
cnet.com, and a couple other pages, but there are definitely more crashes for me
in general since the "fix". Coincidence that some new bug might be present in
Mozilla now or is this related to the code change?
Comment 105•22 years ago
|
||
Yes, in recent builds there are some new bugs that are causing frequent crashes,
which are unrelated to this issue. Fortunately this bugs unique 0x3939393?
pointer value makes it pretty easy to identify from other crashes.
Comment 106•22 years ago
|
||
For those still affected by this problem, bug 140852 is where the bulk of the
discussion/investigation is occuring. That bug deals with the colon. The failure
cases that remain for both bugs all deal with floating point errors and the
js_dtoa function having problems when the error occurs on the low side rather
than the high side.
Depends on: 140852
Comment 107•22 years ago
|
||
*** Bug 165580 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 108•22 years ago
|
||
Still pursuing this bug. I am going to put some test points in the code to
detect unusual settings of the rounding direction, precision control and FPU
exception handling settings.
Comment 109•22 years ago
|
||
*** Bug 167403 has been marked as a duplicate of this bug. ***
Comment 110•22 years ago
|
||
*** Bug 165220 has been marked as a duplicate of this bug. ***
Comment 111•22 years ago
|
||
*** Bug 165097 has been marked as a duplicate of this bug. ***
Comment 112•22 years ago
|
||
*** Bug 164880 has been marked as a duplicate of this bug. ***
Comment 113•22 years ago
|
||
Given that apparently there are still problems even with this patch in, and I've
seen no reports of 1.0.x suffering from this (though it's entirely believable),
I'm not going to give 1.0 branch approval yet.
If any of the people who can reproduce this could try it with 1.0.1 and report
here (good or bad) I'd _greatly_ appreciate it.
Reporter | ||
Comment 114•22 years ago
|
||
Since I was the one to originally open the bug I'll state that I have NOT had the browser to crash since the patch was checked in. I am able to access all of the sites that were causing my problems earlier with no problems. I am currently running 1.1 and not having any problems at all.
Comment 115•22 years ago
|
||
Um, what exactly is the evidence that bug 167403, bug 165220, bug 165097, and
bug 164880 (the four recent duplicates) are in fact duplicates of this bug?
For the most part, those bugs begin with "this crashed for me", followed by
one or two comments by others of "WFM", and then "this bug has been
marked...".
Comment 116•22 years ago
|
||
I checked into that, too. The common evidence in all of them
seems to be the memory address 0x39393929. Don't know if that's
enough evidence, but the "WFM" results are also typical of this
particular crash: it's machine-dependent.
Resummarizing to provide the memory address shown in the typical
Windows alertbox for the crash: 0x39393929. See Comment #22 for
the meaning of this address. Changing summary from
"Browser crashes viewing web page"
to "Large integers, e.g. getTime(), causing crash at 0x39393929"
Also adding "Windows only" to the Status Whiteboard, as that seems
to be the case in this bug and in every bug duped against it -
Summary: Browser crashes viewing web page → Large integers, e.g. getTime(), causing crash at 0x39393929
Whiteboard: [Related: bug 140852? ] → [Windows-only] [Related: bug 140852? ]
Comment 117•22 years ago
|
||
Yes, the memory address listed is pretty good evidence. All but bug 165220 made
reference to the same memory address, 0x3939393? which is trademark of this bug.
Bug 165220 might be something else, hard to know for sure.
Assignee | ||
Comment 118•22 years ago
|
||
I isolated the jsdtoa.c routine and performed a dtoa (double to ascii)
conversions on some large numbers, 86400000 and 819187200000. I set the
rounding direction to all four possible settings. Everything ran as expected.
I then set the precision control to float (24 bits of mantissa). 86400000.0 ran
as expected but 819187200000.0 crashed. It might be useful to write a small
program that sets the fpu rounding precision to float and have it running while
testing the JavaScript engine. It might also be useful to test for the correct
rounding precision at the beginning of each call to jsdtoa (and exit with the
appropriate error message if the rounding precision is corrupt.) Some graphics
program may be reducing the rounding precision for performance.
Comment 119•22 years ago
|
||
But any decent OS should not let another program's FPU settings mess over
Mozilla's. Is the problem that some embedding, or some plugin, is calling into
in-process code that messes with the FPU? We don't want the overhead of setting
FPU control/status registers on every dtoa, of course.
/be
Comment 120•22 years ago
|
||
There are two issues. One which the patch addresses is the mantissa precision.
When initializing COM, some deep function called by it sets it to the higher 64
bit precision. This only happens when certain programs are running. (I have no
idea why). The patch set the precision in the JS numerics initialization, which
addressed this problem. The second issue, appears to be a compiler issue. Even
with the precision properly set some people still experienced the problem. Bug
140852 deals more with this second issue. Basically it looks like certain
versions of the VC++ compiler optimize the floating point instructions in such a
way that the error compounds and creates problems during the conversion from
double to string. Resulting in a colon appearing in the translated number.
This last issue, I've been able to reproduce on my system in isolation (putting
just the jsdtoa code in a small program of its own), but Waldemar was unable to
reproduce this with the program I created. /Op (Chooses more precise floating
point math over speed) fixes the problem on my system. But we were trying to
figure out the exact source of the problem to determine if the /Op option was
really the correct fix or masking another problem.
Comment 121•22 years ago
|
||
If this does at least wallpaper the bug for many users (safely), we'll want it
for the 1.0 branch. Does it do so?
Comment 122•22 years ago
|
||
In my opinion the patch is simple, safe, and correct regardless of the outcome
of bug 140852. I just don't know how many people it will help.
Keywords: stackwanted
Comment 123•22 years ago
|
||
*** Bug 168664 has been marked as a duplicate of this bug. ***
Comment 124•22 years ago
|
||
Here's where my C2.dll came from.
http://msdn.microsoft.com/vstudio/downloads/tools/ppack/default.asp It's the
VC++ 6.0 Process Pack.
So I guess some of the builds machines had this as well. I'm still uncertain if
this is a "bug" in this VC++ update, or a bad assumption on the part the jsdtoa
implementation.
Also I expect we may see this same problem with VC++ 7.0, not that it is an
issue at the moment.
Assignee | ||
Comment 125•22 years ago
|
||
David, Might it be useful to put some asserts in the source to check for
corrupt precision settings?
Comment 126•22 years ago
|
||
It wouldn't hurt, but I'm not sure it's going to do a lot of good. Unless
Mozilla developers run these types of programs while running debug version of
Mozilla it's not going to be that preventative. Right now with the patch in
place, this takes care of the JS Pager issue, and probably other programs that
cause the setting of precision within the call to CoInitialize. So outside of
the C2.dll issue, unless a developer happens across another program, service, OS
version, etc. that might effect this post CoInitialize we're probably not going
to see it till the users come across some odd program or OS version that trips
it up. And once we detect it, what do we do? Add yet another program to list of
programs not to run Mozilla with?
We know of several potential solutions, and I think we need to choose one, and
then incorporate the asssert into that check-in. I just don't know enough about
floating point math to know if the routine is making an errant assumption about
the way FPU is doing the calcs or that this is a bug in this version of the
compiler. I'm fine with any of these. I think we need to pick one and run with
it. Since this doesn't seem to be generating talkbacks this could be a bigger
problem than we realize.
1. build with the earlier C2.dll (Can we detect this using the preprocessor)
May run into this again in VC++ 7.0
Other people may have this DLL and build and encounter the problem
2. Use the /Op option
May be a performance impact (Could compile only jsdtoa.c)
3. Modify the js_dtoa code
We may be coding around a VC++ specific
Assignee | ||
Comment 127•22 years ago
|
||
I agree with David. I'd probably pick option 3, setting the precison control to
the default on every entry to dtoa in VC++.
Comment 128•22 years ago
|
||
David, your comment #126 looks cut-off.
I don't think we should fiddle with FPU registers on every dtoa, or JS numeric
performance will regress unacceptably.
/be
Assignee | ||
Comment 129•22 years ago
|
||
The time spent in dtoa setting the fpu environment would be negligible compared
with the time spent in a call to dtoa. The call would only be made before any
conversion between binary and decimal. If speed were a concern we could simply
test the environment and do nothing if correct, otherwise set the correct
environment and issue a warning that a potentially bad environment was encountered.
Comment 130•22 years ago
|
||
khanson: I hope you're right, but measurements would be good to prove negligible
performance hit. Too many times a seeming small change can prove troublesome,
so we should benchmark real-world and worst-case synthetic cases.
If reading the status/control register (or whatever it is) is faster than
setting unconditionally, do read before a conditional write.
/be
Comment 131•22 years ago
|
||
This is the assembler and diff with and without the /Op
I haven't had a chance to track down an old C2.dll to compare the assembler
generated there. It wasn't in the sp5, only c2.exe. So I suspect it's in sp4 or
sp3 or before.
Adding _controlfp to the js_dtoa call isn't going to fix the problem we're
seeing with this specific c2.dll issue. That has more to do with what's
happening in bug 140852. Comment #102 was the last one concerning this
particular bug's crash. I don't know if Marcia ever found if a specific program
was causing a problem or not. I think we need to keep discussions about the
0x3939393? crash in this bug, and the : appearing in numbers in 140852.
From what I see, there's no solid evidence saying we're seeing the FPU state
getting changed after the JS numerics is initialized. Also I did a timing test
and the js_dtoa function takes about 512 nanoseconds for the test number. With
_controlfp added, that added another 16 nanoseconds, or almost 10% to the
function. The numbers seem small, but 10% doesn't seem that small. /Op
increased the js_dtoa time to 527 nanoseconds or an increase of 25 nanoseconds.
I'm more concerned about bug 140852, even though it doesn't cause a crash.
Comment 132•22 years ago
|
||
> I don't know if Marcia ever found if a specific program
> was causing a problem or not.
I was not able to find if there's one specific application
that causes the problem.
I use david's /Op build .dll and it works fine.
Comment 133•22 years ago
|
||
I think the following is the key section of code:
; 1656 : L = (Long) (d / ds);
- fld ST(0)
+ fld1
fdiv QWORD PTR _ds$[ebp]
+ fst QWORD PTR -76+[ebp]
+ fmul ST(0), ST(1)
call __ftol
mov DWORD PTR _L$[ebp], eax
@@ -1463,42 +1527,42 @@
; 1666 : if (i == ilim) {
- cmp edi, 1
- mov BYTE PTR [ebx], al
+ cmp ebx, 1
+ mov BYTE PTR [esi], al
fmul QWORD PTR _ds$[ebp]
- lea esi, DWORD PTR [ebx+1]
+ lea edi, DWORD PTR [esi+1]
fsubp ST(1), ST(0)
- je SHORT $L1556
- mov eax, esi
- sub eax, ebx
+ je SHORT $L1993
+ mov eax, edi
+ sub eax, esi
mov DWORD PTR 16+[ebp], eax
-$L1370:
+$L1804:
; 1677 : }
; 1678 : break;
; 1679 : }
; 1680 : if (!(d *= 10.))
- fmul QWORD PTR __real@8@4002a000000000000000
+ fmul QWORD PTR __real@4024000000000000
fld ST(0)
- fcomp QWORD PTR __real@8@00000000000000000000
+ fcomp QWORD PTR __real@0000000000000000
fnstsw ax
- sahf
- je $L1614
- fld ST(0)
- fdiv QWORD PTR _ds$[ebp]
+ test ah, 68 ; 00000044H
+ jnp $L2052
+ fld QWORD PTR -76+[ebp]
+ fmul ST(0), ST(1)
call __ftol
mov DWORD PTR _L$[ebp], eax
fild DWORD PTR _L$[ebp]
add al, 48 ; 00000030H
- mov BYTE PTR [esi], al
- inc esi
+ mov BYTE PTR [edi], al
+ inc edi
fmul QWORD PTR _ds$[ebp]
inc DWORD PTR 16+[ebp]
- cmp DWORD PTR 16+[ebp], edi
+ cmp DWORD PTR 16+[ebp], ebx
fsubp ST(1), ST(0)
- jne SHORT $L1370
-$L1556:
+ jne SHORT $L1804
+$L1993:
; 1667 : d += d;
Comment 134•22 years ago
|
||
The L = (long) (d / ds); genereates nearly the same assembler using the old
C2.dll and using the new one with /Op. The new version without /Op does a divide
and then a multiply, while the others just do a divide. The assembler further
down gets a bit more complex.
So I think what's happening is that the new version of c2.dll takes some short
cuts that normally doesn't effect things.
The description of the option is at the link below. Reading the description, and
how the FPU registers are used more rather than memory, and how that increases
precision, sounds like this is probably the best solution IMO, since this
routine relies on 53 bit precision.
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore/html/_core_.2f.Op.asp
Comment 135•22 years ago
|
||
So is there any concensus on what the appropriate solution is?
Comment 136•22 years ago
|
||
In setting up my laptop I also noticed that the Win32 build instructions call
for the processor pack in addition to SP5. So I think we'll have to go the /Op
route unless someone wants to tackle making the code to work without /Op.
Comment 137•22 years ago
|
||
*** Bug 165131 has been marked as a duplicate of this bug. ***
Comment 138•22 years ago
|
||
*** Bug 182624 has been marked as a duplicate of this bug. ***
Comment 139•22 years ago
|
||
*** Bug 186249 has been marked as a duplicate of this bug. ***
Comment 140•22 years ago
|
||
*** Bug 187704 has been marked as a duplicate of this bug. ***
Comment 141•22 years ago
|
||
More info on bug report 187704:
URL only fails when Mozilla browser window opened by clicking URL http://www.chez.com/sfaucourt/mediat_us.htm:
from an open message in a Mozilla Mail window. Then fails repeatedly.
mynews@pacbell.net
Comment 142•22 years ago
|
||
I'm not sure if this is really the same bug since the IPF address is slightly
different but I got here from bug #153402 which has been marked as a duplicate
of this one.
Shortly after printing the following occurs:
MOZILLA caused an invalid page fault in
module <unknown> at 0000:39393939.
Registers:
EAX=0064fb2c CS=0167 EIP=39393939 EFLGS=00010246
EBX=0064fb2c SS=016f ESP=00550038 EBP=00550058
ECX=005500dc DS=016f ESI=8167febc FS=19e7
EDX=bff76855 ES=016f EDI=00550104 GS=0000
Bytes at CS:EIP:
Stack dump:
bff76849 00550104 0064fb2c 00550120 005500dc 00550210 bff76855 0064fb2c 005500ec
bff87fe9 00550104 0064fb2c 00550120 005500dc 39393939 005502c8
The only thing I have done other than print a page is switch to a different
browser tab (often viewing news.com). I have been getting this bug since Mozilla
v1.1 and it is still present. It occurs every time I print no matter what site
is printed or what site I browse next as far as I can see.
I'm running Win98 SE fully patched and using Mozilla 1.2.1 - Mozilla/5.0
(Windows; U; Win98; en-US; rv:1.2.1) Gecko/20021130. OS doesn't appear to be any
more unstable than usual for Win98. :)
Comment 143•22 years ago
|
||
Just a refresher, the 0x39393939 is an indicator of a buffer overrun of the
ASCII 9's which is hex 0x39. It's possible that some other code did this, if
there other code that converts floating point to ascii and doesn't check the
buffer size. I know I've had this crash in testing, where the last number
varied. It depended on the number that was being converted.
Comment 144•22 years ago
|
||
*** Bug 188021 has been marked as a duplicate of this bug. ***
Comment 145•22 years ago
|
||
The bug I described in 187704 has not occurred since installing Windows 2000 critical update described below.
810649: Critical Update
This update contains several fixes to Windows components to better support default Web browsers other than Internet Explorer, as described in Microsoft Knowledge Base (KB) Article 810649. Download now to improve the interaction of certain Windows components with default web browsers other than Internet Explorer.
For more information about this issue, read Microsoft KB Article: 810649. (This site may be in English.)
System Requirements
This update applies to Windows 2000 with Service Pack 3.
I don't know if this is related to the bug or not.
Comment 146•22 years ago
|
||
There were two causes of this. One was the floating point precision changes on
calls to CoInitialize, the other was various large values fed to the function
caused rounding issues in the presence of compiler optimizations. It's possible
you might be experiencing the CoInitialize flavor and this update might have
"fixed" that.
Comment 147•22 years ago
|
||
*** Bug 180943 has been marked as a duplicate of this bug. ***
Comment 148•22 years ago
|
||
*** Bug 192002 has been marked as a duplicate of this bug. ***
Comment 149•22 years ago
|
||
*** Bug 189816 has been marked as a duplicate of this bug. ***
Comment 150•22 years ago
|
||
*** Bug 182789 has been marked as a duplicate of this bug. ***
Comment 151•22 years ago
|
||
*** Bug 190887 has been marked as a duplicate of this bug. ***
Comment 152•22 years ago
|
||
Note: this issue is now causing crashes for Windows users at
http://slashdot.org, explaining the recent spate of duplicates -
Comment 153•22 years ago
|
||
What about David Bradley's fix (/Op for js*.dll) in bug 140852?
It's known for almost 6 months, it works, without it my mozilla crashes twice a day.
There're about 25 dupes for this...
Updated•22 years ago
|
Flags: blocking1.3?
Comment 154•22 years ago
|
||
Please also note Bug 180776 and Bug 185337, they're not necessarily dupes, though.
There is an interesting problem with startrek.com as the crash is time-delayed
i.e. I open the page and the browser crashes about a minute later. Also the
symptoms (talkback not triggered) match as well.
Also, the Windows XP "crash-catcher" doesn't get invoked, as well. I don't know
what prerequisites exist for the "contact Microsoft"-code.
Comment 155•22 years ago
|
||
We should try to get this in for final. Gathering lots of dupes from slashdot
crashers isn't good.
Flags: blocking1.3? → blocking1.3+
Comment 157•22 years ago
|
||
Patch was checked in for 140852, this bug can now be closed as well.
Status: REOPENED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
Comment 158•22 years ago
|
||
Provisionally marking Verified, as I am seeing no new test failures
with the fix for bug 140852. However, from the beginning, I was never
able to reproduce the current bug in the browser.
Could other contributors report back on that? If you use today's
trunk build of the browser, have the crashes gone away?
From the reports, slashdot.org seemed to expose the crash; it would
be nice to know from the field that this no longer occurs -
Thanks - (remember to use a build from 2003-02-18 or after!)
Status: RESOLVED → VERIFIED
Comment 159•22 years ago
|
||
Can anyone confirm that this crash has gone away? Thanks -
Comment 160•22 years ago
|
||
I can verify it has been fixed. Win XP. Build ID: 200302022008.
Comment 161•22 years ago
|
||
I'd only crashed a couple of times at slashdot before the fix, but I can say
that I haven't crashed at all with nightlies (on win2k) since the fix.
Comment 162•22 years ago
|
||
No crash since fix, all testcases from this bug and from #140544 and #140852
work just fine. Thanks.
Updated•22 years ago
|
Whiteboard: [Windows-only] [Related: bug 140852? ] → [Windows-only] [Related: bug 140852? ] fixed1.3
Comment 163•21 years ago
|
||
*** Bug 80734 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•