Closed Bug 140852 Opened 23 years ago Closed 23 years ago

String(819187200000) == '8191871:0000' in xpcshell, browser

Categories

(Core :: JavaScript Engine, defect)

x86
Windows 2000
defect
Not set
major

Tracking

()

VERIFIED FIXED

People

(Reporter: jrgmorrison, Assigned: khanson)

References

Details

(Whiteboard: [Duplicates: bug 161046, bug 160531 Related: bug 160602? ] fixed1.3)

Attachments

(4 files, 1 obsolete file)

I was doing some stuff with compacting time strings and typed in something like the following. The stringified form of the epoch time appears to be broken for some set of dates in 1995 and earlier (hmm, coincidence that JS didn't exist until 1995? :-). The second alert below will display '8191871:0000', which is wrong. Compiled with '-O1' on win2k with msvc6 sp5 with the appropriate processor pack installed. <html> <script> var now = new Date(95,11,17).toString(); alert(now); if (window.dump) dump(now); var now = (new Date(95,11,17)).getTime().toString(); alert(now); if (window.dump) dump(now); </script> </html>
Seems to be correct value in linux default trunk build (gcc 2.91.66, -O1) '819187200000'
works for me wih 1.0 rc1 on win2k
I haven't been able to reproduce this either, in the current JS debug or optimized shell on WinNT, Linux, or Mac9.1 GIVEN DATE: js> (new Date(95,11,17)).getTime().toString() 819187200000 ONE MILLISECOND BEFORE: js> (new Date(95,11,16,23,59,59,999)).getTime().toString() 819187199999 ONE MILLISECOND AFTER: js> (new Date(95,11,17,0,0,0,1)).getTime().toString() 819187200001 John: are all three of these dates malfunctioning for you? Or just the central one? Also, what happens if you try this on the Win2K box: (new Number(819187200000)).toString() Do you get '819187200000'?
Assignee: rogerl → khanson
cc'ing mccabe -
I also can't reproduce this. By the time we're calling toString, it's on a number, not a date. And '8191871:0000' looks a lot like 819187100000 with the first zero overwritten by another character (rather than any concievable formatting error while printing the number out.) In other words, some kind of unrelated memory bug that happens to overwrite part of the string we're seeing. Or maybe it's some kind of display bug (e.g. cygwin under windows sometimes gives me garbage characters.) Is this in the browser? If you say 'd = (new Date(91,11,17)).getTime(); d.toString()' what do you get?
<script> var s = (new Number(819187200000)).toString(); alert(s); // A1 var d = (new Date(91,11,17)).getTime(); var dx = d.toString(); alert(dx); // A2 </script> produces: with 2002-04-29-08-trunk win2k | with 2002-04-29-08-1.0.0 win2k A1) 8191871:0000 | A1) 819187200000 A2) 6929567:0000 | A2) 692956800000 This is using the browser, not the shell. These are the official mozilla builds. See also bug 102725 (which isn't necessarily the same; just similar).
cc'ing Brendan, Mike, jband, dbaron, bbaetz on this one -
Re-summarizing -
Summary: (new Date(95,11,17)).getTime().toString() == '8191871:0000' ?? → (new Number(819187200000)).toString() == '8191871:0000' ??
*** Bug 160531 has been marked as a duplicate of this bug. ***
Comments from the duplicate bug 160531 (note its duplicate bug 161046): ------- Additional Comment_ #9 From Phil Schwartau 2002-08-04 22:47 ------- The behavior noted in the duplicate bug 161046 has to do with parseFloat() and not alert(). As with this bug, it seems not to happen in the JS shell, but only in the xpcshell or browser: ---------------------- IN OPTIMIZED JS SHELL ----------------------- js> num = "+441111111111" +441111111111 js> parseFloat(num); 441111111111 -------------------------- IN XPCSHELL --------------------------- js> num = "+441111111111" +441111111111 js> parseFloat(num); 441111111110.: <<<-------------------------- here is the corruption Thus changing summary from "alert() corrupts display of certain integers" to "xpcshell, browser corrupt display of certain integers" NOTE: 1. I am only seeing this on Windows, and not on Mac9.1 or Linux 2. I see it in Mozilla WinNT at least all the way back to 2002-04-28 3. I do not see it in Mozilla WinNT 2002-04-22. 4. So: due to a checkin between 2002-04-22 and 2002-04-28 ? Again, I think this may be a duplicate of another report, but I can't find it yet - ------- Additional Comment_ #10 From Phil Schwartau 2002-08-04 23:00 ------- I notice a checkin to jsdtoa.c on 2002-04-24: http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&fi le=jsdtoa.c&root=/cvsroot&subdir=mozilla/js/src&command=DIFF_FRAMESET&rev1=3.16& rev2=3.17 That fits into the timeframe outlined in the above comment, but I don't know if it could be the cause. cc'ing rogerl, khanson, scole@planetweb.com, bratell@lysator.liu.se for advice Again, is only happening in the xpcshell and browser on Windows, not in the JS optimized shell on Windows, and not in the browser on Mac9.1 or Linux. I'm not sure if the options I use to build the JS shell match those used in building the xpcshell or browser, so perhaps that is a factor to consider - ------- Additional Comment_ #11 From Steven Cole 2002-08-05 08:47 ------- Hmmm. The change you found, Phil, is the fix for bug 138666. I think it's very unlikely that this is the cause of the problem, especially since the code modification for that bug should cause the same change in behavior for both jsshell and xpcshell. Can you recompile with a April 23, 2002 jsdtoa.c and see what happens? I suspect the bug will still be there, pointing the finger of blame away from jsdtoa.c. (I don't have a functional mozilla build environment here; we just do jsshell and our Spidermonkey embedding here [neither of which have this problem] --- so I don't get xpcshell...) --scole ------- Additional Comment_ #12 From Kenton Hanson 2002-08-05 09:29 ------- I vaguely remember this problem. It behaves like an integer being stored in a floating-point type with 30 bits of significand. Integers stored in double precision don’t suffer from this problem. ------- Additional Comment_ #13 From Steven Cole 2002-08-05 09:45 ------- Possible dup of bug 140852? --scole ------- Additional Comment_ #14 From Kenton Hanson 2002-08-05 11:08 ------- The recent bug 136100 was caused by an error in the underlying arithmetic of an ARM [gcc compiler bug]. This bug appears to be similar, a loss of precison caused by a shorter than double precison type.
Severity: normal → major
Whiteboard: [Note duplicates: bug 161046, bug 160531 ]
Note "Comment_ #9" within Comment #11 above might help developers pin this one down. To repeat it: 0. I am only seeing this in xpcshell and browser, not JS opt/debug shell 1. I am only seeing this on Windows, and not on Mac9.1 or Linux 2. I see it in Mozilla WinNT at least all the way back to 2002-04-28 3. I do not see it in Mozilla WinNT 2002-04-22. 4. So: due to a checkin between 2002-04-22 and 2002-04-28 ?
Summary: (new Number(819187200000)).toString() == '8191871:0000' ?? → String(819187200000) == '8191871:0000' in xpcshell, browser
Whiteboard: [Note duplicates: bug 161046, bug 160531 ] → [Duplicates: bug 161046, bug 160531 Related: bug 160602? ]
In investigating bug 160602, there are some related issues to this bug. Both cases are failing because of rounding issues. In bug 160602 case, this was due to increased precision of the FPU. When I was running some tests, I decided to try the test case here. When it came to the 2 in the number, it gets 1.99999999 ... This is where the problem occurs. It rounds down, and then the value for the next digit comes up 10. My first patch from bug 160602 corrects this problem. With that said, my floating point knowledge is dated and limitted, and the patch may pose other problems for other types of numbers. But in any case, I thought I'd point it out because I think it the fact that it does "correct" the problem might aid in finding a real fix, if it's not a good solution.
The ASCII charcater ':' is the next character after '9'. So I am checking IntToString to see if it is not bumping the ':' to '0' and incrementing the digit to the left. If ':' where permitted to represent the decimal digit 10, then, '819187200000' == '8191871:0000'.
What I was seeing is that result of d/ds in L = (Long) (d / ds); is 1.9999999999 http://lxr.mozilla.org/seamonkey/source/js/src/jsdtoa.c#2379 When this is cast to long it become 1 instead of 2. What I don't understand is the source of this rounding error. The FPU is set to 53 bit mantissa so that's not an issue. The patch I referenced adds 0.5 to d and that gets around the problem. In the past this was SOP for such division and conversion combination, but this function is far more complex so have no idea if that's a proper solution. So this routine seems to be sensitive to some errors calculations.
David, if the FPU round direction is set to "towardszero" or "down" rather than the defualt "tonearest" 1.9999999 will round to 1 when rounded to an integer. I suggest testing for non-default FPU settings in Javascript and presenting an alert box that reveals any non standard FPU settings.
This bug was introduced between the trunk builds 2002-04-25-12 and 2002-04-26-03. There weren't many changes done to the tree in those few hours, and none to the numeric code. See: http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=MozillaSourceWin&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=04%2F25%2F2002+10%3A00%3A00&maxdate=04%2F26%2F2002+04%3A00%3A00&cvsroot=%2Fcvsroot One small change to jsinterp.c, a few to jsstr.h, jsstr.c, and jsgc.c, and some to xpconnect. Anything ring a bell?
Phil or Waldemar, I don't see this bug on a recent Windows NT build. Could you verify that the problem still exists. Thanks
This is a simple program that contains the js_dtoa function and executes it with the control value, and demonstrated the problem. What is happening is the optimized code is keeping d on the FPU stack. The operations performed incur extremely small errors. Here's what I see: 1. 8.19187200000000004e+0000 2. 1.91871999999999998e+0000 3. 9.18719999999999892e+0000 4. 1.87199999999999988e+0000 5. 8.71999999999999886e+0000 6. 7.19999999999999928e+0000 7. 1.99999999999999977e+0000 The debug build works, because divide is used, and the error is in the other direction. The optimized build uses multiply with a small number 1.00000000000000000e-0011, so the other direction. So instead of 2.00000000000223 we get 1.9999999999932 Adding a small number like .001 to d in L = (Long)(d / ds) "hides" the problem. Essentially bumps the error back in the other direction. I don't see these errors on slightly smaller numbers. I wonder if Int_Max is set too large?
> Phil or Waldemar, I don't see this bug on a recent Windows NT build. > Could you verify that the problem still exists. Yes, the problem is still occurring. Using a Mozilla trunk binary from today, build ID 2002082308, on WinNT4.0 (SP6): javascript: alert(String(819187200000)) -----> '8191871:0000'
David: What are you building the thing with? I built your attachment optimized, non-debug using Visual C++ 6.0, and I don't get the error. I did take out the +.001.
Well I can no longer reproduce this error with the code I produced. The only thing I did was to turn off debug symbols, to sync with Waldemar, and then turned them back on when I couldn't reproduce. I still get errors, but they are on the high side instead of the low side. I checked the FPU's control register and its value is the same as before. The assembler looks the same as well. I've tried starting up the various apps I had running at the time I was reproduce and still can't reproduce. I know I'm not crazy, as I documented the results of the calculation in question. I'll see if I can't figure out what the difference is. I even set my clock back to this morning and it didn't have an effect. Did I ever say that I never did liked floating point types.
Well, I have no idea how this happened, but apparently some zero's from the test number were lost between the time I ran the test and the time I uploaded it. Change it to 819187200000.0 and it should work fine or rather fail. Sorry for the confusion.
The first number I see this error for is 100,000,000,000. String(100000000000) returns "0:". And it looks like most if not all numbers greater than this end up with this same problem.
*** Bug 164921 has been marked as a duplicate of this bug. ***
Blocks: 160602
I still cannot reproduce the problem building release with Visual C++ 6.0 -- I get the correct numeric printing behavior. What exact version of the compiler and build settings are you using?
A few notes: David wrote "most if not all numbers greater..." 100,000,000,000 fails, so do 200,000,000,000 but 300,000,000,000 is converted *right*, then 4* is wrong and 5* is right... I did some debugging and I'll do some more, but js_dtoa it pretty huge and not easy to track in optimized build. Anyway there's code like (few times) *s++ = '0' + (char)L; that can create ':' with L rounded to 10 instead of 9.
I deassembled js3250.dll from 1.0 release (that does not contain the bug) and from some recent trunk build. I tried to compare them and it seems that something changed in build environment between these builds (compiler version, options?) - is compiler version (that was used to build) traced? Example: push 00000001 mov ecx, eax pop edx is now xor edx, edx inc edx mov ecx, eax - it's not meaningful change, but it IS a change that is caused by something, I guess MS compilers are not based on neural nets giving different results in every build... Also - the way of testing fpu results changed in few places was: fcom qword ptr [XXX fstsw ax sahf jbe XXX is: fcom qword ptr [XXX fstsw ax test ah, 41 jne XXX and so on...
This is the full project, with the correct test number and minus the +0.001. Should be able to download and then build and run. There's only one configuration in this project. I'm using VC++ SP6 (Product ID: 82895-270-3144333-15432). CL.EXE - 3/14/2000, c1.dll - 5/16/2000 c1xx.dll - 8/21/2000 c2.dll - 11/9/2000. The project is setup for /O1 which from what I know is how we build Mozilla. I also tried /O2 and the various pentium optimizations and they all produced the same error. Marcin the differences you stated really should have no impact on the outcome of the calculations. I'll take a look and see if I can't step through both versions with a debugger at the same time and see what's different. I'm at a loss as to what build change might cause this. I've tried changing to various optimization levels, calling conventions, alignment, and the only thing that makes the error go away is turning off all optimizations.
Attachment #96450 - Attachment is obsolete: true
Note the build system for JS was changed recently on Windows: Makefile.in and makefile.win were fixed to get around an MSVC compiler bug. See bug 151066 and this attachment from that bug: http://bugzilla.mozilla.org/attachment.cgi?id=91211&action=view But that was checked in on 2002-07-23, so I guess it's not relevant to this bug opened on 2002-04-29, but I just thought I'd mention it. Waldemar also suspected some change in the build system which might have occurred within the timeframe he outlined in Comment #18 above: > This bug was introduced between the trunk builds 2002-04-25-12 > and 2002-04-26-03. There weren't many changes done to the tree > in those few hours, and none to the numeric code.
bug#164921 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.1) Gecko/20020826 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.1) Gecko/20020826 value1 = 99999999999; value1.toString(); // we got 99999999999, OK value2 = 100000000000; value2.toString(); // we got 0:0000000000, Wrong value3 = 426067200000; value3.toString(); // we got 4260671:0000, Wrong Reproducible: Always
The difference is the build systems. nmake used the "default" optimization level, while the gmake system is setup to use O1 (optimize for size). While this optimization change is the change that caused this problem, I'm not entirely convinced the routine itself doesn't have a problem. When walking through the code and watching the results of the calculations I see similar errors. I even see errors on the low side early on. It's just when it gets to the last few digits, it appears that most if not all numbers error on the high side, so we don't see the problem. For instance I see 1.91871999999999998 as the result d /ds of the L = (Long)(d / ds); This doesn't matter at this point. But I wonder if there are some numbers that will have this low error occur in processing the last digit.
Ok, I found a VC++ compiler option that looks like it deals with this problem. /Op improves floating point consistency. When I applied that option the routine works regardless of the optimization. So we should probably get this added to the Windows build. Also gcc has -ffloat-store: Do not store floating point variables in registers. This prevents undesirable excess precision on machines such as the 68000 where the floating registers (of the 68881) keep more precision than a double is supposed to have. For most programs, the excess precision does only good, but a few programs rely on the precise definition of IEEE floating point. Use `-ffloat-store' for such programs. I wonder if this should be used for gcc based builds?
/* log(x) ~=~ log(1.5) + (x-1.5)/1.5 * log10(x) = log(x) / log(10) * ~=~ log(1.5)/log(10) + (x-1.5)/(1.5*log(10)) * log10(d) = (i-Bias)*log(2)/log(10) + log10(d2) * * This suggests computing an approximation k to log10(d) by * * k = (i - Bias)*0.301029995663981 * + ( (d2-1.5)*0.289529654602168 + 0.176091259055681 ); * * We want k to be too large rather than too small. * The error in the first-order Taylor series approximation * is in our favor, so we just round up the constant enough * to compensate for any error in the multiplication of * (i - Bias) by 0.301029995663981; since |i - Bias| <= 1077, * and 1077 * 0.30103 * 2^-52 ~=~ 7.2e-14, * adding 1e-13 to the constant term more than suffices. * Hence we adjust the constant term to 0.1760912590558. * (We could get a more accurate k by invoking log10, * but this is probably not worthwhile.) */ This comment from jsdtoa.c suggests that the constants for calculating the log(x) necessary for binary to decimal conversions are fine tuned for double. Notice the approximation is supposed to always error on the same side. This suggests that David's theory about too much precision could indeed be causing the problem.
cc'ing cls re David's ideas in Comment #34 above -
> The difference is the build systems. nmake used the "default" optimization > level, while the gmake system is setup to use O1 (optimize for size). No, the nmake build used -O1 as well. http://lxr.mozilla.org/seamonkey/source/config/WIN32#72 Using -ffloat-store sounds fine to me. Someone might also want to look to see if -mieee-fp makes any difference as well. We should already be using -mieee on alpha processors.
Ugh, apparently when I moved to gmake I forgot all the stuff about nmake. I repulled the 1.0 branch and compiled with the proper settings and I do get the error. The /Op option does "fix" the problem. I pulled MOZILLA_0_9_4_BRANCH and I see the problem in xpcshell with that version. What puzzles me is that I don't see the problem on Netscape 7.0 PR1, which I thought was branched later. Oddly enough I built NETSCAPE_7_0_RTM_RELEASE and still can reproduce the problem xpcshell. So does the Netscape builds have a different set of compiler options?
Another thought, could the VC++ Service Pack version have an effect, 5 vs 6?
Still looking at this bug. I am going to isolate the jsdtoa code and check its behavior under a variety of FPU settings, i.e., precision control and rounding directions.
I built the project from comment 30 using Visual C++ 6.0, and it still executes correctly. I believe that I'm using service pack 5 (which Microsoft claims is the latest one available on their downloads web site), but I am not sure; if I try to download that service pack again to reinstall it, it downloads corrupted and refuses to decompress. I can't find any references to a service pack 6.
I was looking at what I thought was Visual C++ only stuff, but it also lists Source Safe, which has a service pack 6. I'm downloading SP5 and I'll apply it to make sure I'm using it, I'm pretty sure I am, but not 100%. I really wish they reported this in the about box. I've looked, around, if anyone knows where to spot the SP# let me know.
I just reinstalled Visual C++ Service Pack 5 and rebuilt the latest enclosed project from scratch. Still no error -- I get the correct result. The compiler file modification dates are the same as in comment 30, except that c2.dll has the date 8/23/2000.
Wanted to post this here as well. The C2.dll that I have came from Visual C++ 6.0 Processor Pack. http://msdn.microsoft.com/vstudio/downloads/tools/ppack/default.asp So is this a bug in that update, or does this update expose a problem in the jsdtoa code? /Op fixes the problem, but I don't know the impact to performance is if we globally compile with this option. We could compile just jsdtoa.c and with /Op to minmize the performance impact.
Update: this bug, and/or bug 160602, is now causing crashes for Windows users at http://slashdot.org. We need to get fixes in for both bugs -
cls or another make guru needs to verify that this is the correct way to add this option, but this will get the idea across. I don't think we need to globally use this flag, although an argument could be made. I did this across js/src because I think it could matter in other parts of JS as well. If others disagree, we could just specify the -Op for the jsdtoa.c file only. I don't know how this affects speed. I would expect to see a slight decrease for some floating point operations, but this is a trade off between speed and accuracy. The only other alternative I know of is to modify the jsdtoa.c code to work in the presence of the optimization.
Comment on attachment 113634 [details] [diff] [review] Adds -Op to the js/src yeah, that looks fine (we don't use c++/CXXFLAGS in js)
Attachment #113634 - Flags: review+
This bug blocks bug 160602, a Slashdot crasher. Nominating for either 1.3 beta or final.
Flags: blocking1.3b?
Flags: blocking1.3?
Flags: blocking1.3b? → blocking1.3b-
If this is necessary for fixing a high-profile crasher then we should try to get it for 1.3 final.
Flags: blocking1.3? → blocking1.3+
If it helps for cross-platform knowledge, I'm getting this on 1.3beta on WinMe, so it isn't just NT-specific.
Comment on attachment 113634 [details] [diff] [review] Adds -Op to the js/src I spent a little time trying to profile this to measure the impact. Oddly enough, JS_strtod code appears to be slightly faster with /Op. I ran three runs for each setting and compared. one run was slightly slower, the other two were slightly faster. I think the locking that goes on in this routine creates enough noise to hide any real differences. Quantify consistently reported the JS_strtod code minus the functions it calls, to be slightly faster with /Op, although js_strtod appeared to have increased 5% to 10%. So since the performance impact is small, I think it's probably better to be correct than a few microseconds faster. As mentioned earlier this has started crashing on slashdot.com, and it will probably only increase as time goes on.
Attachment #113634 - Flags: approval1.3?
Comment on attachment 113634 [details] [diff] [review] Adds -Op to the js/src Approved for 1.3 final. /be
Attachment #113634 - Flags: approval1.3? → approval1.3+
I just remembered the Makefile.ref file. Also I changed to /Op instead of -Op as jsinterp.c has a sed command that replaces -O to add optimizations.
Patch checked in
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Verified FIXED. The JS testsuite is happy in both the debug and optimized JS shell, and the following javascript:URLS now work correctly in today's Mozilla trunk build on WinNT4.0: javascript: alert(String(819187200000)) ------> 819187200000 javascript: alert(parseFloat("+441111111111")) ------> 441111111111 The same is true in today's xpcshell, with |alert| replaced by |print|. Recall that I was only ever able to reproduce the bug in the browser and xpcshell (and not the JS shell). Now it's all fixed -
Status: RESOLVED → VERIFIED
Whiteboard: [Duplicates: bug 161046, bug 160531 Related: bug 160602? ] → [Duplicates: bug 161046, bug 160531 Related: bug 160602? ] fixed1.3
*** Bug 199293 has been marked as a duplicate of this bug. ***
Flags: testcase?
Checking in regress-140852.js; /cvsroot/mozilla/js/tests/js1_5/Regress/regress-140852.js,v <-- regress-140852.js initial revision: 1.1
Flags: testcase? → testcase+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: