Closed Bug 635617 Opened 13 years ago Closed 13 years ago

64-bit crash [@ MakeDay]

Categories

(Core :: JavaScript Engine, defect)

x86_64
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla8

People

(Reporter: chofmann, Assigned: m_kato)

References

Details

(Keywords: topcrash, Whiteboard: [fixed-in-tracemonkey])

Crash Data

Attachments

(3 files, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-c0e3dc2e-c404-4598-b1d1-024ac2110220 .
============================================================= 

looks like this crash might have been around for awhile, but in the last couple of days there has been a slight bump in volume

         MakeDay

date     total    breakdown by build
         crashes  count build, count build, ...

20110213 2  	1 4.0b82010121417, 
        		1 3.6.132010120307, 
20110214   
20110215 1 4.0b12pre2011021403 	1 , 
20110216 1 4.0b12pre2011021603 	1 , 
20110217 1 4.0b12pre2011021703 	1 , 
20110218 1 4.0b12pre2011021803 	1 , 
20110219 7  	4 4.0b12pre2011021814, 
        		3 4.0b12pre2011021903, 

stacks mostly look like this, but there a few other variations

0 	mozjs.dll 	MakeDay 	js/src/jsdate.cpp:365
1 	mozjs.dll 	date_parseString 	js/src/jsdate.cpp:1141
2 	mozjs.dll 	js_Date 	js/src/jsdate.cpp:2551
3 	mozjs.dll 	js::InvokeConstructor 	js/src/jsinterp.cpp:1231
4 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:4616
5 	xul.dll 	nsIFrame::InvalidateInternalAfterResize 	layout/generic/nsFrame.cpp:4143

more reports at

https://crash-stats.mozilla.com/report/list?signature=MakeDay

looks like the code in the area has changed a bit over the last few weeks and months so there might be a possible low volume regression.
From 4.0b10pre/20110125, it is 64-bit build specific.

> but there a few other variations
The other stack trace looks like:

0 	mozjs.dll 	MakeDay 	js/src/jsdate.cpp:365
1 	mozjs.dll 	js_DoubleToInteger 	js/src/jsnum.h:603
2 	mozjs.dll 	date_msecFromArgs 	js/src/jsdate.cpp:611
3 	mozjs.dll 	js_GetPropertyHelper 	js/src/jsobj.cpp:5463
4 	mozjs.dll 	js_LookupPropertyWithFlags 	js/src/jsobj.cpp:5034
5 	mozjs.dll 	js_Date 	js/src/jsdate.cpp:2558
6 	mozjs.dll 	js::InvokeConstructor 	js/src/jsinterp.cpp:1230
7 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:4654
OS: Windows NT → Windows 7
Hardware: All → x86_64
Summary: crash [@ MakeDay] → 64-bit crash [@ MakeDay]
Version: unspecified → Trunk
This is strange crash.  But this may be http://support.microsoft.com/kb/982107
Here's a patch for MakeDay().

It removes this fmod() call which was useless anyway. It also removes the "if (month < 0) month += 12;".

If the fmod bug was the cause of this crash (I don't know), then it would fix the crash.
Attachment #515074 - Flags: review?
CC'ing Jeff and Nick because `hg annotate` says they last made changes around there. Can you review my patch?
Wrong Jeff.  :-)
So, um.  Am I misreading this wrong when I see it (with or without patch) as, first thing, doing += on an uninitialized |year| variable?
year is passed as argument to this function.
So, yes, I am misreading this.  Quit horribly!  :-)  Looking harder at this.
Ignoring the merits of the changes for possibly fixing a crash, this makes the algorithm deviate from the MakeDay algorithm in ES5 15.9.1.12 even more than the current algorithm does, further hindering analysis of its correctness.  That seems not cool to me.  If we did it more like the spec, integer-izing everything as a first pass, we could not have to worry about potential fmod bugs.  Right?  For the number of crashes here, I don't see an imperative to fix this in a way that does things less like the spec, when doing them more like the spec would apparently also avoid the MS fmod bug and would be easier to analyze as correct.
I don't think that my patch introduces any numerical difference. It just replaces

    result = fmod(x, 12.0)
    if (result < 0) result += 12;

by

    result = x - 12.0 * floor(x / 12.0)

The attached test program shows that that doesn't make any difference as long as the double values don't stay in extended 80bit registers on x87.

The spec does prescribe using fmod(), but do we really have to follow it literally, especially if this doesn't change the result? It tells us to do 3 operations: division, floor, fmod. That's redundant! The fmod result follows from the result of the division/floor operation.
To clarify about the 80bit registers thing --- note that this can easily even make a condition of the form

  expression == expression

be false, if the result of evaluating |expression| was once kept in registers (80bit) and once stored in memory (64bit). Just clarifying that I'm not cheating here :)
I agree it doesn't introduce numerical difference.  What I'm saying is having the implementation algorithm read like the spec algorithm, even if it's a little redundant, pays off so long as the code is not super-performance-sensitive.  Readability counts, notably for verifying correctness.  I don't see MakeDay as a performance-sensitive method, not to the level where an extra FPU operation or two would show up outside (maybe) a micro-benchmark.  If you could demonstrate otherwise, of course, that'd be a different matter.
Updated test program testing 1000x bigger numbers --- still no difference.
Attachment #515233 - Attachment is obsolete: true
(In reply to comment #12)
> I agree it doesn't introduce numerical difference.  What I'm saying is having
> the implementation algorithm read like the spec algorithm, even if it's a
> little redundant, pays off so long as the code is not
> super-performance-sensitive.  Readability counts, notably for verifying
> correctness.  I don't see MakeDay as a performance-sensitive method, not to the
> level where an extra FPU operation or two would show up outside (maybe) a
> micro-benchmark.  If you could demonstrate otherwise, of course, that'd be a
> different matter.

Ah OK. Sure! Didn't want to bother you with premature optimization. So, assuming that the bug is indeed with fmod(), what's the fix going to look like? Wes from #jsapi pointed me to the js_fmod() function and mentioned that there were 5 other occurrences of fmod() around there. Should we just replace them by js_fmod() ?
js_fmod just funnels to fmod (usually), so that's probably not adequate and/or sufficient.  I'm not entirely sure what would be, offhand.  Dates aren't really my specialty (not to say they're anyone else's, tho), and the change you saw in revision logs was probably a systemic fix which didn't have much to do with dates or times or chronological calculation.

We don't even know if the MS fmod bug is what's at fault, do we?  Really wishing we had a testcase right now.  Also, looking at the crash reports, half of them seem to have half-busted stacks: js_GetPropertyHelper or js::Shape::get as the caller for MakeDay, and other such clearly-wrong things.  On the other hand, there seem to be reports that finger date parsing output funneling into date_msecFromArgs, so it does seem real at least somewhat.
Attached patch fix for fmod workaround (obsolete) — Splinter Review
Attachment #515556 - Flags: review?(jwalden+bmo)
Assignee: general → m_kato
Comment on attachment 515556 [details] [diff] [review]
fix for fmod workaround

This appears reasonable to me, but Windows, Windows 64-bit, and Windows 64-bit assembly are not quite my strong points, so probably I should foist it on someone else.  :-)
Attachment #515556 - Flags: review?(jwalden+bmo) → review?(sstangl)
It would be very nice if this could be reviewed and land because it's now crashing on Youtube for me with this signature.
If nobody can review the assembly patch, please reconsider my patch, it was just 5 lines of c++.

The 2 patches are pretty much orthogonal, one avoids using fmod, one rewrites fmod.
Comment on attachment 515556 [details] [diff] [review]
fix for fmod workaround

Whatever, then.  It's short enough that I can probably review reasonably without too much worry about it being horribly wrong.  If something goes wrong, we can back out or figure out what to do then.
Attachment #515556 - Flags: review?(sstangl) → review+
I should add _MSC_VER check.
Attachment #515556 - Attachment is obsolete: true
Attachment #537979 - Flags: review?(jwalden+bmo)
Attachment #537979 - Flags: review?(jwalden+bmo) → review+
Crash Signature: [@ MakeDay]
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/c25968f1f4f0
http://hg.mozilla.org/mozilla-central/rev/763b64b88b59 (backout)
Note: not marking as fixed because last changeset is a backout.
http://hg.mozilla.org/tracemonkey/rev/76279a998f64
Whiteboard: [fixed-in-tracemonkey]
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla8
Depends on: 670091
I just had another MakeDay crash with the 2011-07-07 build which should have the fix already afaik.

https://crash-stats.mozilla.com/report/index/968133df-2fa5-4b12-b64a-f216c2110708
This compiles fine with VC9 assembler, but VC8 assembler doesn't like it...
(In reply to comment #26)
> This compiles fine with VC9 assembler, but VC8 assembler doesn't like it...
Microsoft (R) Macro Assembler (AMD64) Version 8.00.40310.39
Copyright (C) Microsoft Corporation.  All rights reserved.

MASM : warning A4018: invalid command-line option : -o
 Assembling: jswin64.asm
jswin64.asm(45) : error A2222: x87 and MMX instructions disallowed; legacy FP state not saved in Win64
jswin64.asm(47) : warning A4026: .ENDPROLOG found before EH directives : js_myfmod
I should turn off this on VC8.   But VC8 has fmod bug, too...

(In reply to comment #27)
> (In reply to comment #26)
> > This compiles fine with VC9 assembler, but VC8 assembler doesn't like it...
> Microsoft (R) Macro Assembler (AMD64) Version 8.00.40310.39
> Copyright (C) Microsoft Corporation.  All rights reserved.
> 
> MASM : warning A4018: invalid command-line option : -o
>  Assembling: jswin64.asm
> jswin64.asm(45) : error A2222: x87 and MMX instructions disallowed; legacy
> FP state not saved in Win64
> jswin64.asm(47) : warning A4026: .ENDPROLOG found before EH directives :
> js_myfmod
Hmm, after looking more carefully, it seems that this error code was removed in later versions of ml64.exe, so it's a bug in the assembler.

If VC8 has the fmod bug, then maybe I can work around it locally by using
byte 0dbh, 0e2h
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: