Closed Bug 791214 Opened 12 years ago Closed 12 years ago

crash in ToNumberSlow

Categories

(Core :: JavaScript Engine, defect)

18 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla20
Tracking Status
firefox18 + fixed
firefox19 + fixed
firefox20 --- fixed
firefox-esr10 --- unaffected
firefox-esr17 --- unaffected

People

(Reporter: scoobidiver, Assigned: dvander)

References

()

Details

(4 keywords, Whiteboard: [js:p1][adv-main18-])

Crash Data

Attachments

(1 file)

It's a low volume crash but spikes with IonMonkey.

Stack traces are various:
Frame 	Module 	Signature 	Source
0 	mozjs.dll 	ToNumberSlow 	js/src/jsnum.cpp:1374
1 	xul.dll 	mozilla::dom::ValueToPrimitive<double,0> 	obj-firefox/dist/include/mozilla/dom/PrimitiveConversions.h:339
2 	xul.dll 	mozilla::dom::CanvasRenderingContext2DBinding::translate 	obj-firefox/dom/bindings/CanvasRenderingContext2DBinding.cpp:164
3 	xul.dll 	mozilla::dom::CanvasRenderingContext2DBinding::genericMethod 	obj-firefox/dom/bindings/CanvasRenderingContext2DBinding.cpp:2577
4 	mozjs.dll 	js::InvokeKernel 	js/src/jsinterp.cpp:367
5 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:2454
...

Frame 	Module 	Signature 	Source
0 	mozjs.dll 	ToNumberSlow 	js/src/jsnum.cpp:1374
1 	mozjs.dll 	js::math_sin 	js/src/jsmath.cpp:601
2 	mozjs.dll 	js::InvokeKernel 	js/src/jsinterp.cpp:367
3 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:2454
...

Frame 	Module 	Signature 	Source
0 	mozjs.dll 	ToNumberSlow 	js/src/jsnum.cpp:1374
1 	mozjs.dll 	js::Interpret 	js/src/jsinterp.cpp:2080
2 	mozjs.dll 	js::InvokeKernel 	js/src/jsinterp.cpp:378
3 	mozjs.dll 	js::InvokeConstructorKernel 	js/src/jsinterp.cpp:442
4 	mozjs.dll 	js::InvokeConstructor 	js/src/jsinterp.cpp:467
5 	mozjs.dll 	js::ion::InvokeConstructor 	js/src/ion/VMFunctions.cpp:80

More reports at:
https://crash-stats.mozilla.com/report/list?signature=ToNumberSlow
It's #13 top browser crasher in 18.0a1.
Keywords: topcrash
bp-b0388a3b-377f-487f-a9fd-bc4d12121014
bp-a89ef020-5390-416e-ac9a-79f742121014

http://hg.mozilla.org/mozilla-central/rev/57304bbf9c0e
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121014030627

Reproducible: very often

Steps To reproduce:
1. Maximized browser
2. Open http://paperjs.org/examples/smoothing/

3. Double Click, Double Click, Double Click  .... several times
4. Reload

5. Repeat Step 3-4
There are only 2 crashes in 18.0a2.

It's only reproducible in Nightly.
Not tracking for 18 atm as it is not a top crasher in 18.0a2 .
It's #23 top crasher in 18.0a2 and #27 in 19.0a1.

More reports at:
https://crash-stats.mozilla.com/report/list?signature=js%3A%3AToNumberSlow%28JSContext*%2C+JS%3A%3AValue%2C+double*%29
Crash Signature: [@ ToNumberSlow] → [@ ToNumberSlow] [@ js::ToNumberSlow(JSContext*, JS::Value, double*)]
Keywords: topcrash
It spiked in 19.0a1/20121106 from 5 crashes/build to 20. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=0947e291578a&tochange=f9c2c266e7aa
Aurora 18.0a2 also crashes bp-631a40d8-14a8-4da9-bcc8-c93f12121109
STR:
Open http://www.staggeringbeauty.com/ (Be careful speaker volume )
Shake the mouse vigorously.
Given this is a reproducible top crash, tracking for FF18. Not adding regressionwindow-wanted since this appears to have spiked with the IonMonkey landing. This needs engineer attention (David/Naveed?)
Phew, is this site ever... nauseating. Thanks for the sound warning, Alice :)

Using nightlies I bisected this crash between changesets aa5e3b445810 and ec10630b1a54. There are a few interesting JS patches in there, so I'm bisecting between them with try.
Assignee: general → dvander
Status: NEW → ASSIGNED
No luck trying to bisect to JS patches. Going to try full bisect between those two csets, but it will take some time through try.
This is sensitive to PGO. Because, non-PGO build does not crash with the same changeset.

Not crash in non-PGO build(from tinderbox-builds):
http://hg.mozilla.org/mozilla-central/rev/ec10630b1a54
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010003404
Crash in PGO build(from nightly):
http://hg.mozilla.org/mozilla-central/rev/ec10630b1a54
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010030605
(In reply to Alice0775 White from comment #12)
> This is sensitive to PGO. Because, non-PGO build does not crash with the
> same changeset.

Yup, I have been sending MOZ_PGO=1 builds to try.
Another way to reproduce:

  http://operasoftware.github.com/Emberwind/

click on the start screen, spacebar through the intro stuff, as soon as you get to the side-scroller portion hit D to (I think?) go right, and crash.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #14)
> Another way to reproduce:
> 
>   http://operasoftware.github.com/Emberwind/
> 
> click on the start screen, spacebar through the intro stuff, as soon as you
> get to the side-scroller portion hit D to (I think?) go right, and crash.
I can reproduce this on the current 20.0a1 Nightly here on Windows.
> I can reproduce this on the current 20.0a1 Nightly here on Windows.
Forgot to mention, but the Signature is slightly different on 20.0a1: [@ js::ToNumberSlow(JSContext*, JS::Value, double*) ]

Here's my crash: https://crash-stats.mozilla.com/report/index/503ed4d9-b71f-4d6e-bd2a-574ca2121129
Whiteboard: [js:p1]
This is the #15 browser topcrash in 18.0b2 - dvander, can we get some progress here?
I have been investigating this bug but I don't have anything to report yet. PGO bugs are time consuming and difficult to look at. The slow turnaround time for builds does not help.
Group: core-security
Attached patch fixSplinter Review
tl;dr: Confirmed security-critical bug caused by PGO miscompilation. The attached patch selectively disables PGO for various functions. A try build does not crash for me on either site in this bug anymore.

Full analysis, for posterity:

The crashing address is in ToNumberSlow, but this is a red herring. It's trying to operate on a corrupt value that flows from the stack. The value is tagged as a JSString pointer, but the pointer is garbage.

Unfortunately the JavaScript function came from an eval(), so finding its source location was tricky. Eventually I used the script's atom list to find a function that looked plausible. That ended up being updateAppearance() in staggeringbeauty.com/src/main.js, which has this code:

  if (stress > 80) {
    hue += stress > 300 ? Math.pow(stress/5, 2) : stress;
    pathBody.strokeColor = 'hsl('+Math.round(hue)%360+', 100%, 50%)';

Based on the callsite, I guessed that the bad value was flowing from "hue", which is a variable higher on the scope chain, located on the heap. The same corrupt value was indeed located on the scope chain. Since that is the only place that modified "hue", I made an instrumented build to trap whenever we compile code for updating that variable. That, and a data breakpoint on the scope chain slot, was enough to catch that the line of code was, in fact, producing a corrupt value.

In the JIT code, the value was flowing directly from a call, which could only be Math.pow(). However there are a bunch of intermediate C++ functions to call Math.pow(), so to be safe I turned PGO off for all of them. Try build confirmed that was the problem.
Attachment #689588 - Flags: review?(jdemooij)
Comment on attachment 689588 [details] [diff] [review]
fix

Review of attachment 689588 [details] [diff] [review]:
-----------------------------------------------------------------

Great detective work!

::: js/src/jsmath.cpp
@@ +416,5 @@
>      vp->setNumber(z);
>      return JS_TRUE;
>  }
>  
> +#if defined(_MSC_VER)

Nit: can you add a small comment here explaining why we disable PGO, maybe just link to this bug?

::: js/src/jsnum.cpp
@@ +1328,5 @@
>      JS_ASSERT(!cbuf.dbuf && cstrlen < cbuf.sbufSize);
>      return sb.appendInflated(cstr, cstrlen);
>  }
>  
> +#if defined(_MSC_VER)

Same here.
Attachment #689588 - Flags: review?(jdemooij) → review+
Comment on attachment 689588 [details] [diff] [review]
fix

Sure thing.

[Security approval request comment]
How easily can the security issue be deduced from the patch? not at all

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? no

Which older supported branches are affected by this flaw? 19, 18

If not all supported branches, which bug introduced the flaw? it's a compiler bug that randomly manifested after an m-i -> m-c around 10/3 or 10/4

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? this patch should apply on branch

How likely is this patch to cause regressions; how much testing does it need? unlikely, very little
Attachment #689588 - Flags: sec-approval?
Comment on attachment 689588 [details] [diff] [review]
fix

sec-approval+. Let's get this in. Please prepare branch patches and nominate them for branch approval so we can avoid shipping this issue.
Attachment #689588 - Flags: sec-approval? → sec-approval+
I just did a PGO build with VS2011 and I can -not- reproduce this with any of the URLs here.  We should consider upgrading compilers soon.
Good to know, though I'm not so much convinced that's necessarily a bug fix in PGO but just randomly different PGO behavior.
https://hg.mozilla.org/mozilla-central/rev/aeba5e501a21
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Comment on attachment 689588 [details] [diff] [review]
fix

[Approval Request Comment]
Bug caused by (feature/regressing bug #): compiler PGO bug after 10/3 m-c merge
User impact if declined: prevalent crash in JavaScript
Testing completed (on m-c, etc.): yes
Risk to taking this patch (and alternatives if risky): none
String or UUID changes made by this patch:
Attachment #689588 - Flags: approval-mozilla-beta?
Attachment #689588 - Flags: approval-mozilla-aurora?
Attachment #689588 - Flags: approval-mozilla-beta?
Attachment #689588 - Flags: approval-mozilla-beta+
Attachment #689588 - Flags: approval-mozilla-aurora?
Attachment #689588 - Flags: approval-mozilla-aurora+
Whiteboard: [js:p1] → [js:p1][adv-main18-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: