Closed Bug 473552 Opened 15 years ago Closed 15 years ago

TM: SIGILL due to TraceMonkey emitting unsupported NOPL

Categories

(Core :: JavaScript Engine, defect, P1)

x86
All
defect

Tracking

()

RESOLVED FIXED
mozilla1.9.1b3

People

(Reporter: vvy, Assigned: gal)

References

Details

(Keywords: fixed1.9.1, Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b2) Gecko/20090113 Gentoo Firefox/3.1b2
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1b2) Gecko/20090113 Gentoo Firefox/3.1b2

Firefox 3.1b2 terminates with SIGILL on loading any page containing JavaScript. The illegal instruction, according to gdb, is nopl. My processor, VIA C3-2 Nehemiah, is among those that do not support nopl even though they probably should, being 686s.

Related discussion for binutils: http://sourceware.org/bugzilla/show_bug.cgi?id=6957
("There is also another related issue with i686 class CPUs. Some do not actually
support the NOPL instruction. Examples include Via C3, Via Eden, AMD Geode LX
(as used in OLPC), Transmeta Crusoe and it appears broken on Virtual PC.")

Also: http://kerneltrap.org/mailarchive/linux-kernel/2008/9/16/3313464



Reproducible: Always

Steps to Reproduce:
1. On e.g. a VIA C3-2 Nehemiah system, start Firefox 3.1b2 with javascript.options.jit.content enabled.
2. Visit a site containing JavaScript (reddit.com will do).

Actual Results:  
Program received signal SIGILL, Illegal instruction.
[Switching to Thread 0xb6333710 (LWP 27571)]
0xb5d70eb0 in ?? ()
(gdb) disas 0xb5d70eb0 0xb5d70eb1
Dump of assembler code from 0xb5d70eb0 to 0xb5d70eb1:
0xb5d70eb0:	nopl   (%eax)
End of assembler dump.
(gdb) bt
#0  0xb5d70eb0 in ?? ()
#1  0xb7047acc in js_ExecuteTree () from /usr/lib/xulrunner-1.9/libmozjs.so
#2  0xb70526e9 in js_MonitorLoopEdge () from /usr/lib/xulrunner-1.9/libmozjs.so
#3  0xb6ff8532 in js_Interpret () from /usr/lib/xulrunner-1.9/libmozjs.so
#4  0xb6ffaa4a in js_Execute () from /usr/lib/xulrunner-1.9/libmozjs.so
#5  0xb6fbf1f3 in JS_EvaluateUCScriptForPrincipals () from /usr/lib/xulrunner-1.9/libmozjs.so
#6  0xb790b0af in nsJSContext::EvaluateString () from /usr/lib/xulrunner-1.9/libxul.so
[...]


Workaround: Disable javascript.options.jit.* by editing prefs.js (Firefox will crash before you can go to about:config).
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Summary: SIGILL due to TraceMonkey emitting unsupported NOPL → TM: SIGILL due to TraceMonkey emitting unsupported NOPL
Version: unspecified → Trunk
Reporter is right. We emit multi-byte NOPs without checking that they are supported by the machine. I will take out the alignment code. Its not helping us anyway.
Assignee: general → gal
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking1.9.1?
Priority: -- → P1
Target Milestone: --- → mozilla1.9.1b3
If adobe wants this we have to make this optional and or detect the architecture right when we merge.
OS: Linux → All
Attachment #361670 - Flags: review? → review+
Comment on attachment 361670 [details] [diff] [review]
don't align fragment entry, it never worked improved perf and is invalid for older processors

Might want to revert to just emitting a sequence of universally-valid nops, at least in the loop-target case where alignment might matter. For now this is probably fine though.
Pushed to TM.

http://hg.mozilla.org/tracemonkey/rev/1ce06a8fc67e
Whiteboard: fixed-in-tracemonkey
+1, alignment is nearly zero value, but agreed that if we bring back nop-alignment, we need a) valid cpu detection in nanojit, and b) fallback to whatever valid nop's the cpu supports.
Flags: blocking1.9.1? → blocking1.9.1+
http://hg.mozilla.org/mozilla-central/rev/1ce06a8fc67e
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Is Bug 477471 a dup? The reported CPUs are AMD K6-2/500 and Intel Pentium-MMX/166

The crash stat report doesn't seem to show opcodes:
http://crash-stats.mozilla.com/report/index/e8853e64-9e9a-4ed4-a808-360402090207?p=1
Blocks: 477471
In Bug 477471 Comment 15 we are still seeing a SIGILL crash if you visit a page with flash. The crash signature is different [js_MonitorLoopEdge(JSContext*, unsigned int&)] but is looks like JIT code is involved.
I have removed the code that generates NOPL, so this cannot be the same cause any more.
Philip: please file a new bug and cite it here. Thanks,

/be
> Philip: please file a new bug and cite it here. Thanks,

Brendan, I've filed Bug 480822 TM: SIGILL Crash [js_MonitorLoopEdge(JSContext*, unsigned int&)]
Flags: in-testsuite-
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: