Closed
Bug 480822
Opened 15 years ago
Closed 15 years ago
TM: SIGILL Crash [@ js_MonitorLoopEdge(JSContext*, unsigned int&)]
Categories
(Core :: JavaScript Engine, defect, P2)
Tracking
()
VERIFIED
FIXED
mozilla1.9.1b4
People
(Reporter: philip.chee, Assigned: gal)
References
()
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
+++ This bug was initially created as a clone of Bug #477471 +++ From Bug 477471 Comment 12: Build ID: 20090228000503 Build identifier: Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1b3pre) Gecko/20090228 SeaMonkey/2.0b1pre Toggling javascript.options.jit.content back to true and not using safe-mode, it launches, but, if a web page has a Flash ad on it (and which ones do not?), SeaMonkey crashes every time. :( Crash ID: bp-429da3c6-0fe7-4506-b651-a51542090228 Crash ID: bp-a559e867-c944-44b8-b94c-3de882090228 Crash ID: bp-f688369b-00c7-4a96-9a12-dc3b32090228 0 @0xb131bbad 1 @0xbfbd88d7 2 libmozjs.so js_MonitorLoopEdge js/src/jstracer.cpp:4285 3 libmozjs.so js_Interpret js/src/jsinterp.cpp:3098
Summary: TM: SIGILL Crash [js_MonitorLoopEdge(JSContext*, unsigned int&)] → TM: SIGILL Crash [@ js_MonitorLoopEdge(JSContext*, unsigned int&)]
Assignee | ||
Comment 1•15 years ago
|
||
Can you try to capture a disassembly of the crashing code? What instruction causes the SIGILL?
Perhaps just a "me too" comment, but I'm getting the SIGILL crashes with this signature on an AMD K6-III/450. The first page encountered after successfully logging into http://www.chase.com reliably triggers the crash. Setting javascript.options.jit.content to "false" seems to be a valid workaround. I've had to "roll my own" firefox for many months now (binary packages are for i686 and later). If you need more information about my build environment, don't hesitate to ask. I have attempted to perform a build with symbols in the recent past, but I have neither enough disk space nor enough memory (physical and swap). I can generate a core dump, but gdb can't seem to make any sense out of the mess: I get a "no function contains the referenced address" error when attempting disassembly of the function address given in the backtrace. Mozilla version info as below (my build is from the release 3.1b3 sources): Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1b3) Gecko/20090315 Shiretoko/3.1b3
Comment 3•15 years ago
|
||
This is the number one crasher on the trunk. http://crash-stats.mozilla.com/query/query?do_query=1&branch=1.9.2&date=&range_value=1&range_unit=weeks&query_search=signature&query_type=exact&query=
Flags: blocking1.9.2?
OS: Linux → All
Assignee | ||
Comment 4•15 years ago
|
||
This is a SIGSEGV crash involving lr after in invocation, not SIGILL. Looks bad. Must fix.
Assignee: general → gal
Flags: blocking1.9.2? → blocking1.9.1?
Priority: -- → P1
Target Milestone: --- → mozilla1.9.1b4
Assignee | ||
Comment 5•15 years ago
|
||
I will try to take a look at this tonight. If someone can link a crashing website or a testcase or find the regression range, that would be enormously useful.
Whiteboard: need-regression-window, need-testcase
Comment 6•15 years ago
|
||
Comments in crash-stats say Google reader or Facebook. I use Facebook a bit but without any 3rd party Facebook applications and have not run into this.
Comment 7•15 years ago
|
||
First reported crash was on February 10th. Was there a tm merge around then? http://crash-stats.mozilla.com/report/index/7ca39b1a-4e13-4900-ba4b-e01462090210
Updated•15 years ago
|
Flags: blocking1.9.1? → blocking1.9.1+
Priority: P1 → P2
Comment 8•15 years ago
|
||
(In reply to comment #4) > This is a SIGSEGV crash involving lr after in invocation, not SIGILL. Looks > bad. Must fix. Should be resummarized, or are people still seeing SIGILL? Usually crashes due to a segv (load or store from a bad pointer) are quite different from illegal instruction crashes (jit bug, cpu version dependency, jump to random data). /be
Comment 9•15 years ago
|
||
<http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.6a1pre&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=js_MonitorLoopEdge%28JSContext*%2C%20unsigned%20int%26%29> This looks fixed. There are still crashes on trunk, but the build ids are all a month old.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•15 years ago
|
Resolution: FIXED → WORKSFORME
Comment 10•15 years ago
|
||
Any idea what fixed this? Still an issue for 1.9.1/3.5b4 http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A3.5b4&query_search=signature&query_type=exact&query=&date=&range_value=1&range_unit=weeks&do_query=1&signature=js_MonitorLoopEdge%28JSContext*%2C%20unsigned%20int%26%29
Comment 11•15 years ago
|
||
All of the build ids in that link are from 2009-04-23. This push http://hg.mozilla.org/releases/mozilla-1.9.1/pushloghtml?changeset=6dde8411585a or something in the area probably fixed it.
Assignee | ||
Comment 12•15 years ago
|
||
Its not clear to me which patch in that merge fixed this crash. If anyone feels like bisecting this down to the patch that resolved it, that would be awesome.
Comment 13•15 years ago
|
||
NOT fixed here. Contrary to an earlier comment, this is a SIGILL crash (illegal instruction), not SIGSEGV. The problem still exists in the released firefox 3.5b99 on an AMD K6-III/450 built from unmodified source. The "about" information is Mozilla/5.0 (X11; U; Linux i586; en-US; rv:1.9.1b99) Gecko/20090609 Shiretoko/3.5b99 The crash is 100% reproducible by setting javascript.options.jit.content to its default value (true), and then attempting to access the USAA "bill pay" page (lots of javascript). Toggling the above config option to false is still a valid workaround.
Comment 14•15 years ago
|
||
Can you provide a stacktrace or crash-id (in about:crashes) of this crash?
Whiteboard: need-regression-window, need-testcase
Assignee | ||
Comment 15•15 years ago
|
||
I don't think the stacktrace will be very useful. This looks like bad instruction set detection. K6-III/450 is the important hint here. If the reporter is willing to help diagnose the problem and testing patches, we can try to address this. We have no way to test locally (this is an issue with K6-III not supporting SSE and conditional moves and us not detecting that right).
Adding Ted here, who has some non-SSE hardware and mandate!
Comment 17•15 years ago
|
||
Can anyone provide a publicly accessible page that reproduces the problem for them? Pages behind login are unhelpful here. I have a Shiretoko nightly on an old Pentium III machine with the flash plugin installed, and I don't crash clicking around Yahoo Finance. I'll try running through our unittest suites and see if anything triggers in the meantime.
Comment 18•15 years ago
|
||
I don't think you will be able to reproduce the K6 problem on a PIII processor. There is an OS/2 user with a K6-2/500 CPU who also sees SIGILL crashes every now and then with JIT turned on. See http://groups.google.de/group/mozilla.dev.ports.os2/msg/e0f871f5ed375968 http://groups.google.de/group/mozilla.dev.ports.os2/msg/7c21fb982c57255a So his FF was crashing on www.spiegel.de and www.heise.de in conjunction with http://www.heise.de/open/Distributionsreigen-Zielgerade-und-Einlauf-fuer-Ubuntu-und-Mandriva--/artikel/136574/1 But that was with older OS/2 builds, and not fully reproducible, so probably doesn't help to fix the problem. (This bug should be reopened, right?)
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment 19•15 years ago
|
||
(In reply to comment #18) > I don't think you will be able to reproduce the K6 problem on a PIII processor. Hm, I don't know that we have anything older available. :-/ Do you think that this is a crash due to SSE1 instructions? The PIII would have that, but the K6 wouldn't. (Although the P3 lacks SSE2 or better.)
Comment 20•15 years ago
|
||
Definitely willing to help with diagnosis and patch testing. Let me know how I can help. However, do note that I cannot do a debug build: takes far more disk space than I've got, not to mention the RAM and swap required to compile/link it. I agree that a web page not hiding behind a login would be useful: I'll try to find one that reliably triggers the crash. Thanks for the assist!
I bet we could get a one-off debug build for you. Ted, am I crazy?
Comment 22•15 years ago
|
||
Careful on the one-off debug build suggestion :-). If the standard i686 build would work with my processor, I wouldn't be having to roll my own i586 version. The other thing is, the motherboard has the maximum RAM it can support: 384 MB. From what I've seen as far as the potential size of a debug build, I might not have enough system resources to execute it. That's not to say we can't enable debugging for a carefully chosen subset of the build tree if that would be useful. I guess it might be appropriate at this point to apologize for having a certain fondness for antiques.
Assignee | ||
Comment 23•15 years ago
|
||
I would suspect that we get the SSE detection wrong, so that piece should be easy to test separately.
No, no -- I mean a build of the usual i686 build with debug symbols. We want your brokenness, it's going to help us find the problem. (Debug symbols aren't loaded by default, so you should still be able to run it, but we might be able to just give you a JS shell to use.)
Assignee | ||
Comment 25•15 years ago
|
||
rct, want to run this on your machine and tell us what it returns? static bool js_CheckForSSE2() { int features = 0; #if defined _MSC_VER __asm { pushad mov eax, 1 cpuid mov features, edx popad } #elif defined __GNUC__ asm("xchg %%esi, %%ebx\n" /* we can't clobber ebx on gcc (PIC register) */ "mov $0x01, %%eax\n" "cpuid\n" "mov %%edx, %0\n" "xchg %%esi, %%ebx\n" : "=m" (features) : /* We have no inputs */ : "%eax", "%esi", "%ecx", "%edx" ); #elif defined __SUNPRO_C || defined __SUNPRO_CC asm("push %%ebx\n" "mov $0x01, %%eax\n" "cpuid\n" "pop %%ebx\n" : "=d" (features) : /* We have no inputs */ : "%eax", "%ecx" ); #endif return (features & (1<<26)) != 0; } #endif If this returns false correctly, we might have a bug in the FPU code. That would suck a lot since we have zero test coverage for that atm.
Comment 26•15 years ago
|
||
(In reply to comment #22) > Careful on the one-off debug build suggestion :-). If the standard i686 build > would work with my processor, I wouldn't be having to roll my own i586 version. FWIW, our official nightly builds are not i686 targeted. They should run fine on any x86 machine. (Although clearly we have a JIT bug here somewhere.)
Comment 27•15 years ago
|
||
With a trivial main() wrapper that simply returns the value of js_CheckForSSE2(), the value returned is 0.
Assignee | ||
Comment 28•15 years ago
|
||
Ok, found something. Taking the bug.
Assignee | ||
Comment 29•15 years ago
|
||
Can the reporter(s) test this patch? OPT or DEBUG are both fine. You can just rebuild the JS engine separately instead of a full browser rebuild.
Comment 30•15 years ago
|
||
Patch downloaded. Will apply it and start a build. Report to follow, but it will be several hours due to demands of life away from the keyboard. Thanks!
Assignee | ||
Comment 31•15 years ago
|
||
Moving the patch into a separate bug so we can block on that one. https://bugzilla.mozilla.org/show_bug.cgi?id=497455 Closing this back down, but added dependency.
No longer blocks: 468484
Status: REOPENED → RESOLVED
Closed: 15 years ago → 15 years ago
Resolution: --- → FIXED
Comment 32•15 years ago
|
||
is this fixed1.9.1 as well as per comment 31?
Assignee | ||
Comment 33•15 years ago
|
||
Not sure how to properly triage this. I opened a new bug to fix this issue, instead of re-opening this one. Whatever was done to this bug to re-open it should be undone.
Flags: blocking1.9.1+
Comment 34•15 years ago
|
||
Fix verified. No more SIGILL on K6-III/450 with javascript.options.jit.content set to default value of "true". Thanks Andreas!
Comment 35•15 years ago
|
||
Not worth checking for a regression range for an already fixed bug. Resolving as verified fixed based on comment 34.
Status: RESOLVED → VERIFIED
Keywords: regressionwindow-wanted
Comment 36•15 years ago
|
||
I not longer see this bug when javascript.options.jit.content is set to true, but when javascript.options.jit.chrome is set to true, FF crashes. I tested on FF 3.6b4, mozilla 1.9.2. Please let me know if anyone is experiencing this. thank you. The machine is run on is a vortex process that does not support cmov/sse2 instruction. thank you.
What is the stack or crash report ID for the chrome.jit crash?
Comment 38•15 years ago
|
||
this is the cpu info: processor : 0 vendor_id : Vortex86 SoC cpu family : 5 model : 2 model name : 05/02 stepping : 2 cpu MHz : 1000.072 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu tsc cx8 bogomips : 2007.74 clflush size : 32 here is the stack dump: *** Registering components in: Apprunner WARNING: NS_ENSURE_TRUE(mHiddenWindow) failed: file nsAppShellService.cpp, line 399 pldhash: for the table at address 0xb52b07c8, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xb52b0760 == 1 pldhash: for the table at address 0xb494a268, the given entrySize of 48 probably favors chaining over double hashing. ++DOMWINDOW == 1 (0xb7061020) [serial = 1] [outer = (nil)] pldhash: for the table at address 0xb52b0ba8, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xb52b0b40 == 2 ++DOMWINDOW == 2 (0xb7061590) [serial = 2] [outer = (nil)] ++DOMWINDOW == 3 (0xb7061760) [serial = 3] [outer = 0xb7061560] ++DOMWINDOW == 4 (0xb7062410) [serial = 4] [outer = 0xb7060ff0] pldhash: for the table at address 0xb52b1938, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xb52b18d0 == 3 ++DOMWINDOW == 5 (0xb7062d20) [serial = 5] [outer = (nil)] pldhash: for the table at address 0xb52b1b28, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xb52b1ac0 == 4 ++DOMWINDOW == 6 (0xb7062ef0) [serial = 6] [outer = (nil)] pldhash: for the table at address 0xb52b2aa8, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xb52b2a40 == 5 WARNING: NS_ENSURE_TRUE(browserChrome) failed: file nsDocShell.cpp, line 9897 WARNING: Something wrong when creating the docshell for a frameloader!: file nsFrameLoader.cpp, line 912 WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file nsFrameLoader.cpp, line 936 WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file nsFrameLoader.cpp, line 193 pldhash: for the table at address 0xb52b2e88, the given entrySize of 48 probably favors chaining over double hashing. ++DOCSHELL 0xb52b2e20 == 6 ++DOMWINDOW == 7 (0xb7054a30) [serial = 7] [outer = (nil)] Program /opt/mozilla.org/lib/firefox-3.6b4/firefox-bin (pid = 4744) received signal 4. Stack: UNKNOWN 0xffffe420 UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x000E0780] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x000F5EF3] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x0005DC94] js_Invoke+0x00000746 [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x0007475C] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x00074B1E] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x00074CB9] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x00081539] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x000843F7] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x00065042] js_Invoke+0x00000746 [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x0007475C] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x00074B1E] JS_CallFunctionValue+ 0x00000066 [/opt/mozilla.org/lib/firefox-3.6b4/libmozjs.so +0x0001413C] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x004DE339] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x004D0B3F] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x004EB009] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x001630C0] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x004FC2E2] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x004FC633] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x00502153] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libgklayout.so +0x00502642] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libnecko.so +0x0005AC9F] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libjar50.so +0x00010540] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libnecko.so +0x000326DD] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libnecko.so +0x000327B1] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libxpcom_core.so +0x00064B3E] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/libxpcom_core.so +0x00086DC5] NS_ProcessNextEvent_P(nsIThread*, int)+0x00000059 [/opt/mozilla.org/lib/firefox-3.6b4/libxpcom_core.so +0x0002F6AB] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libwidget_gtk2.so +0x00049142] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/components/libtoolkitcomps.so +0x00006F8D] XRE_main+0x00001C4F [/opt/mozilla.org/lib/firefox-3.6b4/libxul.so +0x0001CFA7] UNKNOWN [/opt/mozilla.org/lib/firefox-3.6b4/firefox-bin +0x000017EF] __libc_start_main+0x0000012E [/lib/libc.so.6 +0x0001620E] Sleeping for 300 seconds. Type 'gdb /opt/mozilla.org/lib/firefox-3.6b4/firefox-bin 4744' to attach your debugger to this thread. - Show quoted text - - Show quoted text -
Assignee | ||
Comment 39•15 years ago
|
||
Looks like we are crashing in JIT code. What does the code in comment #25 return for your CPU? Maybe the CPU sets its flags wrong.
Comment 40•15 years ago
|
||
I compiled the code in comment #25 and ran it on the vortex machine; it returned 0 (as expected)
Assignee | ||
Comment 41•15 years ago
|
||
Ok, thats pretty strange. We really shouldn't be emitting cmovs or sse instructions when that flag is off. I will review the corresponding code a bit to double check.
Comment 42•15 years ago
|
||
Were you able to find out what happened? here is the mozconfig for FF3.6b4 if you really wanted to reproduce the bug. # sh # Build configuration script # Options for client.mk. # mk_add_options MOZ_MAKE_FLAGS=-j4 # Options for 'configure' (same as command-line options). ac_add_options --prefix=/opt/mozilla.org ac_add_options --libdir=/opt/mozilla.org/lib ac_add_options --sysconfdir=/etc/firefox ac_add_options --localstatedir=/var ac_add_options --enable-default-mozilla-five-home ac_add_options --with-default-mozilla-five-home=/opt/mozilla.org/lib/firefox-3.6b4 ac_add_options --host=i486-t2-linux-gnu ac_add_options --disable-debug ac_add_options --enable-optimize ac_add_options --disable-dtd-debug ac_add_options --disable-tests ac_add_options --disable-logging ac_add_options --disable-pedantic ac_add_options --enable-xft ac_add_options --enable-default-toolkit=gtk2 ac_add_options --with-system-zlib ac_add_options --with-system-jpeg ac_add_options --with-system-png ac_add_options --with-system-mng ac_add_options --enable-system-cairo ac_add_options --enable-crypto export BUILD_OFFICIAL=1 export MOZILLA_OFFICIAL=1 mk_add_options BUILD_OFFICIAL=1 mk_add_options MOZILLA_OFFICIAL=1 . $topsrcdir/browser/config/mozconfig export MOZ_PHOENIX=1 mk_add_options MOZ_PHOENIX=1 ac_add_options --enable-default-toolkit=cairo-gtk2 ac_add_options --disable-mailnews ac_add_options --disable-composer ac_add_options --enable-extensions=default #,cookie,permissions,xml-rpc,xmlextras,pref,transformiix,webservices,auth ac_add_options --enable-mathml ac_add_options --enable-crypto ac_add_options --enable-module=psm ac_add_options --without-system-png ac_add_options --disable-profilesharing ac_add_options --disable-javaxpcom ac_add_options --disable-startup-notification ac_add_options --disable-necko-wifi ac_add_options --disable-parental-controls ac_add_options --disable-activex ac_add_options --disable-activex-scripting ac_add_options --disable-ogg ac_add_options --disable-wave ac_add_options --disable-accessibility ac_add_options --disable-dbus ac_add_options --disable-crashreporter # speedup build ac_add_options --disable-test ac_add_options --disable-tests ac_add_options --disable-glibtest ac_add_options --disable-freetypetest ac_add_options --disable-libIDLtest # Some debug functions when firefox fail to start #ac_add_options --enable-debug #ac_add_options --enable-debug-modules #ac_add_options --disable-strip #ac_add_options --disable-install-strip #ac_add_options --disable-optimize # More to strip down functions in Firefox 3.6.x ac_add_options --disable-libnotify ac_add_options --disable-accessibility ac_add_options --disable-view-source ac_add_options --disable-plugins ac_add_options --disable--jsd ac_add_options --disable-universalchardet ##ac_add_options --disable-libxul ac_add_options --disable-libIDL ac_add_options --disable-profilelocking ##ac_add_options --disable-rdf #ac_add_options --disable-necko-disk-cache #ac_add_options --disable-necko-wifi #ac_add_options --disable-necko-small-buffers ac_add_options --disable-safe-browsing ac_add_options --disable-help-viewer #ac_add_options --disable-places #ac_add_options --disable-canvas ac_add_options --disable-canvas3d ac_add_options --disable-updater ac_add_options --disable-javaxpcom ac_add_options --disable-xpctools ac_add_options --disable-parental-control ac_add_options --disable-leaky ac_add_options --disable-ldap
Updated•13 years ago
|
Crash Signature: [@ js_MonitorLoopEdge(JSContext*, unsigned int&)]
Comment 43•11 years ago
|
||
Filter on qa-project-auto-change: Bug in removed tracer code, setting in-testsuite- flag.
Flags: in-testsuite-
Updated•9 years ago
|
Keywords: testcase-wanted
You need to log in
before you can comment on or make changes to this bug.
Description
•