Closed Bug 689357 Opened 13 years ago Closed 11 years ago

Crash [@ js::mjit::JaegerShotAtSafePoint]

Categories

(Core :: JavaScript Engine, defect)

ARM
Android
defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox17 + wontfix

People

(Reporter: nhirata, Unassigned)

Details

(Keywords: crash, topcrash, Whiteboard: [mobile-crash][native-crash][ARMv6])

Crash Data

This bug was filed from the Socorro interface and is 
report bp-25d4ce3a-dfda-477b-9217-5326c2110924 .
============================================================= 


Frame 	Module 	Signature [Expand] 	Source
0 		@0x411a49e4 	
1 	libxul.so 	js::mjit::JaegerShotAtSafePoint 	js/src/vm/Stack.h:1260
2 	libxul.so 	js::Interpret 	js/src/vm/Stack.h:1133
3 	libxul.so 	js::Invoke 	js/src/jsinterp.cpp:614
4 	libxul.so 	js_fun_apply 	js/src/vm/Stack.h:268
5 	libxul.so 	js::Invoke 	js/src/jscntxtinlines.h:286
6 	libxul.so 	js::Interpret 	js/src/jsinterp.cpp:4016
7 	libxul.so 	js::RunScript 	js/src/jsinterp.cpp:614
8 	libxul.so 	js::ExternalExecute 	js/src/jsinterp.cpp:913
9 	libxul.so 	JS_EvaluateUCScriptForPrincipalsVersion 	js/src/jsapi.cpp:4929
10 	libxul.so 	nsJSContext::EvaluateString 	dom/base/nsJSEnvironment.cpp:1451
11 	libxul.so 	nsScriptLoader::EvaluateScript 	nsCOMPtr.h:655
12 	libxul.so 	nsScriptLoader::ProcessRequest 	nsCOMPtr.h:800
13 	libxul.so 	nsScriptLoader::ProcessScriptElement 	content/base/src/nsScriptLoader.cpp:745
14 	libxul.so 	nsScriptElement::MaybeProcessScript 	content/base/src/nsScriptElement.cpp:187
15 	libxul.so 	nsHTMLScriptElement::MaybeProcessScript 	content/html/content/src/nsHTMLScriptElement.cpp:336
16 	libxul.so 	nsHTMLScriptElement::DoneAddingChildren 	content/html/content/src/nsHTMLScriptElement.cpp:261
17 	libxul.so 	nsHtml5TreeOpExecutor::RunScript 	parser/html/nsHtml5TreeOpExecutor.cpp:737
18 	libxul.so 	nsHtml5TreeOpExecutor::RunFlushLoop 	parser/html/nsHtml5TreeOpExecutor.cpp:531
19 	libxul.so 	nsHtml5ExecutorReflusher::Run 	parser/html/nsHtml5TreeOpExecutor.cpp:93
20 	libxul.so 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:631
21 	libxul.so 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:245
22 	libxul.so 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:111
23 	libxul.so 	mozilla::ipc::MessagePumpForChildProcess::Run 	ipc/glue/MessagePump.cpp:230
24 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:222
25 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:514
26 	libxul.so 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:191
27 	libxul.so 	XRE_RunAppShell 	toolkit/xre/nsEmbedFunctions.cpp:677
28 	libxul.so 	mozilla::ipc::MessagePumpForChildProcess::Run 	ipc/glue/MessagePump.cpp:222
29 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:222
30 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:514
31 	libxul.so 	XRE_InitChildProcess 	nsAutoPtr.h:155
32 	libmozutils.so 	ChildProcessInit 	other-licenses/android/APKOpen.cpp:794
33 	plugin-container 	main 	ipc/app/MozillaRuntimeMainAndroid.cpp:69
34 	libc.so 	__libc_preinit 	

Only Report : 
https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2011-09-26%2008%3A00%3A00&signature=js%3A%3Amjit%3A%3AJaegerShotAtSafePoint&version=Fennec%3A8.0a2

6th in Top 10 of Aurora Crashes
Whiteboard: [mobile-crash] → [mobile-crash][native-crash]
More reports at:
https://crash-stats.mozilla.com/report/list?signature=js%3A%3Amjit%3A%3AJaegerShotAtSafePoint
https://crash-stats.mozilla.com/report/list?signature=js%3A%3Amjit%3A%3AJaegerShotAtSafePoint%28JSContext*%2C+void*%2C+bool%29
Crash Signature: [@ js::mjit::JaegerShotAtSafePoint] → [@ js::mjit::JaegerShotAtSafePoint] [@ js::mjit::JaegerShotAtSafePoint(JSContext*, void*, bool)]
Version: 8 Branch → Trunk
Crash Signature: [@ js::mjit::JaegerShotAtSafePoint] [@ js::mjit::JaegerShotAtSafePoint(JSContext*, void*, bool)] → [@ js::mjit::JaegerShotAtSafePoint] [@ js::mjit::JaegerShotAtSafePoint(JSContext*, void*, bool) ]
It's #2 top crasher in 17.0b3 for ARMv6 devices.
tracking-fennec: --- → ?
Whiteboard: [mobile-crash][native-crash] → [mobile-crash][native-crash][ARMv6]
David/Naveed - can you help find somebody to do code investigation here while we pull URLs and device correlations?
Here are device correlations in 17.0:
LGE LG-P500 	8
LGE LG-P698f 	7
HUAWEI CM980 	5
LGE LG-VM701 	4
LGE LG-E510g 	4
ZTE Skate 	3
Samsung SCH-R720 	3
LGE LG-E510f 	3
LGE LG-P698 	3
LGE LG-E510 	3
ZTE Orange Monte Carlo 	2
Samsung GT-P1000 	2
Samsung Nexus S 	2
LGE LG-P690 	2
Samsung GT-P7510 	2
HTC Status 	2
Motorola XT530 	2
Samsung GT-S5660 	1
TeleEpoch Chaser 	1
ZTE Blade S 	1
Sony Ericsson R800a 	1
Unknown GOCLEVER TAB A73 	1
Motorola DROID2 GLOBAL 	1
KYOCERA C5120 	1
IVIO EVO_Tab 	1
HTC Desire HD 	1
HTC ChaCha A810e 	1
LGE LG-AS680 	1
ASUS Transformer Pad TF300T 	1
Samsung GT-P3113 	1
Samsung GT-I9100P 	1
Motorola XT531 	1
Samsung GT-P5110 	1
We have never had much luck investigating JIT crashes without STR (especially ones on ARM :( ), so there's not much we can do here. A few thoughts:

(1) Did this crash start occurring after bug 793740 landed? We disabled JM+TI to reduce a previous topcrash. It would be interested to know whether this reduced the overall volume of topcrashes or not.

(2) Are we fuzzing on ARMv6? Fuzzing is one of the precious few testing mechanisms that makes us really confident about JIT stability. I'd go as far as to say we should not ship JITs without having fuzzed them :) so we may want to prioritize that. Fuzzing (and other random ways of bumping into STR) have had more success than investigating topcrashes directly.

(3) I don't have enough data to make this decision myself, but we could always consider disabling the JIT entirely on ARMv6 if we don't think it's stable enough to ship.
(In reply to David Anderson [:dvander] from comment #6)
> (1) Did this crash start occurring after bug 793740 landed?
Bug 793740 landed in 16.0 Beta 6 but this bug was already #4 top crasher in 16.0 Beta 5 for ARMv6 devices along with bug 670603. See https://crash-analysis.mozilla.com/rkaiser/2012-10-01/2012-10-01.fennecandroid.16.0b5.armv6.topcrash.html
(In reply to David Anderson [:dvander] from comment #6)

> (2) Are we fuzzing on ARMv6? 

I was able to come up with a prototype setup yesterday that fuzzes an ARMv6 JS shell on linux using an emulated ARM1176 CPU (shell built with --with-arch=armv6 --target=arm-linux-gnueabi --disable-ion --with-fpu=vfp, running it with -m -a). Let me know if changes to this configuration are required for better testing.
(In reply to David Anderson [:dvander] from comment #6)
> (3) I don't have enough data to make this decision myself, but we could
> always consider disabling the JIT entirely on ARMv6 if we don't think it's
> stable enough to ship.

Comment 7 and daily crash reports show this is a top ARMv6 crash. We should produce a try build with JIT disabled to see if there's a major performance hit. The goal would be to get this into beta 5 (going to build Tuesday) or wontfixing for FF17 based upon a QA review.
Keywords: steps-wanted
QA Contact: kbrosnan
3 	about:blank
2 	about:home
1 	http://www.nytimes.com/2012/10/31/us/hurricane-sandy-barrels-region-leaving-batt
1 	http://sn108w.snt108.mail.live.com/mail/EditMessageLight.aspx?ecui=false&n=96163
1 	https://www.facebook.com/
1 	http://m.weatherbug.com/OH/Huntsville-weather/local-forecast/detailed-day-foreca
1 	http://www.facebook.com/?m2w&refid=17
1 	http://capitalarealibrary.lib.overdrive.com/E9BE16CA-B1B1-4BA0-9B66-7867B53C7650
1 	about:ntab
1 	http://aerogril.poisk-podbor.ru/
1 	http://www.mabuhay-tv.com/
1 	http://book-online.com.ua/read.php?book=927
1 	https://monagence.edf.fr/AEL/servlet/RecapitulatifFactures
1 	http://www.engadget.com/2012/10/26/asus-vivotab-rt-review/
1 	http://www.google.co.uk/
1 	https://m.facebook.com/?refsrc=http%3A%2F%2Fwww.facebook.com%2F&_rdr
1 	http://www.verkkokauppa.com/fi/product/10609/dhffv/Scythe-Kozuti-SCKZT-1000-Inte
1 	https://mail.google.com/mail/?hl=es&shva=1#compose
1 	http://www.google.ro/search?q=html+learn&hl=ro&oq=html+learn&gs_l=mobile-heirloo
1 	http://www.youtube.com/watch?v=M2nn1X9Xlps&feature=youtu.be
1 	http://www.etsy.com/listing/113029804/3-x-polaroid-colour-film-impossible-px?ref
1 	http://vreale.tv/
1 	http://www.mozilla.org/en-US/firefox/mobile/faq/
1 	https://m.facebook.com/?refsrc=http%3A%2F%2Fde-de.facebook.com%2F&_rdr
1 	https://mail.google.com/mail/u/0/?shva=1#inbox
1 	http://forum.jorsindo.com/forum.php?mobile=yes
1 	http://disk.wedos.com/
1 	http://www.google.com/search?hl=en&tbo=d&site=&source=h
Keywords: needURLs
tracking-fennec: ? → 17+
https://tbpl.mozilla.org/?tree=Try&rev=c7fb59a504dc

I just pushed a Try run that should mark the JIT as broken on ARMv6, thus disabling it.
This needs steps to reproduce.

Given the signature of this crash, using the Samsung Gio and the HTC Status I have been unable to reproduce on mozilla-central/mozilla-19. Given that, with the above Try build, I can not tell any difference with JIT disabled.
Keywords: qawantedsteps-wanted
I also don't want to see JIT disabled for ARMv6 before bug 792144 has been throughly investigated.
What is the relationship between bug 792144 and this bug?
I agree with Aaron that I don't see much of a difference between a build with JIT and one with out. Did browsing around Twitter mobile, Facebook mobile as well as several other sites. The only site where I saw a difference was http://blogs.marketwatch.com/ bug 795432 where the slow script warning came up on load, pan and other interactions with the page on no JM build.
(In reply to David Anderson [:dvander] from comment #14)
Is there a chance that on ARMv6 the common way to hit this is on OOM abort? Reducing memory usage would help if that is true.
From the crash stacks it looks unlikely, it seems to mostly be segfaults at random addresses.
Given that we don't have much information on how to distinguish the difference between the builds with/without JIT for ARMv6 and bug 792144 investigating issues with JIT in ARMv6, we'll wontfix this for 17 and see how this plays out in the call to ship ARMv6 on 17. If we ship it, hopefully that will get us more useful information from users.
tracking-fennec: 17+ → 20+
Finkle, why is this tracking 20?
Flags: needinfo?(mark.finkle)
(In reply to Brad Lassey [:blassey] from comment #19)
> Finkle, why is this tracking 20?

Because of the crashes for ARMv6. We wanted a fix or the OK to turn off JIT for ARMv6. We found out that having VFP required for ARMv6 would/could make fixing JIT on ARMv6 easier, because of not needing to deal with software floating point modes. We were hopeful about getting a fix.

Disabling JIT on ARMv6 is a last resort, but still on the table.
Flags: needinfo?(mark.finkle)
It's #2 top crasher in 19.0 for ARMv6. With the new rules in https://wiki.mozilla.org/CrashKill/Topcrash, that qualifies it for the topcrash keyword.
Keywords: topcrash
Naveed, what do you want to do? There is only a week left until the last 20 beta
Flags: needinfo?(nihsanullah)
tracking-fennec: 20+ → ?
There are no crashes in 20.0 and above.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(nihsanullah)
Resolution: --- → WORKSFORME
tracking-fennec: ? → ---

Removing steps-wanted keyword because this bug has been resolved.

Keywords: steps-wanted
You need to log in before you can comment on or make changes to this bug.