Closed Bug 819329 Opened 12 years ago Closed 12 years ago

crash in js::ion::AutoFlushCache::update

Categories

(Core :: JavaScript Engine, defect)

20 Branch
ARM
Android
defect
Not set
blocker

Tracking

()

RESOLVED FIXED
mozilla20
Tracking Status
firefox20 + fixed

People

(Reporter: scoobidiver, Assigned: bhackett1024)

References

Details

(4 keywords, Whiteboard: [native-crash])

Crash Data

Attachments

(1 file)

It first showed up in 20.0a1/20121129074803 and has been hit by five users.

Signature 	js::ion::AutoFlushCache::update(unsigned int, unsigned int) More Reports Search
UUID	88a9f344-e417-43f5-87cf-d5f7d2121207
Date Processed	2012-12-07 07:39:26
Uptime	354
Install Age	5.9 minutes since version was first installed.
Install Time	2012-12-07 07:33:24
Product	FennecAndroid
Version	20.0a1
Build ID	20121206030737
Release Channel	nightly
OS	Android
OS Version	0.0.0 Linux 3.0.54-KT747-g1efb136 #3 SMP PREEMPT Wed Dec 5 18:26:25 MST 2012 armv7l d2uc-user 4.0.4 IMM76D I747UCALEM release-keys
Build Architecture	arm
Build Architecture Info	
Crash Reason	SIGSEGV
Crash Address	0x10
App Notes 	
AdapterDescription: 'Qualcomm -- Adreno (TM) 225 -- OpenGL ES 2.0 AU_LINUX_ANDROID_JB.04.01.01.00.036 (CL2644550) -- Model: SAMSUNG-SGH-I747, Product: d2uc, Manufacturer: samsung, Hardware: qcom'
EGL? EGL+ GL Context? GL Context+ GL Layers? GL Layers+ WebGL? WebGL+ 
samsung SAMSUNG-SGH-I747
d2uc-user 4.0.4 IMM76D I747UCALEM release-keys
Processor Notes 	This dump is too long and has triggered the automatic truncation routine; /data/socorro/stackwalk/bin/exploitable: ERROR: unable to analyze dump
EMCheckCompatibility	True
Adapter Vendor ID	Qualcomm
Adapter Device ID	Adreno (TM) 225
Device	samsung SAMSUNG-SGH-I747
Android API Version	16 (REL)
Android CPU ABI	armeabi-v7a

Frame 	Module 	Signature 	Source
0 	libxul.so 	js::ion::AutoFlushCache::update 	Assembler-arm.cpp:2447
1 	libxul.so 	js::ion::AutoFlushCache::updateTop 	Ion.cpp:1990
2 	libxul.so 	js::ion::Assembler::executableCopy 	Assembler-arm.cpp:469
3 	libxul.so 	js::ion::IonCode::copyFrom 	Ion.cpp:341
4 	libxul.so 	js::ion::CodeGenerator::link 	IonLinker.h:58
5 	libxul.so 	js::ion::AttachFinishedCompilations 	Ion.cpp:1087
6 	libxul.so 	js_InvokeOperationCallback 	jscntxt.cpp:1294
7 	libxul.so 	SortComparatorStrings::operator 	jscntxt.h:2034
8 	libxul.so 	js::array_sort 	Sort.h:45
9 	libxul.so 	js::InvokeKernel 	jscntxtinlines.h:364
10 	libxul.so 	js::Interpret 	jsinterp.cpp:2321
11 	libxul.so 	js::RunScript 	jsinterp.cpp:326
12 	libxul.so 	js::Invoke 	jsinterp.cpp:384
13 	libxul.so 	JS_CallFunctionValue 	jsapi.cpp:5786
14 	libxul.so 	mozilla::dom::EventHandlerNonNull::Call 	EventHandlerBinding.cpp:44 
...

More reports at:
https://crash-stats.mozilla.com/report/list?signature=js%3A%3Aion%3A%3AAutoFlushCache%3A%3Aupdate%28unsigned+int%2C+unsigned+int%29
Whiteboard: [native-crash]
It's #5 top crasher in 20.0a1.
tracking-fennec: --- → ?
Keywords: topcrash
I also hit this today while exploring the My Stream page of the paid social network service http://app.net.
My report is at:
bp-eee1ae35-ac88-4893-bb0f-daceb2121212
It spiked in 20.0a1/20121212 with about 50 crashes per hour. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=4dfe323a663d&tochange=634180132e68
Severity: critical → blocker
Depends on: 820855
Crashes for me on the front page of Ars Technica on Galaxy Nexus.
Copied over from the dupe I filed (bug 820855):


I'm getting a crash on a Galaxy Nexus running a local debug build of Android (cset 87f8165c5a0b + the patch on bug 818060). STR:

1) Start fennec
2) load about:memory
3) Scroll to the bottom and hit update
4) Pinch zoom in and out a little
5) Hit update again

GDB backtrace is attached [1]. It's happened 2/2 times so far so I assume it's easily reproducible.

[1] Now at https://bug820855.bugzilla.mozilla.org/attachment.cgi?id=691353
I can reliably reproduce this on my Galaxy Tab 10.1 running Nightly by loading a page on jsfiddle.net:
http://jsfiddle.net/B7RqJ/1/
Keywords: reproducible
More STR:
1) Open Fennec and do a Google search from the AwesomeScreen
2) Wait 1 minute after the pages loads (don't let the screen turn off while waiting)
I'm guessing the regression is from bug 813559 - I'll test some before/after builds now to verify.
I can confirm that this is a regression from bug 813559.
Attached patch patchSplinter Review
This should fix the crash.  There was no AutoFlushCache on the stack when finishing off thread compilations, which would cause this to always crash on ARM.  (This also suggests that all the ARM tinderbox machines are either (a) not running/compiling with Ion, or (b) using a single core.)  An alternative fix is to set javascript.options.ion.parallel_compilation to false for Fennec.
Assignee: general → bhackett1024
Attachment #691462 - Flags: review?(dvander)
Attachment #691462 - Flags: review?(dvander) → review+
(In reply to Brian Hackett (:bhackett) from comment #11)
> (b) using a single core

This is likely true.
I also get this crash viewing the tinderbox pushlog.
(In reply to Brian Hackett (:bhackett) from comment #11)
> (This also suggests that all the ARM tinderbox machines are either (a)
> not running/compiling with Ion, or (b) using a single core.)  An alternative
> fix is to set javascript.options.ion.parallel_compilation to false for
> Fennec.

I thought our tegras were dual core?
http://en.wikipedia.org/wiki/Tegra#Tegra_2

Our currently-being-rolled-out Panda boards are also dual core:
http://pandaboard.org/node/300/#PandaES

Joel, can you confirm?
Looking on the tegras via android shell there is no indication we are using a double core processor.  This could be a software thing or maybe our old tegras are just old.
Firefox 20.0a1 (2012-12-12)
Devices: Galaxy S2, Asus EEE Transformer TF101
OS: Android 4.0.3

I was able to reproduce this issue on the latest Nightly build, by following these STR:
1. Go to google.com
2. Request Desktop site for the google.com

After step 2, the app crashes until the session will be not restored.
(In reply to Joel Maher (:jmaher) from comment #17)
> Looking on the tegras via android shell there is no indication we are using
> a double core processor.  This could be a software thing or maybe our old
> tegras are just old.

I asked Joel to look at this yesterday because of this bug. I think we might simply have a non-SMP kernel on the tegras, so they might have dual-core hardware but be functionally single-core. The Pandaboards are dual-core, so we'll have test coverage of that hardware configuration once we get them online.
https://hg.mozilla.org/mozilla-central/rev/64186de82d6d
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
Some followup. The fix here does not seem to fix the crash completely. Bug 821625 is happening on Nightly builds with the fix in this bug.

However, setting javascript.options.ion.parallel_compilation to false appears to stop the crashes. We probably need to make that change for Fennec. But I really want to fix the problem, not just turn it off. That creates technical debt that will hurt ARM in the future.
(In reply to Mark Finkle (:mfinkle) from comment #21)
> Some followup. The fix here does not seem to fix the crash completely.
I see no crashes in 20.0a1/20121214.
No longer depends on: 821625
(In reply to Scoobidiver from comment #22)
> (In reply to Mark Finkle (:mfinkle) from comment #21)
> > Some followup. The fix here does not seem to fix the crash completely.
> I see no crashes in 20.0a1/20121214.

Indeed. Closer testing shows the bug is fixed. Thanks for checking Scoobi.
Yes! I can verify that both tbpl.mozilla.org and searching within about:config both work now without triggering this crash.
tracking-fennec: ? → ---
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: