Closed Bug 604432 Opened 14 years ago Closed 14 years ago

slashdot.org crashes the tab 5 seconds after it loads on 10/14 nightly build

Categories

(Core :: JavaScript Engine, defect)

ARM
Android
defect
Not set
major

Tracking

()

VERIFIED FIXED
Tracking Status
fennec 2.0b2+ ---

People

(Reporter: aakashd, Assigned: cjones)

Details

(Keywords: crash)

Attachments

(1 file)

Build Id:
Mozilla/5.0 (Android; Linux armv71; rv:2.0b8pre) Gecko/20101014 Namoroka/4.0b8pre Fennec/4.0b2pre

Note: This is happening on the Samsung Vibrant and not on the Nexus One. So, I think this is a Galaxy S issue or a specific device issue.

Steps to Reproduce:
1. Go to www.slashdot.org/
2. Wait (after the page loads)

Actual Results:
The tab will crash.

Expected Results:
The tab shouldn't crash.
tracking-fennec: --- → ?
Keywords: crash
99.999% sure this is yarr.  Working around this needs to block b2.

Matt reports a crash on his G2 in the v8 suite, which is probably a separate yarr bug.
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Fennec → Core
QA Contact: general → general
I'm not going to be able to hack on this because I've lost the ability to connect to my device, so no gdb.

cdleary --- is there a way for us to piggy-back off the JSContext's tjit and mjit flags to disable yarr JIT?  I'd hate to copy and paste the crappy blacklisting code from jscntxt.cpp to somewhere in yarr.
Ah, back on track.  I'll see if we're dying in trace-tests.
basic/bug593663-regexp.js
Exit code: -11
basic/testLirBufOOM.js
Exit code: -11
jaeger/bug563000/eif-call-newvar.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-call-newvar.js:8: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-call-typechange.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-call-typechange.js:8: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-call.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-call.js:8: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-getter-newvar.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-getter-newvar.js:5: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-getter-typechange.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-getter-typechange.js:5: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-getter.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-getter.js:5: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-trap-newvar.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-trap-newvar.js:5: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-trap-typechange.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-trap-typechange.js:5: Error: unexpected failure to JIT
Exit code: 3
jaeger/bug563000/eif-trap.js
/data/local/js/trace-test/tests/jaeger/bug563000/eif-trap.js:5: Error: unexpected failure to JIT
Exit code: 3
sunspider/check-date-format-tofte.js
/data/local/js/trace-test/tests/sunspider/check-date-format-tofte.js:302:
[snip]
Exit code: 3
sunspider/check-regexp-dna.js
Exit code: -11

tofte is a known bad test.  The "failure to JIT" errors are expected because mjit is blacklisted.  The segfaults ("Exit code: -11") look very bad.
Segfault in basic/bug593663-regexp.js is easily reproducible, 5/5.
Classic, illuminating, broken-JIT backtrace

(gdb) bt
#0  js_ReportOutOfMemory (cx=0x20d05) at /home/cjones/mozilla/mozilla-central/js/src/jscntxt.cpp:1390
#1  0x000203fe in js::RegExp::executeInternal (this=<value optimized out>, cx=0x20d05, res=0x40804360, input=<value optimized out>, lastIndex=0x40804360, test=64, rval=0xac) at /home/cjones/mozilla/mozilla-central/js/src/jsregexpinlines.h:327
#2  0xbeae56b8 in ?? ()
#3  0xbeae56b8 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(In reply to comment #2)
> I'm not going to be able to hack on this because I've lost the ability to
> connect to my device, so no gdb.
> 
> cdleary --- is there a way for us to piggy-back off the JSContext's tjit and
> mjit flags to disable yarr JIT?  I'd hate to copy and paste the crappy
> blacklisting code from jscntxt.cpp to somewhere in yarr.

That's what we should do to keep Samsung's 2.1 phones working. A smarter thing to do is target only Android 2.2+.
I don't understand what you mean.  Smarter why?  Carrier-branded i9000s running 2.2 don't exist yet and there's no ETA.
(In reply to comment #8)
> I don't understand what you mean.  Smarter why?  Carrier-branded i9000s running
> 2.2 don't exist yet and there's no ETA.

If we want JITless Firefox running on these i9000s, we should turn off the JIT in the way you describe. But that's aiming squarely at the device landscape we have today. It will be a lot different 6 months from now.
I don't plan on shipping this disabling code in 4.0 final (unless carriers *really* drag their asses on 2.2).  In the meantime, I would like people using i9000s to be able to provide feedback on our betas.
(Aside: I hate that js/src and I have gotten off on the wrong foot; all I've added are gross hacks.  I hope we can do something productive at some point in the future.)
Assignee: general → jones.chris.g
Attachment #483360 - Flags: review?(cdleary)
100/100 pass on basic/bug593663-regexp.js with this patch, and a clean run through trace-tests (except for known-broken tests).
This should block beta2.
tracking-fennec: ? → 2.0b2+
(In reply to comment #10)
> I don't plan on shipping this disabling code in 4.0 final (unless carriers
> *really* drag their asses on 2.2).

As if on queue,

http://www.engadget.com/2010/10/15/galaxy-s-android-2-2-froyo-update-begins-to-trickle-out/

This is going to sound sarcastic, but I think our engineering planning should be more forward looking, rather than reactive to engadget articles.

How many hours have we spent on Android 2.1? We are not making the problem small enough. I think we should really be targeting 3.0 and forgetting 2.x, but it doesn't seem we're going to get the access we need there, so 2.2+ is the best we can do.

> In the meantime, I would like people using
> i9000s to be able to provide feedback on our betas.

Whenever someone is executing a bad plan at Mozilla, they inevitably rationalize it by saying they are interested in "getting feedback". Feeback is good--I'm in favor of it. But there should be concise reasoning beyond that. As bad plans go, this one is not so bad ( :) ), but I still think it is waste of resources.

Furthermore, I don't think we need feedback from Samsung i9000 users on Android 2.1 to make progress. We'll have quite enough feedback to act on from Android 2.2 phones. It's wishful thinking to suppose that it's time to work on device-specific polish issues.
That's the Samsung update.  I was aware of that.  No *carriers* have announced updates or ETAs.

I really think you're blowing this specific issue *waaayyy* out of proportion.  We spent many hours investigating the JIT crashes because it wasn't known who was at fault.  We know that now, and it wasn't our fault.  Good.

These temporary hacks have taken no time at all.  They don't tie us to any android version.  In exchange, I can use my phone to hack on fennec.  That's a good trade from my perspective!

Deciding which android API version to target is a totally different discussion out of my depth, and way beyond the scope of this bug.  I will point at http://developer.android.com/resources/dashboard/platform-versions.html without comment.  However, that has nothing to do with this bug.
Comment on attachment 483360 [details] [diff] [review]
Disable Yarr JIT on i9000s

Blech. r=me with a FIXME comment that describes when we can remove this hack and a remove-it bug number. (Also, SpiderMonkey style is C89-style comments.)
Attachment #483360 - Flags: review?(cdleary) → review+
Fixed comment style and added a FIXME/bug 604774.
http://hg.mozilla.org/mozilla-central/rev/05dfe52add3a
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
(In reply to comment #15)
> That's the Samsung update.  I was aware of that.  No *carriers* have announced
> updates or ETAs.
> 
> I really think you're blowing this specific issue *waaayyy* out of proportion. 

I do not agree that I'm blowing this issue out of proportion, especially if it is viewed as an example of something that will happen over and over. This is the third round here. First, we (mostly you, but 3 other engineers too) had to dig for this problem. Julian Seward looked, I read kernel changelogs, QA tested etc. Then, we almost fumbled the beta by disabling the JIT (systemic failure, not your fault). Here, we are spending QA and engineering time on yet another issue.


> 
> Deciding which android API version to target is a totally different discussion
> out of my depth, and way beyond the scope of this bug. 

I hope that everyone thinks critically about these kinds of issues, and shares their thoughts in the open. Of course, that doesn't mean you have to have an opinion on everything.

> I will point at
> http://developer.android.com/resources/dashboard/platform-versions.html without
> comment. 

Well, I am not betting on Android 2.1

> However, that has nothing to do with this bug.

Fully disagree.
(In reply to comment #19)
> (In reply to comment #15)
> > That's the Samsung update.  I was aware of that.  No *carriers* have announced
> > updates or ETAs.
> > 
> > I really think you're blowing this specific issue *waaayyy* out of proportion. 
> 
> I do not agree that I'm blowing this issue out of proportion, especially if it
> is viewed as an example of something that will happen over and over. This is
> the third round here. First, we (mostly you, but 3 other engineers too) had to
> dig for this problem. Julian Seward looked, I read kernel changelogs, QA tested
> etc.

I don't think investigating the crashes in our JITs was wasted time.  My only regret is that (to my knowledge) we didn't have channels to samsung or google people who could have told us upfront that the i9000 kernel was broken, stay away.

> Then, we almost fumbled the beta by disabling the JIT (systemic failure,
> not your fault). Here, we are spending QA and engineering time on yet another
> issue.

I disagree that we would have fumbled the beta by shipping a disabled JIT, considering we only noticed after looking at sunspider numbers (hmmmmmm).  But that's more of an aside.  I take full responsibility for fucking up.

I've spent more time replying in this bug than I did writing this patch.  cdleary's time was wasted, sorry.  Can't speak for QA.

> > Deciding which android API version to target is a totally different discussion
> > out of my depth, and way beyond the scope of this bug. 
> 
> I hope that everyone thinks critically about these kinds of issues, and shares
> their thoughts in the open. Of course, that doesn't mean you have to have an
> opinion on everything.

I don't know the r7 vs. r8 vs. ... android APIs, so I can't speak from an engineering perspective on what we lose by us currently targeting r7.

Deciding which devices/android versions we need to support is an economic decision, and again one for which I don't have hard data (I hope someone does!).  But, e.g., we continue to spend time supporting 10-year-old windows XP even though it costs us engineering time because of its crippled APIs, because a plurality of our windows users are still on it.  We'd need to make a similar call for android, but again, I don't have that data.

> > However, that has nothing to do with this bug.
> 
> Fully disagree.

This bug doesn't tie us to an android SDK release and it doesn't tie us to supporting a particular set of devices/OSes.  I'm signed up for the js/src maintenance of these hacks and their removal, and to me they're worth it if only for me, Mossop, sdwilsh, mevans, and the other mozilla folks with i9000s to be able run fennec betas for a couple of months.
verified FIXED on build:

Mozilla/5.0 (Android; Linux armv71; rv:2.0b8pre) Gecko/20101026 Namoroka/4.0b8pre Fennec/4.0b2pre
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: