Closed
Bug 471585
Opened 15 years ago
Closed 14 years ago
JIT causes maemo expedia.com hang/crash
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla1.9.1
People
(Reporter: mozilla, Assigned: vlad)
References
()
Details
(Keywords: mobile)
Attachments
(1 file)
381 bytes,
patch
|
gal
:
review+
jbramley
:
review+
|
Details | Diff | Splinter Review |
I've been isolating this and tp is consistently dying on hawaaworld.com on the n810's. Most of the time it's during cycle 1; occasionally it's in cycle 2, but I've seen over 5 crashes or hangs/disconnects while Fennec is hitting this page. I've got plenty of swap (>200mb), disabled the ptmalloc thresholds, and made sure I have enough disk on / . I was able to ssh in to an n810 while it was hanging on hawaaworld.com and verify that plenty of swap was available via the 'free' command; uptime showed load slightly above 3. My browser_wait is 90 and my pagetimeout is 700.
Reporter | ||
Comment 1•15 years ago
|
||
After commenting this page out of the manifest.txt, the tests started hanging on other pages. readnovel.com was the culprit this time; since it's right before the commented-out hawaaworld.com, I'm guessing it might have something to do with the *next* test, expedia.com. Still investigating on this end.
Reporter | ||
Comment 2•15 years ago
|
||
I believe I've commented out expedia.com and it crashed elsewhere. I've also tried adding dougt's patch from bug 454731, and I'm still crashing. Since it looks like a hang when I observe this on-screen, I'm wondering if it's actually a timeout. Also, with the amount of swap I have I'd be surprised if this is oom. If we (or maemo) rely on having free physical memory, however, that's certainly possibly the culprit.
Reporter | ||
Comment 3•15 years ago
|
||
On a device that's hung, maemo-n810-07:~# free total used free shared buffers Mem: 126796 103304 23492 0 12 Swap: 260020 162512 97508 Total: 386816 265816 121000 But in top, fennec appears to be gobbling up more and more memory: 1594 root RW 374M 1593 4.9301.5 fennec I saw it go up to 440M. Load went from ~3 to over 10. Possibly running this against a debug build would help. In the meantime, I'm going to try disabling JIT in all.js.
Reporter | ||
Comment 4•15 years ago
|
||
maemo-n810-07:~# free total used free shared buffers Mem: 126796 125228 1568 0 12 Swap: 260020 260020 0 Total: 386816 385248 1568 oom certainly is a possibility, but I'm not sure why it's using all that memory.
Reporter | ||
Comment 5•15 years ago
|
||
Changing the javascript.options.jit.content pref to false in all.js fixes the crash. Changing the summary and downgrading to major since we have a workaround.
Severity: critical → major
Summary: maemo talos tp crasher on hawaaworld.com → JIT causes maemo talos tp crash
Updated•15 years ago
|
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Fennec → Core
QA Contact: general → general
Target Milestone: --- → mozilla1.9.1
(In reply to comment #5) > Changing the javascript.options.jit.content pref to false in all.js fixes the > crash. > > Changing the summary and downgrading to major since we have a workaround. I understand your reasoning here, but I would argue this is still critical because we are trying to measure end-user performance, and end-users will be running with default prefs (jit on). A set of perf numbers with jit off would be interesting, to measure its impact (once this is fixed), but in the end-game we must be able to measure this with jit on, so I think it should stay critical. My 2 cents...
Comment 8•15 years ago
|
||
https://bugzilla.mozilla.org/page.cgi?id=fields.html#importance says crashers have the "critical" severity. As a practical matter, tho, if the existence of a workaround meant severity could always be downgraded, pretty much every crasher (app-internal crashes excepted) would only be major, because you can just not visit the site that caused the crash. Workarounds generally don't affect severity.
We need stacks for crashes, and minimized test cases (down to a single page, then cut out script until it stops crashing, etc.), in separate bugs. This isn't really actionable for people who can't access the still-not-public pageset, and we don't have bandwidth in the JS team to be setting up Tp on devices to get stacks and so forth.
Reporter | ||
Comment 10•15 years ago
|
||
I've narrowed this down to the expedia.com page. Still working on narrowing down to a set of scripts and getting a stack trace.
Reporter | ||
Comment 11•14 years ago
|
||
Looks like hitting expedia.com with Fennec 1.0b1 pretty much hangs the browser; no need to set up Tp.
Comment 12•14 years ago
|
||
Adding Vlad in case he gets some time to look.
Assignee | ||
Comment 13•14 years ago
|
||
I'll check if I have time, but this probably wants latest tracemonkey tree arm fixes.
Reporter | ||
Updated•14 years ago
|
Summary: JIT causes maemo talos tp crash → JIT causes maemo expedia.com hang/crash
Comment 14•14 years ago
|
||
I had this in gdb. Here is "info all" from the crash: (gdb) inf all r0 0x0 0 r1 0xb83 2947 r2 0x6 6 r3 0x4000c750 1073792848 r4 0xb83 2947 r5 0x6 6 r6 0x41136c38 1091791928 r7 0x10c 268 r8 0xc34 3124 r9 0xc38 3128 r10 0x41136000 1091788800 r11 0xbe8dfe6c 3196976748 r12 0xbe8dfd50 3196976464 sp 0xbe8dfd40 0xbe8dfd40 lr 0x41050e40 1090850368 pc 0x41050e74 0x41050e74 f0 0 (raw 0x000189880000000000000000) f1 0 (raw 0x000189880000000000000000) f2 0 (raw 0x000189880000000000000000) f3 0 (raw 0x000189880000000000000000) f4 0 (raw 0x000189880000000000000000) f5 0 (raw 0x000189880000000000000000) f6 0 (raw 0x000189880000000000000000) f7 0 (raw 0x000189880000000000000000) fps 0x0 0 cpsr 0x20000010 536870928 (gdb)
Comment 15•14 years ago
|
||
Oh, this build is using tracemonkey from 6/30/2009
Comment 16•14 years ago
|
||
Here is a disas $pc-128 $pc+128: Dump of assembler code from 0x41050df4 to 0x41050ef4: 0x41050df4: moveq r5, #268435456 ; 0x10000000 0x41050df8: movne r5, #0 ; 0x0 0x41050dfc: orr r3, r3, r1 0x41050e00: mov r0, r6 0x41050e04: add r1, sp, #144 ; 0x90 0x41050e08: mov r2, sp 0x41050e0c: str r3, [r4, r12, lsl #2] 0x41050e10: str r5, [sp, #276] 0x41050e14: bl 0x41051088 0x41050e18: cmp r0, #0 ; 0x0 0x41050e1c: ldrge r1, [sp] 0x41050e20: mvnlt r1, #0 ; 0x0 0x41050e24: b 0x41050dac 0x41050e28: andeq r5, lr, r8, lsl #5 0x41050e2c: andeq r5, lr, r12, ror #6 0x41050e30: andeq r2, r0, r4, lsr #29 0x41050e34: stmdb sp!, {r4, r5, r7, lr} 0x41050e38: mov r5, r0 0x41050e3c: bl 0x4103c570 0x41050e40: ldr r4, [r0, #-1112] 0x41050e44: mov r3, r0 0x41050e48: cmp r4, #0 ; 0x0 0x41050e4c: ldr r0, [r0, #-1108] 0x41050e50: bne 0x41050e90 0x41050e54: mov r7, #224 ; 0xe0 0x41050e58: svc 0x00000000 0x41050e5c: mov r4, r0 0x41050e60: str r4, [r3, #-1112] 0x41050e64: mov r2, r5 0x41050e68: mov r1, r4 0x41050e6c: mov r7, #268 ; 0x10c 0x41050e70: svc 0x00000000 0x41050e74: cmn r0, #4096 ; 0x1000 0x41050e78: mov r1, r0 0x41050e7c: bhi 0x41050eac 0x41050e80: cmn r0, #1 ; 0x1 0x41050e84: beq 0x41050ec4 0x41050e88: mov r0, r1 0x41050e8c: ldmia sp!, {r4, r5, r7, pc} 0x41050e90: cmp r0, #0 ; 0x0 0x41050e94: bgt 0x41050e64 0x41050e98: bic r3, r0, #-2147483648 ; 0x80000000 0x41050e9c: cmp r3, #0 ; 0x0 0x41050ea0: rsbne r0, r0, #0 ; 0x0 0x41050ea4: moveq r0, r4 0x41050ea8: b 0x41050e64 0x41050eac: ldr r3, [pc, #104] ; 0x41050f1c 0x41050eb0: rsb r2, r1, #0 ; 0x0 0x41050eb4: bl 0x4103c570 0x41050eb8: ldr r3, [pc, r3] 0x41050ebc: mvn r1, #0 ; 0x0 0x41050ec0: str r2, [r0, r3] 0x41050ec4: ldr r3, [pc, #84] ; 0x41050f20 0x41050ec8: bl 0x4103c570 0x41050ecc: ldr r3, [pc, r3] 0x41050ed0: ldr r2, [r0, r3] 0x41050ed4: cmp r2, #38 ; 0x26 0x41050ed8: bne 0x41050e88 0x41050edc: mov r1, r5 0x41050ee0: mov r0, r4 0x41050ee4: mov r7, #238 ; 0xee 0x41050ee8: svc 0x00000000 0x41050eec: cmn r0, #4096 ; 0x1000 0x41050ef0: mov r2, r0 End of assembler dump.
Comment 17•14 years ago
|
||
running: x/4x 0x41050e6c 0x41050e6c: 0xe3a07f43 0xef000000 0xe3700a01 0xe1a01000
Assignee | ||
Comment 18•14 years ago
|
||
$pc is 0x41050e74, so: 0x41050e68: mov r1, r4 0x41050e6c: mov r7, #268 ; 0x10c 0x41050e70: svc 0x00000000 0x41050e74: cmn r0, #4096 ; 0x1000 0x41050e6c: 0xe3a07f43 0xef000000 0xe3700a01 0xe1a01000 That svc 0x0 (0xef000000) looks extremely suspicious. I'm not sure what gdb's $pc means, whether it's real pc + 8, as with the arm assembly view, or whether it's the next insn to be executed. If the latter then the crash is on that svc instruction, no idea where it's coming from; we insert some breakpoint instructions in a few places as filler, but they always follow branches which isn't the case here. And that's not a breakpoint instruction. There are a bunch of those that appear in the above disassembly which is also strange...
Comment 19•14 years ago
|
||
Indeed, we never emit SVC for anything, so these are very suspicious! There are some clues. The mystery SVCs usually (but not always) preceed a CMN instruction. A quick grep of the source indicates that CMN is only ever emitted by asm_cmpi. Tracing up the possible call stacks, we find that CMN can only be emitted by asm_branch, asm_cond or asm_cmov. Because each cmn in your disassembly is proceeded by a MOV, asm_cmov is probably a good place to start. Can we have a dump of the LIR stream that caused the instructions to be generated please? I'm not sure how you do that in the browser, but it would certainly be useful. asm_cmov can come from either LIR_qcmov or LIR_cmov.
Comment 20•14 years ago
|
||
#19: browser works the same way, via the environment export TMFLAGS=full
Assignee | ||
Comment 21•14 years ago
|
||
Hrm. In the future, please do a debug build and try there -- the ARM backend is really good at having lots of assertions, and I'll look into figuring out how to enable them unconditionally for a while. This problem is: Assertion failed: ((-(1<<12)) <= (dr) && (dr) < (1<<12)) (c:/proj/firefox-wince-dbg/js/src/../../../mozilla-central/js/src/nanojit/NativeARM.cpp:658) specifically, calling STR(ra, rb, dr) with a large dr. Patch coming up.
Assignee | ||
Comment 22•14 years ago
|
||
Fix -- if dr won't fit in S12, just load the full address into IP first before calling STR.
Assignee: general → vladimir
Attachment #389213 -
Flags: superreview?(Jacob.Bramley)
Attachment #389213 -
Flags: review?(gal)
Updated•14 years ago
|
Attachment #389213 -
Flags: superreview?(Jacob.Bramley)
Attachment #389213 -
Flags: review?(gal)
Attachment #389213 -
Flags: review?(Jacob.Bramley)
Attachment #389213 -
Flags: review+
Assignee | ||
Comment 23•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/e402c0d65942
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment 24•14 years ago
|
||
Comment on attachment 389213 [details] [diff] [review] fix Yep, looks good.
Attachment #389213 -
Flags: review?(Jacob.Bramley) → review+
Comment 25•14 years ago
|
||
I can confirm that expedia.com loads on the n810 with JIT enabled. Thanks Vlad.
You need to log in
before you can comment on or make changes to this bug.
Description
•