Closed Bug 978450 Opened 11 years ago Closed 11 years ago

Homescreen keeps crashing 1/3 nightly (Peak, Keon, Hamachi)

Categories

(Core :: JavaScript Engine: JIT, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
1.4 S3 (14mar)
blocking-b2g 1.4+
Tracking Status
b2g-v1.4 --- fixed

People

(Reporter: past, Unassigned)

References

Details

(4 keywords, Whiteboard: [b2g-crash])

Crash Data

Attachments

(1 file)

Attached file Logcat
The latest Peak nightly from Geeksphone (March 1st) results in a homescreen that keeps crashing and restarts. I tried both updating to the 1/3 build using the settings app and by flashing the Geeksphone-provided tarball. I've also had the Dialer app crash too, after a brief period where the homescreen remained stable, until I eventually reverted to an older nightly. Logcat is attached.
(In reply to Panos Astithas [:past] from comment #0) > Created attachment 8384129 [details] > Logcat > > The latest Peak nightly from Geeksphone (March 1st) results in a homescreen > that keeps crashing and restarts. I tried both updating to the 1/3 build > using the settings app and by flashing the Geeksphone-provided tarball. I've > also had the Dialer app crash too, after a brief period where the homescreen > remained stable, until I eventually reverted to an older nightly. Logcat is > attached. Can you get a crash report URL? See https://wiki.mozilla.org/B2G/QA/Tips_And_Tricks#Getting_crashes_off_the_Device. This might related to the automation problem we are seeing in bug 978458.
Thanks for the link. Here are the latest crashes that I found on the device: https://crash-stats.mozilla.com/report/index/0d162a2c-0705-4c5a-9716-4d1662140301 https://crash-stats.mozilla.com/report/index/12d85604-166b-412e-9a44-42e942140301 https://crash-stats.mozilla.com/report/index/8d64e5cd-bf92-40bd-82e6-dba3f2140301 https://crash-stats.mozilla.com/report/index/ffb5b976-cb76-4fdd-bf2b-8a48d2140301 Top of the stack: 0 @0x431899e8 1 js::jit::EnterBaselineMethod(JSContext*, js::RunState&) /home/geeksphone/FOS/peak/gecko/js/src/jit/BaselineJIT.cpp 2 js::Invoke /home/geeksphone/FOS/peak/gecko/js/src/vm/Interpreter.cpp 3 js_fun_call /home/geeksphone/FOS/peak/gecko/js/src/jsfun.cpp 4 js::Invoke /home/geeksphone/FOS/peak/gecko/js/src/jscntxtinlines.h 5 Interpret /home/geeksphone/FOS/peak/gecko/js/src/vm/Interpreter.cpp 6 js::Invoke /home/geeksphone/FOS/peak/gecko/js/src/vm/Interpreter.cpp 7 JS_CallFunction(JSContext*, JS::Handle<JSObject*>, JS::Handle<JSFunction*>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) /home/geeksphone/FOS/peak/gecko/js/src/jsapi.cpp 8 mozJSComponentLoader::ObjectForLocation(nsIFile*, nsIURI*, JSObject**, JSScript**, char**, bool, JS::MutableHandle<JS::Value>) /home/geeksphone/FOS/peak/gecko/js/xpconnect/loader/mozJSComponentLoader.cpp 9 mozJSComponentLoader::ImportInto(nsACString_internal const&, JS::Handle<JSObject*>, JSContext*, JS::MutableHandle<JSObject*>) /home/geeksphone/FOS/peak/gecko/js/xpconnect/loader/mozJSComponentLoader.cpp 10 mozJSComponentLoader::Import(nsACString_internal const&, JS::Handle<JS::Value>, JSContext*, unsigned char, JS::MutableHandle<JS::Value>) /home/geeksphone/FOS/peak/gecko/js/xpconnect/loader/mozJSComponentLoader.cpp 11 nsXPCComponents_Utils::Import(nsACString_internal const&, JS::Handle<JS::Value>, JSContext*, unsigned char, JS::MutableHandle<JS::Value>) /home/geeksphone/FOS/peak/gecko/js/xpconnect/src/XPCComponents.cpp 12 NS_InvokeByIndex /home/geeksphone/FOS/peak/gecko/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp 13 XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode) /home/geeksphone/FOS/peak/gecko/js/xpconnect/src/XPCWrappedNative.cpp 14 XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*) /home/geeksphone/FOS/peak/gecko/js/xpconnect/src/XPCWrappedNativeJSOps.cpp 15 js::Invoke /home/geeksphone/FOS/peak/gecko/js/src/jscntxtinlines.h 16 Interpret /home/geeksphone/FOS/peak/gecko/js/src/vm/Interpreter.cpp
Crash Signature: [@ @0x0 | js::jit::EnterBaselineMethod(JSContext*, js::RunState&) ]
Keywords: crash
Severity: normal → critical
Component: Gaia::Homescreen → JavaScript Engine
Product: Firefox OS → Core
Whiteboard: [b2g-crash]
Nicolas, can you reproduce this?
Component: JavaScript Engine → JavaScript Engine: JIT
Flags: needinfo?(nicolas.b.pierron)
Not sure this is the same issue as I don't have Wi-Fi and so my dumps are not sent, but I see the exact same on a Fugu. When the Homescreen eventually launches then any app will crash, except rarely I can launch it.
I reproduced the issue on Keon device too. In case homescreeen launches then any app will crash.
If I rolled back before bug 976120 (m-c 171153:6cf927291112) was checked-in, I don't see crash.
We get the same issue on Hamachis. After the flash the HomeScreen app is crashing. I can't send a report because I can't get to the WiFi settings screen. The GaiaUI-tests are failing because of this. Gaia a980b8f54956ed470667033630b02492efdf4a07 Gecko https://hg.mozilla.org/mozilla-central/rev/0085a162499f BuildID 20140301160203 Version 30.0a1 ro.build.version.incremental=324 ro.build.date=Thu Dec 19 14:04:55 CST 2013
I *may* know what's going on here, would one of you be able to test a patch?
Yes of course, please attach it :) FWIW it seems that everything is working fine when I flash Gecko and b2g starts, but then after a reboot the bug happens. Happened twice already.
Summary: Homescreen keeps crashing in the Peak 1/3 nightly → Homescreen keeps crashing 1/3 nightly (Peak, Keon, Hamachi)
I can confirm the issue on hamachi. Found on: Alcatel One Touch Fire production (got from T-mobile Poland) B2G version: 1.4.0.0-prerelease master Platform version: 30.0a1 Build Identifier: 20140301160203 Git commit info: 2014-02-28
blocking-b2g: --- → 1.4?
Although I guess it does not add much value, I can also confirm the issue in the Unagi using today's build Gecko-db5f706.Gaia-5684544.
I'm bisecting right now.
And Inari: Gaia a980b8f54956ed470667033630b02492efdf4a07 Gecko https://hg.mozilla.org/mozilla-central/rev/0085a162499f BuildID 20140301160203 Version 30.0a1 ro.build.version.incremental=eng.cltbld.20140227.192705 ro.build.date=Thu Feb 27 19:49:33 EST 2014
Breaking down on tinderbox builds (to get slightly more granularity than nightly builds) The Gaia commit for the first breaking tinderbox build was: Gaia a980b8f54956ed470667033630b02492efdf4a07 and Gecko revision: Gecko 8abc76dedec2 Linked to here: https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/tinderbox-builds/mozilla-central-hamachi-eng/20140228130531/
Relevant JS Engine bugs in push log: * bug 939562 * bug 977117 * bug 977224 * bug 978047 * bug 930477 * bug 957004
This is as deep as we can go from pvtbuilds, as we don't have mozilla-inbound device images right now. We'll need someone on the JS team to link the crash stack to one of those 6 bugs above to identify the regressing cause & get it backed out.
blocking-b2g: 1.4? → 1.4+
(In reply to Florin Strugariu [:Bebe] from comment #13) > And Inari: > Gaia a980b8f54956ed470667033630b02492efdf4a07 > Gecko https://hg.mozilla.org/mozilla-central/rev/0085a162499f > BuildID 20140301160203 > Version 30.0a1 > ro.build.version.incremental=eng.cltbld.20140227.192705 > ro.build.date=Thu Feb 27 19:49:33 EST 2014 The ro.build.date is outdated compared to the BuildID and the Gecko changeset. Assuming this was not another issue (*), AWFY failed to report frequently when the problem appeared: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=39101e03fc13a8b7447b1555881cce5d439ba255&tochange=20c705d00e7c48ff5c82558af56be53a1a8f7f4c (*) There is another issue, where benchmark cannot run because the keyboard position might have changed, and the marionette harness is typing "http:++" instead of "http://".
Keywords: smoketest
(In reply to Jason Smith [:jsmith] from comment #17) > This is as deep as we can go from pvtbuilds, as we don't have > mozilla-inbound device images right now. We'll need someone on the JS team > to link the crash stack to one of those 6 bugs above to identify the > regressing cause & get it backed out. The stack trace does not tell us anything except that this is in Gecko's JS, as baseline fails after a "Cu.import". Then there is nothing else we can learn from this back trace. Also, it would be nice to generate stack traces, while disabling the jits, such as javascript.options.baselinejit.content -> false javascript.options.baselinejit.chrome -> false javascript.options.ion.content -> false This can be done with ./edit-prefs.sh . So far I am unable to configure a gecko locally to test on an Unagi. I setup AWFY to make Unagi's images for each modifications made in the js directory, and I would have these images tomorrow.
Flags: needinfo?(nicolas.b.pierron)
(In reply to Nicolas B. Pierron [:nbp] from comment #19) > (In reply to Jason Smith [:jsmith] from comment #17) > > This is as deep as we can go from pvtbuilds, as we don't have > > mozilla-inbound device images right now. We'll need someone on the JS team > > to link the crash stack to one of those 6 bugs above to identify the > > regressing cause & get it backed out. > > The stack trace does not tell us anything except that this is in Gecko's JS, > as baseline fails after a "Cu.import". Then there is nothing else we can > learn from this back trace. > > Also, it would be nice to generate stack traces, while disabling the jits, > such as > javascript.options.baselinejit.content -> false > javascript.options.baselinejit.chrome -> false > javascript.options.ion.content -> false > > This can be done with ./edit-prefs.sh . > > So far I am unable to configure a gecko locally to test on an Unagi. I > setup AWFY to make Unagi's images for each modifications made in the js > directory, and I would have these images tomorrow. Tomorrow isn't good enough - we need this fixed asap within the next hour or two. We've got a busted m-c build with an entire b2g organization blocked here on testing. Someone needs to get on this now and get the regressing patch backed out.
(In reply to Jason Smith [:jsmith] from comment #20) > (In reply to Nicolas B. Pierron [:nbp] from comment #19) > > So far I am unable to configure a gecko locally to test on an Unagi. I > > setup AWFY to make Unagi's images for each modifications made in the js > > directory, and I would have these images tomorrow. > > Tomorrow isn't good enough - we need this fixed asap within the next hour or > two. We've got a busted m-c build with an entire b2g organization blocked > here on testing. Someone needs to get on this now and get the regressing > patch backed out. I am sorry if everybody is blocked on inbound, but as long as people are landing on inbound, AWFY's priority is to check these commits. When it has idle time, it is building images of listed commits, knowing that I listed 20 changes, this will take a while, and is unlikely to complete in the next 2 hours.
Bisection leads to: 178889 508848ad378a 2014-02-26 10:25 +0100 jdemooij Bug 939562 part 3 - Move JIT flags from ContextOptions to RuntimeOptions. r=bent,bholley,luke I'll try to revert this from latest central, if this reverts cleanly and see how this behaves.
(In reply to Julien Wajsberg [:julienw] from comment #22) > Bisection leads to: > > 178889 508848ad378a 2014-02-26 10:25 +0100 jdemooij > Bug 939562 part 3 - Move JIT flags from ContextOptions to RuntimeOptions. > r=bent,bholley,luke OK, that change must be exposing an existing bug, likely because we JIT more code now. Thanks for your help Julien, we should back it out then. Too bad we don't have tests for this on inbound that would have caught this.
Blocks: 939562
Ed is working on backing out bug 939562 right now.
The last part of bug 939562 has been backed out (and also 978456 which landed on top of it).
I just want to report that it works fine for me with 508848ad378a reverted, both on my Buri and my Fugu (which was crashing a lot). Should we close this bug now ?
Yup.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Does anyone have any idea why this didn't break on the emulator builds we monitor with TBPL? If there's a known class of potential regressions that our test automation is missing, we should at least open a bug about it.
nightly device builds have been re-spun: results will be found here (inside Ed Morley's backout tbpl push): https://tbpl.mozilla.org/?showall=all&rev=c8bea55437c1
(In reply to Jed Davis [:jld] from comment #29) > Does anyone have any idea why this didn't break on the emulator builds we > monitor with TBPL? If there's a known class of potential regressions that > our test automation is missing, we should at least open a bug about it. The emulator does not catch misaligned accesses, and spidermonkey tends to hit this deficiency more frequently than other code.
(In reply to Zac C (:zac) from comment #14) > Breaking down on tinderbox builds (to get slightly more granularity than > nightly builds) > > The Gaia commit for the first breaking tinderbox build was: > Gaia a980b8f54956ed470667033630b02492efdf4a07 > and Gecko revision: > Gecko 8abc76dedec2 > > Linked to here: > https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/tinderbox-builds/ > mozilla-central-hamachi-eng/20140228130531/ Can I have access to this link? Or it is restricted to employees only..
No more crashes on: Alcatel One Touch Fire production (got from T-mobile Poland) B2G version: 1.4.0.0-prerelease master Platform version: 30.0a1 Build Identifier: 20140303114510 Git commit info: 2014-03-03 10:34:58 dfae3744 Even after several restarts.
(In reply to Jed Davis [:jld] from comment #29) > Does anyone have any idea why this didn't break on the emulator builds we > monitor with TBPL? If there's a known class of potential regressions that > our test automation is missing, we should at least open a bug about it. Yep, I've discussed this with our automation friends today: this bug could have been caught if TBPL was running the integration tests on the emulator. We have a bug, this is Bug 916368.
(In reply to Marcela Oniga from comment #33) > > Linked to here: > > https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/tinderbox-builds/ > > mozilla-central-hamachi-eng/20140228130531/ > > Can I have access to this link? Or it is restricted to employees only.. Yes this is restricted, sorry about this.
(In reply to Julien Wajsberg [:julienw] from comment #35) > (In reply to Jed Davis [:jld] from comment #29) > > Does anyone have any idea why this didn't break on the emulator builds we > > monitor with TBPL? If there's a known class of potential regressions that > > our test automation is missing, we should at least open a bug about it. > > Yep, I've discussed this with our automation friends today: this bug could > have been caught if TBPL was running the integration tests on the emulator. > > We have a bug, this is Bug 916368. Oh, and we actually run this automation on a device, Bug 978458 (dupe) was created thanks to this. But it's not on TBPL.
Target Milestone: --- → 1.4 S3 (14mar)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: