Closed Bug 1076605 Opened 11 years ago Closed 10 years ago

[MTBF][App Launch] Apps failed to launch, stuck at icon splash

Categories

(Firefox OS Graveyard :: Gaia::System::Window Mgmt, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:2.1+, b2g-v2.1 fixed, b2g-v2.2 fixed)

RESOLVED FIXED
2.1 S9 (21Nov)
blocking-b2g 2.1+
Tracking Status
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: wachen, Assigned: alive)

References

Details

(Keywords: regression, Whiteboard: [mtbf])

Attachments

(10 files)

Attached image 2014-10-02-11-12-47.png
very high frequency of logcat refresh rate not able to get out of the stucked screen (screenshot as attached) Happened in both kk or jb based v2.1 (current aurora) Gaia 08be48c71d0b4999cedee89fe81de1a03c66436f Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/6e7e0a39f73b BuildID 20140930160205 Version 34.0a2
Blocks: MTBF-B2G
Attached image 2014-10-02-11-21-04.png
Gregor, anything that stands out in the log or can you re-direct to people whoc an help here. the MTBF team can give more info as required to debug this..please NI walter.
Flags: needinfo?(anygregor)
[Blocking Requested - why for this release]: 1/3 or test devices encountered this issue, not able to switch back to homescreen or other action from user's behavior.
blocking-b2g: --- → 2.1?
Just checked the logs attachment 8498608 [details], 8498609, and 8498611, not quite helpful. Walter, is it all the apps can't be launched or just some of them?
Nothing sticks out in the log. Maybe someone in Thinkers team can take a look at such a device and check the basic things like memory, what the parent process is doing, where the child process is stuck, if we even have a child process...
Flags: needinfo?(anygregor)
It reproduced today.
There's only one device reproduce the issue, I can still see a lot of: E/GeckoConsole( 232): Content JS LOG at dummy file:352 in GaiaApps.getDisplayedApp: app with origin 'app://verticalhome.gaiamobile.org' is displayed which should be fixed by bug 997909. The screen shows splash screen of camera app, and b2g-ps shows: APPLICATION SEC USER PID PPID VSIZE RSS WCHAN PC NAME b2g 0 root 232 1 263188 71204 ffffffff b5d81d14 R /system/b2g/b2g (Nuwa) 0 root 449 232 55104 476 ffffffff b6e878ac S /system/b2g/b2g Homescreen 2 u0_a8575 8575 449 84684 2736 ffffffff b6e878ac S /system/b2g/b2g (Preallocated a 2 u0_a10236 10236 449 61300 3540 ffffffff b6e878ac S /system/b2g/b2g Press home key doesn't bring up homescreen, it still stays at camera's splash. Long press home doesn't work, too. Press power key to shutdown screen and press again to turn it on, after unlock it is still camera's splash. I don't see anything special from the adb logs on MTBF server.
I can still see Nuwa forks "Preallocated" and "Find My Device" is launched for several times while monitoring b2g-ps.
Unlock to camera still shows only camera's splash screen. And from log: 10-06 23:44:40.260 232 232 E GeckoConsole: Content JS LOG at dummy file:352 in GaiaApps.getDisplayedApp: app with origin 'app://verticalhome.gaiamobile.org' is displayed Gaia thinks current displayed app is verticalhome. I will enable some logs in Gaia system app, and try to reproduce locally.
Attached file dump_html.tgz
page source dump includes system verticalhome
From attachment 8500920 [details], the src of verticalhome iframe is a length 14011 string: app://verticalhome.gaiamobile.org/index.html#root1412590379634141259044985514125904884221412590510201... Not quite sure what the number "1412590xxxxxx" is, and whether it relates to the issue. Can we have Gaia developer to take a look at the system.html see if there's anything unusal? Probably this is also a rendering issue.
(In reply to Ting-Yu Chou [:ting] from comment #14) > app://verticalhome.gaiamobile.org/index. > html#root1412590379634141259044985514125904884221412590510201... > > Not quite sure what the number "1412590xxxxxx" is, and whether it relates to > the issue. It is a Date, the long url is created from here: http://lxr.mozilla.org/gaia/source/apps/system/js/homescreen_window.js#198.
Me & Paul just talked to Alive, and he thinks this seems related to Gaia.
Flags: needinfo?(alive)
Next time please do this in system app before ANY MTBF test. AppWindow.prototype._DEBUG = true; AppWindowManager.DEBUG = true; Without log no one is able to fix anything. I will try to add a debug option in developer settings.
Flags: needinfo?(alive)
Filed bug 1079699 for the long src of homescreen iframe in comment 14.
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #17) > I will try to add a debug option in developer settings. I'm not really a fan of this, but should we try to use the DUMP() function we already have for app windows as well? https://github.com/mozilla-b2g/gaia/blob/master/shared/js/dump.js
Haven't repro the issue after enabling Gaia logs, Walter will setup 10 devices for another run during the weekend. Note tomorrow (10/10) is holiday in Taiwan.
(In reply to Kevin Grandon :kgrandon from comment #19) > (In reply to Alive Kuo [:alive][NEEDINFO!] from comment #17) > > I will try to add a debug option in developer settings. > > I'm not really a fan of this, but should we try to use the DUMP() function > we already have for app windows as well? > > https://github.com/mozilla-b2g/gaia/blob/master/shared/js/dump.js Maybe, I just don't want to occupy the console to everyone working on system app.
Whiteboard: [mtbf]
Triage reviewed, blocking+ for regression.
blocking-b2g: 2.1? → 2.1+
Keywords: regression
It reproduced after shutdown issue of Bug 1077292 relieved a little bit now. 3 our of 25 reproduced.
(In reply to Walter Chen[:ypwalter][:wachen] from comment #23) > It reproduced after shutdown issue of Bug 1077292 relieved a little bit now. > 3 our of 25 reproduced. Any log can be provided?
Nope, sorry, there were some issue getting the log. However, I will rerun it in a new settings today.
Walter, the log we need is what Alive asked in comment 17.
(In reply to Ting-Yu Chou [:ting] from comment #26) > Walter, the log we need is what Alive asked in comment 17.
Flags: needinfo?(wachen)
Summary: [MTBF][App Launch] Apps can't be launch → [MTBF][App Launch] Apps failed to launch, stuck at icon splash
Hi, Ting-Yu and Tim, I did understand that. However, it wan't easy to reproduce it with battery issue and other other blocking issues. I ran like 25 devices each time everyday, and it's hard for me to catch the bug due to other bugs.
Flags: needinfo?(wachen)
Alive, sorry that I misunderstood the solution. Could you please provide me a diff file for applying?
Flags: needinfo?(alive)
Attached patch log.patchSplinter Review
First level win mgmt logging patch
Flags: needinfo?(alive) → needinfo?(wachen)
Thanks. Applied in jenkins server. However, I haven't seem it in recent 2 rounds today (25 devices). I will keep trying to reproduce it.
Should we set RESOLVED WORKSFORME if still non-reproducible until Oct24?
No, it shouldn't be in that way. It's because of other blocking issues became even more serious as I stated in comment 28. So, we can hardly reproduce it. I believe after resolution of some other issues, this can be easily reproduced.
We will track this for another week. If this is not reproducible next week, we will close it.
I think this is fine as for now. I will open a new bug if this ever reproduce. Thanks for all the help from other people.
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(wachen)
Resolution: --- → WORKSFORME
This happened again in: BASE IMAGE: v188 PVT BUILD v2.1: 20141113161205 (Shallow Flash) It stucked on the launching screen of UI Tests (home button not working.) I think I also get logs for alive to look at this time.
Status: RESOLVED → REOPENED
Flags: needinfo?(alive)
Resolution: WORKSFORME → ---
OS: Linux → Gonk (Firefox OS)
Hardware: x86_64 → ARM
Alive, this is with comment 17 debugging. Can you take a look?
(In reply to Walter Chen[:ypwalter][:wachen] from comment #38) > Alive, this is with comment 17 debugging. Can you take a look? WOW Plenty files in the zip... * What's the problem during the logging? * What's the timing when the problem happens?
Flags: needinfo?(alive)
This is a bug failed that the app failed to launch and stucked in the app launch screen. Even if we use home button, it won't go back to homescreen. The problem happens right before the test failed. I guess you can get the log file from
Assignee: nobody → alive
The log does not state something wrong and the last opened app should be message. Possible reason: the log is missing, or something not related to win mgmt happens. Also app manager is not usable. adb shell b2g-info works but cannot remote debug the device.
OK got it logcat454:11-16 05:31:52.038 205 205 E GeckoConsole: [JavaScript Error: "sandbox is null" {file: "chrome://marionette/content/marionette-listener.js" line: 643}] logcat454:11-16 05:31:52.038 205 205 E GeckoConsole: [JavaScript Error: "TypeError: this.browser is null" {file: "app://system.gaiamobile.org/js/app_window.js" line: 240}] logcat454:11-16 05:31:52.088 205 205 E GeckoConsole: [JavaScript Error: "sandbox is null" {file: "chrome://marionette/content/marionette-listener.js" line: 643}] logcat454:11-16 05:31:52.088 205 205 E GeckoConsole: [JavaScript Error: "TypeError: this.iframe is null" {file: "app://system.gaiamobile.org/js/app_window.js" line: 1387}] logcat454:11-16 05:31:52.238 [AppWindow][UI tests][AppWindow_8][129130.234] getScreenshot timeout! logcat454:11-16 05:31:52.238 [AppWindow][UI tests][AppWindow_8][129130.254] nextpaint is timeouted. logcat454:11-16 05:31:52.238 [AppWindow][UI tests][AppWindow_8][129130.430] Handling mozbrowsererror event...
Rough thought: The app is killed just between it is launched and setVisible/resize is called
Attached file 2.1 patch
The root cause seems to be the app is killed between the launch and the real open. A possible solution might be not to remove the whole browser element but only reset the url. But for v2.1 let's do some more protections. Walter please apply this patch as well as the debugging patch, thx.
Attachment #8524323 - Flags: review?(timdream)
Attachment #8524323 - Flags: feedback?(wachen)
Attachment #8524323 - Flags: review?(timdream) → review+
Component: Stability → Gaia::System::Window Mgmt
currently running on flame.v2.1.mtbf7.319 task. We will know the result next week.
Flags: needinfo?(wachen)
It's even worse now. All the screen stucked at homescreen without any icon on it.
Flags: needinfo?(wachen) → needinfo?(alive)
(In reply to Walter Chen[:ypwalter][:wachen] from comment #47) > It's even worse now. All the screen stucked at homescreen without any icon > on it. Log please.
Flags: needinfo?(alive) → needinfo?(wachen)
Flags: needinfo?(alive)
There's no http://mtbf-1:8080 website. What is that? BTW I can't repro.
Flags: needinfo?(alive)
https://mozilla.box.com/s/zbwk9pbd425phwfik0do From this log(I cannot see other mtbf-1 link): * Somebody calls window.close for Smart Collection * Somebody calls window.close for Homescreen and triggers something really impossible to happen in our code base... * Homescreen instance id is AppWindow6 which is already weird. It should be 'homescreen'. * There is FTU running but it should not. I wonder: 1. This is not mozilla code base... 2. Your test framework is doing something weird - for example, switch to the iframe and call window.close for it. 3. Your test framework randomly call mozApps.launch at booting for your device instead of letting system app doing the launching task. I don't think this log could be reproduced on real device usage. And sorry I cannot go on with test like this.
So several questions: * What is mtbf-1? * What is your build info? * Could you manually reproduce? It only took 66 seconds to fail in your log. So I guess it will not be too difficult to repro if this really happens. But for me I cannot reproduce in 10 tries(v2.1 pvt + 319MB + patch). Everything is just fine. * What do you do in your test? Could you record it if you cannot reproduce it manually?
Flags: needinfo?(wachen)
1. If you are in Mozilla intranet, you should be able to connect to it. http://mtbf-1.corp.tpe1.mozilla.com:8080/ 2. base v188 with v2.1 gaia/gecko(shallow flash) 3. In which file do you see it failed? I could try to reproduce it manually.
Flags: needinfo?(wachen)
Attachment #8524323 - Flags: feedback?(wachen) → feedback+
Alive, thanks. This patch is good to go.
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Comment on attachment 8524323 [details] [review] 2.1 patch [Approval Request Comment] [Bug caused by] (feature/regressing bug #): Suspending app (bug 935750) [User impact] if declined: The app has opportunity to be killed after the app is ready to open and before the real opening transition. The browser iframe is removed hence any access after that will cause javascript error. [Testing completed]: all green on tbpl [Risk to taking this patch] (and alternatives if risky): what's fixed is to add some protection to avoid invalid DOM element access so riskless [String changes made]: No
Attachment #8524323 - Flags: approval-gaia-v2.1?
Attachment #8524323 - Flags: approval-gaia-v2.1? → approval-gaia-v2.1+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: