[QRD KK Only] homescreen is not displayed after pressing home key

RESOLVED DUPLICATE of bug 1055299

Status

Firefox OS
Gaia::System::Window Mgmt
--
critical
RESOLVED DUPLICATE of bug 1055299
4 years ago
4 years ago

People

(Reporter: tkundu, Assigned: alive)

Tracking

unspecified
ARM
Gonk (Firefox OS)
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:2.0+)

Details

(Whiteboard: [caf priority: p1], URL)

Attachments

(4 attachments)

+++ This bug was initially created as a clone of Bug #1050751 +++

+++ This bug was initially created as a clone of Bug #1045332 +++

[Blocking Requested - why for this release]:

Issue : Repeated presses of home button do not bring up homescreen, neither does holding the home button key.
Created attachment 8476059 [details]
about-memory-0.tar.bz2

I have all logs in https://bugzilla.mozilla.org/show_bug.cgi?id=1050751#c63

but I re-collected cc-logs again for bug 1050751 Comment 65
Flags: needinfo?(khuey)
Flags: needinfo?(kgrandon)
Please note that we also have gdb access to affected debug.
(In reply to Tapas Kumar Kundu from comment #2)
> Please note that we also have gdb access to affected debug.
Please note that we also have gdb access to affected device for debugging . Sorry for posting again .
I will look into this issue with kyle, or someone from the systems team.
Flags: needinfo?(kgrandon)
Is logcat available as well?
Ok found it in bug 1050751
(In reply to Kevin Grandon :kgrandon from comment #4)
> I will look into this issue with kyle, or someone from the systems team.

just as an update: We have both marionette and gdb access to affected device. Please let us know if you want to try something on device for debugging.
(In reply to Tapas Kumar Kundu from comment #7)
> (In reply to Kevin Grandon :kgrandon from comment #4)
> > I will look into this issue with kyle, or someone from the systems team.
> 
> just as an update: We have both marionette and gdb access to affected
> device. Please let us know if you want to try something on device for
> debugging.

Tapas, we need window mgmt folks here and that team is in Taipei, so give us a few hours and I will get in touch with you once alive/tim are able to look at this.
We think there might be another closing code path that needs the same workaround we did in bug 1047645.
Flags: needinfo?(khuey)
If we get this path after attempting to close the task manager, possibly through the browser - then it seems that we might be hitting this close path: https://github.com/mozilla-b2g/gaia/blob/5b7daa13fb14325f3571049118ff7e5ad1463c22/apps/system/js/app_window.js#L1653

It does seem like this could be racey.
Assigning to timdream to give it an owner, he can pass it along as necessary.
Assignee: nobody → timdream

Comment 12

4 years ago
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #11)
> Assigning to timdream to give it an owner, he can pass it along as necessary.

I also pinged Tim, and hope he can have time to look at this today.
Assignee: timdream → alive
Could you guys take this and append the log to me?
https://github.com/mozilla-b2g/gaia/pull/22695

You ignore my requests in previous bug :(
Flags: needinfo?(tkundu)
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #13)
> Could you guys take this and append the log to me?
> https://github.com/mozilla-b2g/gaia/pull/22695
> 
> You ignore my requests in previous bug :(

OK I am enabling this log. Sorry about that missing log . We thought that this issue is fixed but it has came back again !
Do you disagree with comment 10?  If we wait for new logs that means we're waiting several days ...
Flags: needinfo?(alive)
That comment is totally backwards. Closing the browser through the task manager is what I meant to say.
(In reply to Kevin Grandon :kgrandon from comment #16)
> That comment is totally backwards. Closing the browser through the task
> manager is what I meant to say.

(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #15)
> Do you disagree with comment 10?  If we wait for new logs that means we're
> waiting several days ...

Since we were not fixing the problem by a guess, we should not "fix it again" by another guess,
otherwise we will reopen, reopen, and reopen again. Is this expected?

Kevin if you want to try something out (but I still doubt because the STR seems task manager is not a must-have) please do.
Flags: needinfo?(alive)
Created attachment 8476767 [details]
cr700028_newlogs.zip
Flags: needinfo?(tkundu)
Flags: needinfo?(khuey)
Flags: needinfo?(kgrandon)
Flags: needinfo?(alive)
Flags: needinfo?(khuey)
Flags: needinfo?(kgrandon)
(In reply to Tapas Kumar Kundu from comment #18)
> Created attachment 8476767 [details]
> cr700028_newlogs.zip

Hi, still looking at the log, could you show me the DOM structure under system app #windows element?
Flags: needinfo?(alive) → needinfo?(tkundu)
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #19)
> (In reply to Tapas Kumar Kundu from comment #18)
> > Created attachment 8476767 [details]
> > cr700028_newlogs.zip
> 
> Hi, still looking at the log, could you show me the DOM structure under
> system app #windows element?

It's in the cc logs.  Grep it for 'windows'.  You can piece together the DOM by looking at the parent/children pointers.
Created attachment 8477177 [details]
simple.log

From the simplified log, what we could see is, in the last minute, several apps are launched "at the same time" without going to homescreen
Gallery -> Browser -> FM -> ...

So I think what happened is:
AppWindowManager switches from Gallery to Browser, and FM is opened between this, so the Gallery app is still opened.

First thing to my mind is: Are you spamming mozApps launch instead of clicking icon in homescreen?

Another strange point is homescreen/browser app is having themselves as activity callee. I think this is also related to the strange launching sequences.

I am going to do some local experiments and try to provide another logging patch to prove my thought.
Kyle has answered in Comment 20. Please put NI on me if you think that I can help you to move ahead quickly.
Flags: needinfo?(tkundu)
Sorry, totally don't know what he/you means. Grep your file with "windows" get me some pointer like stuff..
Yes, this is the C++ side of the DOM.  Treat the pointers as opaque identifiers and follow the edges.  e.g.

grep for 'windows' and you find 

0xb0bcb060 [rc=13] FragmentOrElement (xhtml) div id='windows' app://system.gaiamobile.org/index.html
> 0xb0bc32e0 mNodeInfo
> 0xb0bc3330 GetParent()
> 0xa8f93840 mSlots->mChildNodes
> 0xb0bc96c0 mAttrsAndChildren[i]
> 0xb0bc9700 mAttrsAndChildren[i]
> 0xb0bc9780 mAttrsAndChildren[i]
> 0xa8adea10 mAttrsAndChildren[i]
> 0xace1d240 mAttrsAndChildren[i]
> 0xa8f4c100 mAttrsAndChildren[i]
> 0xadbf4800 mSlots->mClassList

if you follow the pointer for GetParent() you'll find

0xb0bc3330 [rc=111] FragmentOrElement (xhtml) div id='screen' class='software-button-disabled' app://system.gaiamobile.org/index.html
...

if you follow the last pointer from mAttrsAndChildren[i] on the 'windows' div you'll find

0xa8f4c100 [rc=9] FragmentOrElement (xhtml) div id='homescreen' class='appWindow homescreen active render' app://system.gaiamobile.org/index.html
...

You can poke around the DOM tree this way and see what it looks like.  It's slightly tedious but it works.
https://github.com/mozilla-b2g/gaia/pull/23177

Tapas, here is another logging patch, please try it out and give me the log again, thanks.
Flags: needinfo?(tkundu)
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #25)
> https://github.com/mozilla-b2g/gaia/pull/23177
> 
> Tapas, here is another logging patch, please try it out and give me the log
> again, thanks.

Should I enable both 

https://github.com/mozilla-b2g/gaia/pull/23177
https://github.com/mozilla-b2g/gaia/pull/22695 

Please confirm again
Flags: needinfo?(tkundu) → needinfo?(alive)
I discussed with alive in #gaia IRC and enabled only https://github.com/mozilla-b2g/gaia/pull/23177 in our test build . I will update here soon
Created attachment 8477263 [details] [review]
Spammer + Blocker

I added spammer module to programmatically launch apps in a short timing.
<Usage>
var s = new Spammer();
s.start();
s.launch(10); // Launch 10 times

With this module I am able to have two opened apps at the same time.
That means, when pressing home button, only AppWindowManager._activeApp is closed. So the other one keeps opened.

The problem might be: when AppWindowManager had the appopened event it will update the activeApp, but it may come from an app which is not the current app nor the previous app. This is because we are launching apps too quickly.

My thought/solution right now: it's difficult to change the logic of transition state machine right now, so I tried to close all 'opened' app when active app is updated via appopened event. One thing we need to note is, the spammer is not a normal user behavior. If we really launch apps so quickly, the whole phone may become un-usable and the apps won't load anymore. So I want to know how you test because I suspect those launch events are not coming from the click from homescreen but programmatically triggered just like my spammer.

Also please try out this patch to see if it fixes this issue or not.

Please note this is not a right fix.
Flags: needinfo?(alive) → needinfo?(tkundu)
I also noted following problems (irrelated I think) in the log
* Homescreen is Homescreen's activity caller/callee.
* StackManager/EdgeSwipeDetector has some Javascript error.
* Sometimes when active app is updating, "this._activeApp is undefined" shows.
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #28)
> Created attachment 8477263 [details] [review]
> Spammer + Blocker
> 
> I added spammer module to programmatically launch apps in a short timing.
> <Usage>
> var s = new Spammer();
> s.start();
> s.launch(10); // Launch 10 times
> 
> With this module I am able to have two opened apps at the same time.
> That means, when pressing home button, only AppWindowManager._activeApp is
> closed. So the other one keeps opened.
> 
> The problem might be: when AppWindowManager had the appopened event it will
> update the activeApp, but it may come from an app which is not the current
> app nor the previous app. This is because we are launching apps too quickly.
> 
> My thought/solution right now: it's difficult to change the logic of
> transition state machine right now, so I tried to close all 'opened' app
> when active app is updated via appopened event. One thing we need to note
> is, the spammer is not a normal user behavior. If we really launch apps so
> quickly, the whole phone may become un-usable and the apps won't load
> anymore. So I want to know how you test because I suspect those launch
> events are not coming from the click from homescreen but programmatically
> triggered just like my spammer.
> 
> Also please try out this patch to see if it fixes this issue or not.
> 
> Please note this is not a right fix.

I landed this patch in our test build and asked internal test team to test it.
Flags: needinfo?(tkundu)
Flags: needinfo?(tkundu)

Updated

4 years ago
blocking-b2g: --- → 2.0+
I tried with your patch and I can reproduce it again. Regular stability test team is still trying to reproduce with your patch.

I added a comment in bug 1055299 comment 31 suspecting root cause of this issue .Could you please take a look ?
Flags: needinfo?(tkundu) → needinfo?(alive)
(In reply to Tapas  (always NI me) from comment #31)
> I tried with your patch and I can reproduce it again. Regular stability test
> team is still trying to reproduce with your patch.
> 
> I added a comment in bug 1055299 comment 31 suspecting root cause of this
> issue .Could you please take a look ?

I left some comments there. Could you give the log again?
Flags: needinfo?(alive)
I took a look at the simple log again and found something strange.

01-10 23:24:44.809 [Dump: AppWindow][Clock][AppWindow_6][234.688]invokingApp has been ENDED!
01-10 23:24:44.809 [Dump: AppWindow][Clock][AppWindow_6][234.688]closing,closed,::,complete
01-10 23:24:44.809 [Dump: AppWindow][Clock][AppWindow_6][234.689] publishing internal event: closed
01-10 23:24:44.809 [Dump: AppWindow][Clock][AppWindow_6][234.691]set visibility -> ,false
01-10 23:24:44.809 [Dump: AppWindow][Clock][AppWindow_6][234.694]screenshot state -> ,screenshot
01-10 23:24:44.809 [Dump: AppWindow][Clock][AppWindow_6][234.695] publishing external event: closedundefined
01-10 23:24:44.819 [Dump: AppWindow][Clock][AppWindow_6][234.695] publishing internal event: close
01-10 23:24:44.819 [Dump: AppWindow][Clock][AppWindow_6][234.696] publishing external event: closeundefined
01-10 23:24:44.819 [Dump: AppWindow][Video][AppWindow_1][235.839] Handling mozbrowservisibilitychange event...
01-10 23:24:44.819 [Dump: AppWindow][Video][AppWindow_1][235.844] Handling mozbrowservisibilitychange event...
01-10 23:24:44.819 [Dump: AppWindow][Video][AppWindow_1][235.845] publishing internal event: foreground
01-10 23:24:44.819 [Dump: AppWindow][Video][AppWindow_1][235.845] publishing external event: foregroundundefined

We are sending the video app to background at 234.691 but at 235.845 gecko tells me it's now foreground.

Comment 34

4 years ago
Hi Tapas, please confirm those launch events are coming from human clicking homescreen or programmatically triggered as comment 28 mentioned. Thanks.
Flags: needinfo?(tkundu)
(In reply to howie [:howie] from comment #34)
> Hi Tapas, please confirm those launch events are coming from human clicking
> homescreen or programmatically triggered as comment 28 mentioned. Thanks.

All launch events are coming from human clicking homescreen. We are not using any automated test for this CR .
Flags: needinfo?(tkundu)
(In reply to Tapas  (always NI me) from comment #35)
> (In reply to howie [:howie] from comment #34)
> > Hi Tapas, please confirm those launch events are coming from human clicking
> > homescreen or programmatically triggered as comment 28 mentioned. Thanks.
> 
> All launch events are coming from human clicking homescreen. We are not
> using any automated test for this CR .

I hope you had the log turned on in my patch. Still cannot reproduce it.
Most of us think this is not a valid bug because we cannot see it on our side. Please provide more information before we go on.

I was looking at the old log and have trouble to know why the patch does not fix it.
There is no foreground app until the gallery app becomes always opened. This is different from what you said in bug 1055299.
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #36)
> I hope you had the log turned on in my patch. Still cannot reproduce it.
> Most of us think this is not a valid bug because we cannot see it on our
> side. Please provide more information before we go on.
> 

I can reproduce it on FLAME (msm8610) 256MB (running JB with FFOS 2.0) . IMO, I think that we should fix bug 1055299 and that will fix this issue too (see bug 1055299 comment 31) . I mentioned STR in bug 1055299 comment 30.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1055299
Whiteboard: [caf priority: p1][CR 700028] → [caf priority: p1]
You need to log in before you can comment on or make changes to this bug.