Closed Bug 1081528 Opened 6 years ago Closed 6 years ago

[Dialer] Blue screen after terminating a call in Dialer

Categories

(Firefox OS Graveyard :: Gaia::System::Window Mgmt, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:2.1+, b2g-v2.0 unaffected, b2g-v2.1 verified, b2g-v2.2 unaffected)

VERIFIED FIXED
2.1 S7 (24Oct)
blocking-b2g 2.1+
Tracking Status
b2g-v2.0 --- unaffected
b2g-v2.1 --- verified
b2g-v2.2 --- unaffected

People

(Reporter: marcia, Assigned: sfoster)

References

Details

(4 keywords, Whiteboard: [caf priority: p2][systemsfe][CR 737916])

Attachments

(4 files)

[Blocking Requested - why for this release]: Very visible UI regression.

Flame, while running:

Gaia   f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1
SourceStamp d813d79d3eae
BuildID 20141011000201
Version 34.0a2
319 MB RAM

STR:
1. Perform an OTA from yesterday's build to today's build
2. Open the Contacts App
3. Select a Favorite and press the phone icon to dial the contact
4. Terminate the call 

Expected: I would see something other than a blue screen
Actual: I see a Blue Screen

This is 100% reproducible with these STR in today's build. For reference this blue screen was also seen in Bug 1080632. Pressing the home button allows a recovery option. 

I also checked and it isn't just limited to the path of dialing a contact from the Contact app - it is happening with any call I make from Dialer, Missed Calls, etc

Logcat attached.
Attached file bluescreen.txt
We just ran into this on today's build as well. We did not do an OTA but the blue screen occurred after ending a phone call. We check and this issue did not occur on yesterday's build. Adding qawanted for the regression window.
This issue only occurs on 2.1 Flame KK full flash and shallow flash.

Environmental Variables (shallow):
Device: Flame 2.1
Build ID: 20141011053325
Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1
Gecko: e05a92abe9a8
Version: 34.0a2
Firmware Version: v180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Environmental Variables (full)
Device: Flame 2.1
BuildID: 20141011000201
Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1
Gecko: d813d79d3eae
Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf
Version: 34.0a2 (2.1)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Issue does not occur:

Environmental Variables (shallow):
Device: Flame 2.0
BuildID: 20141010074705
Gaia: 6effca669c5baaf6cd7a63c91b71a02c6bd953b3
Gecko: 54ec9cb26b59
Version: 32.0 (2.0) 
Firmware Version: L1TC10011800
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Environmental Variables (shallow):
Device: Flame 2.2
BuildID: 20141011031924
Gaia: 95f580a1522ffd0f09302372b78200dab9b6f322
Gecko: 3f6a51950eb5
Version: 35.0a1 (2.2) 
Firmware Version: L1TC10011800
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0
Flags: needinfo?(jmitchell)
QA Contact: jmercado
Bug 1067649 seems to be the cause of this issue.

Aurora Regression Window:

Last Working 
Environmental Variables:
Device: Flame 2.1
BuildID: 20141010084606
Gaia: d18e130216cd3960cd327179364d9f71e42debda
Gecko: 610ee0e6a776
Version: 34.0a2 (2.1) 
Firmware Version: L1TC10011800
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

First Broken 
Environmental Variables:
Device: Flame 2.1
BuildID: 20141010090103
Gaia: 6cc93702284648c4f697cf89e3cecea6f26187dc
Gecko: 6c30565dae54
Version: 34.0a2 (2.1) 
Firmware Version: L1TC10011800
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Last Working gaia / First Broken gecko - Issue does NOT occur
Gaia: d18e130216cd3960cd327179364d9f71e42debda
Gecko: 6c30565dae54

First Broken gaia / Last Working gekko - Issue DOES occur
Gaia: 6cc93702284648c4f697cf89e3cecea6f26187dc
Gecko: 610ee0e6a776

Gaia Pushlog: https://github.com/mozilla-b2g/gaia/compare/d18e130216cd3960cd327179364d9f71e42debda...6cc93702284648c4f697cf89e3cecea6f26187dc
QA Whiteboard: [QAnalyst-Triage?]
Broken by Bug 1067649 - can you take a look Sam?
Blocks: 1067649
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell) → needinfo?(sfoster)
(In reply to Joshua Mitchell [:Joshua_M] from comment #5)
> Broken by Bug 1067649 - can you take a look Sam?

This seems unlikely as 1067649 didn't touch any window management stuff outside the task manager, but yes I'll take a look.
Flags: needinfo?(sfoster)
I can also reproduce this, but simply dialing a call and just disconnecting.   flagging this as smoketest and qablocker.

Also attaching my logcat for review
Keywords: qablocker, smoketest
Attached file logcat
my logcat
This also requires battery pull for me, so its certainly urgent.
(In reply to Tony Chung [:tchung] from comment #9)
> This also requires battery pull for me, so its certainly urgent.

For me, I can recover by long pressing the home button and swiping the card away.
Attached image BlueScreen.png
1. Call out. 
2. Hang up.

Home button is not functioning, need to long tap on home.
Per the comments it like more related to systemsfe, so i'm changing the component.
Feel free to correct.
Component: Gaia::Dialer → Gaia::System::Window Mgmt
Whiteboard: [systemsfe]
Duplicate of this bug: 1081912
blocking-b2g: 2.1? → 2.1+
Possible dupe of 1081565, investigating.
Assignee: nobody → sfoster
This got missed in the confusion, we were listening for and handling attentionscreenopened events always instead of only when showing. This resulted in a call to exitToApp which would open the homescreen without closing the call log window. There may be other dupes out there - it would have affected any attention window.

We need test coverage for which events are handled in which state. We're going to review coverage this week, I'll see what what is tested in master and file a new bug to fill this gap if necessary.
Attachment #8504183 - Flags: review?(alive)
Comment on attachment 8504183 [details] [review]
Github PR: Only handle attentionscreenopened events while TM is showing

r=me with a small unit test asserting that we do none of the exitToApp work if the event is dispatched while we're not displayed.
Attachment #8504183 - Flags: review?(alive) → review+
Updated the PR with these changes: 

* remove 'home' listener and move it to _registerShowingEvents, as it was simply doing a this.isShown() anyhow. 
* reordered the event listener statements within _registerShowingEvents/_unregisterShowingEvents for consistency with master (gives better diff)
* added unit tests to ensure the events are handled as expected when hidden and when showing. 

Gaia-Try job at https://tbpl.mozilla.org/?tree=Gaia-Try&rev=98a991bcc0bb - I'll request approval for 2.1 when that's green
One thing to note here - this bug was reproduced 100% on an OTA to today's aurora build

Gecko: ad497694e258
Gaia: f5d4ff60ffed8961f7d0380ada9d0facfdfd56b1

but I haven't yet been able to repro using https://pvtbuilds.mozilla.org/pvt/mozilla.org/b2gotoro/nightly/mozilla-b2g34_v2_1-flame-kk/

Gaia   d18e130216cd3960cd327179364d9f71e42debda
SourceStamp 610ee0e6a776
BuildID 20141013001201
Version 34.0a2

Since this bug has come and gone in other builds (See Bug 1080632) let's see how it looks after the fix.
Sam, was this a merge mistake or is this bug also present on master?
I want to understand why we didn't see this bug after we landed bug 1067649 on master.
Flags: needinfo?(sfoster)
The task manager code has diverged between 2.1 and master as a result of not uplifting bug 1061324. Ultimately, bug 1067649 was fallout from that and only affected 2.1 - we didn't land that patch on master. What I don't yet understand here is why this wouldn't repro every time on 2.1, and why it apparently only showed up after landing bug 1067649 - its a clear mistake with what should be clear consequences. Must be that the change in the exitToApp logic in 1067649 revealed this bug. I can figure it out, in the meantime the patch is ready to land.
Flags: needinfo?(sfoster)
(In reply to Sam Foster [:sfoster] from comment #20)
> The task manager code has diverged between 2.1 and master as a result of not
> uplifting bug 1061324. Ultimately, bug 1067649 was fallout from that and
> only affected 2.1 - we didn't land that patch on master. What I don't yet
> understand here is why this wouldn't repro every time on 2.1, and why it
> apparently only showed up after landing bug 1067649 - its a clear mistake
> with what should be clear consequences. Must be that the change in the
> exitToApp logic in 1067649 revealed this bug. I can figure it out, in the
> meantime the patch is ready to land.

Ok waiting for green smoke-tests for master was not really helpful in this case. Maybe we should add this criteria for our uplift approvals. Thanks for clarification!
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][COM=Gaia::System]
Duplicate of this bug: 1082511
Target Milestone: --- → 2.1 S7 (24Oct)
Whiteboard: [systemsfe] → [systemsfe][CR 737916]
Comment on attachment 8504183 [details] [review]
Github PR: Only handle attentionscreenopened events while TM is showing

[Approval Request Comment]
[Bug caused by] (feature/regressing bug #): Task manager
[User impact] if declined: All attentionscreens including calls will exit improperly
[Testing completed]: New unit tests, Gaia-Try and manual testing on 2.1 flame
[Risk to taking this patch] (and alternatives if risky): Low risk. I could pare the patch down further to eliminate a couple of lines of refactoring, but this patch aligns closer to master now and the new tests should guard against similar regressions in task manager. 
[String changes made]: None
Attachment #8504183 - Flags: approval-gaia-v2.1?(fabrice)
Attachment #8504183 - Flags: approval-gaia-v2.1?(fabrice) → approval-gaia-v2.1+
I suspect if we look we'll find we were throwing exceptions in TaskManager's exitToApp when an attentionopened event was fired while the task manager was hidden (e.g. by the call screen), which was masking this bug. When that was fixed in bug 1047143 (testing this.unfilteredStack before iterating though it), we started seeing the results of the exitToApp calls. Interestingly, this may also have been the root cause of bug 1065511.
Just making a note that there are a few Gij failures in the pull request. I think that this is ok given that these errors are also happening on mozilla-aurora, see: https://treeherder.mozilla.org/ui/#/jobs?repo=mozilla-aurora&revision=ad497694e258&searchQuery=b2g_ubuntu64_vm%20mozilla-aurora%20opt%20test%20gaia-integration

I'm looking into these now, but I suspect that we just need to uplift some things.
Merged into v2.1: https://github.com/mozilla-b2g/gaia/commit/a93410a4ef8ff11ff042e2ccbb26001eddd03285
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [systemsfe][CR 737916] → [caf priority: p1][systemsfe][CR 737916]
Seth, please verify that this issue is fixed on tomorrow's build.
Flags: needinfo?(smiko)
Keywords: verifyme
Duplicate of this bug: 1082669
Verified fixed on Flame 2.1 (319mb/full flash)

Actual result:
Blue screen is NOT present after ending a call. 

Environmental Variables:
Device: Flame 2.1
BuildID: 20141015001201
Gaia: 379ea4c9dd6d3f8ca2f79ce59c15f6afe6e557c3
Gecko: 4853208cb48a
Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf
Version: 34.0 (2.1)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
Status: RESOLVED → VERIFIED
Flags: needinfo?(smiko) → needinfo?(ktucker)
Keywords: verifyme
Flags: needinfo?(ktucker)
Whiteboard: [caf priority: p1][systemsfe][CR 737916] → [caf priority: p2][systemsfe][CR 737916]
You need to log in before you can comment on or make changes to this bug.