Closed Bug 1084495 Opened 6 years ago Closed 6 years ago

[Bluetooth] White screen appears after confirming pairing error

Categories

(Firefox OS Graveyard :: Gaia::System::Window Mgmt, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

(blocking-b2g:2.1+, b2g-v2.0 unaffected, b2g-v2.1 verified, b2g-v2.2 verified)

VERIFIED FIXED
2.1 S8 (7Nov)
blocking-b2g 2.1+
Tracking Status
b2g-v2.0 --- unaffected
b2g-v2.1 --- verified
b2g-v2.2 --- verified

People

(Reporter: ychung, Assigned: alive)

References

()

Details

(Keywords: regression, Whiteboard: [caf priority: p2][2.1-flame-test-run-3] [CR 754769])

Attachments

(3 files)

Description:
A white screen appears when the user selects "OK" on the pairing error message after pairing timeout takes place.
   
Repro Steps:
1) Update a Flame device to BuildID: 20141017001201.
2) Open Settings > Bluetooth > and search for other devices.
3) Select a device detected.
4) When the pairing request screen appears on both devices, do not select any options.
5) When the pairing times out, select "OK" on the "Unable to pair devices" dialog on DUT.
  
Actual:
A white screen appears.
  
Expected: 
The Bluetooth settings screen reappears.
  
Environmental Variables:
Device: Flame 2.1
BuildID: 20141017001201
Gaia: 1ea74943cfe525c76a074ca1d7de8e51a70f6b98
Gecko: 2befa902ff5c
Gonk: 05aa7b98d3f891b334031dc710d48d0d6b82ec1d
Version: 34.0 (2.1)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
  
Repro frequency: 100%
Link to failed test case: https://moztrap.mozilla.org/manage/case/3510/
See attached: video clip, logcat
http://youtu.be/-vCMwk5RQ7k
This issue also reproduces on Flame 2.2:

Flame 2.2 

Device: Flame 2.2 Master KK (319mb) (Full Flash)
BuildID: 20141017040208
Gaia: abef62c0623e5504a97b4fd411e879a67b285b52
Gecko: ae1dfa192faf
Gonk: 52c909e821d107d414f851e267dedcd7aae2cebf
Version: 36.0a1 (2.2 Master)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0

A white screen appears after pairing timeout > "ok" on error dialog.
==========================================

This issue does NOT reproduce on Flame 2.0:

Flame 2.0

Device: Flame 2.0 KK (319mb) (Full Flash)
BuildID: 20141017000203
Gaia: 9c7dec14e058efef81f2267b724dad0850fc07e4
Gecko: c17df9fe087d
Gonk: 05aa7b98d3f891b334031dc710d48d0d6b82ec1d
Version: 32.0 (2.0)
Firmware: V180
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

The Bluetooth settings screen reappears properly after selecting "ok" on the error dialog.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(dharris)
See Also: → 1082669
I've noticed this as well. I don't have a logcat or stack trace ATM, but it seemed related to signal observers not being cleaned up correctly. This makes and assertion in Gecko fail, and the screen remains white.
Request regressionwindow-wanted here.
[Blocking Requested - why for this release]:

Nominating this to block for 2.1. If the user times out on the BT pairing page they will get a white screen. This is poor UX, and a regression
blocking-b2g: --- → 2.1?
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(dharris)
QA Contact: jmercado
Duplicate of this bug: 1087843
I'm able to reproduce the issue on v2.2 with following build version. After the white screen displayed, long press home key to card view. Then, click settings app in card view to go back settings app. The panel will be rendering normally. This is a graphic issue. And I find out a warning log while I go back to settings app from card view.

W/GeckoConsole( 8599): [JavaScript Warning: "Error in parsing value for 'visibility'.  Declaration dropped." {file: "app://system.gaiamobile.org/index.html" line: 0 column: 12 source: "visibility: undefined"}]

================================================================================
Gaia-Rev        76d80e5b137dbe96dd41b132fc6a57deae8ea157
Gecko-Rev       https://hg.mozilla.org/mozilla-central/rev/34c66dadd802
Build-ID        20141022175116
Version         36.0a1
Device-Name     flame
FW-Release      4.4.2
FW-Incremental  eng.cltbld.20141022.225857
FW-Date         Wed Oct 22 22:59:07 EDT 2014
Bootloader      L1TC10011800
QA Whiteboard: [COM=Bluetooth]
Looks like the symptom is same with the description in bug 1074063. Duplicated it. Feel free to reopen if my suspect is wrong.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1074063
I hadn't finished the regression window for this bug before it was closed but here was the central window if it helps.

Central Regression Window:

Last Working 
Environmental Variables:
Device: Flame 2.2
BuildID: 20141001055321
Gaia: 0e280591881d44b80f456bc27e12d9114c218868
Gecko: fe0afc101ad2
Version: 35.0a1 (2.2) 
Firmware Version: v184
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

First Broken 
Environmental Variables:
Device: Flame 2.2
BuildID: 20141001060621
Gaia: a23d2c490b39c4699c9375e25c4acdf396a2fa85
Gecko: 835ef55e175e
Version: 35.0a1 (2.2) 
Firmware Version: v184
User Agent: Mozilla/5.0 (Mobile; rv:35.0) Gecko/35.0 Firefox/35.0

Last Working gaia / First Broken gecko - Issue does NOT occur
Gaia: 0e280591881d44b80f456bc27e12d9114c218868
Gecko: 835ef55e175e

First Broken gaia / Last Working gekko - Issue DOES occur
Gaia: a23d2c490b39c4699c9375e25c4acdf396a2fa85
Gecko: fe0afc101ad2

Gaia Pushlog: https://github.com/mozilla-b2g/gaia/compare/0e280591881d44b80f456bc27e12d9114c218868...a23d2c490b39c4699c9375e25c4acdf396a2fa85
QA Whiteboard: [COM=Bluetooth] → [QAnalyst-Triage?][COM=Bluetooth]
Flags: needinfo?(jmitchell)
QA Whiteboard: [QAnalyst-Triage?][COM=Bluetooth] → [QAnalyst-Triage+][COM=Bluetooth]
Flags: needinfo?(jmitchell)
QA Contact: jmercado
blocking-b2g: 2.1? → 2.1+
Since bug 1074063 fixed, the issue is still reproduced. Once the white screen is appeared, long press home key to card view. Then, click settings app in card view. Settings app is rendered again. Pretty like a graphics issue here.

Alive, do you have any idea? Thanks.
Flags: needinfo?(alive)
This is never dupe of bug 1074063.
Status: RESOLVED → REOPENED
Flags: needinfo?(alive)
Resolution: DUPLICATE → ---
Since the bug is reopened, please help to find regression window, thanks.
Depends on: 1047645
The bug is coming from bug https://bugzilla.mozilla.org/show_bug.cgi?id=1047645 and https://bugzilla.mozilla.org/show_bug.cgi?id=927862
Assignee: nobody → alive
Component: Gaia::Bluetooth File Transfer → Gaia::System::Window Mgmt
Depends on: attention-window
In bug 1047645, we were introducing a workaround to make sure the destroy function is invoked after kill is called. However, when bug 927862 lands, it's having a mechanism to check the bottom app/attention window is repainted.

Now this case is due to
(1) The repaint of the bottom window is too late
(2) The destroy of the killed attention window happens before (1)
(3) Even attention window manager grants the close function of the killed attention window, the element of the attention window does not exist. Hence it's not firing the closing/closed event anymore. So attention window manager will not fire attention-inactive.

The proposed solution is publish closed event - if the destroyed is called with timeout - before destroying to make sure all the modules needs the attentionclosed event work well.
Attachment #8515747 - Flags: review?(etienne)
Comment on attachment 8515747 [details] [review]
https://github.com/mozilla-b2g/gaia/pull/25741

Can you tell me more about the sequence of events causing this situation (and this.element to be removed from underneath us)?

It feels weird to publish('closed') from a timeout outside of an appTransitionController... so I want to make sure I really understand what's happening. Thanks!
Attachment #8515747 - Flags: review?(etienne)
Comment 12 seems to be satisfying the regression window tag as it points to the cause of this bug, additionally it seems a patch is already in the works. Please re-add the tag if you disagree with this statement.
(In reply to Etienne Segonzac (:etienne) from comment #14)
> Comment on attachment 8515747 [details] [review]
> https://github.com/mozilla-b2g/gaia/pull/25741
> 
> Can you tell me more about the sequence of events causing this situation
> (and this.element to be removed from underneath us)?
> 
> It feels weird to publish('closed') from a timeout outside of an
> appTransitionController... so I want to make sure I really understand what's
> happening. Thanks!

1. Attention window is terminated by window.close()
2. Attention window tells attention window manager that it wants to close by requestClose because it's active now
3. Attention window manager is waiting the underlying settings app to be repainted by app.ready
4. Attention window is timeout-ed because of the timer introduced in bug 1047645.
5. Attention window destroyed itself and remove this.element
6. The ready callback is done and attention window manager try call close on the attention window but it failed because this.element is null then.
7. Because attentionclosed is not fired by anyone, attention window manager will not get attentionclosing to trigger attention-inactive.

I understand publishing closed for appTransitionController is something strange, but that's side effect coming from bug 1047645 which root cause was not known. If we want the remove the timeout hack in kill() we need to re-evaluate bug 1047645 which is a CAF blocker.

What do you suggest?
Flags: needinfo?(etienne)
Thanks for the detailed explanation.

(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #16)
> 1. Attention window is terminated by window.close()
> 2. Attention window tells attention window manager that it wants to close by
> requestClose because it's active now
> 3. Attention window manager is waiting the underlying settings app to be
> repainted by app.ready
> 4. Attention window is timeout-ed because of the timer introduced in bug
> 1047645.

This looks suspicious, the appTransitionController already has a safetytimeout and this one is even using the same duration...

> 5. Attention window destroyed itself and remove this.element

You mean by gecko?

> 6. The ready callback is done and attention window manager try call close on
> the attention window but it failed because this.element is null then.

Could we always go through "immediate-close" in AppTransitionController#requestClose if app.element is falsy?
(if 5. is out of our control we need to guard against it anyway)
Flags: needinfo?(etienne)
(In reply to Etienne Segonzac (:etienne) from comment #17)
> Thanks for the detailed explanation.
> 
> (In reply to Alive Kuo [:alive][NEEDINFO!] from comment #16)
> > 1. Attention window is terminated by window.close()
> > 2. Attention window tells attention window manager that it wants to close by
> > requestClose because it's active now
> > 3. Attention window manager is waiting the underlying settings app to be
> > repainted by app.ready
> > 4. Attention window is timeout-ed because of the timer introduced in bug
> > 1047645.
> 
> This looks suspicious, the appTransitionController already has a
> safetytimeout and this one is even using the same duration...

My guess is somehow AppWindowManager will not response the requestClose, so the timer in appTransitionController doesn't apply.

> 
> > 5. Attention window destroyed itself and remove this.element
> 
> You mean by gecko?

by AppWindow.prototype.destroy()
https://github.com/mozilla-b2g/gaia/blob/master/apps/system/js/app_window.js#L429

> 
> > 6. The ready callback is done and attention window manager try call close on
> > the attention window but it failed because this.element is null then.
> 
> Could we always go through "immediate-close" in
> AppTransitionController#requestClose if app.element is falsy?
> (if 5. is out of our control we need to guard against it anyway)

See my guess:
AppWindow requestClose => AppWindowManager
AppWindowManager failed to reply and call AppWindow.close, hence destroy is never called.

For your proposal, if app.element is null, then there will be not any event dispatched from the element (by publish()), hence AttentionWindowManager cannot know someone is closed.
I am looking for an alternative way to fix bug 1047645 from now on.
Attached file v2, commit close
What do you think about this one?

@Tapas:
The fix of bug 1047645 is bringing us trouble. Unfortunately I am not able to reproduce it in my side, so I need your feedback on this patch to make sure it fixes bug 1047645 for you. If you cannot commit, we may still need to land it and revisit in CAF run for v2.1
Sorry for that.
Attachment #8517273 - Flags: review?(etienne)
Attachment #8517273 - Flags: feedback?(tkundu)
Comment on attachment 8517273 [details] [review]
v2, commit close

I'm fine with the change (comments on github), but if we get solid STR for bug 1047645 we should definitely add a marionette test as part of this patch :)
Attachment #8517273 - Flags: review?(etienne) → review+
https://treeherder.mozilla.org/ui/#/jobs?repo=gaia-try&revision=3364e0a197ce
Tests passed. Not sure what's the next but I want to close and at least land to master by end of this week.
If :tapas had some problems we could revisit.
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #22)
> https://treeherder.mozilla.org/ui/#/jobs?repo=gaia-try&revision=3364e0a197ce
> Tests passed. Not sure what's the next but I want to close and at least land
> to master by end of this week.
> If :tapas had some problems we could revisit.

Most likely, we can confirm you only by the end of next week as we blocked with other P1 issues now . Sorry for it . 

Please go ahead and land this fix if you are fine with this patch.
Flags: needinfo?(tkundu)
Attachment #8517273 - Flags: feedback?(tkundu)
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Comment on attachment 8517273 [details] [review]
v2, commit close

[Approval Request Comment]
[Bug caused by] (feature/regressing bug #):
Regression of bug 1047645
[User impact] if declined:
User will see white screen after bt pairing is timeout.
[Testing completed]:
Yes
[Risk to taking this patch] (and alternatives if risky):
Riskless
[String changes made]:
No
Attachment #8517273 - Flags: approval-gaia-v2.1?
This issue is verified fixed on Flame 2.2.

Result: The white screen does not appear after selecting "OK" on the dialog.

Device: Flame 2.2 (319mb, KK, Full Flash)
BuildID: 20141107040206
Gaia: 779f05fead3d009f6e7fe713ad0fea16b6f2fb31
Gecko: 64f4392d0bdc
Gonk: 48835395daa6a49b281db62c50805bd6ca24077e
Version: 36.0a1 (2.2 Master)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
===============================================

Leaving verifyme for 2.1 uplift.
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+][COM=Bluetooth] → [QAnalyst-Triage?][COM=Bluetooth]
Flags: needinfo?(ktucker)
Keywords: verifyme
Attachment #8517273 - Flags: approval-gaia-v2.1? → approval-gaia-v2.1+
QA Whiteboard: [QAnalyst-Triage?][COM=Bluetooth] → [QAnalyst-Triage+][COM=Bluetooth]
Flags: needinfo?(ktucker)
This issue is verified fixed on Flame 2.1.

Result: The white screen does not appear after selecting "OK" on the dialog.

Device: Flame 2.1 (319mb, KK, Full Flash)
BuildID: 20141110001201
Gaia: 0ec1925fc37b7c71d129ae44e42516a0cfb013c4
Gecko: 97487a2d1ee6
Gonk: 48835395daa6a49b281db62c50805bd6ca24077e
Version: 34.0 (2.1) 
Firmware Version: v188-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
QA Whiteboard: [QAnalyst-Triage+][COM=Bluetooth] → [QAnalyst-Triage?][COM=Bluetooth]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?][COM=Bluetooth] → [QAnalyst-Triage+][COM=Bluetooth]
Flags: needinfo?(ktucker)
Keywords: verifyme
Whiteboard: [2.1-flame-test-run-3] → [2.1-flame-test-run-3] [CR 754769]
Whiteboard: [2.1-flame-test-run-3] [CR 754769] → [caf priority: p2][2.1-flame-test-run-3] [CR 754769]
You need to log in before you can comment on or make changes to this bug.