bugzilla.mozilla.org has resumed normal operation. Attachments prior to 2014 will be unavailable for a few days. This is tracked in Bug 1475801.
Please report any other irregularities here.

[Windows Management][Notification] If the user drags down the notification tray while plugging/unplugging the device the tray will get stuck on the screen, and the user will not be able to interact with anything else

RESOLVED WORKSFORME

Status

Firefox OS
Gaia::System::Window Mgmt
RESOLVED WORKSFORME
3 years ago
3 years ago

People

(Reporter: DerekH, Unassigned)

Tracking

({regression})

unspecified
ARM
Gonk (Firefox OS)
regression

Firefox Tracking Flags

(blocking-b2g:2.5+, b2g-v2.0 unaffected, b2g-v2.1 affected, b2g-v2.2 affected, b2g-master affected)

Details

(Whiteboard: [3.0-Daily-Testing][systemsfe], URL)

Attachments

(1 attachment)

Description:
While the user is plugging in, or unplugging the phone while they are pulling the notification tray down, they device will be unusable until the notification tray is tapped. The home/power buttons do remain usable though.


Repro Steps:
1) Update a Flame to 20150519010201
2) Pull down notification bar while plugging in or unplugging the device

Actual:
Notification tray gets stuck and phone becomes unresponsive to user input until the user interacts with notification tray again


Expected:
Notification tray is able to be pulled down smoothly at all times without issue


Environmental Variables:
Device: Flame 3.0 (319mb)(Kitkat)(Full Flash)
Build ID: 20150519010201
Gaia: 762cbd16712484f93f485e89f5363686540a3db7
Gecko: f65cc0022a0e
Gonk: 040bb1e9ac8a5b6dd756fdd696aa37a8868b5c67
Version: 41.0a1 (3.0)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:41.0) Gecko/41.0 Firefox/41.0


Repro frequency: 5/10
See attached: Logcat, Video - https://youtu.be/JPIZ0OKGz1A
This issue DOES occur on Flame 2.2

Notification tray gets stuck and phone becomes unresponsive to user input until the user interacts with notification tray again

Environmental Variables:
Device: Flame 2.2 (319mb)(Kitkat)(Full Flash)
Build ID: 20150517162505
Gaia: f73891b8fcc5f34de81868640754f7cc331fa709
Gecko: 8785a53b8d6e
Gonk: bd9cb3af2a0354577a6903917bc826489050b40d
Version: 37.0 (2.2)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0

=============================================================================================================

This issue does NOT occur on Flame 2.1

Notification tray is able to be pulled down smoothly at all times without issue

Environmental Variables:
Device: Flame 2.1
Build ID: 20150517001201
Gaia: c80865cb0bf73f1b97defbc646083b404feb3ac4
Gecko: 86182f8fc3f1
Gonk: ab265fb203390c70b8f2a054f38cf4b2f2dad70a
Version: 34.0 (2.1)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Summary: [Windows Management][Notification] If the user drags down the notification tray while plugging/unplugging the device → [Windows Management][Notification] If the user drags down the notification tray while plugging/unplugging the device the tray will get stuck on the screen, and the user will not be able to interact with anything else
[Blocking Requested - why for this release]:
Functional regression, starting with a 3.0 nomination.

Requesting a window.
blocking-b2g: --- → 3.0?
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage?
Flags: needinfo?(pbylenga)
Keywords: regressionwindow-wanted
blocking-b2g: 3.0? → 3.0+
This issue reproduces much more consistently if the user has USB storage enabled (9/10 times instead of 5/10).  Redoing the branch checks first with this new information.
Keywords: qawanted
This issue affects 2.1 but does NOT affect 2.0 and is thus still a regression.

Issue DOES occur

Environmental Variables:
Device: Flame 2.1
BuildID: 20150519204142
Gaia: c80865cb0bf73f1b97defbc646083b404feb3ac4
Gecko: bf8615dd05d6
Version: 34.0 (2.1) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Issue does NOT occur

Environmental Variables:
Device: Flame 2.0
BuildID: 20150519205445
Gaia: 84898cadf28b1a1fcd03b726cff658de470282f0
Gecko: 84aec7fd8770
Version: 32.0 (2.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0
QA Whiteboard: [QAnalyst-Triage? → [QAnalyst-Triage?]
status-b2g-v2.0: --- → unaffected
status-b2g-v2.1: unaffected → affected
Flags: needinfo?(ktucker)
Keywords: qawanted
QA Contact: jmercado
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
All Commits in the pushlog come from the FX team inbound which I don't have access to so I can't narrow this down further.

Central Regression Window:

Last Working 
Environmental Variables:
Device: Flame 3.0
BuildID: 20150401093537
Gaia: 4bb3a933bd805e8df1e11827cb247754c3565b0b
Gecko: e5b72a8edb82
Version: 40.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

First Broken 
Environmental Variables:
Device: Flame 3.0
BuildID: 20150401134635
Gaia: 4bb3a933bd805e8df1e11827cb247754c3565b0b
Gecko: e044f4d172e2
Version: 40.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

Last Working gaia / First Broken gecko - Issue DOES occur
Gaia: 4bb3a933bd805e8df1e11827cb247754c3565b0b
Gecko: e044f4d172e2

First Broken gaia / Last Working gecko - Issue does NOT occur
Gaia: 4bb3a933bd805e8df1e11827cb247754c3565b0b
Gecko: e5b72a8edb82

Gecko Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=e5b72a8edb82&tochange=e044f4d172e2
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: regressionwindow-wanted
Naoki, can you take a look at this please?
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker) → needinfo?(nhirata.bugzilla)
Is there always this message in the logcat when its not working? 

###!!! [Parent][MessageChannel] Error: (msgtype=0x3A0033,name=PContent::Msg_NotifyIdleObserver) Channel error: cannot send/recv
Yes this messge seems to appear very frequently when this bug occurs
I guess it's some weird race condition.

I can't seem to reproduce it on today's pvtbuild; I spoke with DerekH via irc and he can.  Not sure what I'm doing wrong in order to reproduce the bug...

Still trying to figure it out.
Ah!  I figured it out.  It looks like enabling the USB Storage and the status of connecting/disconnecting the USB within the settings visible makes a difference.

I just realized, we also need to fix access so that QAnalyst can access the fx-team inbound builds.  bug 1167473 was filed to get them access.
I had the adb devtools on as well as bluetooth and nfc.  Seems like having a lot of stuff open helps see the issue more.  Having said that I can't seem to reproduce on the inbound fxteam builds.

looking at the pushlog, ... it's pushing from mc to fxteam right?  doesn't that mean that it shouldn't be the cause of the glitch?

Derek or Jayme, could you please take a look to see if you get different results?  The bug doesn't seem to be occurring for me in these builds.
https://drive.google.com/a/mozilla.com/file/d/0B_0LdM1CVycIa0V2V1ZqMzNVQ0U/view?usp=sharing
https://drive.google.com/a/mozilla.com/file/d/0B_0LdM1CVycIYW55anlGc0FheEU/view?usp=sharing
Flags: needinfo?(jmercado)
Flags: needinfo?(dharris)
Flags: needinfo?(nhirata.bugzilla)
Dave, my first guess would be that the Msg_NotifyIdleObserver error msg indicates some real issue. Can you take a look here if unplugging the phone and the idleobserver are somehow related?
Flags: needinfo?(dhylands)
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #13)
> I had the adb devtools on as well as bluetooth and nfc.  Seems like having a
> lot of stuff open helps see the issue more.  Having said that I can't seem
> to reproduce on the inbound fxteam builds.
> 
> looking at the pushlog, ... it's pushing from mc to fxteam right?  doesn't
> that mean that it shouldn't be the cause of the glitch?
> 
> Derek or Jayme, could you please take a look to see if you get different
> results?  The bug doesn't seem to be occurring for me in these builds.
> https://drive.google.com/a/mozilla.com/file/d/0B_0LdM1CVycIa0V2V1ZqMzNVQ0U/
> view?usp=sharing
> https://drive.google.com/a/mozilla.com/file/d/0B_0LdM1CVycIYW55anlGc0FheEU/
> view?usp=sharing

I was able to reproduce this issue on both the provided builds without any apps open or USB sharing enabled.

On the first shared build I am seeing this error show up very frequently:
05-22 18:07:30.719: E/HWComposer(207): Non-uniform vsync interval: 17315260
05-22 18:07:30.779: E/HWComposer(207): Non-uniform vsync interval: 17316407
05-22 18:07:30.869: E/HWComposer(207): Non-uniform vsync interval: 17315886
05-22 18:07:30.929: E/HWComposer(207): Non-uniform vsync interval: 17316094
05-22 18:07:30.989: E/HWComposer(207): Non-uniform vsync interval: 17307603


On the second shared build I am seeing this message consistently: 
IPDL protocol error: Handler for UpdateNoSwap returned error code
05-22 17:44:18.024   211   779 I Gecko   : 
05-22 17:44:18.024   211   779 I Gecko   : ###!!! [Child][DispatchAsyncMessage] Error: (msgtype=0x78000A,name=PLayerTransaction::Msg_UpdateNoSwap) Processing error: message was deserialized, but the handler returned false (indicating failure)
Flags: needinfo?(nhirata.bugzilla)
Flags: needinfo?(jmercado)
Flags: needinfo?(dharris)
(In reply to Gregor Wagner [:gwagner] from comment #14)
> Dave, my first guess would be that the Msg_NotifyIdleObserver error msg
> indicates some real issue. Can you take a look here if unplugging the phone
> and the idleobserver are somehow related?

Not directly that I'm aware of. It looks like observers which register with the idle service get told about becoming and no becoming idle.

If you unplug the USB cable, and a volume was being shared with the PC will trigger the AutoMounter, which will try to mount the volume. Before mounting the volume, vold will do a chkdsk on the volume, which involves launching another process, which may very well cause child processes to be killed due to OOM.

The "Channel error: cannot send/recv" error typically happens when the parent tries to send something to the child and the child was killed off for some reason.

So it could be that a child which registered to be notified of idle event got killed, and the error is just because the parent is telling the child that the system isn't idle anymore. It depends on how "idle" is determined. Because a bunch of stuff happens anytime the USB cable is plugged in or unplugged.
Flags: needinfo?(dhylands)
I think it's not just one single change, but a change with the memory consumption based on what's said and the regression window?

Based on further testing:
"Using the three builds that you linked I was unable to reproduce the bug on the second 2 builds which were (20150331203056) and (20150401023004). I tried turning USB sharing on, Wi-Fi on, and opening multiple apps to slow down the memory but I was not able to get the notification bar to stick. I was only able get this issue to occur 1 time yesterday on build 2015033123308 after about 50 attempts." - from Derek.

If I was to include 2015033123308 as part of the regression range, end point as bad : http://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=6b167d125ec9&tochange=3eb16083368b

If I was to mark 20150401023004 as good, http://hg.mozilla.org/integration/fx-team/pushloghtml?fromchange=b0ed0d8f967a&tochange=3b159e46a90d

There isn't any gaia change between 20150331203056 and 2015033123308 and 20150401023004 and 20150401053004.

The only diff is the gonk layer : 
nhirata-19333:Desktop nhirata$ diff 20150401023004.xml 20150401053004.xml 
155c155
<   <!--Mercurial-Information: <project name="https://hg.mozilla.org/integration/fx-team" path="gecko" remote="hgmozillaorg" revision="b0ed0d8f967a"/>-->
---
>   <!--Mercurial-Information: <project name="https://hg.mozilla.org/integration/fx-team" path="gecko" remote="hgmozillaorg" revision="3b159e46a90d"/>-->
Flags: needinfo?(nhirata.bugzilla)
I can't reproduce this any more. Is this still happening?
Keywords: qawanted
This issue no longer reproduces on the latest Nightly Flame or Aries builds.

Actual Results: The notification tray will still be active and the phone will still be interactive.

Environmental Variables:
Device: Aries 2.5
BuildID: 20150827170205
Gaia: d784c81961d82cbe9e111405468c590a8345856c
Gecko: ca086f9ef8bca2d6cdfa79bfc4c854f56a59859e
Gonk: 2916e2368074b5383c80bf5a0fba3fc83ba310bd
Version: 43.0a1 (2.5) 
Firmware Version: D5803_23.1.A.1.28_NCB.ftf
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0

Environmental Variables:
Device: Flame 2.5
BuildID: 20150827083151
Gaia: d784c81961d82cbe9e111405468c590a8345856c
Gecko: b33eae31bd7188024b54228e0c0345800a65e595
Gonk: c4779d6da0f85894b1f78f0351b43f2949e8decd
Version: 43.0a1 (2.5) 
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:43.0) Gecko/43.0 Firefox/43.0
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: qawanted
Closing this issue since it no longer reproduces.  Please reopen if it is seen again.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → WORKSFORME
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.