[Aries] wifi fails to connect if data connection with SIM is enabled in the FTU

RESOLVED FIXED in Firefox 41

Status

RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: nhirata, Assigned: hchang)

Tracking

unspecified
2.2 S14 (12june)
ARM
Gonk (Firefox OS)
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:2.5+, firefox39 wontfix, firefox40 wontfix, firefox41 fixed, b2g-master fixed)

Details

(Whiteboard: [spark])

Attachments

(4 attachments, 1 obsolete attachment)

Created attachment 8609110 [details]
logcat.txt

1. download build : 
https://tools.taskcluster.net/index/artifacts/#gecko.v1.mozilla-central.revision.linux.2f6ea66057fe37794d7fc061407495cdcece5443.aries/gecko.v1.mozilla-central.revision.linux.2f6ea66057fe37794d7fc061407495cdcece5443.aries.opt

2. flash device
3. when going through ftu, turn on sim data
4. then try to connect to wifi

Expected: wifi connects to Mozilla-Guest
Actual: wifi fails
[Blocking Requested - why for this release]: FTU issue.
blocking-b2g: --- → spark?
Flags: needinfo?(lissyx+mozillians)
Whiteboard: [spark]
Association with the Mozilla Guest AP seems to be working. What are you flashing ? Gecko/Gaia or the whole device ?
Can you provide dmesg output ?
Flags: needinfo?(lissyx+mozillians) → needinfo?(nhirata.bugzilla)
Would have been much helpful to get logcat with timestamps (logcat -v threadtime). Otherwise, the symptoms and logs might look like bug 1154690
On an uptodate Z3 device I'm hitting a similar issue: FTU gets blocked on "Scanning for networks" ...
(In reply to Alexandre LISSY :gerard-majax from comment #4)
> On an uptodate Z3 device I'm hitting a similar issue: FTU gets blocked on
> "Scanning for networks" ...

Never mind, it's an unrelated issue.
Created attachment 8610832 [details]
logcat.txt

I think I figured out a better way to reproduce the issue.

For wifi, select Mozilla (don't wait to sign in ) and switch to Mozilla Guest again.

Logcat attached.
Flags: needinfo?(nhirata.bugzilla)
Please provide a video.
blocking-b2g: spark? → spark+
Flags: needinfo?(nhirata.bugzilla)
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #6)
> Created attachment 8610832 [details]
> logcat.txt
> 
> I think I figured out a better way to reproduce the issue.
> 
> For wifi, select Mozilla (don't wait to sign in ) and switch to Mozilla
> Guest again.
> 
> Logcat attached.

I do see Mozilla Guest being associated:
> 05-26 14:38:00.621   322   322 I Gecko   : -*- WifiWorker component: Event coming in: CTRL-EVENT-STATE-CHANGE id=0 state=9 BSSID=ac:4b:c8:0f:79:89 SSID=Mozilla Guest

And then we disassociate:
> 05-26 14:38:06.801   322   322 I Gecko   : -*- WifiWorker component: Event coming in: CTRL-EVENT-DISCONNECTED bssid=ac:4b:c8:0f:79:89 reason=3 locally_generated=1

If you could cross check the output of dmesg after a fresh boot, to assert the wifi driver version, that would be valuable. In the dmesg output, look for a string like: "Dongle Host Driver, version ". We expect 1.88.57.4 to be the one behaving correctly.

FYI, on a fresh Z3 build from current master, with the driver fix as of bug 1154690, I don't have any issue:
 - reset device
 - restart device into FTU, 
 - enable data on Orange F SIM, wait until statusbar shows 'H' or 'H+',
 - go to next panel, connect to an open Wifi

If I do add the STR of comment 6, looks like I do trigger the same issue. It would even look like there is no need to enable data, as I could reproduce by just tapping a WPA-PSK SSID, then cancelling the form asking for credentials, and then tapping on a SSID for an open network.

I remember seeing bugs about being unable to connect because of forgetting networks in Settings. Could this be related?
The more I investigate on my Z3 with wifi driver 1.88.57.4, the more I observe similar behavior as in bug 1154690, on master from may 26th. In the mean time, I got report of repeated wifi connection issues on Flame and on Open C. Those might involve IPv6 enabled network. The network I did test was with IPv6, as Mozilla Guest is.
Flags: needinfo?(vchang)
Not sure if bug 1164322 helps to prevent this issue. I saw Android framework disables automatic reconnect timer in wpa_supplicant which probably could prevent wpa_supplicant going to in sane in certain corner cases.
Flags: needinfo?(vchang)
We need your help, henry.
Assignee: nobody → hchang

Comment 12

4 years ago
Henry,

Do you have update on this? It's critical for Spark to have Wifi working.
I think bug 1164322 is what I am experiencing.  I made this bug blocked by that bug, and marked that bug a blocker.  Maybe the right thing to do is to dup this bug to that one?
Flags: needinfo?(nhirata.bugzilla)
(Assignee)

Comment 14

4 years ago
I enabled the DHCP debug log [1] and tried the STR. Most of the failure is caused
by something like the following log:

 I/DHCPCD  ( 1665): version 5.5.6 starting
D/DHCPCD  ( 1665): wlan0: using hwaddr bc:6e:64:db:ba:28
D/DHCPCD  ( 1665): wlan0: executing `/system/etc/dhcpcd/dhcpcd-run-hooks', reason PREINIT
D/DHCPCD  ( 1665): wlan0: executing `/system/etc/dhcpcd/dhcpcd-run-hooks', reason CARRIER
D/DHCPCD  ( 1665): wlan0: reading lease `/data/misc/dhcp/dhcpcd-wlan0.lease'
I/DHCPCD  ( 1665): wlan0: rebinding lease of 10.247.37.153
D/DHCPCD  ( 1665): wlan0: sending REQUEST (xid 0xb710f5b), next in 3.21 seconds
D/DHCPCD  ( 1665): wlan0: sending REQUEST (xid 0xb710f5b), next in 8.48 seconds
I/DHCPCD  ( 1665): wlan0: broadcasting for a lease
D/DHCPCD  ( 1665): wlan0: sending DISCOVER (xid 0x81571f1c), next in 3.79 seconds
D/DHCPCD  ( 1665): wlan0: sending DISCOVER (xid 0x81571f1c), next in 8.42 seconds
D/DHCPCD  ( 1665): wlan0: sending DISCOVER (xid 0x81571f1c), next in 16.50 seconds
E/DHCPCD  ( 1665): timed out
W/DHCPCD  ( 1665): allowing 8 seconds for IPv4LL timeout
I/DHCPCD  ( 1665): wlan0: probing for an IPv4LL address
I/DHCPCD  ( 1665): wlan0: checking for 169.254.59.207
D/DHCPCD  ( 1665): wlan0: sending ARP probe (1 of 3), next in 1.16 seconds
I/wpa_supplicant( 1648): wlan0: CTRL-EVENT-DISCONNECTED bssid=3c:94:d5:76:17:48 reason=3 locally_generated=1
I/DHCPCD  ( 1665): wlan0: carrier lost
D/DHCPCD  ( 1665): wlan0: executing `/system/etc/dhcpcd/dhcpcd-run-hooks', reason NOCARRIER

And I am investigating the the root cause of the DHCP issue now.


[1] http://androidxref.com/4.4.4_r1/xref/external/dhcpcd/common.h#86
Thanks Henry, that also reminds me of some similar behavior I saw while debugging my IPv6/WiFi issues.
Henry, do you have any updates on this?
Flags: needinfo?(hchang)
(Assignee)

Comment 17

4 years ago
(In reply to Doug Sherk (:drs) (use needinfo?) from comment #16)
> Henry, do you have any updates on this?

Sorry for not having updated for a while.

For the logs in comment 14, I guess the incorrect or corrupt '/data/misc/dhcp/dhcpcd-wlan0.lease'
was read to send the first request so that there's no response from DHCP server.
But this couldn't explain why there's no response to the subsequent DISCOVER packets.
I am trying to setup a laptop to monitor the DHCP packets between devices.

I also see another weird case, which is that the previous failure or interrupted DHCP
request callback asynchronously disconnect the ongoing connection. AOSP seems to use a 
DHCP state machine to manage the DHCP request. I am not sure if is is necessary for
us to avoid all the DHCP issue. By the way, Bug 1152991 also has a DHCP not respond
issue after reboot. 

[1] http://androidxref.com/5.1.0_r1/xref/frameworks/base/core/java/android/net/DhcpStateMachine.java
Flags: needinfo?(hchang)
(Assignee)

Comment 18

4 years ago
Some updates below.

1) After updating the kernel driver [1], I don't see any DHCP non-responding
   issue.

2) If we very quickly switch networks, the previous DHCP request will be
   cancelled due to [2] and issue a "DISCONNECT" command [3] to wpa_supplicant.

I currently have no perfect solution to this issue. A temporary solution is to 
remove [4] which is for DHCP retry.

[1] https://github.com/mozilla-b2g/b2g-manifest/pull/337/commits
[2] https://dxr.mozilla.org/mozilla-central/source/dom/wifi/WifiNetUtil.jsm#34
[3] https://dxr.mozilla.org/mozilla-central/source/dom/wifi/WifiWorker.js?from=wifiworker.js#665
[4] https://dxr.mozilla.org/mozilla-central/source/dom/wifi/WifiWorker.js?from=wifiworker.js#658-667
Henry, can we get a temporary fix (your point [4]?) landed and improve on it later? We're having several related issues, and it would be good to know if this bug is the cause of them or not.
(Assignee)

Comment 21

4 years ago
Attached a patch which I think is close to the final solution. I tested with the patch a couple of
times and saw no issue at all (Quickly switch between networks). Could anyone help test this patch
as well? Thanks!
Thanks, Henry. I'll ask others to test this.
Comment on attachment 8616622 [details] [diff] [review]
Bug1167466.patch

Naoki, could you make a build with this patch applied? We can distribute it to people who have been having this problem.

Henry, due to our timelines, we should just go ahead and land this ASAP but assume that it doesn't fix the issue, and thus continue working on it. To be clear, I haven't tried it yet, and neither has anybody else, so I don't have any reason to believe that it's still broken. But since we only have a week left to fix this, we should explore it from all sides, just to be safe. Thanks for your understanding.
Flags: needinfo?(nhirata.bugzilla)
Flags: needinfo?(hchang)
Build in progress.
(Assignee)

Updated

4 years ago
Flags: needinfo?(hchang)
Attachment #8616622 - Flags: review?(changyihsin)
Comment on attachment 8616622 [details] [diff] [review]
Bug1167466.patch

Review of attachment 8616622 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you, Henry.
Attachment #8616622 - Flags: review?(changyihsin) → review+
Build here.  flash.sh script included:
https://drive.google.com/a/mozilla.com/file/d/0B_0LdM1CVycISVhDaWp5MzZ1aVk/view?usp=sharing

Build ID               20150608164528
Gaia Revision          92ce25e46faf3d3c66aa884759ff84ede039f71d
Gaia Date              2015-06-08 20:19:43
Gecko Revision         https://hg.mozilla.org/mozilla-central/rev/ec02dc4772d2
Gecko Version          41.0a1
Device Name            aries
Firmware(Release)      4.4.2
Firmware(Incremental)  eng.nhirata.20150608.163847
Firmware Date          Mon Jun  8 16:40:17 PDT 2015
Bootloader             s1



changeset:   247580:ec02dc4772d2
tag:         Bug1167466.patch
tag:         qbase
tag:         qtip
tag:         tip
user:        hchang <hchang@mozilla.com>
date:        Mon Jun 08 18:49:50 2015 +0800
summary:     [PATCH] Bug 1167466 - Prevent from previous failed DHCP callback

changeset:   247579:e10e2e8d8bf2
tag:         qparent
user:        Kim Moir <kmoir@mozilla.com>
date:        Mon Jun 08 15:14:54 2015 -0400
summary:     a=RyanVM backed out changeset 64d4c71b2def no bug
wifi seems to work better.  I haven't had any issues so far.

Qawanted for others to try switching wifi networks and see if it seems better for them.  Also NI? on Marcia as she mentioned she had issues.
Flags: needinfo?(nhirata.bugzilla) → needinfo?(mozillamarcia.knous)
Keywords: qawanted
(Assignee)

Comment 28

4 years ago
Created attachment 8617164 [details] [diff] [review]
Bug1167466 (Refined commit message, carry r+)
Attachment #8616622 - Attachment is obsolete: true
Attachment #8617164 - Flags: review+
Created attachment 8617511 [details]
screenshot of one issue observed

(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #27)
> Qawanted for others to try switching wifi networks and see if it seems
> better for them.

Previously it seems to have trouble connecting to a guest wifi right after I enabled Data connection on previous page, now it seems to work ok. Switching wifi networks now also seems okay, but the problem where it visually tries to connect to the guest wifi even after I switched wifi persists. 

I've attached a screenshot showing that I've switched to connecting to 'Asus_2.4GHz' but it seems to keep trying to connect to 'TRcenter-guest'. Note that I don't have the password to access TRcenter-guest, so it could be why it appears to keep connecting to it(?). It doesn't stop me from connecting to the hotspot that I want to connect, so I think it's okay.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: qawanted
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
(Assignee)

Updated

4 years ago
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/475a5b2d58db
Status: NEW → RESOLVED
Last Resolved: 4 years ago
status-firefox41: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → 2.2 S14 (12june)

Updated

4 years ago
Duplicate of this bug: 1152991
Duplicate of this bug: 1174004
blocking-b2g: spark+ → 2.5+
status-b2g-v2.5: --- → fixed
status-b2g-master: --- → fixed
status-firefox39: --- → wontfix
status-firefox40: --- → wontfix
status-b2g-v2.5: fixed → ---
Flags: needinfo?(mozillamarcia.knous)
You need to log in before you can comment on or make changes to this bug.