Closed
Bug 987760
Opened 11 years ago
Closed 10 years ago
[v1.4][v2.0] Investigate test_settings_wifi.py failure
Categories
(Firefox OS Graveyard :: Gaia::UI Tests, defect, P1)
Tracking
(tracking-b2g:backlog)
RESOLVED
FIXED
tracking-b2g | backlog |
People
(Reporter: RobertC, Unassigned)
References
Details
(Keywords: qablocker, regression, Whiteboard: [fromAutomation][p=2], Priority 2)
Attachments
(23 files)
115.49 KB,
text/plain
|
Details | |
1.75 MB,
text/html
|
Details | |
4.23 MB,
application/zip
|
Details | |
9.40 MB,
application/zip
|
Details | |
29.10 KB,
image/png
|
Details | |
6.31 MB,
text/plain
|
Details | |
5.12 MB,
application/x-gzip
|
Details | |
4.71 KB,
text/plain
|
Details | |
2.14 MB,
text/plain
|
Details | |
412.46 KB,
text/plain
|
Details | |
5.32 MB,
application/octet-stream
|
Details | |
391.84 KB,
text/plain
|
Details | |
428.88 KB,
text/html
|
Details | |
358.48 KB,
text/plain
|
Details | |
312.41 KB,
text/plain
|
Details | |
511.55 KB,
text/plain
|
Details | |
2.41 MB,
text/x-log
|
Details | |
1.36 MB,
text/x-log
|
Details | |
875.89 KB,
text/html
|
Details | |
60.11 KB,
application/x-executable
|
Details | |
8.13 MB,
text/plain
|
Details | |
475.59 KB,
text/html
|
Details | |
1.23 MB,
text/plain
|
Details |
The test test_settings_wifi.py is failing intermittently because the expected wifi is not present in the available list. We could not reproduce this issue on the local devices. The earliest failure with this issue on Jenkins is from March 18.
Is there an issue with the local wifi network? Can someone from MV confirm this issue?
Stacktrace:
Traceback (most recent call last):
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette_test.py", line 163, in run
testMethod()
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/tests/python/gaia-ui-tests/gaiatest/tests/functional/settings/test_settings_wifi.py", line 21, in test_connect_to_wifi_via_settings_app
wifi_settings.connect_to_network(self.testvars['wifi'])
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/tests/python/gaia-ui-tests/gaiatest/apps/settings/regions/wifi.py", line 32, in connect_to_network
this_network = self.wait_for_element_present(*this_network_locator)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/tests/python/gaia-ui-tests/gaiatest/apps/base.py", line 29, in wait_for_element_present
lambda m: m.find_element(by, locator))
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/wait.py", line 143, in until
cause=last_exc)
TimeoutException: Traceback (most recent call last):
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/wait.py", line 122, in until
rv = condition(self.marionette)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/tests/python/gaia-ui-tests/gaiatest/apps/base.py", line 29, in <lambda>
lambda m: m.find_element(by, locator))
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 1189, in find_element
response = self._send_message('findElement', 'value', **kwargs)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 613, in _send_message
self._handle_error(response)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-aurora.ui/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 634, in _handle_error
raise NoSuchElementException(message=message, status=status, stacktrace=stacktrace)
TimeoutException: Timed out after 30.3313939571 seconds, caused by <class 'marionette.errors.NoSuchElementException'>
Jenkins builds link:
http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.mozilla-aurora.ui/70/testReport/junit/test_settings_wifi/TestSettingsWifi/test_connect_to_wifi_via_settings_app/history/?
Latest build on which this issue is reproduced:
Gaia a2a88d0638594a6510f878d2c5e99a6ead7520ad
Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/67bdb575d833
BuildID 20140325000201
Version 30.0a2
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
Also reproduced on:
Gaia 730670951e40b2317a167fcd07c398bb662d6e87
Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/a44f8b39c2c8
BuildID 20140324000202
Version 30.0a2
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
Reporter | ||
Comment 1•11 years ago
|
||
Started an adhoc run with this test.
http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Hamachi/job/b2g.hamachi.mozilla-aurora.ui.adhoc/22/HTML_Report/?
Reporter | ||
Comment 2•11 years ago
|
||
We cannot rerpoduce this locally, but we are seeing conatnt failures on Jenkins. Can someone from MV office verify this manually?
Flags: needinfo?(stephen.donner)
Flags: needinfo?(mozillamarcia.knous)
Comment 3•11 years ago
|
||
Raymond, can you try, today before you leave:
1) multiple (at least 15 or so) runs of test_settings_wifi.py on mozilla-central/master build
2) the same, but against the Aurora build for 1.4?
Let us know here what you see, please, and help us try to understand whether this is an automation/manual bug with the build(s), or perhaps a Wi-Fi issue with that node on Mozilla Guest; thanks!
Flags: needinfo?(stephen.donner) → needinfo?(mozbugs.retornam)
Comment 4•11 years ago
|
||
This was from a replacement candidate (but not-yet in rotation) B2G-11 phone.
Comment 5•11 years ago
|
||
The command I ran was: gaiatest --testvars=/Users/sdonner/Desktop/ui-tests/b2g-7.json --address=localhost:2828 --xml-output=results/results.xml --html-output=results/index.html --type=b2g-unit --restart --repeat=9 --timeout=30000 gaiatest/tests/functional/settings/test_settings_wifi.py
sdonner-10779:B2G-flash-tool sdonner$ ./check_versions.sh
Gaia 80af23f8c74d9d2e9388d8ed3c204040b5c528ec
Gecko https://hg.mozilla.org/mozilla-central/rev/c69c55582faa
BuildID 20140326040202
Version 31.0a1
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
Updated•11 years ago
|
Flags: needinfo?(mozillamarcia.knous)
Flags: needinfo?(mozbugs.retornam)
Comment 7•11 years ago
|
||
:bebe, can you try a good # of runs on mozilla-central builds /before/ we branched for 1.4?
(Note: it's getting a bit convoluted -- at least to me!-- to know what's where, with bug 986349 and bug 988117 in the mix.)
Comment 8•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #6)
> :dlee, can you please take a look? Thanks!
Hi, Stephen,
I tried to reproduce this issue on my platform(unagi) but i cannot reproduce it.
From the log in description it seems after scanning it can't find the ssid specified in testvars files.
But for scan result, we cannot promise that we could get all the wifi-network each time, it depends on driver, device.
So if this issue is not always reproducible and it is not only happened after specific version, i will suggest maybe we can retry this(scan and get the wifi network) multiple times.
Flags: needinfo?(dlee)
Comment 9•11 years ago
|
||
Hi Dimi -
Luckily, Eric Chang helped reproduce this issue in the Taipei office:
gaiatest --testvars=testvars.json --address=localhost:2828 --xml-output=results.xml --html-output=index.html --type=b2g-unit --restart --repeat=9 --timeout=30000 gaiatest/tests/functional/settings/test_settings_wifi.py
GAIA_REV=c8bdb8fdd176f794d703a9ba24cca8364313cd45
GECKO_REV=68f47b1cc64d415ac77785a264257bc7a16ba09e
GECKO_BRANCH=mozillaorg/master
GAIA_BRANCH=mozillaorg/master
BUILD_TAG=jenkins-B2G.master.hamachi-206
BuildID=20140326051552
Can you work with him/the QA team? Thanks to both of you!
Updated•11 years ago
|
Severity: normal → major
Comment 10•11 years ago
|
||
I checked again on this case and my failed one, the timeout on my case is as follows.
> GAIA_REV=c8bdb8fdd176f794d703a9ba24cca8364313cd45
> GECKO_REV=68f47b1cc64d415ac77785a264257bc7a16ba09e
> GECKO_BRANCH=mozillaorg/master
> GAIA_BRANCH=mozillaorg/master
> BUILD_TAG=jenkins-B2G.master.hamachi-206
> BuildID=20140326051552
> File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette_test.py", line 145, in run
> self.setUp()
> File "/home/ericcc/repo/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/tests/functional/settings/test_settings_wifi.py", line 13, in setUp
> self.data_layer.disable_wifi()
> File "/home/ericcc/repo/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 263, in disable_wifi
> result = self.marionette.execute_async_script("return GaiaDataLayer.disableWiFi()", special_powers=True)
> File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 1162, in execute_async_script
> filename=os.path.basename(frame[0]))
> File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 613, in _send_message
> self._handle_error(response)
> File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 662, in _handle_error
> raise ScriptTimeoutException(message=message, status=status, stacktrace=stacktrace)
> ScriptTimeoutException: ScriptTimeoutException: timed out
Let me try to run again of this test, change my setup to - Buri, v1.4 and run again.
Comment 11•11 years ago
|
||
v1.4
Gaia 7e705dd4718d528974d99ac31866318d7e201152
Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/4889124accfa
BuildID 20140326000201
Version 30.0a2
ro.build.version.incremental=eng.archermind.20131114.105818
ro.build.date=Thu Nov 14 10:58:33 CST 2013
This is what I've got out of 100 tests, not easy to reproduce..
2/100 Timeouts of the same as above
1/100 Timeout when waiting for X to show up
97/100 Pass
Updated•11 years ago
|
blocking-b2g: --- → 1.4?
Updated•11 years ago
|
Component: Gaia::UI Tests → Wifi
Whiteboard: [fromAutomation]
Comment 12•11 years ago
|
||
Recent automation regression - needed to ensure that we have stable on device automation runs.
Keywords: qablocker,
regression
Dimi, can you or someone on your team look at this ASAP, please? Thx :-)
Flags: needinfo?(dlee)
Updated•11 years ago
|
Assignee: nobody → dlee
Flags: needinfo?(dlee)
Comment 14•11 years ago
|
||
Hi Eric
Could you help run the test with adb log ? it will be very helpful ! thanks
Flags: needinfo?(echang)
Comment 15•11 years ago
|
||
Got a small script to run test with adb, for 100 runs, I need maybe 90 minutes
Flags: needinfo?(echang)
Comment 17•11 years ago
|
||
Currently I am trying to reproduce this again with adb logcat on, first 100 tests(a for loop of 100 times, each test with separate adb logcat), another 100 times with a single logcat and gaiatest --repeat=99
200 PASS.
V1.4 AURORA, Buri, Engineering build
Gaia a2a88d0638594a6510f878d2c5e99a6ead7520ad
Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/67bdb575d833
BuildID 20140325000201
Version 30.0a2
ro.build.version.incremental=eng.archermind.20131114.105818
ro.build.date=Thu Nov 14 10:58:33 CST 2013
wifi with wpa
Are you able to try both authenticated and open nodes?
Comment 19•11 years ago
|
||
V1.4 AURORA, Buri, Engineering build
Gaia a2a88d0638594a6510f878d2c5e99a6ead7520ad
Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/67bdb575d833
BuildID 20140325000201
Version 30.0a2
ro.build.version.incremental=eng.archermind.20131114.105818
ro.build.date=Thu Nov 14 10:58:33 CST 2013
wifi without authentication
99 PASS
1 FAIL (but not a timeout)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette_test.py", line 163, in run
testMethod()
File "/home/ericcc/repo/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/tests/functional/settings/test_settings_wifi.py", line 21, in test_connect_to_wifi_via_settings_app
wifi_settings.connect_to_network(self.testvars['wifi'])
File "/home/ericcc/repo/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/apps/settings/regions/wifi.py", line 32, in connect_to_network
this_network = self.wait_for_element_present(*this_network_locator)
File "/home/ericcc/repo/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/apps/base.py", line 29, in wait_for_element_present
lambda m: m.find_element(by, locator))
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/wait.py", line 143, in until
cause=last_exc)
TimeoutException: Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/wait.py", line 122, in until
rv = condition(self.marionette)
File "/home/ericcc/repo/mozilla-b2g/gaia/tests/python/gaia-ui-tests/gaiatest/apps/base.py", line 29, in <lambda>
lambda m: m.find_element(by, locator))
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 1189, in find_element
response = self._send_message('findElement', 'value', **kwargs)
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 613, in _send_message
self._handle_error(response)
File "/usr/local/lib/python2.7/dist-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 634, in _handle_error
raise NoSuchElementException(message=message, status=status, stacktrace=stacktrace)
TimeoutException: Timed out after 30.5163249969 seconds, caused by <class 'marionette.errors.NoSuchElementException'>
Comment 20•11 years ago
|
||
Need to remove the part says "(but not a timeout)" @@"
(In reply to Eric Chang [:ericcc] [:echang] from comment #19)
> Created attachment 8400474 [details]
> 20140402-3-open-wifi.zip
>
> wifi without authentication
>
> 99 PASS
> 1 FAIL (but not a timeout)
Comment 21•11 years ago
|
||
Is the firmware a factor here? I have the v1.2_US_20131115.cfg on this Buri.
(In reply to Eric Chang [:ericcc] [:echang] from comment #21)
> Is the firmware a factor here? I have the v1.2_US_20131115.cfg on this Buri.
I don't think so - I can/could reproduce this manually (thought it's easier/more problematic with automation, b/c it feels like a race condition or something), and with the latest from TeleWeb, which should be v1.2-device.cfg - are you on that?
Comment 23•11 years ago
|
||
Thanks for the info, I will try that. Currently I am waiting for dimi's private build for getting more debugging output.
(In reply to Eric Chang [:ericcc] [:echang] from comment #23)
> Thanks for the info, I will try that. Currently I am waiting for dimi's
> private build for getting more debugging output.
Thanks; can you also be sure you're both testing on Hamachi or Buri? We haven't used Unagis since before October or so of last year, IIRC...
Comment 25•11 years ago
|
||
Sure, I use Buri only.
Comment 26•11 years ago
|
||
One of the root cause that case this test case fail will be fixed in Bug 991499.
But there still other issues related to this bug, i will keep checking it.
Comment 27•11 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #26)
> One of the root cause that case this test case fail will be fixed in Bug
> 991499.
> But there still other issues related to this bug, i will keep checking it.
Should bug 991499 be nominated to block then?
Comment 28•11 years ago
|
||
Hi Dimi, Got 1 with private build, could you help to check the log, thanks..
Flags: needinfo?(dlee)
Comment 29•11 years ago
|
||
(In reply to Eric Chang [:ericcc] [:echang] from comment #28)
> Created attachment 8402435 [details]
> 20140405-1.zip
>
> Hi Dimi, Got 1 with private build, could you help to check the log, thanks..
Hi Eric,
This error happened because wpa_supplicant did not find the ap during scan, this is the same problem as i described in Comment 8, could you help test again in office environment ? thanks
Flags: needinfo?(dlee)
Comment 30•11 years ago
|
||
Running for 5 hours now, all pass, will leave that in the office for the night.
Comment 31•11 years ago
|
||
Running for 13+ hours now, 800 runs all pass, (59~61 sec/a run) with private build from Dimi and open wifi.
Flags: needinfo?(dlee)
Comment 32•11 years ago
|
||
According to Eric test result, maybe we can check if this bug is still reproducible after bug 991499 land to 1.4
Flags: needinfo?(dlee)
Comment 33•11 years ago
|
||
According to comment 31 and comment 32, set to RESOLVED WORKSFORME
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WORKSFORME
Comment 34•11 years ago
|
||
Stephen - Can you check to see if this has been running green here in our CI? Just want to double check that this is truly fixed.
Flags: needinfo?(stephen.donner)
Comment 35•11 years ago
|
||
Reopen it because this requires bug 991499 to be uplifted to 1.4.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
(In reply to Jason Smith [:jsmith] from comment #34)
> Stephen - Can you check to see if this has been running green here in our
> CI? Just want to double check that this is truly fixed.
Still having problems on master, too (see http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.mozilla-central.unittests.adhoc/17/). Likely, they are related to bug 990580.
Flags: needinfo?(stephen.donner)
Comment 37•11 years ago
|
||
This issue is reproducible on Jenkins, on v1.3 latest build. From the screenshot we can notice that the wireless network is present, but the phone is not able to connect to it before timeout.
Traceback (most recent call last):
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/.env/local/lib/python2.7/site-packages/marionette_client-0.7.1-py2.7.egg/marionette/marionette_test.py", line 143, in run
testMethod()
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/tests/python/gaia-ui-tests/gaiatest/tests/functional/settings/test_settings_wifi.py", line 21, in test_connect_to_wifi_via_settings_app
wifi_settings.connect_to_network(self.testvars['wifi'])
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/tests/python/gaia-ui-tests/gaiatest/apps/settings/regions/wifi.py", line 48, in connect_to_network
timeout = max(self.marionette.timeout and self.marionette.timeout / 1000, 60))
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/tests/python/gaia-ui-tests/gaiatest/apps/base.py", line 103, in wait_for_condition
raise TimeoutException(message)
TimeoutException: Condition timed out
Gaia 100a159af11281d965e9d70d0f3d3111a38775ee
Gecko https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/ea2a473b122f
BuildID 20140409004002
Version 28.0
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
I couldn't reproduce this locally. I attached a screenshot of the failure.
Comment 38•11 years ago
|
||
Comment 39•11 years ago
|
||
(In reply to Viorela Ioia [:viorela] from comment #37)
> This issue is reproducible on Jenkins, on v1.3 latest build. From the
> screenshot we can notice that the wireless network is present, but the phone
> is not able to connect to it before timeout.
>
> Traceback (most recent call last):
> File
> "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/.env/local/lib/
> python2.7/site-packages/marionette_client-0.7.1-py2.7.egg/marionette/
> marionette_test.py", line 143, in run
> testMethod()
> File
> "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/tests/python/
> gaia-ui-tests/gaiatest/tests/functional/settings/test_settings_wifi.py",
> line 21, in test_connect_to_wifi_via_settings_app
> wifi_settings.connect_to_network(self.testvars['wifi'])
> File
> "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/tests/python/
> gaia-ui-tests/gaiatest/apps/settings/regions/wifi.py", line 48, in
> connect_to_network
> timeout = max(self.marionette.timeout and self.marionette.timeout / 1000,
> 60))
> File
> "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g28_v1_3.ui.2/tests/python/
> gaia-ui-tests/gaiatest/apps/base.py", line 103, in wait_for_condition
> raise TimeoutException(message)
> TimeoutException: Condition timed out
>
>
> Gaia 100a159af11281d965e9d70d0f3d3111a38775ee
> Gecko https://hg.mozilla.org/releases/mozilla-b2g28_v1_3/rev/ea2a473b122f
> BuildID 20140409004002
> Version 28.0
> ro.build.version.incremental=eng.tclxa.20131223.163538
> ro.build.date=Mon Dec 23 16:36:04 CST 2013
>
> I couldn't reproduce this locally. I attached a screenshot of the failure.
What is the reproduce rate of this failure ? And is it possible to have log for this error ? Thanks!
Flags: needinfo?(viorela.ioia)
We still fail on master, and -- oddly enough -- it's now passing on the 1st 3 tries, but the 4th fails (see http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.mozilla-central.unittests.adhoc/20/).
I'll attach a logfile, next, too.
19:28:41 Showing version details
19:28:45 application_buildid: 20140409142348
19:28:45 application_changeset: 690c810c8e3e
19:28:45 application_name: B2G
19:28:45 application_repository: https://hg.mozilla.org/mozilla-central
19:28:45 application_version: 31.0a1
19:28:45 build_changeset: 1ad48c4be51b279f7f63c1a13025b52fe087d231
19:28:45 device_firmware_date: 1387433095
19:28:45 device_firmware_version_incremental: 324
19:28:45 device_firmware_version_release: 4.0.4
19:28:45 device_id: msm7627a
19:28:45 gaia_changeset: 9d0b1bdf746823a94b13e6574c1d8304dc584763
19:28:45 gaia_date: 1397017099
19:28:45 platform_buildid: 20140409142348
19:28:45 platform_changeset: 690c810c8e3e
19:28:45 platform_repository: https://hg.mozilla.org/mozilla-central
The 4th test fails, here -- the 1st 3 pass, oddly enough!
Comment 42•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #41)
> Created attachment 8404424 [details]
> logcat.txt from latest failed run on master/m-c
>
> The 4th test fails, here -- the 1st 3 pass, oddly enough!
WifiWorker cannot retrieve dhcp info, but log info is not enough to know it stucked in gecko, dhcpcd or AP side.
Hi, Stephen,
Can we test connect to different AP ?
And is it possible to provide an engineer build to you to get more information from log ?
Thanks for help.
Flags: needinfo?(viorela.ioia) → needinfo?(stephen.donner)
Updated•11 years ago
|
Whiteboard: [fromAutomation] → [fromAutomation][p=2]
Target Milestone: --- → 1.4 S6 (25apr)
(In reply to Dimi Lee[:dimi][:dlee] from comment #42)
> (In reply to Stephen Donner [:stephend] from comment #41)
> > Created attachment 8404424 [details]
> > logcat.txt from latest failed run on master/m-c
> >
> > The 4th test fails, here -- the 1st 3 pass, oddly enough!
> WifiWorker cannot retrieve dhcp info, but log info is not enough to know it
> stucked in gecko, dhcpcd or AP side.
> Hi, Stephen,
> Can we test connect to different AP ?
> And is it possible to provide an engineer build to you to get more
> information from log ?
> Thanks for help.
I will try this from home tonight. As for the 2nd question, I'm not quite clear on what you're asking -- do you want us to try a special build, for more debug info? If so, sure, that'd be fine by me. Please make sure it's got Marionette enabled, too.
Flags: needinfo?(stephen.donner)
Updated•11 years ago
|
Flags: needinfo?(dlee)
Comment 44•11 years ago
|
||
Hi Stephen,
Attach debug files using latest mozilla-central
Detail information is included in README.
Please ping me if there are any problem, thanks
Flags: needinfo?(dlee)
Updated•11 years ago
|
Flags: needinfo?(stephen.donner)
Updated•11 years ago
|
Whiteboard: [fromAutomation][p=2] → [fromAutomation][p=2], Priority 2
Here's the output (and command to run) this test, with 9 repeats (10 total), /without/ your patch. Logcat, also /without/ patch, for comparison, attaching next.
Flags: needinfo?(stephen.donner)
Updated•11 years ago
|
Comment 47•11 years ago
|
||
This issue now may related to many other bugs cause test case fail.
We should check this after other bugs is fixed then we will have a more clear environment to investigate.
Suggest set to non-blocking now.
blocking-b2g: 1.4+ → 1.4?
Flags: needinfo?(praghunath)
Updated•11 years ago
|
Target Milestone: 1.4 S6 (25apr) → ---
Comment 48•11 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #47)
> This issue now may related to many other bugs cause test case fail.
> We should check this after other bugs is fixed then we will have a more
> clear environment to investigate.
>
> Suggest set to non-blocking now.
Why? As long as there's intermittent failures due to wifi regressions, this remains as a blocker. If there's nothing else to do here that's related to wifi, then this should be closed out.
(In reply to Jason Smith [:jsmith] from comment #48)
> (In reply to Dimi Lee[:dimi][:dlee] from comment #47)
> > This issue now may related to many other bugs cause test case fail.
> > We should check this after other bugs is fixed then we will have a more
> > clear environment to investigate.
> >
> > Suggest set to non-blocking now.
>
> Why? As long as there's intermittent failures due to wifi regressions, this
> remains as a blocker. If there's nothing else to do here that's related to
> wifi, then this should be closed out.
Agreed; I suggested we clear through bug 990580 (and perhaps bug 991262) because it appears far-reaching -- this should still block, though (esp. if we can still reproduce, at all, after bug 990580's fix lands).
Comment 50•11 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #48)
> (In reply to Dimi Lee[:dimi][:dlee] from comment #47)
> > This issue now may related to many other bugs cause test case fail.
> > We should check this after other bugs is fixed then we will have a more
> > clear environment to investigate.
> >
> > Suggest set to non-blocking now.
>
> Why? As long as there's intermittent failures due to wifi regressions, this
> remains as a blocker. If there's nothing else to do here that's related to
> wifi, then this should be closed out.
We can not reproduce the failure caused by wifi at Taipei, Eric also had do lots of test (Comment 31). Stephen provided us failure logs and some of them are not related to wifi. But we did see some of logs show that gecko cannot retrieve ip information from DHCP. From the log it is not related to any wifi regression and we think it is more likely related to the environment.
We still want to keep track on this bug so it should not be closed out.
Comment 51•11 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #50)
> (In reply to Jason Smith [:jsmith] from comment #48)
> > (In reply to Dimi Lee[:dimi][:dlee] from comment #47)
> > > This issue now may related to many other bugs cause test case fail.
> > > We should check this after other bugs is fixed then we will have a more
> > > clear environment to investigate.
> > >
> > > Suggest set to non-blocking now.
> >
> > Why? As long as there's intermittent failures due to wifi regressions, this
> > remains as a blocker. If there's nothing else to do here that's related to
> > wifi, then this should be closed out.
>
> We can not reproduce the failure caused by wifi at Taipei, Eric also had do
> lots of test (Comment 31). Stephen provided us failure logs and some of them
> are not related to wifi. But we did see some of logs show that gecko cannot
> retrieve ip information from DHCP. From the log it is not related to any
> wifi regression and we think it is more likely related to the environment.
The reality is that this environment is where our devices are running CI automation, so it needs to work. We can't just point at the environment with no idea of the solution in mind, as that doesn't provide firm evidence that there are no more problems on the gecko side. We either need to prove firmly about what is failing in our wifi setup or we need to continue investigating what gecko bugs are present here. As such, this stays as a blocker, as this still is a regression in existing on device automation.
Note - The issues not related to wifi are being fixed separately and are unrelated to this bug, so I'm removing as dependencies.
Updated•11 years ago
|
blocking-b2g: 1.4? → 1.4+
Flags: needinfo?(praghunath)
Comment 52•11 years ago
|
||
Can we please check if this issue is reproducible once bug 979130 is uplifted?
Comment 53•11 years ago
|
||
(In reply to Preeti Raghunath(:Preeti) from comment #52)
> Can we please check if this issue is reproducible once bug 979130 is
> uplifted?
That's unrelated to this bug.
Keywords: qawanted
Comment 54•11 years ago
|
||
(In reply to Jason Smith [:jsmith] from comment #51)
> The reality is that this environment is where our devices are running CI
> automation, so it needs to work. We can't just point at the environment with
> no idea of the solution in mind, as that doesn't provide firm evidence that
> there are no more problems on the gecko side. We either need to prove firmly
> about what is failing in our wifi setup or we need to continue investigating
> what gecko bugs are present here.
Agree with you. Because Dimi wants some helps to reproduce this bugs with his debugging patch and Dimi can not reproduce this bug with local network, can someone help him?
Keywords: qawanted
Comment 55•11 years ago
|
||
(In reply to Ken Chang[:ken] from comment #54)
> (In reply to Jason Smith [:jsmith] from comment #51)
> > The reality is that this environment is where our devices are running CI
> > automation, so it needs to work. We can't just point at the environment with
> > no idea of the solution in mind, as that doesn't provide firm evidence that
> > there are no more problems on the gecko side. We either need to prove firmly
> > about what is failing in our wifi setup or we need to continue investigating
> > what gecko bugs are present here.
> Agree with you. Because Dimi wants some helps to reproduce this bugs with
> his debugging patch and Dimi can not reproduce this bug with local network,
> can someone help him?
What patch? Which attachment?
(In reply to Jason Smith [:jsmith] from comment #55)
> What patch? Which attachment?
https://bugzilla.mozilla.org/attachment.cgi?id=8406080, from comment 44. I'll try to get this today.
(In reply to Stephen Donner [:stephend] from comment #56)
> (In reply to Jason Smith [:jsmith] from comment #55)
>
> > What patch? Which attachment?
>
> https://bugzilla.mozilla.org/attachment.cgi?id=8406080, from comment 44.
> I'll try to get this today.
Tried this today, on a just-flashed Hamachi, but get the following:
sdonner-10779:Desktop sdonner$ adb push omni.ja /system/b2g
failed to copy 'omni.ja' to '/system/b2g/omni.ja': No space left on device
I've done an adb remount, but still no luck; what am I doing wrong?
(In reply to Stephen Donner [:stephend] from comment #58)
> Created attachment 8410013 [details]
> wifi_logcat.txt
This was done like so:
1) I pushed:
sdonner-10779:Desktop sdonner$ sudo adb remount
Password:
remount succeeded
sdonner-10779:Desktop sdonner$ sudo adb push libnetutils.so /system/lib/
to:
2) sdonner-10779:B2G-flash-tool sdonner$ ./check_versions.sh
Gaia 84f4835833c4cb8ac7e5d9e0c94738b1cb0ef45a
Gecko https://hg.mozilla.org/mozilla-central/rev/c962bde5ac0b
BuildID 20140421160201
Version 31.0a1
ro.build.version.incremental=324
ro.build.date=Thu Dec 19 14:04:55 CST 2013
3) Then ran:
gaiatest --testvars=/Users/sdonner/Desktop/ui-tests/b2g-7.json --address=localhost:2828 --xml-output=results/results.xml --html-output=results/index.html --type=b2g-unit --restart gaiatest/tests/functional/settings/test_settings_wifi.py
4) What I see when I run the automation, is that it's just "Searching..." indefinitely, under "Available networks." Also, manually (after automation has run), I see that under the Settings app top-level, "Wi-Fi" is in the "Initializing..." state. When I try this manually with the same build (after a restart), I don't get it into this state.
Comment 60•11 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #60)
> Created attachment 8410035 [details]
> omni.ja with debug message
Unfortunately, this didn't work with my build; it causes the phone to boot into a blank/black screen and never finishes its startup.
This should show a single test_settings_wifi.py failure, with Debug turned on, with the appropriate, new omni.ja replaced!
Comment 63•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #62)
> Created attachment 8410049 [details]
> logcat.txt
>
> This should show a single test_settings_wifi.py failure, with Debug turned
> on, with the appropriate, new omni.ja replaced!
Hi, Stephen
Could you help upload the output html (index.html), thanks !
Comment 65•11 years ago
|
||
Hi Dimi, I run this for 200*30 times, 3 days, I got tons of reports and logs, and the b2g starts but it stays in the blue flash screen. Will send you the logs soon.
:davehunt, any thoughts on the failures between a run of test_settings.wifi.py with and without --restart?
For me, locally, it passes without --restart, but fails _with_ --restart. However, http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.mozilla-central.unittests.adhoc/ fares better than I, and it uses --restart (but I flash with ./fullflash.sh -n and also don't do the stuff our Jenkins box does).
Flags: needinfo?(dave.hunt)
Comment 67•11 years ago
|
||
How good is the wifi signal in the testing env.? For us in Taipei, the signal of APs we are testing with are quite strong.
Flags: needinfo?(stephen.donner)
Comment 68•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #66)
> :davehunt, any thoughts on the failures between a run of
> test_settings.wifi.py with and without --restart?
>
> For me, locally, it passes without --restart, but fails _with_ --restart.
> However,
> http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.mozilla-central.
> unittests.adhoc/ fares better than I, and it uses --restart (but I flash
> with ./fullflash.sh -n and also don't do the stuff our Jenkins box does).
Could you update GAIA to latest one and test again ?
In my testing environment, the always "Searching..." issue is fixed after updating GAIA
Comment 69•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #66)
> :davehunt, any thoughts on the failures between a run of
> test_settings.wifi.py with and without --restart?
>
> For me, locally, it passes without --restart, but fails _with_ --restart.
> However,
> http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.mozilla-central.
> unittests.adhoc/ fares better than I, and it uses --restart (but I flash
> with ./fullflash.sh -n and also don't do the stuff our Jenkins box does).
With --restart we stop b2g and clear a bunch of files from the device. Without it, we try to achieve the same in software - like using the API to forget WiFi networks, etc. Usually --restart is better at achieving a clean state, but sometimes we need to tweak the collection of files we delete. Perhaps something has changed in the way WiFi details are cached?
Flags: needinfo?(dave.hunt)
Updated•11 years ago
|
Whiteboard: [fromAutomation][p=2], Priority 2 → [fromAutomation][p=2], Priority 2,[ETA:4/30]
(In reply to Eric Chang [:ericcc] [:echang] from comment #67)
> How good is the wifi signal in the testing env.? For us in Taipei, the
> signal of APs we are testing with are quite strong.
The Wi-Fi router (Juniper) is literally right above the devices, in the lab.
Flags: needinfo?(stephen.donner)
Comment 72•11 years ago
|
||
Update current status of this issue :
1. We are tracking the issue that test case fail when trying to obtain IP address (Comment 42).
So I provide some components with debug information to Stephen to get more debug message because we cannot reproduce this in Taipei (Comment 31).
2. Stephen encounter a problem that wifi always show Searching when running test case (Comment 59).
This problem may caused by unsync version of GAIA and GECKO because it is fixed in my side if update to latest GAIA and GECKO (mozilla-central).
Stephen, Could you help confirm this in your side ? Thanks
Flags: needinfo?(stephen.donner)
Comment 73•11 years ago
|
||
(In reply to Dimi Lee[:dimi][:dlee] from comment #72)
> Update current status of this issue :
> 1. We are tracking the issue that test case fail when trying to obtain IP
> address (Comment 42).
> So I provide some components with debug information to Stephen to get more
> debug message because we cannot reproduce this in Taipei (Comment 31).
>
> 2. Stephen encounter a problem that wifi always show Searching when running
> test case (Comment 59).
> This problem may caused by unsync version of GAIA and GECKO because it is
> fixed in my side if update to latest GAIA and GECKO (mozilla-central).
> Stephen, Could you help confirm this in your side ? Thanks
I raise "qawanted" again because we still not get the log for original problem, the item 1 what Dimi mentioned and we can not reproduce it.
Keywords: qawanted
Zac, can you help get a local run which has failed, and a logcat? Thx.
Flags: needinfo?(stephen.donner) → needinfo?(zcampbell)
(In reply to Stephen Donner [:stephend] from comment #71)
> (In reply to Eric Chang [:ericcc] [:echang] from comment #67)
> > How good is the wifi signal in the testing env.? For us in Taipei, the
> > signal of APs we are testing with are quite strong.
>
> The Wi-Fi router (Juniper) is literally right above the devices, in the lab.
So, we thought this was true, but we are now told that the "Mozilla Guest" node to which we connect is, in fact, "somewhere else" -- likely in the ceiling in the adjacent kitchen to this lab.
As a next step on our side, we'll also try connecting to an authenticated, WPA2 Personal node: the one that the A-team uses (for which we'll need to isolate a device, add its MAC address to the Access Control List, and then change a credentials file so our automation can use it).
Raymond, please slot this as #2 on your priorities, just behind launching Australis tomorrow; thanks!
Flags: needinfo?(mozbugs.retornam)
(In reply to Ken Chang[:ken] from comment #73)
> I raise "qawanted" again because we still not get the log for original
> problem, the item 1 what Dimi mentioned and we can not reproduce it.
Actually, on Thursday the 24th, I sent Dimi the logfiles of runs with failures and without, as he requested. I'll post them here, for posterity.
Here's the case without specifying "--restart"; it passes.
Here's a logcat where we pass in "--restart", and fail to pass the test.
Additionally, I was told this isn't failing in Tiapei's environment, but yet:
http://release1-qa.corp.tpe1.mozilla.com:8080/view/B2G.Test.Hamachi/job/B2G.master.pvt.hamachi.gaiatest.unit.wifi/4/console
Eric, can you get that job running with HTML Reports (w/screenshots) and logcat (with debug Wi-Fi output) enabled? Thanks!
Updated•11 years ago
|
Flags: needinfo?(echang)
(And sorry, one last spam for me tonight, but I notice Taipei QA's job is running the gaiatest/tests/unit/settings/test_wifi_settings.py/ *test_set_wifi* [1] test; we're after gaiatest/tests/functional/settings/test_settings_wifi.py/ *test_connect_to_wifi_via_settings_app* [2] test.
Thanks again for helping us all try to nail this down!
[1] https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/tests/unit/settings/test_wifi_settings.py#L10
[2] https://github.com/mozilla-b2g/gaia/blob/master/tests/python/gaia-ui-tests/gaiatest/tests/functional/settings/test_settings_wifi.py#L15
Ok, worked quite a bit again with Dimi, and we've got a better handle on this, now.
1) we're no longer seeing -- locally or in Jenkins -- the "Searching..." problem
2) The real problem (and for which this bug was filed) is still present: with --restart --repeat=n, we're still seeing problems with two access points 1) "Mozilla Guest" and 2) "ateam" taking a while to dole out DHCP leases/wireless sessions.
We're looking into that, tomorrow, after Australis launches :-) We'll report back with more as we have it.
Comment 82•11 years ago
|
||
Actually, that was happening when we have build problem last week, the build was large, cannot be installed onto phone completely, and the build inside was messed up.
(In reply to Stephen Donner [:stephend] from comment #79)
> Additionally, I was told this isn't failing in Tiapei's environment, but yet:
>
> http://release1-qa.corp.tpe1.mozilla.com:8080/view/B2G.Test.Hamachi/job/B2G.
> master.pvt.hamachi.gaiatest.unit.wifi/4/console
>
> Eric, can you get that job running with HTML Reports (w/screenshots) and
> logcat (with debug Wi-Fi output) enabled? Thanks!
Flags: needinfo?(echang)
FWIW, the connect_to_wifi() failure that underlies this and other tests (such as in test_cleanup_gaia) is also experienced -- ocassionally -- by Taipei, so I've asked over in bug 997552 comment 9 for help looking into that as another possible method of reproducing and nailing this issue.
Additionally, test_connect_and_forget_all_networks has also now started failing in our unittests; see http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Hamachi/job/b2g.hamachi.mozilla-central.unittests/549/testReport/test_wifi/TestWiFi/test_connect_and_forget_all_networks/
Summary: [v1.4] Investigate test_settings_wifi.py failure → [v1.4][v2.0] Investigate test_settings_wifi.py failure
Comment 85•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #81)
> Ok, worked quite a bit again with Dimi, and we've got a better handle on
> this, now.
>
> 1) we're no longer seeing -- locally or in Jenkins -- the "Searching..."
> problem
> 2) The real problem (and for which this bug was filed) is still present:
> with --restart --repeat=n, we're still seeing problems with two access
> points 1) "Mozilla Guest" and 2) "ateam" taking a while to dole out DHCP
> leases/wireless sessions.
>
> We're looking into that, tomorrow, after Australis launches :-) We'll
> report back with more as we have it.
Hi, Stephen,
Have any update or test result on this bug ? Thanks
Flags: needinfo?(stephen.donner)
Comment 86•11 years ago
|
||
I ran tests with --restart and it failed. Attached is the logcat output
Gaia d23e479e8a4ce0bc620acb2d7e2f82801aa4d0ea
Gecko https://hg.mozilla.org/releases/mozilla-aurora/rev/36f67ce46855
BuildID 20140428000206
Version 30.0a2
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
Flags: needinfo?(mozbugs.retornam)
Comment 87•11 years ago
|
||
I ran again with the Omni.ja and libnetutils.so stephend sent me and it failed after 7 tries.
Attached is the logcat for all the runs
Comment 88•11 years ago
|
||
Logcat from 3rd run
Updated•11 years ago
|
Flags: needinfo?(zcampbell)
Flags: needinfo?(stephen.donner)
Dimi, Ken: do you now have the information you need, from us, at this point?
FWIW, we asked IT, and learned the following:
6:22:12 PM <adam> well, the DHCP server is seeing it.
6:22:16 PM <adam> Apr 30 22:09:28 admin1a dhcpd: DHCPREQUEST for 10.252.33.54 (10.252.75.6) from e4:2d:02:2b:da:4b via 10.252.32.1
6:22:16 PM <adam> Apr 30 22:09:28 admin1a dhcpd: DHCPACK on 10.252.33.54 to e4:2d:02:2b:da:4b via 10.252.32.1
6:23:04 PM <adam> don't see an error yet, not rate limiting.
6:46:44 PM <adam> It appears to try to send it that 33.54 addr
6:51:06 PM <adam> I'm not seeing errors.
Flags: needinfo?(dlee)
Comment 90•11 years ago
|
||
Hi Stephen,
Yes, it did provide some useful information, but we still need more test to investigate this issue.
From log, we can confirm that gecko part is OK and the problem may occur in dhcpcd(dhcp client) or dhcp server.
Could your help check following:
1. The libnetutils.so seems not replace correctly because i cannot see any debug information from latest log. Could you help reproduce this with pushing it to system/lib
2. In the lab environment, we tried both "ateam" and "mozilla guest". Do they both connect to the same dhcp server ?
3. For question 2, if they both connect to the same dhcp server, is it possible to try this case with different dhcp server ? For example, use your cell phone as wifi hotspot and try to connect to your phone.
Thanks for help!
Flags: needinfo?(dlee) → needinfo?(stephen.donner)
(In reply to Dimi Lee[:dimi][:dlee] from comment #90)
> Hi Stephen,
> Yes, it did provide some useful information, but we still need more test
> to investigate this issue.
> From log, we can confirm that gecko part is OK and the problem may occur in
> dhcpcd(dhcp client) or dhcp server.
>
> Could your help check following:
> 1. The libnetutils.so seems not replace correctly because i cannot see any
> debug information from latest log. Could you help reproduce this with
> pushing it to system/lib
Fixed; sorry about that. The logs from today should include it.
> 2. In the lab environment, we tried both "ateam" and "mozilla guest". Do
> they both connect to the same dhcp server ?
According to #it, no; we have "admin1a" and "admin1b", but both have, or should have, the same dhcpd.conf, which has:
5:08:09 PM <limed> we have these 2 values in dhcpd.conf
5:08:09 PM <limed> max-response-delay 60;
5:08:09 PM <limed> max-unacked-updates 10;
The full log of a Gaia UI Test run is here: https://app.box.com/s/dmk40286u1obssfdje4t. I'll attach the HTML report next.
> 3. For question 2, if they both connect to the same dhcp server, is it
> possible to try this case with different dhcp server ? For example, use your
> cell phone as wifi hotspot and try to connect to your phone.
>
> Thanks for help!
I'll work again with them tomorrow on this issue, and one of them will be here in-person, to help us debug it, too.
Flags: needinfo?(stephen.donner)
Also, according to them, sometimes the device keeps requesting a lease even after it's been given a lease:
May 5 23:49:32 admin1a dhcpd: DHCPREQUEST for 10.252.120.213 from e4:2d:02:38:d9:c4 via 10.252.120.1
May 5 23:49:32 admin1a dhcpd: DHCPACK on 10.252.120.213 to e4:2d:02:38:d9:c4 via 10.252.120.1
May 5 23:51:09 admin1a dhcpd: DHCPREQUEST for 10.252.120.213 from e4:2d:02:38:d9:c4 via 10.252.120.1
May 5 23:51:09 admin1a dhcpd: DHCPACK on 10.252.120.213 to e4:2d:02:38:d9:c4 via 10.252.120.1
(In reply to Stephen Donner [:stephend] from comment #91)
> The full log of a Gaia UI Test run is here:
> https://app.box.com/s/dmk40286u1obssfdje4t. I'll attach the HTML report
> next.
Sorry, in some spring cleaning, I accidentally a) uploaded the wrong file b) deleted the one I meant to upload :-(
Here's a reproduction of at least some of the problems, manually, though:
https://app.box.com/s/fslrqtubceap5bqx7h48
Comment 95•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #94)
> (In reply to Stephen Donner [:stephend] from comment #91)
>
> > The full log of a Gaia UI Test run is here:
> > https://app.box.com/s/dmk40286u1obssfdje4t. I'll attach the HTML report
> > next.
>
> Sorry, in some spring cleaning, I accidentally a) uploaded the wrong file b)
> deleted the one I meant to upload :-(
>
> Here's a reproduction of at least some of the problems, manually, though:
>
> https://app.box.com/s/fslrqtubceap5bqx7h48
Thanks for providing the log.
From log we can know that the problem is between dhcp client and server.
dhcp client(dhcpcd) could not get ip address so the testcase fail.
Could you help check following :
1. Try to use your personal phone as wifi hotspot, and modify jenkins testcase config file to connect to your phone. Then we can know if this issue is related to network environment.
2. Replace uploaded dhcpcd to /system/bin/ and dump the debug message if testcase fail
Thanks!
Flags: needinfo?(stephen.donner)
Comment 96•11 years ago
|
||
Push this file to /system/bin/ to get dhcpcd debug message
Flags: needinfo?(stephen.donner)
The above logcat is from the same run as the following HTML Report, which had two failures: 1 "AssertionError: WiFi was not connected via Settings app" and 1 "TimeoutException: TimeoutException: Timed out after 60.1 seconds" which timed out looking for "Connected".
Comment 100•11 years ago
|
||
test_settings_wifi.py testcase fail (Comment 37) on jenkins is because of gecko could not retrieve ip information from DHCP server.
After replacing gecko and libnetutils to get more debug information we found that libnetutils dump the following debug message:
"Timed out waiting for DHCP to finish" (in system/core/libnetutils/dhcp_utils.c::dhcp_do_request)
This debug message means that this issue is not related to gecko and it may cause by either dhcpcd (dhcp client) ,dhcp server or network environment. Also according to comment 92, it did have some problem between client and server.
I will keep checking this issue, but from current investigate result, this issue is neither a wifi regression nor a gecko bug so i will suggest to set this bug to non-blocking b2g.
Updated•11 years ago
|
blocking-b2g: 1.4+ → 1.4?
Comment 101•11 years ago
|
||
Removing the blocking flag as the conclusion on this bug indicates this is not a gaia/gecko bug. We'll keep investigating though.
blocking-b2g: 1.4? → ---
Component: Wifi → Gaia::UI Tests
This is from a manual run, where I try to connect to "ateam", but also connect a few times to "Mozilla Guest" with some issues with both SSIDs :(
(In reply to Stephen Donner [:stephend] from comment #102)
> Created attachment 8419111 [details]
> mywifi-woes_logcat.txt
>
> This is from a manual run, where I try to connect to "ateam", but also
> connect a few times to "Mozilla Guest" with some issues with both SSIDs :(
And here's the video capture that goes along with it: https://drive.google.com/file/d/0B_XMhxrTPIrWWnJTYy1EMVVLd3M/edit?usp=sharing
Dimi, does this help?
Comment 104•11 years ago
|
||
(In reply to Stephen Donner [:stephend] from comment #103)
> (In reply to Stephen Donner [:stephend] from comment #102)
> > Created attachment 8419111 [details]
> > mywifi-woes_logcat.txt
> >
> > This is from a manual run, where I try to connect to "ateam", but also
> > connect a few times to "Mozilla Guest" with some issues with both SSIDs :(
>
> And here's the video capture that goes along with it:
> https://drive.google.com/file/d/0B_XMhxrTPIrWWnJTYy1EMVVLd3M/edit?usp=sharing
>
> Dimi, does this help?
Yes, it is useful, I checked the timeline of viedo between
1:29(log time 20:25:44) - 1:41(log time 20:25:55) (waiting for ip address -> connected)
And following is the debug message of dhcpcd
05-07 20:25:43.709 990 990 E dhcpcd : version 5.2.10 starting
05-07 20:25:43.719 990 990 F dhcpcd : wlan0: using hwaddr e4:2d:02:38:d9:c4
05-07 20:25:43.779 990 990 E dhcpcd : wlan0: rebinding lease of 10.252.33.142
05-07 20:25:43.779 990 990 F dhcpcd : wlan0: sending REQUEST (xid 0x583817f), next in 4.99 seconds
05-07 20:25:48.789 990 990 F dhcpcd : wlan0: sending REQUEST (xid 0x583817f), next in 7.91 seconds
05-07 20:25:53.789 990 990 E dhcpcd : wlan0: broadcasting for a lease
05-07 20:25:53.789 990 990 F dhcpcd : wlan0: sending DISCOVER (xid 0xab7ef042), next in 3.24 seconds
05-07 20:25:54.619 990 990 E dhcpcd : wlan0: offered 10.252.120.213 from 10.252.75.6
05-07 20:25:54.619 990 990 F dhcpcd : wlan0: sending REQUEST (xid 0xab7ef042), next in 4.43 seconds
05-07 20:25:54.619 990 990 E dhcpcd : wlan0: ignoring offer of 10.252.120.213 from 10.252.75.7
05-07 20:25:54.679 990 990 E dhcpcd : wlan0: acknowledged 10.252.120.213 from 10.252.75.7
From the log message, we can see that dhcpcd wait for 11 sec then get server ack. I think it is related to the testcase fail we encountered, because if dhcpcd takes too long to get server ack, the test case will fail cause of timeout.
Updated•11 years ago
|
blocking-b2g: --- → backlog
Whiteboard: [fromAutomation][p=2], Priority 2,[ETA:4/30] → [fromAutomation][p=2], Priority 2
Updated•11 years ago
|
Priority: -- → P1
(In reply to Dimi Lee[:dimi][:dlee] from comment #104)
> And following is the debug message of dhcpcd
>
> 05-07 20:25:43.709 990 990 E dhcpcd : version 5.2.10 starting
> 05-07 20:25:43.719 990 990 F dhcpcd : wlan0: using hwaddr
> e4:2d:02:38:d9:c4
> 05-07 20:25:43.779 990 990 E dhcpcd : wlan0: rebinding lease of
> 10.252.33.142
> 05-07 20:25:43.779 990 990 F dhcpcd : wlan0: sending REQUEST (xid
> 0x583817f), next in 4.99 seconds
> 05-07 20:25:48.789 990 990 F dhcpcd : wlan0: sending REQUEST (xid
> 0x583817f), next in 7.91 seconds
> 05-07 20:25:53.789 990 990 E dhcpcd : wlan0: broadcasting for a lease
> 05-07 20:25:53.789 990 990 F dhcpcd : wlan0: sending DISCOVER (xid
> 0xab7ef042), next in 3.24 seconds
> 05-07 20:25:54.619 990 990 E dhcpcd : wlan0: offered 10.252.120.213
> from 10.252.75.6
> 05-07 20:25:54.619 990 990 F dhcpcd : wlan0: sending REQUEST (xid
> 0xab7ef042), next in 4.43 seconds
> 05-07 20:25:54.619 990 990 E dhcpcd : wlan0: ignoring offer of
> 10.252.120.213 from 10.252.75.7
> 05-07 20:25:54.679 990 990 E dhcpcd : wlan0: acknowledged
> 10.252.120.213 from 10.252.75.7
>
> From the log message, we can see that dhcpcd wait for 11 sec then get server
> ack. I think it is related to the testcase fail we encountered, because if
> dhcpcd takes too long to get server ack, the test case will fail cause of
> timeout.
James, is it possible to play with the default timeout value(s), specifically, the threshold for the time between the DHCPREQUEST and the DHCPACK? Is that value 10 seconds, currently? Thanks!
Flags: needinfo?(jbarnell)
Comment 106•11 years ago
|
||
Stephen, we're using the default values from ISC dhcpd, those are globals which would impact many portions of the environment. This looks like a problem similar to an ateam issue we're troubleshooting and I don't beleive its dhcp related. You should reasonably expect dhcp to work is less than 10 seconds on a normal exhcange. Can we match/link the logs above to the system start and wlan0 association events? If I were to guess the dhcp request is going out before association completes. Again this is 100% guess so take it with a grain of salt.
There are a couple of items I've read recently that indicate certain hardware combinations with android/linux work great with a manual connect, but not when started via rc.d. You're the best judge for that and I'll try to locate the links.
When can we coordinate a time to be in front of the devices? I'd like to see whats happening and turn up the log level on the wifi controllers. I also have a laptop with wireless packet capture to use at MTV2. I do need the captures as we may need to engage Juniper on this.
Flags: needinfo?(jbarnell)
Comment 107•11 years ago
|
||
Can you try setting up a static IP and see if this issue still happens? That way we will be sure it's related to the wireless/DHCP combo.
Also is "e4:2d:02:38:d9:c4" a good mac address to look at the logs on the wireless side? Or you have a better test device?
(In reply to Arzhel Younsi [:XioNoX] from comment #107)
> Can you try setting up a static IP and see if this issue still happens? That
> way we will be sure it's related to the wireless/DHCP combo.
> Also is "e4:2d:02:38:d9:c4" a good mac address to look at the logs on the
> wireless side? Or you have a better test device?
Not sure how we can set up a static IP on these Firefox OS devices; the problem doesn't seem to be in the dhcpd, though, now -- what we're now seeing is that the "ateam" node doesn't always appear after the wpa_supplicant's scan finds the rest of the SSIDs.
I've filed bug 1011358 for that. We're seeing it on both Hamachis and Flames, as well. I can get you the MAC address of the Flame tomorrow.
Depends on: 1011358
Comment 110•11 years ago
|
||
We have 2 failing tests in the latest v1.4 build, because of not being able to connect to wifi in time:
http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Hamachi/job/b2g.hamachi.mozilla-b2g30_v1_4.ui.non-smoketest/44/HTML_Report/
http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Hamachi/job/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/45/HTML_Report/
Gaia 17b102ee8d9a724b62285547cc5f1c5d30bfb01c
Gecko https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/95be84248033
BuildID 20140520000201
Version 30.0
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
Comment 111•11 years ago
|
||
Tests that are using wifi connection have been failing in each of the latest v1.4 builds.
This is the most recent build report: http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Hamachi/job/b2g.hamachi.mozilla-b2g30_v1_4.ui.non-smoketest/49/HTML_Report/
http://selenium.qa.mtv2.mozilla.com:8080/job/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/50/HTML_Report/
Gaia 16f7ef17feb08562aa6a3cfd6d561b2fcca652c0
Gecko https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/9cc96d59cc30
BuildID 20140523000202
Version 30.0
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
Comment 112•11 years ago
|
||
I started an adhoc run of one of the test that uses wifi connection on v1.4, and it failed each time, with 3 different errors: no wireless networks are found on the list, ateam SSID is not found, or the connection to the ateam fails.
See report below:
http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Hamachi/job/b2g.hamachi.mozilla-b2g30_v1_4.adhoc/10/HTML_Report/
I ran the same test locally several times and it passes each time.
Any updates on the wifi issues?
Flags: needinfo?(stephen.donner)
Flags: needinfo?(mozbugs.retornam)
Comment 113•11 years ago
|
||
(In reply to Viorela Ioia [:viorela] from comment #112)
> I started an adhoc run of one of the test that uses wifi connection on v1.4,
> and it failed each time, with 3 different errors: no wireless networks are
> found on the list, ateam SSID is not found, or the connection to the ateam
> fails.
> See report below:
> http://selenium.qa.mtv2.mozilla.com:8080/view/B2G%20Hamachi/job/b2g.hamachi.
> mozilla-b2g30_v1_4.adhoc/10/HTML_Report/
> I ran the same test locally several times and it passes each time.
>
> Any updates on the wifi issues?
The network engineer we are working with in the Mountain View office is out this week. We'll have to pick this up next week.
Flags: needinfo?(mozbugs.retornam)
We are going to potentially be having Wi-Fi issues (esp. on master) for a bit, as bug 1001897 and bug 1011358 are still pending, and, as Raymond says, James Barnell was out most of last week, and will be back to work on this with us next week.
Flags: needinfo?(stephen.donner)
Reporter | ||
Comment 115•11 years ago
|
||
We had several wifi related failures in today's test run.
Build info:
Gaia 8f4201a44676eb70926a3d2645d94bf92fcd6718
Gecko https://hg.mozilla.org/releases/mozilla-b2g30_v1_4/rev/4985bf79f425
BuildID 20140526000201
Version 30.0
ro.build.version.incremental=eng.tclxa.20131223.163538
ro.build.date=Mon Dec 23 16:36:04 CST 2013
http://selenium.qa.mtv2.mozilla.com:8080/view/B2G/job/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/56/HTML_Report/
http://selenium.qa.mtv2.mozilla.com:8080/view/B2G/job/b2g.hamachi.mozilla-b2g30_v1_4.ui.non-smoketest/55/HTML_Report/
Traceback (most recent call last):
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette_test.py", line 145, in run
self.setUp()
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/tests/python/gaia-ui-tests/gaiatest/tests/functional/ftu/test_ftu_skip_tour.py", line 92, in setUp
self.wait_for_condition(lambda m: self.data_layer.is_wifi_enabled)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 1058, in wait_for_condition
Wait(self.marionette, timeout).until(method, message=message)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/wait.py", line 122, in until
rv = condition(self.marionette)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/tests/python/gaia-ui-tests/gaiatest/tests/functional/ftu/test_ftu_skip_tour.py", line 92, in <lambda>
self.wait_for_condition(lambda m: self.data_layer.is_wifi_enabled)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/tests/python/gaia-ui-tests/gaiatest/gaia_test.py", line 254, in is_wifi_enabled
return self.marionette.execute_script("return window.navigator.mozWifiManager.enabled;")
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 1114, in execute_script
filename=os.path.basename(frame[0]))
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 613, in _send_message
self._handle_error(response)
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.smoketest/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette.py", line 648, in _handle_error
raise JavascriptException(message=message, status=status, stacktrace=stacktrace)
JavascriptException: JavascriptException: TypeError: window.navigator.mozWifiManager is null
stacktrace:
execute_script @gaia_test.py, line 254
inline javascript, line 1084
src: "let __marionetteFunc = function(){return window.navigator.mozWifiManager.enabled;};__marionetteFunc.apply(null, __marionetteParams);"
Traceback (most recent call last):
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.non-smoketest/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/marionette_test.py", line 163, in run
testMethod()
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.non-smoketest/tests/python/gaia-ui-tests/gaiatest/tests/functional/settings/test_settings_wifi.py", line 21, in test_connect_to_wifi_via_settings_app
wifi_settings.connect_to_network(self.testvars['wifi'])
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.non-smoketest/tests/python/gaia-ui-tests/gaiatest/apps/settings/regions/wifi.py", line 50, in connect_to_network
lambda m: m.find_element(*self._connected_message_locator).text == "Connected")
File "/var/jenkins/workspace/b2g.hamachi.mozilla-b2g30_v1_4.ui.non-smoketest/.env/local/lib/python2.7/site-packages/marionette_client-0.7.5-py2.7.egg/marionette/wait.py", line 143, in until
cause=last_exc)
TimeoutException: TimeoutException: Timed out after 60.1016900539 seconds
Team, we will update you as we have progress; no need to continue posting more stacktraces - they just add to the length of the bug. We know there is an issue with, especially, the "ateam" Wi-Fi node, so we are looking at, now, understanding the difference between our new test "qalab" node and the problems with the "ateam" node.
Updated•10 years ago
|
Assignee: dlee → nobody
Comment 117•10 years ago
|
||
We haven't seen wifi issues for some time now.
I guess this is no longer an issue in our tests.
I am going to close this bug.
Thanks to all for the investigation!
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•10 years ago
|
blocking-b2g: backlog → ---
tracking-b2g:
--- → backlog
You need to log in
before you can comment on or make changes to this bug.
Description
•