Closed Bug 1035039 Opened 10 years ago Closed 9 years ago

[MTBF][Marionette] filedescriptor out of range in select()

Categories

(Firefox OS Graveyard :: MTBF, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: wachen, Unassigned)

References

Details

http://mtbf-1:8080/job/flame.v200.mtbf/label=mtbf-2/8/consoleFull
Suspected Reason: Marionette server/client issue (ADB is still alive for us to get minidump)

07:35:26 INFO:Marionette:running webserver on http://10.247.24.112:48910/ serving content from /var/jenkins/workspace/flame.v200.mtbf@2/label/mtbf-2/.env/local/lib/python2.7/site-packages/marionette_client-0.7.10-py2.7.egg/marionette/www
07:35:26 Exception in thread Thread-1140:
07:35:26 Traceback (most recent call last):
07:35:26   File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
07:35:26     self.run()
07:35:26   File "/usr/lib/python2.7/threading.py", line 504, in run
07:35:26     self.__target(*self.__args, **self.__kwargs)
07:35:26   File "/usr/lib/python2.7/SocketServer.py", line 225, in serve_forever
07:35:26     r, w, e = select.select([self], [], [], poll_interval)
07:35:26 ValueError: filedescriptor out of range in select()

07:36:03 *Total MTBF Time: 250714.730s
07:36:03 INFO:mtbf_driver.mtbf:
07:36:03 MTBF TEST SUMMARY
07:36:03 -----------------
07:36:03 INFO:mtbf_driver.mtbf:passed: 6564
07:36:03 INFO:mtbf_driver.mtbf:failed: 492


Most failed test cases are due to test cases unstability. It would improve over time.
Hi, Malini,

Can you help with this bug?
Flags: needinfo?(mdas)
Hi Walter, I'll be away for some meetings this week.

Jgriffin, can you find someone to take a look at this in the meantime? In the worst case, I can take a look on Wednesday.
Flags: needinfo?(mdas) → needinfo?(jgriffin)
(In reply to Walter Chen[:ypwalter][:wachen] from comment #0)
> http://mtbf-1:8080/job/flame.v200.mtbf/label=mtbf-2/8/consoleFull

Unfortunately I can't access this url.  Can you attach the full console log?
Flags: needinfo?(jgriffin) → needinfo?(wachen)
http://mtbf-1:8080/job/flame.v200.mtbf/label=mtbf-2/30/consoleFull 
http://mtbf-1:8080/job/flame.v200.mtbf/label=mtbf-2/29/consoleFull 

Error getting log: [Errno 32] Broken pipe: Connection to Marionette server is lost. Check gecko.log (desktop firefox) or logcat (b2g) for errors.

Reproduced, but no logcat, trying to reproduce again
Thanks.  The part of the log this shows up in:

23:45:16 passed: 6
23:45:16 INFO:Marionette:passed: 6
23:45:16 failed: 0
23:45:16 INFO:Marionette:failed: 0
23:45:16 todo: 0
23:45:16 INFO:Marionette:todo: 0
23:45:16 running webserver on http://10.247.24.112:36302/ serving content from /var/jenkins/workspace/flame.v200.mtbf@2/label/mtbf-2/.env/local/lib/python2.7/site-packages/marionette_client-0.7.10-py2.7.egg/marionette/www
23:45:16 INFO:Marionette:running webserver on http://10.247.24.112:36302/ serving content from /var/jenkins/workspace/flame.v200.mtbf@2/label/mtbf-2/.env/local/lib/python2.7/site-packages/marionette_client-0.7.10-py2.7.egg/marionette/www
23:45:16 TEST-START test_dummy_case.py
23:45:16 INFO:Marionette:TEST-START test_dummy_case.py
23:45:16 Exception in thread Thread-1021:
23:45:16 Traceback (most recent call last):
23:45:16   File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner
23:45:16     self.run()
23:45:16   File "/usr/lib/python2.7/threading.py", line 504, in run
23:45:16     self.__target(*self.__args, **self.__kwargs)
23:45:16   File "/usr/lib/python2.7/SocketServer.py", line 225, in serve_forever
23:45:16     r, w, e = select.select([self], [], [], poll_interval)
23:45:16 ValueError: filedescriptor out of range in select()

The tests are executed with:

09:57:27 ++ MOZ_IGNORE_NUWA_PROCESS=true
09:57:27 ++ MTBF_TIME=120h
09:57:27 ++ MTBF_CONF=conf/flame_master.json
09:57:27 ++ mtbf --address=localhost:2831 --testvars=testvars.json tests/mtbf/keyboard

The MTBF tests re-run a set of tests, and this error occurs between runs.  For each run the webserver is shutdown and restarted; it looks like there is potentially some race condition when this happens.
This probably occurs because MTBF isn't cleaning up the runner between runs, so the webserver instance from the previous run isn't shut down cleanly.

Can you try adding:

   self.runner.cleanup()

to the MTBF driver, here:

https://github.com/Mozilla-TWQA/MTBF-Driver/blob/master/mtbf_driver/mtbf.py#L154
It probably not the case. My jenkins job always started and flash the new build. Even if I don't flash a new build, the phone will still restart.
(In reply to Walter Chen[:ypwalter][:wachen] from comment #8)
> It probably not the case. My jenkins job always started and flash the new
> build. Even if I don't flash a new build, the phone will still restart.

This is a cleanup that occurs between repeats of a particular test within the same test job; I didn't mean to imply it was happening between jobs in Jenkins.
Component: General → MTBF
We didn't see this bug for a year, so it's WFM right now.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
You need to log in before you can comment on or make changes to this bug.