Closed
Bug 1084614
Opened 9 years ago
Closed 8 years ago
ssltunnel intermittently fails to bind on Android 4.0
Categories
(Testing :: General, defect)
Tracking
(firefox39 fixed)
RESOLVED
FIXED
mozilla39
Tracking | Status | |
---|---|---|
firefox39 | --- | fixed |
People
(Reporter: gbrown, Assigned: gbrown)
References
Details
(Whiteboard: [ateam_harness_work])
Attachments
(2 files, 1 obsolete file)
1.07 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
4.99 KB,
patch
|
kmoir
:
review+
|
Details | Diff | Splinter Review |
Most (all?) mochitest and robocop test logs for Android 4.0 show something like: 12:46:52 INFO - runtests.py | Server pid: 20201 12:46:52 INFO - runtests.py | Websocket server pid: 20203 12:46:52 INFO - runtests.py | SSL tunnel pid: 20205 12:46:52 INFO - failed to bind socket 12:46:52 INFO - Shutting down... 12:46:53 INFO - runtests.py | Running tests: start. "failed to bind socket / Shutting down..." does not appear in Android 2.3 logs. I don't know that this is causing any test failures, but it seems wrong.
![]() |
Assignee | |
Comment 1•9 years ago
|
||
On Android 4.0, ssltunnel is run on a "foopy" which typically serves multiple pandaboards. Process listings from the foopy typically show 8 or more instances of ssltunnel running. I wonder if we are running into port conflicts. http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android/1413468522/mozilla-central_panda_android_test-mochitest-1-bm101-tests1-panda-build632.txt.gz 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: calling [ps xU cltbld] 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: PID TTY STAT TIME COMMAND 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 4040 ? S 4:39 /builds/panda-0173/test/build/venv/bin/python /builds/panda-0173/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0173/test/build/tests/mochitest -l /builds/panda-0173/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 4042 ? Sl 0:08 /builds/panda-0173/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelm9NB5D.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 4106 ? S 4:37 /builds/panda-0175/test/build/venv/bin/python /builds/panda-0175/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0175/test/build/tests/mochitest -l /builds/panda-0175/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 4108 ? Sl 0:07 /builds/panda-0175/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel0km9nn.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 4253 ? Sl 0:01 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0179 --pidfile=/builds/panda-0179/twistd.pid --python=/builds/panda-0179/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 4548 ? S 0:54 /builds/panda-0173/test/build/venv/bin/python /builds/panda-0173/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0173/test/build/tests/mochitest -l /builds/panda-0173/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 5945 ? S 13:18 /builds/panda-0171/test/build/venv/bin/python /builds/panda-0171/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0171/test/build/tests/mochitest -l /builds/panda-0171/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 5947 ? Sl 0:09 /builds/panda-0171/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel0nP5lt.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 10279 ? S 4:49 /builds/panda-0177/test/build/venv/bin/python /builds/panda-0177/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0177/test/build/tests/mochitest -l /builds/panda-0177/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 10281 ? Sl 0:05 /builds/panda-0177/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelKrCucv.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 13382 ? S 5:19 /builds/panda-0171/test/build/venv/bin/python /builds/panda-0171/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0171/test/build/tests/mochitest -l /builds/panda-0171/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 13956 ? S 1:54 /builds/panda-0171/test/build/venv/bin/python /builds/panda-0171/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0171/test/build/tests/mochitest -l /builds/panda-0171/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 14670 ? S 3:54 /builds/panda-0179/test/build/venv/bin/python /builds/panda-0179/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0179/test/build/tests/mochitest -l /builds/panda-0179/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 14672 ? Sl 0:05 /builds/panda-0179/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel8v5sF8.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 15016 ? S 5:51 /builds/panda-0180/test/build/venv/bin/python /builds/panda-0180/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0180/test/build/tests/mochitest -l /builds/panda-0180/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 15018 ? Sl 0:09 /builds/panda-0180/test/build/hostutils/bin/ssltunnel /tmp/ssltunneltprySu.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 17989 ? S 0:32 /builds/panda-0176/test/build/venv/bin/python /builds/panda-0176/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0176/test/build/tests/mochitest -l /builds/panda-0176/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 17991 ? Sl 0:00 /builds/panda-0176/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelYI55ST.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 21711 ? S 0:34 /builds/panda-0171/test/build/venv/bin/python /builds/panda-0171/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0171/test/build/tests/mochitest -l /builds/panda-0171/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 23684 ? Sl 0:04 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0178 --pidfile=/builds/panda-0178/twistd.pid --python=/builds/panda-0178/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 23810 ? S 0:17 /tools/buildbot/bin/python scripts/scripts/android_panda.py --cfg android/android_panda_releng.py --jsreftest-suite jsreftest-1 --blob-upload-branch mozilla-inbound --download-symbols true 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 24799 ? S 0:01 python -u /builds/sut_tools/logcat.py 10.26.129.73 /builds/panda-0178/test/build/blobber_upload_dir/logcat.log -v time 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 24987 ? S 0:00 python -u /builds/panda-0178/test/build/tests/reftest/remotereftest.py --deviceIP=10.26.129.73 --xre-path=../hostutils/xre --utility-path=../hostutils/bin --app=org.mozilla.fennec --ignore-window-size --bootstrap --extra-profile-file=jsreftest/tests/user.js jsreftest/tests/jstests.list --http-port=30178 --ssl-port=31178 --symbols-path=https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-debug/1413466183/fennec-36.0a1.en-US.android-arm.crashreporter-symbols.zip --total-chunks=3 --this-chunk=1 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 24997 ? Sl 1:57 ../hostutils/bin/xpcshell -g /builds/panda-0178/test/build/hostutils/xre -v 170 -f /builds/panda-0178/test/build/hostutils/bin/components/httpd.js -e const _PROFILE_PATH = '/tmp/tmpop4Qn5';const _SERVER_PORT = '30178'; const _SERVER_ADDR ='10.26.129.17'; -f /builds/panda-0178/test/build/tests/reftest/server.js 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 25631 ? S 0:32 /builds/panda-0180/test/build/venv/bin/python /builds/panda-0180/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0180/test/build/tests/mochitest -l /builds/panda-0180/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 25731 ? Sl 0:01 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0174 --pidfile=/builds/panda-0174/twistd.pid --python=/builds/panda-0174/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 25736 ? Sl 0:01 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0177 --pidfile=/builds/panda-0177/twistd.pid --python=/builds/panda-0177/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 26087 ? Sl 0:01 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0171 --pidfile=/builds/panda-0171/twistd.pid --python=/builds/panda-0171/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 26090 ? Sl 0:01 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0170 --pidfile=/builds/panda-0170/twistd.pid --python=/builds/panda-0170/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 26137 ? S 0:11 /tools/buildbot/bin/python scripts/scripts/android_panda.py --cfg android/android_panda_releng.py --crashtest-suite crashtest --blob-upload-branch mozilla-central --download-symbols ondemand 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 27685 ? S 0:11 /tools/buildbot/bin/python scripts/scripts/android_panda.py --cfg android/android_panda_releng.py --reftest-suite reftest-6 --blob-upload-branch try --download-symbols ondemand 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 27854 ? Ss 0:00 /bin/sh -c /builds/watch_devices.sh > /dev/null 2>&1 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 27856 ? S 0:00 /bin/bash -eu /builds/watch_devices.sh 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 27859 ? S 0:00 tee -a /builds/watcher.log 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 27869 ? Sl 0:03 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0173 --pidfile=/builds/panda-0173/twistd.pid --python=/builds/panda-0173/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 27938 ? S 0:00 /bin/bash -eu /builds/watch_devices.sh panda-0181 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28081 ? S 0:00 /bin/bash /builds/tools/buildfarm/mobile/manage_buildslave.sh gettac panda-0181 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28093 ? S 0:00 python /builds/tools/buildfarm/mobile/../utils/retry.py --stderr-regexp ERROR 404: Not Found --fail-if-match wget -q -O/builds/panda-0181/buildbot.tac.new http://slavealloc.pvt.build.mozilla.org/gettac/panda-0181 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28129 ? Sl 0:01 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0180 --pidfile=/builds/panda-0180/twistd.pid --python=/builds/panda-0180/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28132 ? Sl 0:01 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0176 --pidfile=/builds/panda-0176/twistd.pid --python=/builds/panda-0176/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28141 ? S 0:00 python -u /builds/sut_tools/logcat.py 10.26.129.65 /builds/panda-0170/test/build/blobber_upload_dir/logcat.log -v time 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28160 ? S 0:00 python -u /builds/panda-0170/test/build/tests/reftest/remotereftest.py --deviceIP=10.26.129.65 --xre-path=../hostutils/xre --utility-path=../hostutils/bin --app=org.mozilla.fennec --ignore-window-size --bootstrap --http-port=30170 --ssl-port=31170 --symbols-path=https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-central-android/1413468522/fennec-36.0a1.en-US.android-arm.crashreporter-symbols.zip reftest/tests/testing/crashtest/crashtests.list 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28170 ? Sl 0:13 ../hostutils/bin/xpcshell -g /builds/panda-0170/test/build/hostutils/xre -v 170 -f /builds/panda-0170/test/build/hostutils/bin/components/httpd.js -e const _PROFILE_PATH = '/tmp/tmpyPA0RS';const _SERVER_PORT = '30170'; const _SERVER_ADDR ='10.26.129.17'; -f /builds/panda-0170/test/build/tests/reftest/server.js 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28423 ? S 0:00 python -u /builds/sut_tools/logcat.py 10.26.129.74 /builds/panda-0179/test/build/blobber_upload_dir/logcat.log -v time 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28441 ? R 0:06 /tools/buildbot/bin/python scripts/scripts/android_panda.py --cfg android/android_panda_releng.py --mochitest-suite mochitest-1 --blob-upload-branch mozilla-central --download-symbols ondemand 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28449 ? S 0:00 python -u /builds/panda-0179/test/build/tests/reftest/remotereftest.py --deviceIP=10.26.129.74 --xre-path=../hostutils/xre --utility-path=../hostutils/bin --app=org.mozilla.fennec --ignore-window-size --bootstrap --http-port=30179 --ssl-port=31179 --symbols-path=https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/nsilva@mozilla.com-efa0352ac458/try-android/fennec-36.0a1.en-US.android-arm.crashreporter-symbols.zip reftest/tests/layout/reftests/reftest.list --total-chunks=8 --this-chunk=6 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28460 ? Sl 0:03 ../hostutils/bin/xpcshell -g /builds/panda-0179/test/build/hostutils/xre -v 170 -f /builds/panda-0179/test/build/hostutils/bin/components/httpd.js -e const _PROFILE_PATH = '/tmp/tmpFl0F1G';const _SERVER_PORT = '30179'; const _SERVER_ADDR ='10.26.129.17'; -f /builds/panda-0179/test/build/tests/reftest/server.js 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28839 ? R 0:00 /tools/buildbot/bin/python /builds/sut_tools/verify.py 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28844 ? R 0:00 ps xU cltbld 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28909 ? S 2:13 /builds/panda-0178/test/build/venv/bin/python /builds/panda-0178/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0178/test/build/tests/mochitest -l /builds/panda-0178/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 28911 ? Sl 0:04 /builds/panda-0178/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel24ePhU.cfg 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 29375 ? Sl 0:03 /tools/buildbot/bin/python2.7 /tools/buildbot/bin/twistd --no_save --rundir=/builds/panda-0175 --pidfile=/builds/panda-0175/twistd.pid --python=/builds/panda-0175/buildbot.tac 08:21:15 INFO - 10/16/2014 08:21:15: DEBUG: 30534 ? S 0:48 /builds/panda-0180/test/build/venv/bin/python /builds/panda-0180/test/build/tests/mochitest/pywebsocket_wrapper.py -p 9988 -w /builds/panda-0180/test/build/tests/mochitest -l /builds/panda-0180/test/build/tests/mochitest/websock.log --log-level=debug --allow-handlers-outside-root-dir
Comment 2•9 years ago
|
||
Probably the main reason that this isn't causing test failures is that https support in mochitests is still relatively new. The port conflict thing seems less likely when you consider that the exact same process is used to select the HTTP server port. Maybe landing a patch that runs netstat -p alongside the ps command might help inform that a little better.
![]() |
Assignee | |
Comment 3•8 years ago
|
||
This seems to be fairly rare now, but still happens. With ps -ef and netstat -p logged just before starting ssltunnel... With "failed to bind": http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-c583b58fbb11/try-android-api-11/try_panda_android_test-mochitest-2-bm89-tests1-panda-build67.txt.gz Successful run (no "failed to bind"): http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-c583b58fbb11/try-android-api-11/try_panda_android_test-mochitest-1-bm89-tests1-panda-build91.txt.gz
![]() |
Assignee | |
Comment 4•8 years ago
|
||
It looks like ssltunnel is failing at http://hg.mozilla.org/mozilla-central/annotate/2209a8de5d23/testing/mochitest/ssltunnel/ssltunnel.cpp#l964. I *assume* it is trying to bind to the port passed to runtestsremote.py. For the failed case: 12:15:12 INFO - Running command: ['python', '-u', '/builds/panda-0329/test/build/tests/mochitest/runtestsremote.py', ... '--http-port=30329', '--ssl-port=31329', ... which should be set by http://hg.mozilla.org/build/mozharness/annotate/08cd66cbfacd/scripts/android_panda.py#l436, which looks OK to me -- on device panda-XXX, it uses --ssl-port=31XXX. In the failed log, I do not see any reason to believe 31329 is already in use.
![]() |
Assignee | |
Comment 5•8 years ago
|
||
Another failure -- same port, 31329: http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-f96e618a84af/try-android-api-11/try_panda_android_test-robocop-3-bm89-tests1-panda-build62.txt.gz Another failure, port 31499: http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-f96e618a84af/try-android-api-11/try_panda_android_test-robocop-4-bm102-tests1-panda-build124.txt.gz
Comment 6•8 years ago
|
||
As an aside, we should probably stick the port number in the error message here: https://dxr.mozilla.org/mozilla-central/source/testing/mochitest/ssltunnel/ssltunnel.cpp#963
![]() |
Assignee | |
Comment 7•8 years ago
|
||
Assignee: nobody → gbrown
Attachment #8568611 -
Flags: review?(ted)
![]() |
Assignee | |
Comment 8•8 years ago
|
||
:kmoir -- Can you see if there is something running on the foopy for panda-0329 that is using port 31329? I'm hoping "netstat -p" from the root account will show something.
Flags: needinfo?(kmoir)
Comment 9•8 years ago
|
||
[root@foopy58.p3.releng.scl3.mozilla.com ~]# netstat -tulpn Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 0.0.0.0:30322 0.0.0.0:* LISTEN 30436/xpcshell tcp 0 0 0.0.0.0:30324 0.0.0.0:* LISTEN 1418/xpcshell tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 4982/sshd tcp 0 0 0.0.0.0:31320 0.0.0.0:* LISTEN 1958/ssltunnel tcp 0 0 0.0.0.0:25 0.0.0.0:* LISTEN 5088/master tcp 0 0 0.0.0.0:31322 0.0.0.0:* LISTEN 30440/ssltunnel tcp 0 0 0.0.0.0:31324 0.0.0.0:* LISTEN 1430/ssltunnel tcp 0 0 0.0.0.0:30332 0.0.0.0:* LISTEN 30507/../hostutils/ tcp 0 0 0.0.0.0:31328 0.0.0.0:* LISTEN 26060/ssltunnel tcp 0 0 0.0.0.0:31329 0.0.0.0:* LISTEN 27032/ssltunnel [root@foopy58.p3.releng.scl3.mozilla.com ~]# ps -ef | grep 27032 root 2159 32678 0 09:28 pts/0 00:00:00 grep 27032 cltbld 27032 1 0 Feb22 ? 00:00:01 /builds/panda-0329/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelEs06qr.cfg perhaps the ssltunnel session process isn't getting killed between test suites?
Flags: needinfo?(kmoir)
![]() |
Assignee | |
Comment 10•8 years ago
|
||
Yes, I think that's it exactly. Now that I know what I'm looking for, I see similar, long-running ssltunnel's in the process list for the other failure I found, http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-f96e618a84af/try-android-api-11/try_panda_android_test-robocop-4-bm102-tests1-panda-build124.txt.gz: 18:46:45 INFO - cltbld 18068 1 0 Feb15 ? 00:00:05 /builds/panda-0499/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel5_RTRH.cfg 18:46:45 INFO - cltbld 18139 1 0 Feb15 ? 00:00:05 /builds/panda-0503/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelAQ2Pnt.cfg We need to make sure ssltunnel is always killed when a test is done with it.
![]() |
Assignee | |
Updated•8 years ago
|
Summary: ssltunnel seems to fail on Android 4.0 → ssltunnel intermittently fails to bind on Android 4.0
Updated•8 years ago
|
Attachment #8568611 -
Flags: review?(ted) → review+
![]() |
Assignee | |
Comment 11•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/0f9069a68fa3
Keywords: leave-open
![]() |
Assignee | |
Comment 13•8 years ago
|
||
One possibility that I am concerned about is when android_panda.py times out, like this: http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-api-11/1425579369/mozilla-inbound_panda_android_test-mochitest-1-bm89-tests1-panda-build162.txt.gz command timed out: 2400 seconds without output running ['/tools/buildbot/bin/python', 'scripts/scripts/android_panda.py', '--cfg', 'android/android_panda_releng.py', '--mochitest-suite', 'mochitest-1', '--blob-upload-branch', 'mozilla-inbound', '--download-symbols', 'ondemand'], attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=2460.474057 ========= Finished '/tools/buildbot/bin/python scripts/scripts/android_panda.py ...' failed (results: 2, elapsed: 41 mins, 0 secs) (at 2015-03-05 11:50:34.291248) ========= Did android_panda.py start runtestsremote.py? Did runtestsremote.py start ssltunnel? Were all processes killed, or were some orphaned? :kmoir -- Can you check the foopy for panda-0381 and see if the foopy has any running processes that were started between 2015-03-05 11:10 and 2015-03-05 11:50?
Flags: needinfo?(kmoir)
Comment 14•8 years ago
|
||
No, not from today but there are some orphaned processes from earlier before the patch was applied. Perhaps they should be killed on all the foopies since they seem to be there from before. [root@foopy62.p4.releng.scl3.mozilla.com ~]# ps -ef | grep ssltunnel cltbld 1059 789 0 18:01 ? 00:00:00 /builds/panda-0627/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelbJa31a.cfg cltbld 3844 3802 0 18:11 ? 00:00:00 [ssltunnel] <defunct> root 4610 4554 0 18:18 pts/0 00:00:00 grep ssltunnel cltbld 12500 1 0 Feb23 ? 00:00:05 /builds/panda-0375/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelV8PU5b.cfg cltbld 24954 1 0 Feb15 ? 00:00:10 /builds/panda-0380/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelBK0B1x.cfg cltbld 32749 1 0 Feb19 ? 00:00:08 /builds/panda-0374/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelQcmx_x.cfg
Flags: needinfo?(kmoir)
![]() |
Assignee | |
Comment 15•8 years ago
|
||
Thanks Kim. Yes, the ssltunnel orphans should be cleaned up, but I still don't understand how they are being orphaned and assume we'll just get more.
Whiteboard: [ateam_harness_work]
![]() |
Assignee | |
Comment 16•8 years ago
|
||
It looks to me like the harness normally kills ssltunnel just fine. I'm not sure of exception cases, like when buildbot kills a mozharness script that exceeds its timeout -- but Comment 14 suggests that ssltunnel is cleaned up even in those cases. Still, ssltunnel orphans are happening somehow. I suppose it is always a risk. I am tempted to fix this by having the harness kill orphaned ssltunnels: before runtests.py starts ssltunnel, check for and kill running ssltunnel instances with ppid=1, or something like that. (It would be nice to only kill the ssltunnel using the needed port, but that would require something like parsing netstat output or parsing the temporary config files...seems error-prone.) Any better suggestions?
Comment 17•8 years ago
|
||
We do something similar when running them on the emulators. http://hg.mozilla.org/build/mozharness/file/bc2b3ab7c9c6/scripts/android_emulator_unittest.py#l572 I agree that killing the process seems like the right approach
![]() |
Assignee | |
Comment 18•8 years ago
|
||
This patch updates the android_panda.py mozharness script to kill orphan ssltunnel processes before starting tests. At first I was going to make this change in run_tests.py. That seemed right because that's where we start ssltunnel. On the other hand, that sort of process checking and cleanup seemed out of scope for run_tests.py and this problem seems to be specific to the panda jobs (because of the lifetime of the foopy, I suppose) -- why clutter run_tests.py and run this check on all platforms if it only happens on pandas? So I updated the android_panda.py script instead; does that seem reasonable? Here it is in action: http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-2cc28ee1bb03/try-android-api-11/try_panda_android_test-crashtest-bm100-tests1-panda-build282.txt.gz 06:56:42 INFO - Running main action method: run_test 06:56:42 INFO - Running command: ['ps', '-f', '-C', 'ssltunnel', '--no-headers'] 06:56:42 INFO - Copy/paste: ps -f -C ssltunnel --no-headers 06:56:42 INFO - cltbld 3544 1 0 Mar05 ? 00:00:05 /builds/panda-0611/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel6w5cry.cfg 06:56:42 INFO - cltbld 8319 1 0 Feb15 ? 00:00:20 /builds/panda-0102/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelVZLKf7.cfg 06:56:42 INFO - cltbld 13811 1 0 Feb22 ? 00:00:13 /builds/panda-0096/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelbMTNbB.cfg 06:56:42 INFO - cltbld 15367 1 0 Feb28 ? 00:00:11 /builds/panda-0104/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelVSzmEr.cfg 06:56:42 INFO - cltbld 31981 1 0 Mar09 ? 00:00:04 /builds/panda-0106/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel2cfbEH.cfg 06:56:42 INFO - cltbld 32105 1 0 Mar09 ? 00:00:04 /builds/panda-0103/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelVSAw5K.cfg 06:56:42 INFO - Return code: 0 06:56:42 INFO - Running command: ['ps', '-o', 'pid,ppid', '-C', 'ssltunnel', '--no-headers'] 06:56:42 INFO - Copy/paste: ps -o pid,ppid -C ssltunnel --no-headers 06:56:42 INFO - killing orphan with pid 3544 06:56:42 INFO - killing orphan with pid 8319 06:56:42 INFO - killing orphan with pid 13811 06:56:42 INFO - killing orphan with pid 15367 06:56:42 INFO - killing orphan with pid 31981 06:56:42 INFO - killing orphan with pid 32105 06:56:42 INFO - Return code: 0 06:56:42 INFO - mkdir: /builds/panda-0104/test/build/blobber_upload_dir
Attachment #8578051 -
Flags: review?(kmoir)
![]() |
Assignee | |
Comment 19•8 years ago
|
||
I meant to mention: I noticed some un-used code just where I wanted to update; I deleted it.
![]() |
Assignee | |
Comment 20•8 years ago
|
||
Comment on attachment 8578051 [details] [diff] [review] kill orphan ssltunnel before starting test on panda It looks like this reports an error when there are no ssltunnel processes running. That's annoying. http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/gbrown@mozilla.com-2cc28ee1bb03/try-android-api-11/try_panda_android_test-crashtest-bm100-tests1-panda-build283.txt.gz 08:35:06 INFO - Running command: ['ps', '-f', '-C', 'ssltunnel', '--no-headers'] 08:35:06 INFO - Copy/paste: ps -f -C ssltunnel --no-headers 08:35:06 ERROR - Return code: 1 08:35:06 INFO - Running command: ['ps', '-o', 'pid,ppid', '-C', 'ssltunnel', '--no-headers'] 08:35:06 INFO - Copy/paste: ps -o pid,ppid -C ssltunnel --no-headers 08:35:06 ERROR - Return code: 1
Attachment #8578051 -
Flags: review?(kmoir)
![]() |
Assignee | |
Comment 21•8 years ago
|
||
Bug 915212 and bug 919920 convinced me that there is at least a chance that a similar problem sometimes affects mochitest and reftest jobs on other platforms -- so I changed back to implementing this in the mochitest and reftest harnesses. This patch makes two passes through ps output: The first to log any processes containing 'ssltunnel' or 'xpcshell' on the command line; the second to kill processes named 'ssltunnel' or 'xpcshell' with ppid=1. https://treeherder.mozilla.org/#/jobs?repo=try&revision=2f5cd88ed407 shows this in action. For most platforms, logs just show: 18:02:45 INFO - Checking for orphan ssltunnel processes... 18:02:45 INFO - Checking for orphan xpcshell processes... 18:02:46 INFO - SUITE-START | Running 1225 tests On Android 4.0, logs often find matching ssltunnel orphans: 18:08:03 INFO - Checking for orphan ssltunnel processes... 18:08:03 INFO - cltbld 5948 1 0 Mar05 ? 00:00:06 /builds/panda-0485/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelKhCPKH.cfg 18:08:03 INFO - cltbld 14154 1 0 Feb23 ? 00:00:15 /builds/panda-0494/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel59p8io.cfg 18:08:03 INFO - cltbld 14510 1 0 Feb12 ? 00:00:18 /builds/panda-0488/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelQbAO3V.cfg 18:08:03 INFO - cltbld 15468 1 0 Feb16 ? 00:00:15 /builds/panda-0487/test/build/hostutils/bin/ssltunnel /tmp/ssltunnelwgVbzn.cfg 18:08:03 INFO - cltbld 16292 16250 0 18:05 ? 00:00:00 [ssltunnel] <defunct> 18:08:03 INFO - cltbld 26349 1 0 Mar11 ? 00:00:04 /builds/panda-0486/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel1QCSd4.cfg 18:08:03 INFO - cltbld 29292 1 0 Mar05 ? 00:00:06 /builds/panda-0491/test/build/hostutils/bin/ssltunnel /tmp/ssltunnel9NXWNn.cfg 18:08:03 INFO - killing ssltunnel orphan with pid 5948 18:08:03 INFO - killing ssltunnel orphan with pid 14154 18:08:03 INFO - killing ssltunnel orphan with pid 14510 18:08:03 INFO - killing ssltunnel orphan with pid 15468 18:08:03 INFO - killing ssltunnel orphan with pid 26349 18:08:03 INFO - killing ssltunnel orphan with pid 29292 18:08:03 INFO - Checking for orphan xpcshell processes... 18:08:03 INFO - cltbld 12358 12348 1 17:39 ? 00:00:19 ../hostutils/bin/xpcshell -g /builds/panda-0488/test/build/hostutils/xre -v 170 -f /builds/panda-0488/test/build/hostutils/bin/components/httpd.js -e const _PROFILE_PATH = '/tmp/tmpfR6Dl3';const _SERVER_PORT = '30488'; const _SERVER_ADDR ='10.26.132.21'; -f /builds/panda-0488/test/build/tests/reftest/server.js 18:08:03 INFO - cltbld 15375 15365 3 17:56 ? 00:00:21 ../hostutils/bin/xpcshell -g /builds/panda-0483/test/build/hostutils/xre -v 170 -f /builds/panda-0483/test/build/hostutils/bin/components/httpd.js -e const _PROFILE_PATH = '/tmp/tmpdNS9JU';const _SERVER_PORT = '30483'; const _SERVER_ADDR ='10.26.132.21'; -f /builds/panda-0483/test/build/tests/reftest/server.js 18:08:03 INFO - cltbld 16286 16250 3 18:05 ? 00:00:04 /builds/panda-0491/test/build/hostutils/bin/xpcshell -g /builds/panda-0491/test/build/hostutils/xre -v 170 -f /builds/panda-0491/test/build/hostutils/bin/components/httpd.js -e const _PROFILE_PATH = '/tmp/tmpSL7q9X.mozrunner'; const _SERVER_PORT = '30491'; const _SERVER_ADDR = '10.26.132.21'; const _TEST_PREFIX = undefined; const _DISPLAY_RESULTS = false; -f /builds/panda-0491/test/build/tests/mochitest/server.js 18:08:03 INFO - SUITE-START | Running 467 tests
Attachment #8578051 -
Attachment is obsolete: true
Attachment #8579086 -
Flags: review?(kmoir)
Updated•8 years ago
|
Attachment #8579086 -
Flags: review?(kmoir) → review+
![]() |
Assignee | |
Comment 22•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/abddc93fd8c4
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/abddc93fd8c4
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in
before you can comment on or make changes to this bug.
Description
•