Intermittent "Automation error: Error receiving data from socket (possible reboot). cmd={'cmd': 'ps'}; err=[Errno 54] Connection reset by peer" in Android tests

RESOLVED WORKSFORME

Status

defect
RESOLVED WORKSFORME
7 years ago
6 years ago

People

(Reporter: philor, Unassigned)

Tracking

({intermittent-failure})

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

https://tbpl.mozilla.org/php/getParsedLog.php?id=14352945&tree=Mozilla-Inbound
Android Tegra 250 mozilla-inbound opt test reftest-2 on 2012-08-13 16:28:11 PDT for push b441413e4c2d
slave: tegra-182

FIRE PROC: '"MOZ_CRASHREPORTER=1,XPCOM_DEBUG_BREAK=stack,MOZ_CRASHREPORTER_NO_REPORT=1,NO_EM_RESTART=1,MOZ_PROCESS_LOG=/tmp/tmpEIJrvbpidlog,XPCOM_MEM_BLOAT_LOG=/tmp/tmplL6OI8/runreftest_leaks.log" org.mozilla.fennec -no-remote -profile /mnt/sdcard/tests/reftest/profile/'
INFO | automation.py | Application pid: 1462
DeviceManager: error pulling file '/mnt/sdcard/tests/reftest/reftest.log': No such file or directory
DeviceManager: error pulling file '/mnt/sdcard/tests/reftest/reftest.log': No such file or directory
DeviceManager: error pulling file '/mnt/sdcard/tests/reftest/reftest.log': No such file or directory
DeviceManager: error pulling file '/mnt/sdcard/tests/reftest/reftest.log': No such file or directory
DeviceManager: error pulling file '/mnt/sdcard/tests/reftest/reftest.log': No such file or directory
Automation error: Error receiving data from socket (possible reboot). cmd={'cmd': 'ps'}; err=[Errno 54] Connection reset by peer
reconnecting socket
DeviceManager: error pulling file '/mnt/sdcard/tests/reftest/reftest.log': No such file or directory

INFO | automation.py | Application ran for: 0:01:43.844409
========= Started 'python reftest/remotereftest.py ...' warnings (results: 1, elapsed: 2 mins, 1 secs) (at 2012-08-13 16:43:44.185353) =========
python reftest/remotereftest.py --deviceIP 10.250.50.92 --xre-path ../hostutils/xre --utility-path ../hostutils/bin --app org.mozilla.fennec --http-port 30182 --ssl-port 31182 --pidfile /builds/tegra-182/test/../remotereftest.pid --enable-privilege --bootstrap --total-chunks 3 --this-chunk 2 reftest/tests/layout/reftests/reftest.list --symbols-path=http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1344892063/fennec-17.0a1.en-US.android-arm.crashreporter-symbols.zip

....

INFO | remotereftests.py | Server pid: 39176
{'uptime': ['0 days 0 hours 3 minutes 13 seconds 21 ms'], 'power': ['Power status:', ' AC power ONLINE', ' Battery charge NO BATTERY', ' Remaining charge: 0%', ' Battery Temperature: 0.0 (c)'], 'process': [['10031', '1355', 'com.mozilla.SUTAgentAndroid'], ['10018', '1154', 'com.android.launcher'], ['10013', '1398', 'com.cooliris.media'], ['10004', '1291', 'android.process.media'], ['10009', '1374', 'com.android.quicksearchbox'], ['10002', '1388', 'com.android.music'], ['10032', '1366', 'com.mozilla.watcher'], ['10007', '1139', 'com.android.inputmethod.latin'], ['1000', '1020', 'system'], ['1001', '1149', 'com.android.phone'], ['10006', '1335', 'com.android.mms'], ['10010', '1319', 'com.android.providers.calendar'], ['10014', '1302', 'com.android.email'], ['10017', '1281', 'com.android.bluetooth'], ['10029', '1270', 'com.android.deskclock'], ['10015', '1186', 'android.process.acore'], ['1000', '1164', 'com.android.settings']], 'screen': ['X:1600 Y:1200'], 'memory': ['PA:835268608, FREE: 758198272'], 'systime': ['2012/08/13 04:43:48:537'], 'rotation': ['ROTATION:0'], 'disk': [], 'os': ['harmony-eng 2.2 FRF91 20110202.102810 test-keys'], 'id': ['00:21:e8:70:95:60'], 'uptimemillis': ['193050']}

...

Automation error: Error receiving data from socket (possible reboot). cmd={'cmd': 'ps'}; err=[Errno 54] Connection reset by peer
reconnecting socket
DeviceManager: error pulling file '/mnt/sdcard/tests/reftest/reftest.log': No such file or directory

...

{'uptime': ['0 days 0 hours 0 minutes 43 seconds 383 ms'], 'power': ['Power status:', ' AC power ONLINE', ' Battery charge NO BATTERY', ' Remaining charge: 0%', ' Battery Temperature: 0.0 (c)'], 'process': [['10031', '1420', 'com.mozilla.SUTAgentAndroid'], ['10018', '1239', 'com.android.launcher'], ['10013', '1463', 'com.cooliris.media'], ['10009', '1440', 'com.android.quicksearchbox'], ['10002', '1455', 'com.android.music'], ['10032', '1429', 'com.mozilla.watcher'], ['10007', '1214', 'com.android.inputmethod.latin'], ['1000', '1019', 'system'], ['1001', '1228', 'com.android.phone'], ['10004', '1333', 'android.process.media'], ['10006', '1399', 'com.android.mms'], ['10010', '1384', 'com.android.providers.calendar'], ['10014', '1369', 'com.android.email'], ['10017', '1361', 'com.android.bluetooth'], ['10029', '1349', 'com.android.deskclock'], ['10015', '1272', 'android.process.acore'], ['1000', '1241', 'com.android.settings']], 'screen': ['X:1600 Y:1200'], 'memory': ['PA:834838528, FREE: 757678080'], 'systime': ['1970/01/01 12:00:59:825'], 'rotation': ['ROTATION:0'], 'disk': [], 'os': ['harmony-eng 2.2 FRF91 20110202.102810 test-keys'], 'id': ['00:21:e8:70:95:60'], 'uptimemillis': ['43411']}
program finished with exit code 0
elapsedTime=121.491821
TinderboxPrint: reftest-2<br/><em class="testfail">T-FAIL</em>
========= Finished 'python reftest/remotereftest.py ...' warnings (results: 1, elapsed: 2 mins, 1 secs) (at 2012-08-13 16:45:45.717000) =========
So yea, looks like either a bad tegra or the reboot issue is back somehow
https://tbpl.mozilla.org/php/getParsedLog.php?id=14355748&tree=Services-Central
slave: tegra-182

/me pushes another 20 chips onto "bad tegra"
Depends on: tegra-182
this really looks like a bad tegra to me.  I say 3 strikes and your out rule.
https://tbpl.mozilla.org/php/getParsedLog.php?id=14716774&tree=Mozilla-Inbound
tegra-349

Not that anything in the 300s in any way says anything other than "bad tegra" but at least it's not 182 again.
Only one of them had this message in the log, but three of the last three in bug 689856 are also tegra-349.
Depends on: tegra-349
https://tbpl.mozilla.org/php/getParsedLog.php?id=14845531&tree=Ionmonkey
tegra-182

Interesting that (by the terrible standards of tegras) 182 and 349 actually have reasonable enough success rates.
Whiteboard: [orange]
Resolving WFM keyword:intermittent-failure bugs last modified >3 months ago, whose whiteboard contains none of:
{random,disabled,marked,fuzzy,todo,fails,failing,annotated,time-bomb,leave open}

There will inevitably be some false positives; for that (and the bugspam) I apologise. Filter on orangewfm.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.