Bug 730726 (tegra-082)

tegra-082 problem tracking

RESOLVED FIXED

Status

P3
normal
RESOLVED FIXED
7 years ago
6 months ago

People

(Reporter: philor, Unassigned)

Tracking

Details

(Whiteboard: [buildduty][capacity][buildslaves], URL)

(Reporter)

Description

7 years ago
Of its last 300 jobs, 1 was purple, and 299 were retries.
stop_cp has been run against this tegra.

Updated

7 years ago
Assignee: nobody → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → arich
Summary: Please disable tegra-082 and enroll it in tegra recovery → recover tegra-082
Is tegra-082 a frequent offender?  Does it need to be removed from prod?
Assignee: server-ops-releng → mlarrain
colo-trip: --- → mtv1
reimaged
Assignee: mlarrain → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: arich → release
Priority: -- → P3
Whiteboard: [buildduty][capacity][buildslaves]
back in production
Alias: tegra-082
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Component: Release Engineering → Release Engineering: Machine Management
QA Contact: release → armenzg
Resolution: --- → FIXED
Summary: recover tegra-082 → tegra-082 problem tracking

Comment 5

7 years ago
PING CRITICAL
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Updated

7 years ago
Depends on: 750787
This recovered on its own.
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago7 years ago
No longer depends on: 750787
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 778812

Updated

6 years ago
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago6 years ago
Resolution: --- → FIXED
Needs recovery.
Status: RESOLVED → REOPENED
Depends on: 786315
Resolution: FIXED → ---
Back in production.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Offline, trying a PDU reboot.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Back in production.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Last job 9 days, 23:16:54 ago

error.flg [Remote Device Error: Unable to properly remove /mnt/sdcard/tests] 

remotely reformatted sdcard
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This machine is actively taking jobs, and generally successful.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
error.flg [Remote Device Error: Unable to properly remove /mnt/sdcard/tests] 

back in production
No jobs taken on this device for > a week (< 3 weeks)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(mass change: filter on tegraCallek02reboot2013)

I just rebooted this device, hoping that many of the ones I'm doing tonight come back automatically. I'll check back in tomorrow to see if it did, if it does not I'll triage next step manually on a per-device basis.

---
Command I used (with a manual patch to the fabric script to allow this command)

(fabric)[jwood@dev-master01 fabric]$  python manage_foopies.py -j15 -f devices.json `for i in 021 032 036 039 046  048 061 064 066 067 071 074 079 081 082 083 084 088 093 104 106 108 115 116 118 129 152 154 164 168 169 174 179 182 184 187 189 200 207 217 223 228 234 248 255 264 270 277 285 290 294 295 297 298 300 302 304 305 306 307 308 309 310 311 312 314 315 316 319 320 321 322 323 324 325 326 328 329 330 331 332 333 335 336 337 338 339 340 341 342 343 345 346 347 348 349 350 354 355 356 358 359 360 361 362 363 364 365 367 368 369; do echo '-D' tegra-$i; done` reboot_tegra

The command does the reboot, one-at-a-time from the foopy the device is connected from. with one ssh connection per foopy
Last few times was SDCard failure to clean, this time is as well --> replace SDCard

Updated

6 years ago
Depends on: 838687
now taking jobs
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Depends on: 877722
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Updated

6 years ago
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
(Assignee)

Updated

5 years ago
Product: mozilla.org → Release Engineering
Wednesday, October 02, 2013 10:37:58 AM
Status: RESOLVED → REOPENED
Depends on: 944498
Resolution: FIXED → ---
SD card has been replaced and reimaged/flashed.
Back in production.
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago5 years ago
Resolution: --- → FIXED
Perma-retrying; disabled.

python /builds/sut_tools/installApp.py 10.26.85.59 build/fennec-33.0a1.en-US.android-arm-armv6.apk org.mozilla.fennec
 in dir /builds/tegra-082/test/. (timeout 1200 secs)
 watching logfiles {}
 argv: ['python', '/builds/sut_tools/installApp.py', '10.26.85.59', u'build/fennec-33.0a1.en-US.android-arm-armv6.apk', 'org.mozilla.fennec']
 environment:
  HOME=/home/cltbld
  LOGNAME=cltbld
  OLDPWD=/home/cltbld
  PATH=/usr/local/bin:/usr/local/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin
  PWD=/builds/tegra-082/test
  PYTHONPATH=/builds/sut_tools
  SHELL=/bin/sh
  SHLVL=4
  SUT_IP=10.26.85.59
  SUT_NAME=tegra-082
  USER=cltbld
  _=/tools/buildbot/bin/python2.7
 using PTY: False
06/13/2014 00:26:07: INFO: copying build/fennec/application.ini to build/talos/remoteapp.ini
06/13/2014 00:26:07: DEBUG: calling [cp build/fennec/application.ini build/talos/remoteapp.ini]
06/13/2014 00:26:07: DEBUG: cp: cannot create regular file `build/talos/remoteapp.ini': No such file or directory
06/13/2014 00:26:07: INFO: connecting to: 10.26.85.59
reconnecting socket
06/13/2014 00:26:07: INFO: devroot /mnt/sdcard/tests
06/13/2014 00:26:07: INFO: 10.26.84.15, 50082
06/13/2014 00:26:07: INFO: Current device time is 2014/06/13 00:26:06
06/13/2014 00:26:07: INFO: Setting device time to 2014/06/13 00:26:07
06/13/2014 00:26:07: INFO: Current device time is 2014/06/13 00:26:07
results: {'process': [['1001', '1216', 'com.android.phone'], ['10007', '1207', 'com.android.inputmethod.latin'], ['1000', '1020', 'system'], ['10029', '1321', 'com.android.deskclock'], ['10032', '1498', 'com.mozilla.SUTAgentAndroid'], ['10018', '1228', 'com.android.launcher'], ['10017', '1330', 'com.android.bluetooth'], ['10013', '1430', 'com.cooliris.media'], ['10009', '1407', 'com.android.quicksearchbox'], ['1000', '1234', 'com.android.settings'], ['10002', '1422', 'com.android.music'], ['10004', '1349', 'android.process.media'], ['10031', '1397', 'com.mozilla.watcher'], ['10006', '1376', 'com.android.mms'], ['10010', '1361', 'com.android.providers.calendar'], ['10014', '1338', 'com.android.email'], ['10015', '1260', 'android.process.acore']]}
results: {'memory': ['PA:835723264, FREE: 763985920']}
results: {'uptime': ['0 days 0 hours 59 minutes 26 seconds 4 ms']}
results: {'screen': ['X:1024 Y:768']}
06/13/2014 00:26:07: INFO: Installing /mnt/sdcard/tests/fennec-33.0a1.en-US.android-arm-armv6.apk
in push file with: build/fennec-33.0a1.en-US.android-arm-armv6.apk, and: /mnt/sdcard/tests/fennec-33.0a1.en-US.android-arm-armv6.apk
sending: push /mnt/sdcard/tests/fennec-33.0a1.en-US.android-arm-armv6.apk
push returned: 2fddef50e51bf4a332a32dcb443205d9
Push File Validated!
06/13/2014 00:26:24: INFO: /builds/tegra-082/test/../error.flg
Remote Device Error: updateApp() call failed - exiting
program finished with exit code 1
elapsedTime=47.628204
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Updated

4 years ago
Depends on: 1031502

Comment 22

4 years ago
formatted SD card, flashed and reimaged tegra.

vle@vle-10516 ~ $ telnet tegra-082.tegra.releng.scl3.mozilla.com 20701
Trying 10.26.85.59...
Connected to tegra-082.tegra.releng.scl3.mozilla.com.
Escape character is '^]'.
$>^]
telnet> q
(Reporter)

Updated

4 years ago
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago4 years ago
QA Contact: armenzg → bugspam.Callek
Resolution: --- → FIXED

Updated

6 months ago
Product: Release Engineering → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.