Closed Bug 754900 Opened 12 years ago Closed 12 years ago

Make updateSUT.py not fail on its first run

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

x86
macOS

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Unassigned)

Details

(Whiteboard: [tegra][mobile][testing])

Attachments

(1 file)

I tried as much as possible to not fail on its first run or at least get output when failing (even added sys.stdout.flush() to improve this).

It seems I don't know why after 16 seconds we get a "signal 15" and who returns a "-1" exit code since updateSUT.py does not do that.

I would like to figure out how to have more output when things fail and hopefully discover how to prevent the failure.

Perhaps we have to add an explicit reboot step?

For reference, this is the file:
https://hg.mozilla.org/build/tools/file/16fc4f354b44/sut_tools/updateSUT.py
and this is the step:
dm.sendCMD(['updt com.mozilla.SUTAgentAndroid /mnt/sdcard/%s' % apkfilename])

Maybe the -1 comes from sut_lib.py?

python updateSUT.py 10.250.50.46
 in dir /builds/tegra-136/test/build (timeout 1200 secs)
 watching logfiles {}
 argv: ['python', 'updateSUT.py', '10.250.50.46']
 environment:
  PATH=/opt/local/bin:/opt/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
  PWD=/builds/tegra-136/test/build
  SUT_IP=10.250.50.46
  SUT_NAME=tegra-136
  __CF_USER_TEXT_ENCODING=0x1F5:0:0
 closing stdin
 using PTY: False
process killed by signal 15
program finished with exit code -1
elapsedTime=16.007517
shouldn't this be an ateam bug?  it's the updt command that is failing not the buildbot step
I am not taking but wanted to see if this patch inspires anyone.
oh... I don't know...
I am not sure where the problem is, I run updateSUT locally and it works over an over.  Maybe there is something quirky with how this is run via buildbot.
This is something similar to what happens with cleanup.py and my sys.stdout.flush() are useless:
process killed by signal 15
program finished with exit code -1
elapsedTime=593.846133

No output. Just signal 15 and -1 exit code.
Component: Release Engineering → Release Engineering: Platform Support
Priority: -- → P3
QA Contact: release → coop
Whiteboard: [tegra][mobile][testing]
This is fixed by running updateSUT from verify.py, and only ever updating SUTAgent by a cascading deploy with bringing tegras down from production first, (so any tegras that take a while to come back up after updating, do not then fail the job they were assigned).
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: