Closed
Bug 734221
Opened 13 years ago
Closed 13 years ago
update sutagent to version 1.07
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jmaher, Assigned: armenzg)
References
Details
Attachments
(2 files, 5 obsolete files)
5.22 KB,
patch
|
bear
:
review+
jmaher
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
5.64 KB,
patch
|
bear
:
review+
jmaher
:
review+
|
Details | Diff | Splinter Review |
in debugging why the robocop tests were having trouble, I found that sutagent was causing the problem. We have an older version of sutagent (says version 1.0) on the tegras and we need the newest version (1.07). This will also allow us to run xpcshell unittests on the tegras as well! double-win!
In order to update a sutagent, we need to grab a sutagent from a recent build:
* wet http://ftp.mozilla.org/pub/mozilla.org/mobile/nightly/latest-mozilla-central-android-r7/fennec-13.0a1.en-US.android-arm.tests.zip
* unzip fennec-13.0a1.en-US.android-arm.tests.zip
* push bin/sutAgentAndroid.apk /mnt/sdcard/sutAgentAndroid1.07.apk
* in telnet to current sutagent, run:
** updt com.mozilla.SUTAgentAndroid /mnt/sdcard/sutAgentAndroid1.07.apk
** NOTE: the session will immediately terminate **
* sleep a minute
* reconnect
* in telnet to *new* sutagent, run:
** ver
** verify output == '1.07'
Assignee | ||
Updated•13 years ago
|
Assignee: nobody → armenzg
Component: Release Engineering → Release Engineering: Machine Management
Priority: -- → P1
QA Contact: release → armenzg
Assignee | ||
Comment 1•13 years ago
|
||
This script helps me do the work:
http://people.mozilla.org/~jmaher/sutagent/updateSUT.py
Thanks jmaher!
Assignee | ||
Comment 2•13 years ago
|
||
jmaher has been guiding me and I am making more progress.
I am updating the code to use devicemanagerSUT.py from m-c/mobile [1].
I believe this file comes inside of the talos.zip we create.
I noticed that something weird was happening when calling this:
version = dm2.verifySendCMD(['ver'], newline=False).split('\n')[0]
which would block in [1]:
> temp = self._sock.recv(1024)
[1] http://hg.mozilla.org/mozilla-central/file/bfb1b7520ce9/build/mobile/devicemanagerSUT.py#l243
Assignee | ||
Comment 3•13 years ago
|
||
It does not need the to know where the apk is since the script updateSUT.py knows that.
This probably won't be the final location of updateSUT.py.
We probably need another modification for unit tests.
Assignee | ||
Comment 4•13 years ago
|
||
Assignee | ||
Comment 5•13 years ago
|
||
It seems that time.sleep(90) or updt command + reboot is not very liked by buildbot
It seems like a flushing of the python output would have made this more meaningful.
python updateSUT.py 10.250.49.9
in dir /builds/tegra-022/test/../talos-data/talos/mozdevice (timeout 1200 secs)
watching logfiles {}
argv: ['python', 'updateSUT.py', '10.250.49.9']
environment:
PATH=/opt/local/bin:/opt/local/sbin:/opt/local/Library/Frameworks/Python.framework/Versions/2.6/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin
PWD=/builds/tegra-022/talos-data/talos/mozdevice
SUT_IP=10.250.49.9
SUT_NAME=tegra-022
__CF_USER_TEXT_ENCODING=0x1F6:0:0
closing stdin
using PTY: False
process killed by signal 15
program finished with exit code -1
elapsedTime=11.060346
Assignee | ||
Comment 6•13 years ago
|
||
I am trying to look for a place to download the older sut agent so I can downgrade the staging slaves.
Any suggestions on where I could find it?
I am still trying to figure out how to deal with the problem we hit in comment 5.
Assignee | ||
Comment 7•13 years ago
|
||
BTW I could also have a downtime and forget about making the update of the SUT agent a standard procedure.
Assignee | ||
Comment 8•13 years ago
|
||
Attachment #605026 -
Attachment is obsolete: true
Attachment #605439 -
Flags: review?(jmaher)
Attachment #605439 -
Flags: review?(bear)
Reporter | ||
Comment 9•13 years ago
|
||
Comment on attachment 605439 [details] [diff] [review]
[tools/sut_tools] script to download latest sut agent if tegra is running older version
Review of attachment 605439 [details] [diff] [review]:
-----------------------------------------------------------------
this looks good, I think there is enough looping logic in here to avoid any looping logic in any other code that calls this.
::: sut_tools/updateSUT.py
@@ +52,5 @@
> + (target_version, version)
> + sys.exit(1)
> + print "INFO: updateSUT.py: We're now running %s" % version
> + sys.exit(0)
> + except:
when do we get in this except clause? is that during dm2.sendCMD() ?
Attachment #605439 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 10•13 years ago
|
||
I am going to wrap up bug 735260 in here.
I am doing a last run on staging for both unit tests and talos.
Attachment #605439 -
Attachment is obsolete: true
Attachment #605548 -
Flags: review?(jmaher)
Attachment #605548 -
Flags: review?(bear)
Attachment #605439 -
Flags: review?(bear)
Assignee | ||
Comment 12•13 years ago
|
||
Attachment #605025 -
Attachment is obsolete: true
Attachment #605549 -
Flags: review?(jmaher)
Attachment #605549 -
Flags: review?(bear)
Reporter | ||
Comment 13•13 years ago
|
||
Comment on attachment 605549 [details] [diff] [review]
[buildbotcustom] steps to update the SUT agent if it is needed
Review of attachment 605549 [details] [diff] [review]:
-----------------------------------------------------------------
just need some adjustments on the file locations.
::: process/factory.py
@@ +7208,5 @@
> ))
>
> def addTearDownSteps(self):
> + self.addStep(DownloadFile(
> + url='http://build.mozilla.org/talos/mobile/devicemanager.py',
this should be: http://hg.mozilla.org/mozilla-central/file/tip/build/mobile/devicemanager.py
@@ +7213,5 @@
> + workdir='.',
> + description="Download devicemanager.py",
> + ))
> + self.addStep(DownloadFile(
> + url='http://build.mozilla.org/talos/mobile/devicemanagerSUT.py',
http://hg.mozilla.org/mozilla-central/file/tip/build/mobile/devicemanagerSUT.py
@@ +7218,5 @@
> + workdir='.',
> + description="Download devicemanagerSUT.py",
> + ))
> + self.addStep(DownloadFile(
> + url='http://build.mozilla.org/talos/mobile/updateSUT.py',
I suspect updateSUT.py will live in tool/sut_tools/
Attachment #605549 -
Flags: review?(jmaher) → review+
Reporter | ||
Comment 14•13 years ago
|
||
Comment on attachment 605548 [details] [diff] [review]
[tools] updateSUT.py and clientproxy.py changes to call it
Review of attachment 605548 [details] [diff] [review]:
-----------------------------------------------------------------
r- for the use of sys.argv[1] in the function. Otherwise this is looking pretty good with a couple nits.
::: sut_tools/updateSUT.py
@@ +22,5 @@
> + data = f.read()
> + f.close()
> + dm.sendCMD(['push /mnt/sdcard/%s %s\r\n' % (apkfile, str(len(data))), data], newline=False)
> + dm.debug = 5
> + dm.sendCMD(['ls /mnt/sdcard'])
this seems unnecessary?
@@ +25,5 @@
> + dm.debug = 5
> + dm.sendCMD(['ls /mnt/sdcard'])
> + dm.sendCMD(['updt com.mozilla.SUTAgentAndroid /mnt/sdcard/%s' % apkfile])
> + # XXX devicemanager.py might need to close the sockets so we won't need these 2 steps
> + dm._sock.close()
add a:
if dm._sock:
dm._sock.close()
@@ +60,5 @@
> + print "INFO: updateSUT.py: We're going to sleep for 90 seconds"
> + time.sleep(90)
> +
> + print "INFO: updateSUT.py: Connecting to: " + sys.argv[1]
> + return devicemanager.DeviceManagerSUT(sys.argv[1])
I don't like sys.argv[1] being used in a function. I would rather assign this to a global variable or pass it in.
Attachment #605548 -
Flags: review?(jmaher) → review-
Comment 15•13 years ago
|
||
Comment on attachment 605548 [details] [diff] [review]
[tools] updateSUT.py and clientproxy.py changes to call it
+def main():
+ if (len(sys.argv) <> 2):
this is a holdover from hal's changing of my code to be more easily testable - we should have fixed it then to pass in to main the appropriate parameters.
+ download_apk()
+ f = open(apkfile, 'rb')
I think this would be better (more robust to errors) if apkfile is returned from download_apk() - then you can use it's value as a sanity check that the file exists before trying to do an open() on it.
+ while tries < 5:
+ try:
+ dm = connect(sleep=90)
+ break
+ except:
+ tries += 1
+ print "WARNING: updateSUT.py: We have tried to connect %s time(s) after trying to update." % tries
+
+ ver = version(dm)
I would move the "ver = version(dm)" line to inside of the try: block - this will let you use ver as your signal that the reconnect worked and also let you avoid having to error trap/check that dm is valid when calling version()
+def version(dm):
+ ver = dm.sendCMD(['ver']).split("\n")[0]
+ print "INFO: updateSUT.py: We're running %s" % ver
+ return ver
I feel you should wrap the call to dm.sendCMD() in a try block or check that dm is valid before making the call
+def download_apk():
+ url = 'http://build.mozilla.org/talos/mobile/sutAgentAndroid.%s.apk' % target_version
+ print "INFO: updateSUT.py: We're downloading the apk: %s" % url
+ req = urllib2.Request(url)
+ f = urllib2.urlopen(req)
+ local_file = open(apkfile, 'wb')
+ local_file.write(f.read())
+ local_file.close()
since we are writing to disk what we receive from the url call, any html error page would end up being written to the apk filename and we would know what is wrong until someone thinks to look at the contents.
check the return of urllib2.urlopen() for a status code and ensure it's 200 at least before writing to the file.
other than the nits above this looks really good - your getting the hang of this \o/
Attachment #605548 -
Flags: review?(bear) → review-
Updated•13 years ago
|
Attachment #605549 -
Flags: review?(bear) → review+
Assignee | ||
Comment 16•13 years ago
|
||
(In reply to Mike Taylor [:bear] from comment #15)
> +def version(dm):
> + ver = dm.sendCMD(['ver']).split("\n")[0]
> + print "INFO: updateSUT.py: We're running %s" % ver
> + return ver
>
> I feel you should wrap the call to dm.sendCMD() in a try block or check that
> dm is valid before making the call
We don't do that for the other dm.sendCMD() calls we do.
I have seen that sendCMD() will throw an exception if dm is not valid.
I filed a separate bug for dm to indicate that it did not initialize correctly (see bug 735451).
Assignee | ||
Comment 17•13 years ago
|
||
The output shows that the script after the 4th attempt will manage to get the slave back.
I have also run the script against boards that were already upgraded.
foopy06:bug734221 cltbld$ python updateSUT.py 10.250.49.3
INFO: updateSUT.py: Connecting to: 10.250.49.3
reconnecting socket
INFO: updateSUT.py: We're running SUTAgentAndroid Version 1.00
INFO: updateSUT.py: We're going to try to install SUTAgentAndroid Version 1.07
INFO: updateSUT.py: We're downloading the apk: http://build.mozilla.org/talos/mobile/sutAgentAndroid.1.07.apk
send cmd: updt com.mozilla.SUTAgentAndroid /mnt/sdcard/sutAgentAndroid.apk
recv'ing...
response: exit
$>
INFO: updateSUT.py: We're going to sleep for 90 seconds
^@INFO: updateSUT.py: Connecting to: 10.250.49.3
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
WARNING: updateSUT.py: We have tried to connect 1 time(s) after trying to update.
INFO: updateSUT.py: We're going to sleep for 90 seconds
^@INFO: updateSUT.py: Connecting to: 10.250.49.3
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
WARNING: updateSUT.py: We have tried to connect 2 time(s) after trying to update.
INFO: updateSUT.py: We're going to sleep for 90 seconds
^@^@INFO: updateSUT.py: Connecting to: 10.250.49.3
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
WARNING: updateSUT.py: We have tried to connect 3 time(s) after trying to update.
INFO: updateSUT.py: We're going to sleep for 90 seconds
^@INFO: updateSUT.py: Connecting to: 10.250.49.3
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
reconnecting socket
unable to connect socket
WARNING: updateSUT.py: We have tried to connect 4 time(s) after trying to update.
INFO: updateSUT.py: We're going to sleep for 90 seconds
^@^@INFO: updateSUT.py: Connecting to: 10.250.49.3
reconnecting socket
INFO: updateSUT.py: We're running SUTAgentAndroid Version 1.07
INFO: updateSUT.py: We're now running SUTAgentAndroid Version 1.07
Attachment #605548 -
Attachment is obsolete: true
Attachment #605951 -
Flags: review?(jmaher)
Attachment #605951 -
Flags: review?(bear)
Assignee | ||
Comment 18•13 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #13)
I will have to work on the buildbotcustom patch to have a reliable DownloadFile since hg web tends to fail us.
jmaher asked me on IRC to grab the file from the source of truth.
Comment 19•13 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #16)
> (In reply to Mike Taylor [:bear] from comment #15)
> > +def version(dm):
> > + ver = dm.sendCMD(['ver']).split("\n")[0]
> > + print "INFO: updateSUT.py: We're running %s" % ver
> > + return ver
> >
> > I feel you should wrap the call to dm.sendCMD() in a try block or check that
> > dm is valid before making the call
>
> We don't do that for the other dm.sendCMD() calls we do.
> I have seen that sendCMD() will throw an exception if dm is not valid.
> I filed a separate bug for dm to indicate that it did not initialize
> correctly (see bug 735451).
if we are not wrapping the calls, then that's my mistake from earlier work :/
Comment 20•13 years ago
|
||
Comment on attachment 605951 [details] [diff] [review]
[tools] updateSUT.py and clientproxy.py changes to call it
thanks for making the changes - the code looks great. I'm looking forward to seeing it run in staging.
Attachment #605951 -
Flags: review?(bear) → review+
Assignee | ||
Comment 21•13 years ago
|
||
(In reply to Mike Taylor [:bear] from comment #20)
> Comment on attachment 605951 [details] [diff] [review]
> [tools] updateSUT.py and clientproxy.py changes to call it
>
> thanks for making the changes - the code looks great. I'm looking forward
> to seeing it run in staging.
This has been running for a week on staging with good results :)
http://dev-master01.build.scl1.mozilla.com:8043/one_line_per_build
Reporter | ||
Updated•13 years ago
|
Attachment #605951 -
Flags: review?(jmaher) → review+
Assignee | ||
Comment 22•13 years ago
|
||
Finally I got this right!
* this should retry the command and the job if hg web fails
* in a future patch I will do some cleanup to download stuff through talos_from_code.py
* unfortunately, I am grabbing the old versions of devicemanager from the talos repo (updateSUT.py is not written for the newer version); this will come later
* I am updating talos.zip to match what talos.json uses (this will ease transiting to talos.json and be in par with mobile)
Attachment #605549 -
Attachment is obsolete: true
Attachment #606401 -
Flags: review?(jmaher)
Attachment #606401 -
Flags: review?(bear)
Reporter | ||
Comment 23•13 years ago
|
||
Comment on attachment 606401 [details] [diff] [review]
[buildbotcustom] steps to update the SUT agent if it is needed (take 5)
Review of attachment 606401 [details] [diff] [review]:
-----------------------------------------------------------------
just some simple nits.
::: process/factory.py
@@ +7031,5 @@
> + self.addStep(RetryingShellCommand(
> + name='get_device_manager_SUT_py',
> + description="Download devicemanagerSUT.py",
> + command=['wget', '--no-check-certificate',
> + 'http://hg.mozilla.org/build/talos/raw-file/6e5f5cadd9e9/talos/devicemanagerSUT.py'],
these don't live in talos, this should be m-c
@@ +7038,5 @@
> + ))
> + self.addStep(RetryingShellCommand(
> + name='get_updateSUT_py',
> + command=['wget', '--no-check-certificate',
> + 'http://build.mozilla.org/talos/mobile/updateSUT.py'],
for some reason I thought updateSUT.py would live in sut_tools.
@@ +7718,5 @@
> + name='get_talos_zip',
> + command=['wget', '-O', 'talos.zip', '--no-check-certificate',
> + 'http://build.mozilla.org/talos/zips/talos.bug732835.zip'],
> + workdir=self.workdirBase,
> + haltOnFailure=True,
why is all this duplicated?
Attachment #606401 -
Flags: review?(jmaher) → review+
Comment 24•13 years ago
|
||
Comment on attachment 606401 [details] [diff] [review]
[buildbotcustom] steps to update the SUT agent if it is needed (take 5)
I have to agree with Joel on wondering why updateSUT.py doesn't live in sut_tools
other than that looks good!
Attachment #606401 -
Flags: review?(bear) → review+
Assignee | ||
Comment 25•13 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #23)
> > + command=['wget', '--no-check-certificate',
> > + 'http://hg.mozilla.org/build/talos/raw-file/6e5f5cadd9e9/talos/devicemanagerSUT.py'],
>
> these don't live in talos, this should be m-c
>
I mentioned it on my comment:
* unfortunately, I am grabbing the old versions of devicemanager from the talos repo (updateSUT.py is not written for the newer version); this will come later
> > + command=['wget', '--no-check-certificate',
> > + 'http://build.mozilla.org/talos/mobile/updateSUT.py'],
>
> for some reason I thought updateSUT.py would live in sut_tools.
>
It is. I just had not yet landed it there. I will fix it.
> @@ +7718,5 @@
> > + name='get_talos_zip',
> > + command=['wget', '-O', 'talos.zip', '--no-check-certificate',
> > + 'http://build.mozilla.org/talos/zips/talos.bug732835.zip'],
> > + workdir=self.workdirBase,
> > + haltOnFailure=True,
>
> why is all this duplicated?
What is duplicated?
FTR, I am changing the DownloadFile for talos.zip to also be retrying and syncing up the version of talos.zip to what is being used for Desktop.
Assignee | ||
Comment 26•13 years ago
|
||
Comment on attachment 605951 [details] [diff] [review]
[tools] updateSUT.py and clientproxy.py changes to call it
http://hg.mozilla.org/build/tools/rev/b94a850405d4
In the next hour I will be landing the custom changes and reconfigure the masters.
Attachment #605951 -
Flags: checked-in+
Assignee | ||
Comment 27•13 years ago
|
||
This got merged into production around 8:45 AM PDT.
This means that over today we will get most of our tegras updating to SUT Agent version 1.07.
I will notify in dev.tree-management.
Assignee | ||
Comment 28•13 years ago
|
||
Unless we hit any issues this is done.
https://groups.google.com/forum/?fromgroups#!topic/mozilla.dev.tree-management/xM_2I4aZCPU
I will file a bug to use the newer devicemanager* files and modify updateSUT.py to make use of them.
FTR, this is how I generated retry.zip:
zip retry.zip buildfarm/utils/retry.py buildfarm/utils/unix_util.py lib/python/util/retry.py lib/python/util/__init__.py
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Updated•7 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•