Closed Bug 614173 Opened 11 years ago Closed 11 years ago

SUT agent/device manager: More robust way to download files

Categories

(Testing :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mcote, Unassigned)

References

Details

Attachments

(3 files, 2 obsolete files)

The current implementation of the device manager uses the "cat" agent command to transfer files from the agent to the user.  The device manager scans the output, looking for the prompt ("$>\0"), which indicates the end of the file.  Already one bug was found in devicemanager.py in which it expected the prompt to be on its own line, which is only the case if the catted file ends with a newline character.  This can easily be fixed, but there always runs the risk of finding the agent prompt within a file (think log files), which will cause premature termination of the file (and device manager confusion down the line).  In general, a more robust way to grab files is to provide the length of the file before the data, so the client knows exactly when to stop reading (and the input need not be scanned, just directed straight to a file).

Hence I propose an agent command, perhaps called "streamfile":

streamfile <remote path>

When issued, the agent shall output the length of the file found at <path> in bytes as a decimal integer, followed by a newline, followed by the full contents of the file.  The prompt shall then be output and the agent await a new command.  The client shall read the indicated number of bytes, then expect the prompt; if not found, the file transfer will be aborted.  It is probably best at this point for the device manager to terminate the connection and reconnect, since the agent may be in a weird state.

If <path> doesn't exist, a suitable error message (to be defined) shall be output, and the command prompt displayed as usual.
This patch adds DeviceManager.pullFile(), using the new agent "pull" command.  There are also a few miscellaneous changes... seems that the copy of devicemanager.py got most, but not all, my changes from bug 612461.  More reason to get all the copies of devicemanager.py synced up as soon as possible.
Attachment #493834 - Flags: review?(jmaher)
Attachment #493834 - Attachment is obsolete: true
Attachment #494163 - Flags: review?(ctalbert)
Attachment #493834 - Flags: review?(jmaher)
Comment on attachment 494163 [details] [diff] [review]
Support "pull" command; apply to remote-testing/tests/devicemanager.py

This is good stuff.  I think we need to remove some of the debugging and do a little tweaking and it should be good to go.

>diff --git a/tests/devicemanager.py b/tests/devicemanager.py
>--- a/tests/devicemanager.py
>+++ b/tests/devicemanager.py
>@@ -466,50 +472,137 @@ class DeviceManager:
>     return self.stripPrompt(data).strip('\n')
> 
>   def catFile(self, remoteFile):
>     data = self.sendCMD(['cat ' + remoteFile])
>     if data == None:
>         return None
>     return self.stripPrompt(data)
>   
>+  def pullFile(self, remoteFile):
>+    def err(error_msg):
>+        err_str = 'bad response to pull: %s!' % error_msg
Nit: Change the text?  Error returned from pull?  
>+
>+    def read(to_recv, error_msg):
>+      data = self._sock.recv(to_recv)
>+      if not data:
>+        err(error_msg)
>+        return None
>+      return data
>+    
>+    self.sendCMD(['pull ' + remoteFile])
A comment here explaining the data that you require to come back as metadata would be nice.
Or even better: a test codifying that the proper data is returned (i.e. will fail if the stuff returned is at all not what we expect).
>+    buffer = ''
>+    while not '\n' in buffer:
>+      data = read(1024, 'could not find metadata')
>+      if data == None:
>+        return
>+      buffer += data
>+    nl = buffer.find('\n')
>+    metadata = buffer[:nl]
>+    print 'metadata: %s' % metadata
Should be a if (self.debug...) type of statement.  Pick a level that seems appropriate - maybe 2 for now until we get some experience with the pull command.
>+    filedata = buffer[nl+1:]  # skip newline
>+    sep = metadata.rfind(',')
>+    if sep == -1:
>+      err('could not find file size')
This error message isn't going to make much sense when seen in a long print out of other stuff from talos and mochitest.  Can you make it more specific?
>+      return None
>+    filename = metadata[:sep]
>+    filesizestr = metadata[sep+1:]
>+    prompt = self.base_prompt + self.prompt_sep
>+    try:
>+        filesize = int(filesizestr)
>+    except ValueError:
>+      err('invalid file size')
again, specificity
>+      return None
>+    if filesize == -1:
Comments?  Read the metadata...
>+      while not '\n' in filedata:
>+        data = read(1024, 'could not find metadata')
>+        if data == None:
>+          return None
>+        filedata += data
>+      nl = filedata.find('\n')
>+      error_str = filedata[:nl]
>+      filedata = filedata[nl+1:]
Comments would be great here
>+      while filedata < len(prompt):
>+        data = read(1024, 'could not find metadata')
>+        if data == None:
>+          return None
>+      print 'error pulling file: %s' % error_str
mention devicemanager or something to namespace this
>+      return None
>+        
>+    total_to_recv = filesize + len(prompt)
Comments and now, we finally read the file...
>+    while len(filedata) < total_to_recv:
>+      to_recv = min(total_to_recv - len(filedata), 1024)
>+      data = read(to_recv, 'could not get all file data')
>+      if data == None:
>+        return None
>+      filedata += data
>+    if filedata[-len(prompt):] != prompt:
>+      err('no prompt')
Better error message

>+      return filedata
>+    return filedata[:-len(prompt)]
>+    
>   # copy file from device (remoteFile) to host (localFile)
>   def getFile(self, remoteFile, localFile = ''):
>     if localFile == '':
>         localFile = os.path.join(self.tempRoot, "temp.txt")
>   
>-    promptre = re.compile(self.prompt_regex + '.*')
>-    retVal = self.catFile(remoteFile)
>+    retVal = self.pullFile(remoteFile)
>+    if retVal == None:
>+      return None
>     fhandle = open(localFile, 'wb')
>     fhandle.write(retVal)
>     fhandle.close()
>+    if not self.validateFile(remoteFile, localFile):
>+        print 'failed to validate file when downloading %s!' % remoteFile
>+        return None
>     return retVal
>     
>   # copy directory structure from device (remoteDir) to host (localDir)
>   def getDirectory(self, remoteDir, localDir):
>     if (self.debug >= 2): print "getting files in '" + remoteDir + "'"
>     filelist = self.listFiles(remoteDir)
>+    print 'files: %s' % filelist
This is uneeded as we already print the filelist in debug mode below
>     if (filelist == None):
>       return None
>     if (self.debug >= 3): print filelist
>     if not os.path.exists(localDir):
>       os.makedirs(localDir)
>   
>     for f in filelist:
>-      if (self.isDir(os.path.join(remoteDir, f))):
>-        if (self.getDirectory(remoteDir + '/' + f, os.path.join(localDir, f)) == None):
>+      if f == '.' or f == '..':
>+          continue
>+      remotePath = remoteDir + '/' + f
>+      localPath = os.path.join(localDir, f)
>+      print 'remotePath is %s' % remotePath
>+      print 'localPath is %s' % localPath
These should be if (self.debug > 4 or removed or something...
>+      try:
>+        is_dir = self.isDir(remotePath)
>+      except FileError:
>+        print 'bad file "%s"!' % remotePath
error message, mention that we are continuing mention that we are continuing in spite of the error.
>+        continue
>+      if is_dir:
>+        if (self.getDirectory(remotePath, localPath) == None):
>+          print 'aborted when getting directory'
better message
>           return None
>       else:
>-        if (self.getFile(remoteDir + '/' + f, os.path.join(localDir, f)) == None):
>-          return None
>+        # It's sometimes acceptable to have getFile() return None, such as
>+        # when the agent encounters broken symlinks.
>+        # FIXME: This should be improved so we know when a file transfer really
>+        # failed.
>+        self.getFile(remotePath, localPath)
Let's do two things: add a check here to see if getFile returns None, and if so, print out that it returned none in an error message.  That way it won't be invisible when it fails.  And let's file a follow on bug to fix this properly.  I agree that it is a bit out of scope for these changes right now and since its full of corner cases and probably agent changes, it's better left for a follow-on bug.
>     return filelist
> 
>   def isDir(self, remotePath):
>     data = self.sendCMD(['isdir ' + remotePath])
>     retVal = self.stripPrompt(data).strip()
>+    if not retVal:
>+        raise FileError('isdir returned null')
It'd be great to start namespacing these errors - DEVICEMANAGER::<error message> or something.

So, I like the spirit of the patch, I like what you've done.  Most of these comments are nits, but I want to test it with mochitest as well before I declare us completely out of the woods here, so tentative r+ with the changes listed above (and the caveat that I might have more for you) and I'll report back my results from testing with mochitest.
Attachment #494163 - Flags: review?(ctalbert) → review+
Verified that it all works on mochitest and reftest.  I am happy r+
This patch implements the pull command as well as fixing a few other issues.
Attachment #497360 - Flags: review?(ctalbert)
Comment on attachment 497360 [details]
Latest changes, adds pull command among others

Looks good!
Attachment #497360 - Flags: review?(ctalbert) → review+
Refactored the pull patch.  Pulled out more of the reading code into functions, cleaned up log messages.  I mostly left out the namespacing of log messages for now--I think that should be another bug and all the messages should be done at once, probably through a function wrapping "print".
Attachment #494163 - Attachment is obsolete: true
Attachment #497671 - Flags: review?(ctalbert)
This patch adds bounds checking to the ini file routines to handle the case where there is a key but no value.
Attachment #497855 - Flags: review?(ctalbert)
Comment on attachment 497671 [details] [diff] [review]
Refactored pull patch, with test

cool, thanks for the tests and the changes
Attachment #497671 - Flags: review?(ctalbert) → review+
Comment on attachment 497855 [details] [diff] [review]
Added bounds check to ini file routine

Should we remove these commented out lines instead of just commenting out code without an explanation?  

The actual code change here looks good.
Attachment #497855 - Flags: review?(ctalbert) → review+
(In reply to comment #9)
> Comment on attachment 497671 [details] [diff] [review]
> Refactored pull patch, with test
> 
> cool, thanks for the tests and the changes

These changes to the devicemanager were included as part of the overall changes on bug 617815.  Mark, can you double check and verify that's the case?
(In reply to comment #12)
> Landed:
> http://hg.mozilla.org/mozilla-central/rev/69cc2a93f509

So this change busted android builds, I have backed it out and since this bug's purpose is kind of two things I'll open a new bug for the fix to the bustage. This will now be tracked in bug 619839
Component: New Frameworks → General
You need to log in before you can comment on or make changes to this bug.