Closed
Bug 1410723
Opened 7 years ago
Closed 7 years ago
Intermittent DistutilsFileError: cannot copy tree
Categories
(Firefox for Android Graveyard :: Testing, defect, P5)
Firefox for Android Graveyard
Testing
Tracking
(firefox58 fixed, firefox59 fixed)
RESOLVED
FIXED
Firefox 59
People
(Reporter: intermittent-bug-filer, Assigned: gbrown)
References
Details
(Keywords: intermittent-failure)
Attachments
(2 files, 2 obsolete files)
2.20 KB,
patch
|
bc
:
review+
|
Details | Diff | Splinter Review |
1.92 KB,
patch
|
bc
:
review+
|
Details | Diff | Splinter Review |
Filed by: archaeopteryx [at] coole-files.de
https://treeherder.mozilla.org/logviewer.html#?job_id=138730283&repo=mozilla-central
https://queue.taskcluster.net/v1/task/fFKL89L5QIyY5ZIdpzJHjQ/runs/0/artifacts/public/logs/live_backing.log
[task 2017-10-22T11:02:06.969Z] 11:02:06 INFO - TEST-START | toolkit/components/telemetry/tests/unit/test_TelemetryScalars.js
[task 2017-10-22T11:02:13.012Z] 11:02:13 INFO - mozdevice killProcess failed force-stop of xpcshell
[task 2017-10-22T11:02:13.032Z] 11:02:13 INFO - TEST-PASS | toolkit/components/telemetry/tests/unit/test_TelemetryScalars.js | took 6063ms
[task 2017-10-22T11:02:13.065Z] 11:02:13 INFO - Following exceptions were raised:
[task 2017-10-22T11:02:13.066Z] 11:02:13 INFO - Traceback (most recent call last):
[task 2017-10-22T11:02:13.066Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 181, in run
[task 2017-10-22T11:02:13.067Z] 11:02:13 INFO - self.run_test()
[task 2017-10-22T11:02:13.068Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 792, in run_test
[task 2017-10-22T11:02:13.068Z] 11:02:13 INFO - if self.checkForCrashes(self.tempDir, self.symbolsPath, test_name=name):
[task 2017-10-22T11:02:13.069Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/remotexpcshelltests.py", line 181, in checkForCrashes
[task 2017-10-22T11:02:13.069Z] 11:02:13 INFO - self.device.getDirectory(self.remoteMinidumpDir, dumpDir)
[task 2017-10-22T11:02:13.070Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozdevice/devicemanagerADB.py", line 541, in getDirectory
[task 2017-10-22T11:02:13.070Z] 11:02:13 INFO - dir_util.copy_tree(localDir, originalLocal)
[task 2017-10-22T11:02:13.071Z] 11:02:13 INFO - File "/usr/lib/python2.7/distutils/dir_util.py", line 128, in copy_tree
[task 2017-10-22T11:02:13.071Z] 11:02:13 INFO - "cannot copy tree '%s': not a directory" % src
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - DistutilsFileError: cannot copy tree '/tmp/tmps55NSj/minidumps': not a directory
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - Traceback (most recent call last):
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/remotexpcshelltests.py", line 625, in <module>
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - main()
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/remotexpcshelltests.py", line 620, in main
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - mobileArgs=xpcsh.mobileArgs):
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 1385, in runTests
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - mobileArgs, **kwargs)
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 1580, in runTestList
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - raise exceptions[0]
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - distutils.errors.DistutilsFileError: cannot copy tree '/tmp/tmps55NSj/minidumps': not a directory
[task 2017-10-22T11:02:13.114Z] 11:02:13 ERROR - Return code: 1
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(gbrown)
Assignee | ||
Comment 3•7 years ago
|
||
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=138730283&lineNumber=2140
[task 2017-10-22T11:02:06.969Z] 11:02:06 INFO - TEST-START | toolkit/components/telemetry/tests/unit/test_TelemetryScalars.js
[task 2017-10-22T11:02:13.012Z] 11:02:13 INFO - mozdevice killProcess failed force-stop of xpcshell
[task 2017-10-22T11:02:13.032Z] 11:02:13 INFO - TEST-PASS | toolkit/components/telemetry/tests/unit/test_TelemetryScalars.js | took 6063ms
[task 2017-10-22T11:02:13.065Z] 11:02:13 INFO - Following exceptions were raised:
[task 2017-10-22T11:02:13.066Z] 11:02:13 INFO - Traceback (most recent call last):
[task 2017-10-22T11:02:13.066Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 181, in run
[task 2017-10-22T11:02:13.067Z] 11:02:13 INFO - self.run_test()
[task 2017-10-22T11:02:13.068Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 792, in run_test
[task 2017-10-22T11:02:13.068Z] 11:02:13 INFO - if self.checkForCrashes(self.tempDir, self.symbolsPath, test_name=name):
[task 2017-10-22T11:02:13.069Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/remotexpcshelltests.py", line 181, in checkForCrashes
[task 2017-10-22T11:02:13.069Z] 11:02:13 INFO - self.device.getDirectory(self.remoteMinidumpDir, dumpDir)
[task 2017-10-22T11:02:13.070Z] 11:02:13 INFO - File "/builds/worker/workspace/build/venv/lib/python2.7/site-packages/mozdevice/devicemanagerADB.py", line 541, in getDirectory
[task 2017-10-22T11:02:13.070Z] 11:02:13 INFO - dir_util.copy_tree(localDir, originalLocal)
[task 2017-10-22T11:02:13.071Z] 11:02:13 INFO - File "/usr/lib/python2.7/distutils/dir_util.py", line 128, in copy_tree
[task 2017-10-22T11:02:13.071Z] 11:02:13 INFO - "cannot copy tree '%s': not a directory" % src
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - DistutilsFileError: cannot copy tree '/tmp/tmps55NSj/minidumps': not a directory
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - Traceback (most recent call last):
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/remotexpcshelltests.py", line 625, in <module>
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - main()
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/remotexpcshelltests.py", line 620, in main
[task 2017-10-22T11:02:13.072Z] 11:02:13 INFO - mobileArgs=xpcsh.mobileArgs):
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 1385, in runTests
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - mobileArgs, **kwargs)
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - File "/builds/worker/workspace/build/tests/xpcshell/runxpcshelltests.py", line 1580, in runTestList
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - raise exceptions[0]
[task 2017-10-22T11:02:13.073Z] 11:02:13 INFO - distutils.errors.DistutilsFileError: cannot copy tree '/tmp/tmps55NSj/minidumps': not a directory
[task 2017-10-22T11:02:13.114Z] 11:02:13 ERROR - Return code: 1
[task 2017-10-22T11:02:13.114Z] 11:02:13 ERROR - No tests run or test summary not found
Bug 1328301 updated getDirectory(), but that is conditional on using a modern adb. In automation, we were using an older adb until I updated it in bug 1409057.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 5•7 years ago
|
||
I haven't quite figured out what is going wrong here or how to fix it, and probably won't get to it for at least a few days. :bc, checking with you to see if you have come across this, and/or if you want to take it?? (No pressure -- nothing urgent here afaik.)
Flags: needinfo?(bob)
Comment 6•7 years ago
|
||
Yes, I have seen it in autophone. It is common but not that frequent. I would like to take a look some time but it is really low on my priorities. The main issue I'd like to tackle now is disabling the tests which time out since they result in dramatic reductions in throughput. If someone would like to investigate that would be great. Otherwise, I'll get to it when I have the time. Thanks for the heads up though. I had thought this was Autophone specific. It must lie in devicemanager somewhere.
Flags: needinfo?(bob)
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → gbrown
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(gbrown)
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Summary: Intermittent DistutilsFileError: cannot copy tree '/tmp/tmps55NSj/minidumps': not a directory → Intermittent DistutilsFileError: cannot copy tree
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 10•7 years ago
|
||
I don't understand how/why that directory is not there after the pull. Rather than pull my hair out, here's a patch to improve the error message -- hopefully that will help me understand the problem.
Attachment #8933153 -
Flags: review?(bob)
Assignee | ||
Updated•7 years ago
|
Keywords: leave-open
Comment 11•7 years ago
|
||
Comment on attachment 8933153 [details] [diff] [review]
better diagnostics on getDirectory failure
Review of attachment 8933153 [details] [diff] [review]:
-----------------------------------------------------------------
Let's chat on irc.
::: testing/mozbase/mozdevice/mozdevice/devicemanagerADB.py
@@ +538,5 @@
> localDir = os.path.join(tempParent, remoteName)
> else:
> localDir = '/'.join(localDir.rstrip('/').split('/')[:-1])
> + cmd = ["pull", remoteDir, localDir]
> + proc = self._runCmd(cmd)
The way _runCmd is written, proc will be None if _runCmd times out. At first I thought we needed the wait() but it seems _runCmd will guarantee that it is called if proc is not None.
@@ +543,1 @@
> if copyRequired:
Maybe handle no proc by:
if not proc:
raise Exception("getDirectory() _runCmd %s failed" % str(cmd))
try:
...
But really, the caller's of _runCmd seem to expect proc to be returned regardless of the timeout. Perhaps we should make _runCmd always return proc even if it does timeout. Then we would need the proc.wait().
Attachment #8933153 -
Flags: review?(bob)
Assignee | ||
Comment 12•7 years ago
|
||
Thanks. I like the idea of always returning proc from _runCmd, since many existing callers assume a non-None return value. Also added some warnings to _runCmd to give us more info about retries. Let's see how that works.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ba362d7944eba488a31d41e2c64fa56c8d24138f
Attachment #8933153 -
Attachment is obsolete: true
Attachment #8933354 -
Flags: review?(bob)
Comment 13•7 years ago
|
||
Comment on attachment 8933354 [details] [diff] [review]
better diagnostics on getDirectory failure
Review of attachment 8933354 [details] [diff] [review]:
-----------------------------------------------------------------
Minor style comments. Looks good to me. r+
::: testing/mozbase/mozdevice/mozdevice/devicemanagerADB.py
@@ +544,5 @@
> + try:
> + dir_util.copy_tree(localDir, originalLocal)
> + mozfile.remove(tempParent)
> + except:
> + self._logger.error("getDirectory() failed after %s" % str(cmd))
Since we have proc now, what do you think of adding proc.returncode to the error message?
@@ +696,5 @@
> retries += 1
> else:
> return proc
> + self._logger.warning("_runCmd exceeded all retries")
> + return proc
Personal style issue. Feel free to ignore.
proc = None
while retries < retryLimit:
proc = ProcessHandler(finalArgs, storeOutput=True,
processOutputLine=self._log, onTimeout=_timeout)
proc.returncode = proc.wait()
if proc.returncode is not None:
break
proc.kill()
self._logger.warning("_runCmd failed for '%s'" % ' '.join(finalArgs))
retries += 1
if retries >= retryLimit:
self._logger.warning("_runCmd exceeded all retries")
return proc
Attachment #8933354 -
Flags: review?(bob) → review+
Assignee | ||
Comment 14•7 years ago
|
||
Great suggestions -- will incorporate both on landing.
Comment 15•7 years ago
|
||
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0e3f8ec51869
Improve diagnostics when Android devicemanager getDirectory fails; r=bc
Comment 16•7 years ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 18•7 years ago
|
||
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=149266366&lineNumber=1875
[task 2017-12-02T08:41:40.533Z] 08:41:40 INFO - mozdevice getDirectory() failed after ['pull', '/storage/sdcard/tests/xpc/minidumps', '/tmp/tmp8XVf2s/minidumps']
[task 2017-12-02T08:41:40.533Z] 08:41:40 INFO - mozdevice rc=1 out=['adb: error: failed to get feature set: no devices/emulators found']
...
[task 2017-12-02T08:41:40.545Z] 08:41:40 INFO - DistutilsFileError: cannot copy tree '/tmp/tmp8XVf2s/minidumps': not a directory
Assignee | ||
Comment 26•7 years ago
|
||
All the recent failures with extra diagnostics show the same error message, as in comment 18: The emulator is suddenly unresponsive. That's a common intermittent problem with the emulator, usually handled by retrying the task whenever a DMError is seen. From that perspective, the only problem here is that getDirectory fails to throw DMError in this case; using _checkCmd should rectify that.
Attachment #8934718 -
Flags: review?(bob)
Assignee | ||
Comment 27•7 years ago
|
||
Comment 28•7 years ago
|
||
Comment on attachment 8934718 [details] [diff] [review]
use _checkCmd instead of _runCmd for getDirectory
Review of attachment 8934718 [details] [diff] [review]:
-----------------------------------------------------------------
::: testing/mozbase/mozdevice/mozdevice/devicemanagerADB.py
@@ +539,4 @@
> else:
> localDir = '/'.join(localDir.rstrip('/').split('/')[:-1])
> cmd = ["pull", remoteDir, localDir]
> + proc = self._checkCmd(cmd)
_checkCmd returns the returncode of the process.
Attachment #8934718 -
Flags: review?(bob)
Assignee | ||
Comment 29•7 years ago
|
||
It looks like 'adb pull' will return non-zero for other conditions -- an empty directory, or a directory that doesn't exist, so I think it is necessary to parse the output to decide if throwing a DMError and triggering a task retry is reasonable.
Here I check for the "no devices/emulators found" message as an explicit indication that the emulator is down, but also allow for other conditions: if adb pull failed and didn't report any of the expected messages:
0 files pulled
1 file pulled
<remote dir> does not exist
Doesn't break the common cases: https://treeherder.mozilla.org/#/jobs?repo=try&revision=60e69ae6e6892d809f1c5833a7d6ac19cf399c7f
Attachment #8934718 -
Attachment is obsolete: true
Attachment #8935133 -
Flags: review?(bob)
Comment 30•7 years ago
|
||
Comment on attachment 8935133 [details] [diff] [review]
throw DMError when a bad emulator is suspected
Review of attachment 8935133 [details] [diff] [review]:
-----------------------------------------------------------------
r+. Feel free to ignore the style comment.
::: testing/mozbase/mozdevice/mozdevice/devicemanagerADB.py
@@ +544,5 @@
> + # Raise a DMError when the device is missing, but not when the
> + # directory is empty or missing.
> + if ("no devices/emulators found" in proc.output or
> + not ("pulled" in proc.output or
> + "does not exist" in proc.output)):
A comment on style. I find I need to concentrate a bit to make sure this does what we expect. What do you think of?
if ("no devices/emulators found" in proc.output or
"pulled" not in proc.output and
"does not exist" not in proc.output):
or with parens which aren't absolutely necessary but do provide a 1 char indent of the raise from the condition?
if ("no devices/emulators found" in proc.output or
("pulled" not in proc.output and
"does not exist" not in proc.output)):
Attachment #8935133 -
Flags: review?(bob) → review+
Comment 31•7 years ago
|
||
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a48edb857f3
Throw DMError when getDirectory() fails suspiciously; r=bc
Assignee | ||
Updated•7 years ago
|
Keywords: leave-open
Comment 32•7 years ago
|
||
Backed out changeset 1a48edb857f3 (bug 1410723) for autophone mass failures r=backout on a CLOSED TREE
https://treeherder.mozilla.org/logviewer.html#?job_id=150341938&repo=mozilla-inbound
https://hg.mozilla.org/integration/mozilla-inbound/rev/70fd148133435d20416262be28f51ee4c430a6f6
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=1a48edb857f33c02e23104d9d1f7b6e1e5c42a75&selectedJob=150341938
Flags: needinfo?(gbrown)
Assignee | ||
Comment 33•7 years ago
|
||
I forgot that proc.output is a []. This should fix it up:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ece6d35061a391956e9c11c12953a6a826f57e5f
Assignee | ||
Comment 35•7 years ago
|
||
autophone passes on try, with a simple modification to account for [] output.
Flags: needinfo?(gbrown)
Comment 36•7 years ago
|
||
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5115d6bc9009
Throw DMError when getDirectory() fails suspiciously; r=bc
Comment 37•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
status-firefox59:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 59
Comment 38•7 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/888df7563642
https://hg.mozilla.org/releases/mozilla-beta/rev/c7a07000546a
status-firefox58:
--- → fixed
Assignee | ||
Comment 39•7 years ago
|
||
Confirmation this is working: https://treeherder.mozilla.org/logviewer.html#?job_id=150809794&repo=mozilla-inbound&lineNumber=1884
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•4 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•