Last Comment Bug 793855 - runxpcshelltests.py's profile cleanup should be more resilient and output TBPL-compatible errors
: runxpcshelltests.py's profile cleanup should be more resilient and output TBP...
Status: RESOLVED FIXED
[sheriff-want]
:
Product: Testing
Classification: Components
Component: XPCShell Harness (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla18
Assigned To: Ed Morley [:emorley]
:
Mentors:
Depends on:
Blocks: 752243 778688 509439 717448 762032 767968 799532 808545 809071 823996
  Show dependency treegraph
 
Reported: 2012-09-24 14:46 PDT by Ed Morley [:emorley]
Modified: 2012-12-21 10:09 PST (History)
3 users (show)
emorley: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
fixed


Attachments
Patch v1 (1.61 KB, patch)
2012-10-04 07:27 PDT, Ed Morley [:emorley]
gps: feedback+
Details | Diff | Review
Check process exited v1 (4.09 KB, patch)
2012-10-06 15:31 PDT, Ed Morley [:emorley]
gps: review+
jmaher: review+
Details | Diff | Review
Catch exceptions when removing profile (1.70 KB, patch)
2012-10-06 16:12 PDT, Ed Morley [:emorley]
gps: review+
Details | Diff | Review

Description Ed Morley [:emorley] 2012-09-24 14:46:51 PDT
eg:
Bug 762032
https://tbpl.mozilla.org/php/getParsedLog.php?id=15413259&tree=Services-Central
{
TEST-INFO | C:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test_svc\unit\test_0201_app_launch_apply_update_svc.js | running test ...
TEST-PASS | C:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test_svc\unit\test_0201_app_launch_apply_update_svc.js | test passed (time: 12012.000ms)
Traceback (most recent call last):
  File "xpcshell/runxpcshelltests.py", line 992, in <module>
    main()
  File "xpcshell/runxpcshelltests.py", line 988, in main
    if not xpcsh.runTests(args[0], testdirs=args[1:], **options.__dict__):
  File "xpcshell/runxpcshelltests.py", line 866, in runTests
    self.removeDir(self.profileDir)
  File "xpcshell/runxpcshelltests.py", line 325, in removeDir
    shutil.rmtree(dirname)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 174, in rmtree
    onerror(os.remove, fullname, sys.exc_info())
  File "c:\mozilla-build\python25\Lib\shutil.py", line 172, in rmtree
    os.remove(fullname)
WindowsError: [Error 13] The process cannot access the file because it is being used by another process: 'c:\\users\\cltbld\\appdata\\local\\temp\\tmpmlu2dc\\ExecutableDir.tmp\\bin\\uninstall\\helper.exe'
program finished with exit code 1
}

and bug 752243
https://tbpl.mozilla.org/php/getParsedLog.php?id=15487614&tree=Mozilla-Inbound
{
TEST-PASS | /Users/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_wipeServer.js | test passed (time: 1489.360ms)
Traceback (most recent call last):
  File "xpcshell/runxpcshelltests.py", line 992, in <module>
    main()
  File "xpcshell/runxpcshelltests.py", line 988, in main
    if not xpcsh.runTests(args[0], testdirs=args[1:], **options.__dict__):
  File "xpcshell/runxpcshelltests.py", line 866, in runTests
    self.removeDir(self.profileDir)
  File "xpcshell/runxpcshelltests.py", line 325, in removeDir
    shutil.rmtree(dirname)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 216, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 216, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 208, in rmtree
    onerror(os.listdir, path, sys.exc_info())
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 206, in rmtree
    names = os.listdir(path)
OSError: [Errno 13] Permission denied: '/var/folders/Hs/HsDn6a9SG8idoIya6p9mtE+++TI/-Tmp-/tmpYfO7zV/Cache/D'
program finished with exit code 1
}
Comment 1 Ed Morley [:emorley] 2012-10-04 07:27:37 PDT
Created attachment 667960 [details] [diff] [review]
Patch v1

The bugs that cause this failure type have become extremely common in the last week (see dependency tree); this patch will actually let them be starrable.

Not sure if this is the right approach, somewhat of a copypasta with the xunitResult part etc
Comment 2 Ed Morley [:emorley] 2012-10-04 09:00:07 PDT
https://tbpl.mozilla.org/?tree=Try&rev=662b31af8e1a

Example output:
TEST-UNEXPECTED-FAIL | /builds/slave/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_changePassword.js | Failed to clean up after test: <type 'exceptions.OSError'>

https://tbpl.mozilla.org/php/getParsedLog.php?id=15811737&tree=Try#error0
Comment 3 Gregory Szorc [:gps] 2012-10-04 09:52:03 PDT
Comment on attachment 667960 [details] [diff] [review]
Patch v1

Review of attachment 667960 [details] [diff] [review]:
-----------------------------------------------------------------

The fact we need this patch saddens me.

We remove the profile directory *after* the xpcshell process terminates. So, there should be no contention on the directory and thus nothing lingering to prevent us from deleting it.

Looking at the code in more detail, I /think/ it may be possible for the finally block to get entered before the process terminates. The reason is we don't have an exception handler or code in the finally that ensures the process has actually exited! I'm tempted that as a first step we try to add more robust process exit detection to the xpcshell test runner. If that fails, we can go with the approach in this patch.

This patch also only puts a bandage over the wound without treating it. I would love to see some extra diagnostics like the list of files still in the directory, the permissions and owners of them, etc.

::: testing/xpcshell/runxpcshelltests.py
@@ +864,5 @@
>          # or check-one.
>          if self.profileDir and not self.interactive and not self.singleFile:
> +          try:
> +            self.removeDir(self.profileDir)
> +          except:

You should never do bareword except. At the least you should catch the base class Exception. Ideally you should only catch the exception you care about. But, in this case we actually do want to catch all exceptions.

That being said, you aren't using the raised exception instance, so meh.

@@ +867,5 @@
> +            self.removeDir(self.profileDir)
> +          except:
> +            message = "TEST-UNEXPECTED-FAIL | %s | Failed to clean up after test: %s" % (name, sys.exc_info()[0])
> +            self.log.error(message)
> +            print_stdout(stdout)

Please throw in:

    print_stdout(traceback.format_exc())

@@ +870,5 @@
> +            self.log.error(message)
> +            print_stdout(stdout)
> +            self.failCount += 1
> +            xunitResult["passed"] = False
> +            

Nit: leading whitespace

@@ +874,5 @@
> +            
> +            xunitResult["failure"] = {
> +              "type": "TEST-UNEXPECTED-FAIL",
> +              "message": message,
> +              "text": stdout

And capture the full formatted traceback in the xunit result for bonus points.
Comment 4 Ed Morley [:emorley] 2012-10-06 15:31:06 PDT
Created attachment 668842 [details] [diff] [review]
Check process exited v1

I've changed the approach as suggested: As opposed to just catching the exception on directory removal, we now check the process has exited - and if not, show a TBPL-compatible error and then force kill the process.

Is green on Try:
https://tbpl.mozilla.org/?tree=Try&rev=68312a2cec8e

Unfortunately even with many retriggers I couldn't get one of the known-oranges to occur so we could see it work.

Asking review from Joel for the deviceManager aspects of the remotexpcshelltests.py changes.
Comment 5 Ed Morley [:emorley] 2012-10-06 16:05:18 PDT
After a few more retriggers I got:
https://tbpl.mozilla.org/php/getParsedLog.php?id=15886294&tree=Try
{
OSError: [Errno 13] Permission denied: '/var/folders/fq/5_h5_hvs48xdqf996vtnrgd000000w/T/tmpRk5tWb/Cache/5'
}

...which I guess means we need both patches; since it's not always due to the process still being active.
Comment 6 Ed Morley [:emorley] 2012-10-06 16:12:09 PDT
Created attachment 668850 [details] [diff] [review]
Catch exceptions when removing profile

With comment 3 nits addressed.
Comment 7 Joel Maher (:jmaher) 2012-10-07 03:52:48 PDT
Comment on attachment 668842 [details] [diff] [review]
Check process exited v1

Review of attachment 668842 [details] [diff] [review]:
-----------------------------------------------------------------

thanks!
Comment 8 Ed Morley [:emorley] 2012-10-07 04:12:11 PDT
Thank you for the review :-)

Adjusting bug summary (again) to reflect the dual approach we're now taking; sorry for churn.
Comment 9 Gregory Szorc [:gps] 2012-10-07 10:32:55 PDT
Comment on attachment 668842 [details] [diff] [review]
Check process exited v1

Review of attachment 668842 [details] [diff] [review]:
-----------------------------------------------------------------

At some point we should probably switch runxpcshelltests.py to use mozprocess. But, that's for another bug.
Comment 10 Gregory Szorc [:gps] 2012-10-07 10:34:13 PDT
Comment on attachment 668850 [details] [diff] [review]
Catch exceptions when removing profile

Review of attachment 668850 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Comment 11 Ed Morley [:emorley] 2012-10-07 13:31:45 PDT
Thank you for the reviews :-)

https://hg.mozilla.org/integration/mozilla-inbound/rev/9a685a97bfec
https://hg.mozilla.org/integration/mozilla-inbound/rev/db8552af8d73

(Made one last tweak; added the traceback import to runxpcshelltests.py in response to a Try run)

Final example output (for the catch exceptions patch; I couldn't get the other oranges to recreate):
https://tbpl.mozilla.org/php/getParsedLog.php?id=15898864&full=1&branch=try#error0
Comment 13 Ed Morley [:emorley] 2012-10-15 05:44:09 PDT
Landed on beta so we have this for ESR17 for the next year.

https://hg.mozilla.org/releases/mozilla-beta/rev/92ca71259d1b
https://hg.mozilla.org/releases/mozilla-beta/rev/189299c0bb50

Note You need to log in before you can comment on or make changes to this bug.