Closed Bug 793855 Opened 12 years ago Closed 12 years ago

runxpcshelltests.py's profile cleanup should be more resilient and output TBPL-compatible errors

Categories

(Testing :: XPCShell Harness, defect)

defect
Not set
normal

Tracking

(firefox17 fixed)

RESOLVED FIXED
mozilla18
Tracking Status
firefox17 --- fixed

People

(Reporter: emorley, Assigned: emorley)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sheriff-want])

Attachments

(2 files, 1 obsolete file)

eg:
Bug 762032
https://tbpl.mozilla.org/php/getParsedLog.php?id=15413259&tree=Services-Central
{
TEST-INFO | C:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test_svc\unit\test_0201_app_launch_apply_update_svc.js | running test ...
TEST-PASS | C:\talos-slave\test\build\xpcshell\tests\toolkit\mozapps\update\test_svc\unit\test_0201_app_launch_apply_update_svc.js | test passed (time: 12012.000ms)
Traceback (most recent call last):
  File "xpcshell/runxpcshelltests.py", line 992, in <module>
    main()
  File "xpcshell/runxpcshelltests.py", line 988, in main
    if not xpcsh.runTests(args[0], testdirs=args[1:], **options.__dict__):
  File "xpcshell/runxpcshelltests.py", line 866, in runTests
    self.removeDir(self.profileDir)
  File "xpcshell/runxpcshelltests.py", line 325, in removeDir
    shutil.rmtree(dirname)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 169, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "c:\mozilla-build\python25\Lib\shutil.py", line 174, in rmtree
    onerror(os.remove, fullname, sys.exc_info())
  File "c:\mozilla-build\python25\Lib\shutil.py", line 172, in rmtree
    os.remove(fullname)
WindowsError: [Error 13] The process cannot access the file because it is being used by another process: 'c:\\users\\cltbld\\appdata\\local\\temp\\tmpmlu2dc\\ExecutableDir.tmp\\bin\\uninstall\\helper.exe'
program finished with exit code 1
}

and bug 752243
https://tbpl.mozilla.org/php/getParsedLog.php?id=15487614&tree=Mozilla-Inbound
{
TEST-PASS | /Users/cltbld/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_wipeServer.js | test passed (time: 1489.360ms)
Traceback (most recent call last):
  File "xpcshell/runxpcshelltests.py", line 992, in <module>
    main()
  File "xpcshell/runxpcshelltests.py", line 988, in main
    if not xpcsh.runTests(args[0], testdirs=args[1:], **options.__dict__):
  File "xpcshell/runxpcshelltests.py", line 866, in runTests
    self.removeDir(self.profileDir)
  File "xpcshell/runxpcshelltests.py", line 325, in removeDir
    shutil.rmtree(dirname)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 216, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 216, in rmtree
    rmtree(fullname, ignore_errors, onerror)
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 208, in rmtree
    onerror(os.listdir, path, sys.exc_info())
  File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/shutil.py", line 206, in rmtree
    names = os.listdir(path)
OSError: [Errno 13] Permission denied: '/var/folders/Hs/HsDn6a9SG8idoIya6p9mtE+++TI/-Tmp-/tmpYfO7zV/Cache/D'
program finished with exit code 1
}
Blocks: 509439, 717448, 767968
Attached patch Patch v1 (obsolete) — Splinter Review
The bugs that cause this failure type have become extremely common in the last week (see dependency tree); this patch will actually let them be starrable.

Not sure if this is the right approach, somewhat of a copypasta with the xunitResult part etc
Attachment #667960 - Flags: review?(gps)
https://tbpl.mozilla.org/?tree=Try&rev=662b31af8e1a

Example output:
TEST-UNEXPECTED-FAIL | /builds/slave/talos-slave/test/build/xpcshell/tests/services/sync/tests/unit/test_service_changePassword.js | Failed to clean up after test: <type 'exceptions.OSError'>

https://tbpl.mozilla.org/php/getParsedLog.php?id=15811737&tree=Try#error0
Comment on attachment 667960 [details] [diff] [review]
Patch v1

Review of attachment 667960 [details] [diff] [review]:
-----------------------------------------------------------------

The fact we need this patch saddens me.

We remove the profile directory *after* the xpcshell process terminates. So, there should be no contention on the directory and thus nothing lingering to prevent us from deleting it.

Looking at the code in more detail, I /think/ it may be possible for the finally block to get entered before the process terminates. The reason is we don't have an exception handler or code in the finally that ensures the process has actually exited! I'm tempted that as a first step we try to add more robust process exit detection to the xpcshell test runner. If that fails, we can go with the approach in this patch.

This patch also only puts a bandage over the wound without treating it. I would love to see some extra diagnostics like the list of files still in the directory, the permissions and owners of them, etc.

::: testing/xpcshell/runxpcshelltests.py
@@ +864,5 @@
>          # or check-one.
>          if self.profileDir and not self.interactive and not self.singleFile:
> +          try:
> +            self.removeDir(self.profileDir)
> +          except:

You should never do bareword except. At the least you should catch the base class Exception. Ideally you should only catch the exception you care about. But, in this case we actually do want to catch all exceptions.

That being said, you aren't using the raised exception instance, so meh.

@@ +867,5 @@
> +            self.removeDir(self.profileDir)
> +          except:
> +            message = "TEST-UNEXPECTED-FAIL | %s | Failed to clean up after test: %s" % (name, sys.exc_info()[0])
> +            self.log.error(message)
> +            print_stdout(stdout)

Please throw in:

    print_stdout(traceback.format_exc())

@@ +870,5 @@
> +            self.log.error(message)
> +            print_stdout(stdout)
> +            self.failCount += 1
> +            xunitResult["passed"] = False
> +            

Nit: leading whitespace

@@ +874,5 @@
> +            
> +            xunitResult["failure"] = {
> +              "type": "TEST-UNEXPECTED-FAIL",
> +              "message": message,
> +              "text": stdout

And capture the full formatted traceback in the xunit result for bonus points.
Attachment #667960 - Flags: review?(gps) → feedback+
I've changed the approach as suggested: As opposed to just catching the exception on directory removal, we now check the process has exited - and if not, show a TBPL-compatible error and then force kill the process.

Is green on Try:
https://tbpl.mozilla.org/?tree=Try&rev=68312a2cec8e

Unfortunately even with many retriggers I couldn't get one of the known-oranges to occur so we could see it work.

Asking review from Joel for the deviceManager aspects of the remotexpcshelltests.py changes.
Attachment #667960 - Attachment is obsolete: true
Attachment #668842 - Flags: review?(jmaher)
Attachment #668842 - Flags: review?(gps)
Summary: runxpcshelltests.py should catch shutil.rmtree(dirname) exceptions and give a TBPL-parser-friendly error message → runxpcshelltests.py should check the test process has exited and if not, give a TBPL-compatible error
After a few more retriggers I got:
https://tbpl.mozilla.org/php/getParsedLog.php?id=15886294&tree=Try
{
OSError: [Errno 13] Permission denied: '/var/folders/fq/5_h5_hvs48xdqf996vtnrgd000000w/T/tmpRk5tWb/Cache/5'
}

...which I guess means we need both patches; since it's not always due to the process still being active.
With comment 3 nits addressed.
Attachment #668850 - Flags: review?(gps)
Comment on attachment 668842 [details] [diff] [review]
Check process exited v1

Review of attachment 668842 [details] [diff] [review]:
-----------------------------------------------------------------

thanks!
Attachment #668842 - Flags: review?(jmaher) → review+
Thank you for the review :-)

Adjusting bug summary (again) to reflect the dual approach we're now taking; sorry for churn.
Summary: runxpcshelltests.py should check the test process has exited and if not, give a TBPL-compatible error → runxpcshelltests.py's profile cleanup should be more resilient and output TBPL-compatible errors
Comment on attachment 668842 [details] [diff] [review]
Check process exited v1

Review of attachment 668842 [details] [diff] [review]:
-----------------------------------------------------------------

At some point we should probably switch runxpcshelltests.py to use mozprocess. But, that's for another bug.
Attachment #668842 - Flags: review?(gps) → review+
Comment on attachment 668850 [details] [diff] [review]
Catch exceptions when removing profile

Review of attachment 668850 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #668850 - Flags: review?(gps) → review+
Thank you for the reviews :-)

https://hg.mozilla.org/integration/mozilla-inbound/rev/9a685a97bfec
https://hg.mozilla.org/integration/mozilla-inbound/rev/db8552af8d73

(Made one last tweak; added the traceback import to runxpcshelltests.py in response to a Try run)

Final example output (for the catch exceptions patch; I couldn't get the other oranges to recreate):
https://tbpl.mozilla.org/php/getParsedLog.php?id=15898864&full=1&branch=try#error0
Flags: in-testsuite-
Target Milestone: --- → mozilla18
Blocks: 799532
Blocks: 808545
Blocks: 809071
Blocks: 823996
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: