Closed Bug 582821 Opened 14 years ago Closed 11 years ago

[Win7][disabled on Windows] intermittent exception in test_nsIProcess.js : from nsILocalFile.moveTo in rename_and_test (causing the slave to fail with "rm: cannot remove directory `build/xpcshell/tests/xpcom/tests/unit': Directory not empty" till clobber)

Categories

(Core :: XPCOM, defect)

x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: dbaron, Unassigned)

References

Details

(Keywords: intermittent-failure, Whiteboard: [test which breaks the slave][test disabled on Windows])

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1280361270.1280364347.13736.gz Rev3 WINNT 6.1 mozilla-central opt test xpcshell s: talos-r3-w7-037 TEST-UNEXPECTED-FAIL | c:\talos-slave\mozilla-central-win7-opt-u-xpcshell\build\xpcshell\tests\xpcom\unit\test_nsIProcess.js | test failed (with xpcshell return code: 0), see following log: TEST-UNEXPECTED-FAIL | (xpcshell/head.js) | [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsILocalFile.moveTo]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: c:/talos-slave/mozilla-central-win7-opt-u-xpcshell/build/xpcshell/tests/xpcom/unit/test_nsIProcess.js :: rename_and_test :: line 159" data: no] I don't think this has been observed before, but we've only had the Win7 xpcshell unhidden for a few days.
This may be similar to problems we've beeing seeing where Windows holds on to conftest.exe for a few seconds after you've run it. I've heard that making the "Application Experience" service start automatically instead of delay-start will help things (I've done that on my machine). But I also think we could just use copyTo instead of moveTo in this test.
(In reply to comment #1) > This may be similar to problems we've beeing seeing where Windows holds on to > conftest.exe for a few seconds after you've run it. I've heard that making the > "Application Experience" service start automatically instead of delay-start > will help things (I've done that on my machine). > > But I also think we could just use copyTo instead of moveTo in this test. For what it's worth, we sleep for 30 seconds before launching Buildbot on this platform. I had a look through services.msc on a 32-bit Windows 7 machine and "Application Experience" has its Startup Type set to "Manual". On this machine, it has been started, though I'm not sure by what. Based on all the stop and start in the System log it seems like it launches on demand -- which I guess is what your comment implies, Benjamin.
philringnalda%gmail.com http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1281496696.1281499913.4488.gz Rev3 WINNT 6.1 mozilla-central opt test xpcshell on 2010/08/10 20:18:16 s: talos-r3-w7-034 TEST-UNEXPECTED-FAIL | c:\\talos-slave\\mozilla-central_win7_test-xpcshell\\build\\xpcshell\\tests\\xpcom\\unit\\test_nsIProcess.js | test failed (with xpcshell return code: 0), see following log: TEST-UNEXPECTED-FAIL | (xpcshell/head.js) | [Exception... \"Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsILocalFile.moveTo]\" nsresult: \"0x80004005 (NS_ERROR_FAILURE)\" location: \"JS frame :: c:/talos-slave/mozilla-central_win7_test-xpcshell/build/xpcshell/tests/xpcom/unit/test_nsIProcess.js :: rename_and_test :: line 159\" data: no]
http://tinderbox.mozilla.org/showlog.cgi?log=Mozilla-Aurora/1305062068.1305064280.24657.gz Rev3 WINNT 6.1 mozilla-aurora opt test xpcshell on 2011/05/10 14:14:28 s: talos-r3-w7-025 TEST-UNEXPECTED-FAIL | c:\talos-slave\test\build\xpcshell\tests\xpcom\tests\unit\test_nsIProcess.js | test failed (with xpcshell return code: 0), see following log: TEST-UNEXPECTED-FAIL | (xpcshell/head.js) | [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsILocalFile.moveTo]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: c:/talos-slave/test/build/xpcshell/tests/xpcom/tests/unit/test_nsIProcess.js :: rename_and_test :: line 159" data: no]
And if you then look at the rest of that slave's morning in https://build.mozilla.org/buildapi/recent/talos-r3-w7-035 you'll see that it went on to turn all the jobs that feel the need for a rm -rf build red, because "rm: cannot remove directory `build/xpcshell/tests/xpcom/tests/unit': Directory not empty" though I don't really know whether this is the cause of that fairly frequent (sometimes mentioned in bug 692715, other times not) thing, or just the first victim.
Blocks: 692715
(In reply to Benjamin Smedberg [:bsmedberg] from comment #1) > This may be similar to problems we've beeing seeing where Windows holds on > to conftest.exe for a few seconds after you've run it. Seems to hold on for ever so slightly more than a few seconds - https://build.mozilla.org/buildapi/recent/talos-r3-w7-037 won't show it forever, but right now shows another 12 red runs after the comment 20 run, broken up by two green Talos runs because they don't try to rm -rf build and thus don't get "rm: cannot remove directory `build/xpcshell/tests/xpcom/tests/unit': Directory not empty"). I've never just let one go, but sometimes when I've sent whoever I could find off to help the slave, it has already healed, which I suspect means it did another xpcshell run, and my bet is that rather than "for a few seconds" the wedged state lasts "until this test runs again."
Whiteboard: [orange] → [orange][test which breaks the slave]
And indeed, 037 "burned" (a misnomer, since the tests run just fine, but colored and no bug suggested by tbpl means retrigger for most people, so practically they burn) another 31 jobs until I triggered 30 or 40 xpcshell jobs on a try push to trap it into having to do one, after which it has been green again.
Depends on: 727543
I'm no longer willing to make releng run around manually cleaning up after this test, so we need to do something about it, whether or not that something is to stop running it.
Severity: normal → critical
Summary: [Win7] intermittent exception in test_nsIProcess.js : from nsILocalFile.moveTo in rename_and_test → [Win7] intermittent exception in test_nsIProcess.js : from nsILocalFile.moveTo in rename_and_test (causing the slave to fail with "rm: cannot remove directory `build/xpcshell/tests/xpcom/tests/unit': Directory not empty" till clobber)
https://tbpl.mozilla.org/?tree=Try&rev=7357e37293a7 is a naive s/moveTo/copyTo/, just copying and leaving the copied file, since it looked to me at a glance like the existing scheme was "create it, move it, move it back, abandon it without deleting it." Failed despite the test passing, on every flavor of Windows, because the copied file couldn't be removed.
Oh, awesomesauce: the reason comment 34 looked like more things than talos-r3-w7-034 should have been able to run overnight is because my try run created four more slaves in this state.
ahah, you broke our slaves :) Btw, couldn't this test move/copy the files to the profile folder that is automatically thrown away? As a side note Neil pointed out on IRC he had issues (basically the same bug) moving unicode named files in the past.
So that's another slave broken
Depends on: 733071
And that's enough of this. Disabled on Windows in https://hg.mozilla.org/integration/mozilla-inbound/rev/eed9e3c1ea13
Summary: [Win7] intermittent exception in test_nsIProcess.js : from nsILocalFile.moveTo in rename_and_test (causing the slave to fail with "rm: cannot remove directory `build/xpcshell/tests/xpcom/tests/unit': Directory not empty" till clobber) → clobber) [Win7][disabled on Windows] intermittent exception in test_nsIProcess.js : from nsILocalFile.moveTo in rename_and_test (causing the slave to fail with "rm: cannot remove directory `build/xpcshell/tests/xpcom/tests/unit': Directory not empty" till
(In reply to Phil Ringnalda (:philor) from comment #78) > And that's enough of this. Disabled on Windows in > https://hg.mozilla.org/integration/mozilla-inbound/rev/eed9e3c1ea13 Merged that to m-c: https://hg.mozilla.org/mozilla-central/rev/eed9e3c1ea13
Depends on: 745404
Whiteboard: [orange][test which breaks the slave] → [orange][test which breaks the slave][test disabled on Windows]
Whiteboard: [orange][test which breaks the slave][test disabled on Windows] → [test which breaks the slave][test disabled on Windows]
No longer blocks: 692715
Status: NEW → RESOLVED
Closed: 11 years ago
Depends on: 692715
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.