Closed Bug 1192307 Opened 4 years ago Closed 2 years ago

"Access denied" when killing xpcshell process on Win8/Win10

Categories

(Testing :: XPCShell Harness, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jgriffin, Assigned: ted)

References

Details

Sadly, it looks like we've been having xpcshell test timeouts for some time on Windows 8, but these are showing as green in TH:  https://treeherder.mozilla.org/logviewer.html#?job_id=12629080&repo=mozilla-inbound

I'm not sure if this is a mozharness problem, or if the issue is elsewhere.

Interesting, Callek stood the tests up on Windows 10, and there the failures are flagged correctly:  https://treeherder.mozilla.org/logviewer.html#?job_id=10197137&repo=try
Hmm axe that, the same problem exists in Windows 10; the failures there are other failures.
The "TEST-TIMEOUT" lines in attached log are the expected behavior of bug 906510 (if I search for "test_provider_appinfo.js" in the log for instance, it's retried at the end of the run and passes).
Thanks, so the only real bug here is our failure to shut down an xpcshell process correctly when it times out.
log snippet:

 08:34:48 INFO - TEST-TIMEOUT | toolkit/components/places/tests/expiration/test_notifications_onDeleteVisits.js | took 300000ms
08:34:48 INFO - Can't trigger Breakpad, just killing process
08:34:48 INFO - xpcshell return code: None
08:34:48 INFO - Exception in thread Thread-4263:
08:34:48 INFO - Traceback (most recent call last):
08:34:48 INFO - File "c:\mozilla-build\python27\Lib\threading.py", line 551, in __bootstrap_inner
08:34:48 INFO - self.run()
08:34:48 INFO - File "c:\mozilla-build\python27\Lib\threading.py", line 755, in run
08:34:48 INFO - self.function(*self.args, **self.kwargs)
08:34:48 INFO - File "C:\slave\test\build\tests\xpcshell\runxpcshelltests.py", line 682, in <lambda>
08:34:48 INFO - testTimer = Timer(testTimeoutInterval, lambda: self.testTimeout(proc))
08:34:48 INFO - File "C:\slave\test\build\tests\xpcshell\runxpcshelltests.py", line 341, in testTimeout
08:34:48 INFO - self.postCheck(proc)
08:34:48 INFO - File "C:\slave\test\build\tests\xpcshell\runxpcshelltests.py", line 309, in postCheck
08:34:48 INFO - self.kill(proc)
08:34:48 INFO - File "C:\slave\test\build\tests\xpcshell\runxpcshelltests.py", line 213, in kill
08:34:48 INFO - return proc.kill()
08:34:48 INFO - File "c:\mozilla-build\python27\Lib\subprocess.py", line 1019, in terminate
08:34:48 INFO - _subprocess.TerminateProcess(self._handle, 1)
08:34:48 INFO - WindowsError: [Error 5] Access is denied
Summary: xpcshell test timeouts are green on Windows 8 → "Access denied" when killing xpcshell process on Win8/Win10
So in the simple case, Popen.kill works on my local Windows 10 machine:
>>> p = subprocess.Popen(['sleep', '100'])
>>> p.poll() is None
True
>>> p.kill()
>>> p.poll()
1

There must be something specific to running xpcshell here.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #5)
> So in the simple case, Popen.kill works on my local Windows 10 machine:
> >>> p = subprocess.Popen(['sleep', '100'])
> >>> p.poll() is None
> True
> >>> p.kill()
> >>> p.poll()
> 1
> 
> There must be something specific to running xpcshell here.

what version of python? Can you also try with py 2.7.3 ?
This is Python 2.7.10 in the latest MozillaBuild release. I'll try 2.7.3 for sanity.
2.7.3 gives the same result for that simple test. I'll have to try making an xpcshell test hang and see what the harness does for me locally.
Hangs work fine locally, although I realized I'm doing a 32-bit build and the log is from a 64-bit build. I'll try doing a 64-bit build...

Another theory I had was that maybe 32-bit Python couldn't TerminateProcess on a 64-bit process, but I built a small 64-bit test program and that still worked fine, so it's not that.
A local Win64 build with test_sample.js hacked to have an infinite loop seems to handle it OK:
TEST-INFO | crashinject: exit status 0
 0:21.72 LOG: Thread-37 INFO xpcshell return code: None
 0:21.72 LOG: Thread-37 ERROR testing/xpcshell/example/unit/test_sample.js | Process still running after test!

I guess crashinject doesn't work, but I don't get the same error from the log above.
I wrote a patch for bug 1193738, I'd like to see if that makes this any better.
I have a green try push for the patch from that bug:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8cf7f60f999c

I scheduled Win10 xpcshell tests on it to see if that patch makes this any better.
Verdict: looks better.
Assignee: nobody → ted
Depends on: 1193738
Depends on: 1316309
No longer depends on: 1193738
This bug was from 2015 and has been superseeded by the migration we've already done for talos tests to w10 and the taskcluster work that's ongoing.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.