Processes still running after task on Windows
Categories
(Taskcluster :: General, defect)
Tracking
(Not tracked)
People
(Reporter: glandium, Unassigned)
References
Details
As can be seen at the end of: https://queue.taskcluster.net/v1/task/G0j-vl9PS861uEMQ7u-6Dg/runs/0/artifacts/public/logs/live_backing.log
[taskcluster:error] [mounts] Could not unmount <nil> due to: 'Could not persist cache "level-3-checkouts" due to remove Z:\task_1550107918\build\src\vs2017_15.4.2\VC\bin\Hostx64\x64\msvcp140.dll: Access is denied.'
[taskcluster 2019-02-14T02:24:22.928Z] Uploading redirect artifact public/logs/live.log to URL https://queue.taskcluster.net/v1/task/G0j-vl9PS861uEMQ7u-6Dg/runs/1/artifacts/public/logs/live_backing.log with mime type "text/plain; charset=utf-8" and expiry 2020-02-14T00:40:39.774Z
[taskcluster:error] Could not persist cache "level-3-checkouts" due to remove Z:\task_1550107918\build\src\vs2017_15.4.2\VC\bin\Hostx64\x64\msvcp140.dll: Access is denied.
The first line suggests some process using the msvcp140.dll file is still running, which shouldn't be happening: all processes should be killed at the end of a task.
Reporter | ||
Comment 1•6 years ago
|
||
List of processes still running at the end of the failing task:
https://taskcluster-artifacts.net/ZFZlawJ5QxSovzeUV3ALFw/0/public/logs/live_backing.log
Z:\task_1550112176>wmic process get description,executablepath
Description ExecutablePath
System Idle Process
System
smss.exe
csrss.exe
wininit.exe
csrss.exe
winlogon.exe
services.exe
lsass.exe
svchost.exe
svchost.exe
dwm.exe
svchost.exe
svchost.exe
svchost.exe
svchost.exe
svchost.exe
spoolsv.exe
LiteAgent.exe
svchost.exe
dirmngr.exe
nssm.exe
IpOverUsbSvc.exe
cmd.exe
conhost.exe
nxlog.exe
svchost.exe
Ec2Config.exe
WmiPrvSE.exe
taskhostex.exe C:\Windows\system32\taskhostex.exe
explorer.exe C:\Windows\Explorer.EXE
WmiPrvSE.exe
svchost.exe
svchost.exe
msdtc.exe
generic-worker.exe
livelog.exe
vctip.exe z:\task_1550112176\build\src\vs2017_15.4.2\VC\bin\Hostx64\x64\VCTIP.EXE
cmd.exe C:\Windows\system32\cmd.exe
conhost.exe C:\Windows\system32\conhost.exe
WMIC.exe C:\Windows\System32\Wbem\WMIC.exe
Clearly, the culprit is vctip, which is presumably leftover from running VC++. It's something the job could avoid having still running or running at all, but it's also clearly a problem that the worker process doesn't clean up the running processes before unmounting caches.
Comment 2•6 years ago
|
||
It is by design, that if a cache cannot be unmounted, it is not persisted.
The problem with killing processes first in order to be able to unmount a cache, is that the existence of a process still running after the task has completed which has an open file handle to a file inside the cache, means that we can't be sure the cache is in a clean state. If we kill a process which is writing to the cache, we could leave it in a corrupt state. We therefore only unmount caches if the task completed successfully and no processes are still running which have open file handles to files inside the cache.
If it the responsibility of the task to ensure that no locks are held on caches when the task completes, so if it is preferred to kill vctip, this should be explicitly handled in the task, since the worker cannot know if killing a particular process will leave a cache in a bad state or not. In contrast the task understands what the processes are and whether it is safe to kill them or not.
Note workers are rebooted between tasks, so no zombie processes persist across task boundaries.
Reporter | ||
Comment 3•6 years ago
|
||
It is by design, that if a cache cannot be unmounted, it is not persisted.
Fine, but why does it have to make the task fail?
Reporter | ||
Comment 4•6 years ago
|
||
And after bug 1527798:
[taskcluster:error] [mounts] Could not unmount <nil> due to: 'Could not persist cache "level-3-checkouts" due to remove Z:\task_1550178085\build\src\vs2017_15.8.4\VC\bin\Hostx64\x64\mspdbcore.dll: Access is denied.'
[taskcluster:error] Could not persist cache "level-3-checkouts" due to remove Z:\task_1550178085\build\src\vs2017_15.8.4\VC\bin\Hostx64\x64\mspdbcore.dll: Access is denied.
Comment 5•6 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #3)
It is by design, that if a cache cannot be unmounted, it is not persisted.
Fine, but why does it have to make the task fail?
The task declares a cache to be persisted after the task completes, but after the task has completed, there are open file handles in the cache, so the cache cannot be released. This is a good enough reason to mark the task as failed. It should first release the open file handles, before completing. Not doing so either means the worker can't release the cache, or forces it to aggressively kill processes, which could leave the cache in a compromised state. This isn't something we would want from a successful task. The open file handles indicate that the task did not complete successfully, as resources were not released.
By making it the responsibility of the task to release file handles, rather than the worker, the worker does not make assumptions about which processes can or cannot be safely killed, and which will interfere with the cache. Having a successful task, but not persisting the cache, would also be strange/misleading behaviour.
Reporter | ||
Comment 6•6 years ago
|
||
I guess we can agree this is wontfix. Still painful to discover that changes that affect tasks don't trigger said tasks.
Description
•