Closed Bug 1523860 Opened 5 years ago Closed 5 years ago

generic-worker-386 panics with error code 69 on aarch64

Categories

(Taskcluster :: Workers, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: egao, Assigned: pmoore)

References

Details

System

  • winver: 17134.523, version 1803
  • generic-worker: 12.0.0
  • generic-worker arch: 386

Description

I had Generic-Worker running on an aarch64 reference system for two days, but issues started occurring.

The following occurred, in chronological order.

Issue Begins

Sometime on January 29, I performed a wholesale removal of the user accounts that G-W created in order to execute tasks. The cleanup process involved the following:
-- removal of user accounts from Account Manager
-- deletion of directories named task_<integer> from C:\Users\

I then noticed that new tasks were not being executed.

Reinstallation of Generic-Worker

These steps were taken, all with an Administrator shell:

  1. deletion of G-W service using sc delete "Generic Worker"
[SC] DeleteService SUCCESS
  1. installation of G-W as a LocalSystem service
c:\generic-worker>generic-worker install service --config generic-worker.config
2019/01/30 00:34:06 Making system call GetProfilesDirectoryW with args: [0 18C3AC1C]
2019/01/30 00:34:06   Result: 0 7FFFFFF7 The data area passed to a system call is too small.
2019/01/30 00:34:06 Making system call GetProfilesDirectoryW with args: [18C42380 18C3AC1C]
2019/01/30 00:34:06   Result: 1 7FFFFFF7 The operation completed successfully.
2019/01/30 00:34:06 Command args: []string{"generic-worker", "install", "service", "--config", "generic-worker.config"}
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'install' 'Generic Worker' 'c:\generic-worker\run-generic-worker.bat'
Service "Generic Worker" installed successfully!
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppDirectory' 'c:\generic-worker'
Set parameter "AppDirectory" for service "Generic Worker".
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'DisplayName' 'Generic Worker'
Reset parameter "DisplayName" for service "Generic Worker" to its default.
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'Description' 'A taskcluster worker that runs on all mainstream platforms'
Set parameter "Description" for service "Generic Worker".
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'Start' 'SERVICE_AUTO_START'
Set parameter "Start" for service "Generic Worker".
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'Type' 'SERVICE_WIN32_OWN_PROCESS'
Set parameter "Type" for service "Generic Worker".
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppPriority' 'NORMAL_PRIORITY_CLASS'
Reset parameter "AppPriority" for service "Generic Worker" to its default.
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppNoConsole' '1'
Set parameter "AppNoConsole" for service "Generic Worker".
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppAffinity' 'All'
Reset parameter "AppAffinity" for service "Generic Worker" to its default.
2019/01/30 00:34:06 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodSkip' '0'
Set parameter "AppStopMethodSkip" for service "Generic Worker".
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodConsole' '1500'
Reset parameter "AppStopMethodConsole" for service "Generic Worker" to its default.
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodWindow' '1500'
Reset parameter "AppStopMethodWindow" for service "Generic Worker" to its default.
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodThreads' '1500'
Reset parameter "AppStopMethodThreads" for service "Generic Worker" to its default.
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppThrottle' '1500'
Reset parameter "AppThrottle" for service "Generic Worker" to its default.
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppExit' 'Default' 'Exit'
Set parameter "AppExit" for service "Generic Worker".
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRestartDelay' '0'
Set parameter "AppRestartDelay" for service "Generic Worker".
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStdout' 'c:\generic-worker\generic-worker-service.log'
Set parameter "AppStdout" for service "Generic Worker".
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStderr' 'c:\generic-worker\generic-worker-service.log'
Set parameter "AppStderr" for service "Generic Worker".
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStdoutCreationDisposition' '4'
Reset parameter "AppStdoutCreationDisposition" for service "Generic Worker" to its default.
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStderrCreationDisposition' '4'
Reset parameter "AppStderrCreationDisposition" for service "Generic Worker" to its default.
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateFiles' '1'
Set parameter "AppRotateFiles" for service "Generic Worker".
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateOnline' '1'
Set parameter "AppRotateOnline" for service "Generic Worker".
2019/01/30 00:34:07 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateSeconds' '3600'
Set parameter "AppRotateSeconds" for service "Generic Worker".
2019/01/30 00:34:08 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateBytes' '0'
Set parameter "AppRotateBytes" for service "Generic Worker".

c:\generic-worker>generic-worker install service --config generic-worker.config
2019/01/30 00:36:41 Making system call GetProfilesDirectoryW with args: [0 18C34C1C]
2019/01/30 00:36:41   Result: 0 7FFFFFF7 The data area passed to a system call is too small.
2019/01/30 00:36:41 Making system call GetProfilesDirectoryW with args: [18C3C380 18C34C1C]
2019/01/30 00:36:41   Result: 1 7FFFFFF7 The operation completed successfully.
2019/01/30 00:36:41 Command args: []string{"generic-worker", "install", "service", "--config", "generic-worker.config"}
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'install' 'Generic Worker' 'c:\generic-worker\run-generic-worker.bat'
Service "Generic Worker" installed successfully!
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppDirectory' 'c:\generic-worker'
Set parameter "AppDirectory" for service "Generic Worker".
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'DisplayName' 'Generic Worker'
Reset parameter "DisplayName" for service "Generic Worker" to its default.
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'Description' 'A taskcluster worker that runs on all mainstream platforms'
Set parameter "Description" for service "Generic Worker".
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'Start' 'SERVICE_AUTO_START'
Set parameter "Start" for service "Generic Worker".
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'Type' 'SERVICE_WIN32_OWN_PROCESS'
Set parameter "Type" for service "Generic Worker".
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppPriority' 'NORMAL_PRIORITY_CLASS'
Reset parameter "AppPriority" for service "Generic Worker" to its default.
2019/01/30 00:36:41 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppNoConsole' '1'
Set parameter "AppNoConsole" for service "Generic Worker".
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppAffinity' 'All'
Reset parameter "AppAffinity" for service "Generic Worker" to its default.
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodSkip' '0'
Set parameter "AppStopMethodSkip" for service "Generic Worker".
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodConsole' '1500'
Reset parameter "AppStopMethodConsole" for service "Generic Worker" to its default.
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodWindow' '1500'
Reset parameter "AppStopMethodWindow" for service "Generic Worker" to its default.
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStopMethodThreads' '1500'
Reset parameter "AppStopMethodThreads" for service "Generic Worker" to its default.
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppThrottle' '1500'
Reset parameter "AppThrottle" for service "Generic Worker" to its default.
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppExit' 'Default' 'Exit'
Set parameter "AppExit" for service "Generic Worker".
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRestartDelay' '0'
Set parameter "AppRestartDelay" for service "Generic Worker".
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStdout' 'c:\generic-worker\generic-worker-service.log'
Set parameter "AppStdout" for service "Generic Worker".
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStderr' 'c:\generic-worker\generic-worker-service.log'
Set parameter "AppStderr" for service "Generic Worker".
2019/01/30 00:36:42 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStdoutCreationDisposition' '4'
Reset parameter "AppStdoutCreationDisposition" for service "Generic Worker" to its default.
2019/01/30 00:36:43 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppStderrCreationDisposition' '4'
Reset parameter "AppStderrCreationDisposition" for service "Generic Worker" to its default.
2019/01/30 00:36:43 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateFiles' '1'
Set parameter "AppRotateFiles" for service "Generic Worker".
2019/01/30 00:36:43 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateOnline' '1'
Set parameter "AppRotateOnline" for service "Generic Worker".
2019/01/30 00:36:43 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateSeconds' '3600'
Set parameter "AppRotateSeconds" for service "Generic Worker".
2019/01/30 00:36:43 Running command: 'C:\nssm-2.24\win64\nssm.exe' 'set' 'Generic Worker' 'AppRotateBytes' '0'
Set parameter "AppRotateBytes" for service "Generic Worker".
  1. attempted start of Generic-Worker service
C:\windows\system32>net start "Generic Worker"
The Generic Worker service is starting.
The Generic Worker service could not be started.

A service specific error occurred: 69.

More help is available by typing NET HELPMSG 3547.

Troubleshooting

Various changes were made to the generic-worker.config file in order to troubleshoot.

  • runTasksAsCurrentUser: toggled between false, true
  • openpgp key: regenerated
  • tasksDir: attempted multiple values eg. C:\, `C:\Users\Edwin
  • using another local administrator account: did not yield positive results

Logs

generic-worker.log

2019/01/30 00:36:45 Making system call GetProfilesDirectoryW with args: [0 18C84BCC]
2019/01/30 00:36:45   Result: 0 7FFFFFF7 The data area passed to a system call is too small.
2019/01/30 00:36:45 Making system call GetProfilesDirectoryW with args: [18C8C320 18C84BCC]
2019/01/30 00:36:45   Result: 1 7FFFFFF7 The operation completed successfully.
2019/01/30 00:36:45 Making system call GetProfilesDirectoryW with args: [0 18CE4278]
2019/01/30 00:36:45   Result: 0 7FFFFFF7 The data area passed to a system call is too small.
2019/01/30 00:36:45 Making system call GetProfilesDirectoryW with args: [18C8C980 18CE4278]
2019/01/30 00:36:45   Result: 1 7FFFFFF7 The operation completed successfully.
2019/01/30 00:36:45 Loading generic-worker config file 'c:\generic-worker\generic-worker.config'...
2019/01/30 00:36:45 Creating file c:\generic-worker\generic-worker.config...
2019/01/30 00:36:45 Saving file c:\generic-worker\generic-worker.config (absolute path: c:\generic-worker\generic-worker.config)
2019/01/30 00:36:45 Config: {
  "accessToken": "*************",
  "authBaseURL": "https://auth.taskcluster.net/v1",
  "availabilityZone": "",
  "cachesDir": "caches",
  "certificate": "",
  "checkForNewDeploymentEverySecs": 1800,
  "cleanUpTaskDirs": true,
  "clientId": "mozilla-auth0/ad|Mozilla-LDAP|egao/",
  "deploymentId": "",
  "disableReboots": false,
  "downloadsDir": "downloads",
  "ed25519SigningKeyLocation": "C:\\generic-worker\\ed_key",
  "idleTimeoutSecs": 0,
  "instanceId": "",
  "instanceType": "",
  "livelogCertificate": "",
  "livelogExecutable": "livelog",
  "livelogGETPort": 60023,
  "livelogKey": "",
  "livelogPUTPort": 60022,
  "livelogSecret": "*************",
  "numberOfTasksToRun": 0,
  "openpgpSigningKeyLocation": "C:\\generic-worker\\gpg_file",
  "privateIP": "",
  "provisionerBaseURL": "",
  "provisionerId": "test-provisioner",
  "publicIP": "75.156.86.153",
  "purgeCacheBaseURL": "https://purge-cache.taskcluster.net/v1/",
  "queueBaseURL": "https://queue.taskcluster.net/v1",
  "region": "",
  "requiredDiskSpaceMegabytes": 10240,
  "rootURL": "https://taskcluster.net",
  "runAfterUserCreation": "",
  "runTasksAsCurrentUser": false,
  "sentryProject": "",
  "shutdownMachineOnIdle": false,
  "shutdownMachineOnInternalError": false,
  "subdomain": "taskcluster-worker.net",
  "taskclusterProxyExecutable": "taskcluster-proxy",
  "taskclusterProxyPort": 80,
  "tasksDir": "C:\\Users",
  "workerGroup": "test-worker-group",
  "workerId": "test-worker-id",
  "workerType": "aarch64windowsedwin",
  "workerTypeMetadata": {
    "config": {
      "deploymentId": "",
      "runTasksAsCurrentUser": false
    },
    "generic-worker": {
      "go-arch": "386",
      "go-os": "windows",
      "go-version": "go1.10.3",
      "release": "https://github.com/taskcluster/generic-worker/releases/tag/v12.0.0",
      "revision": "636ac900e91adae5ca40c1b1709a521524385d42",
      "source": "https://github.com/taskcluster/generic-worker/commits/636ac900e91adae5ca40c1b1709a521524385d42",
      "version": "12.0.0"
    }
  }
}
2019/01/30 00:36:45 Detected windows platform
2019/01/30 00:36:45 Making system call GetProfilesDirectoryW with args: [0 18CE4E24]
2019/01/30 00:36:45   Result: 0 7FFFFFF7 The data area passed to a system call is too small.
2019/01/30 00:36:45 Making system call GetProfilesDirectoryW with args: [18C8D340 18CE4E24]
2019/01/30 00:36:45   Result: 1 7FFFFFF7 The operation completed successfully.
2019/01/30 00:36:45 Looking for existing task users to delete...
2019/01/30 00:36:45 Initialising task feature Live Log...
2019/01/30 00:36:45 Initialising task feature Taskcluster Proxy...
2019/01/30 00:36:45 Initialising task feature OS Groups...
2019/01/30 00:36:45 Initialising task feature Mounts/Caches...
2019/01/30 00:36:45 Loaded file file-caches.json
2019/01/30 00:36:45 Loaded file directory-caches.json
2019/01/30 00:36:45 Initialising task feature Supersede...
2019/01/30 00:36:45 Initialising task feature RDP...
2019/01/30 00:36:45 Initialising task feature Run As Administrator...
2019/01/30 00:36:45 Initialising task feature Chain of Trust...
2019/01/30 00:36:45 All features initialised.
2019/01/30 00:36:45 Making system call WTSGetActiveConsoleSessionId with args: []
2019/01/30 00:36:45   Result: 1 18AC98F8 The operation completed successfully.
2019/01/30 00:36:45 Making system call WTSQueryUserToken with args: [1 18CE5658]
2019/01/30 00:36:45   Result: 1 2 The operation completed successfully.
2019/01/30 00:36:45 Making system call GetUserProfileDirectoryW with args: [2BC 0 18CE5690]
2019/01/30 00:36:45   Result: 0 2 The data area passed to a system call is too small.
2019/01/30 00:36:45 Making system call GetUserProfileDirectoryW with args: [2BC 18D04820 18CE5690]
2019/01/30 00:36:45   Result: 1 2 The operation completed successfully.
2019/01/30 00:36:45 Saving file file-caches.json (absolute path: c:\generic-worker\file-caches.json)
2019/01/30 00:36:45 Saving file directory-caches.json (absolute path: c:\generic-worker\directory-caches.json)
2019/01/30 00:36:45 goroutine 1 [running]:
runtime/debug.Stack(0x0, 0x0, 0x18d62130)
	/home/travis/.gimme/versions/go1.10.3.src/src/runtime/debug/stack.go:24 +0x8a
main.HandleCrash(0x8bfa20, 0x18d5e000)
	/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:583 +0x1e
main.RunWorker.func1(0x18ac9ea0)
	/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:602 +0x3d
panic(0x8bfa20, 0x18d5e000)
	/home/travis/.gimme/versions/go1.10.3.src/src/runtime/panic.go:502 +0x1d0
main.prepareTaskUser(0x18d04680, 0xf, 0x2)
	/home/travis/gopath/src/github.com/taskcluster/generic-worker/plat_windows.go:179 +0x6d9
main.PrepareTaskEnvironment(0x6fc1526c)
	/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:1355 +0xe2
main.RunWorker(0x0)
	/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:654 +0x346
main.main()
	/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:431 +0x9c3
2019/01/30 00:36:45  *********** PANIC occurred! *********** 
2019/01/30 00:36:45 exit status 1332
2019/01/30 00:36:45 No sentry project defined, not reporting to sentry
2019/01/30 00:36:45 Exiting worker with exit code 69

Permission of tasks directory:

C:\windows\system32>icacls "C:\Users\task_1548723780"
C:\Users\task_1548723780 NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F)
                         BUILTIN\Administrators:(I)(OI)(CI)(F)
                         BUILTIN\Users:(I)(RX)
                         BUILTIN\Users:(I)(OI)(CI)(IO)(GR,GE)
                         Everyone:(I)(RX)
                         Everyone:(I)(OI)(CI)(IO)(GR,GE)

Successfully processed 1 files; Failed processing 0 files
Assignee: nobody → pmoore
Status: NEW → ASSIGNED

Recent update:

I performed a Windows Reset/Refresh with the option Keep My Files.

Configuration Changes

  • runTasksAsCurrentUser was toggled to false in generic-worker.config
  • tasksDir was modified to C:\\tasksDir where tasksDir has the following permissions:
C:\tasksDir NT AUTHORITY\Authenticated Users:(OI)(CI)(F)
11:52             BUILTIN\Users:(OI)(CI)(F)
11:52             BUILTIN\Administrators:(I)(OI)(CI)(F)
11:52             NT AUTHORITY\SYSTEM:(I)(OI)(CI)(F)
11:52             BUILTIN\Users:(I)(OI)(CI)(RX)
11:52             NT AUTHORITY\Authenticated Users:(I)(M)
11:52             NT AUTHORITY\Authenticated Users:(I)(OI)(CI)(IO)(M)
11:52 Successfully processed 1 files; Failed processing 0 files

Process

After the refresh was successfully completed by Windows, I performed the following steps using the files that were retained as part of the reset:

  • ensured Generic Worker service was not present
  • confirmed that original generic-worker.exe v12.0.0 was retained from prior to the reset
  • net localgroup "Remote Desktop Users" /add
  • generic-worker.exe service install --config generic-worker.config --nssm c:\nssm-2.24\win32\nssm.exe

What happened:

  • localgroup was added successfully
  • generic-worker service was installed successfully
  • generic-worker requested a reboot
  • post-reboot, generic-worker has been confirmed to be running

What this means

  • the process of resetting Windows installation properly restored the permission and/or user account issues that I inflicted on the system by removing the accounts.
  • this issue likely did not arise in CI environment because the machines are torn down after each test, whereas this aarch64 laptop is constantly running in a 'dirty' state.

For the time being, it seems that Generic-Worker is able to run again on my system and I can continue with the experiments to get CI tasks running.

(In reply to Edwin Gao (:egao) from comment #0)

2019/01/30 00:36:45 goroutine 1 [running]:
runtime/debug.Stack(0x0, 0x0, 0x18d62130)
/home/travis/.gimme/versions/go1.10.3.src/src/runtime/debug/stack.go:24 +0x8a
main.HandleCrash(0x8bfa20, 0x18d5e000)
/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:583 +0x1e
main.RunWorker.func1(0x18ac9ea0)
/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:602 +0x3d
panic(0x8bfa20, 0x18d5e000)
/home/travis/.gimme/versions/go1.10.3.src/src/runtime/panic.go:502 +0x1d0
main.prepareTaskUser(0x18d04680, 0xf, 0x2)
/home/travis/gopath/src/github.com/taskcluster/generic-worker/plat_windows.go:179 +0x6d9
main.PrepareTaskEnvironment(0x6fc1526c)
/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:1355 +0xe2
main.RunWorker(0x0)
/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:654 +0x346
main.main()
/home/travis/gopath/src/github.com/taskcluster/generic-worker/main.go:431 +0x9c3
2019/01/30 00:36:45 *********** PANIC occurred! ***********
2019/01/30 00:36:45 exit status 1332
2019/01/30 00:36:45 No sentry project defined, not reporting to sentry
2019/01/30 00:36:45 Exiting worker with exit code 69

This failure comes from here:

err = exec.Command("icacls", taskContext.TaskDir, "/grant", taskContext.LogonSession.User.Name+":(OI)(CI)F").Run()

The icacls is returning with an exit code 1332 which suggests that the user does not exist. This user is taken from the autologon credentials in the system registry.

I think the problem was this:

  1. The worker was successfully running tasks, and there was a problem when upgrading the worker. Possibly runTasksAsCurrentUser was set to true accidentally which meant that the worker process was not able to e.g. update the C:\Users folder.
  2. Every time a new task user is created, the windows registry is updated to set the task user username/password in the windows registry, so that the task user can log on automatically, and then the system is rebooted.
  3. At some point the existing task user was manually deleted.
  4. The worker assumes that the current task user is the username that is specified in the winlogon registry. Probably the winlogon registry still had the old task user username in it, so the worker tried to grant that user access to the task directory, but that user no longer exists on the system, so it panics.

Probably there is a fix in here, to use a different mechanism to determine what the current task user is (or if there isn't one) and making sure it aligns with the currently logged in user of the interactive desktop session, or not.

In any case, I'm glad you have it working again. If it happens again, and you delete a task_* user account manually, see if this problem gets solved by also resetting the registry keys:

  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\AutoAdminLogon
  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\DefaultUserName
  • HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon\DefaultPassword
Flags: needinfo?(egao)

Thank you for looking into this issue - much appreciated :pmoore.

Your account of the possible things that went wrong sounds like what I did wrong with my hardware.

For future reference, do not delete task users and directories, or if that must be done, modify registry values. Got it.

Would you prefer to re-title and keep this bug open for the potential future work? Otherwise I would say this bug should be closed.

Flags: needinfo?(egao)
No longer blocks: 1523722
Component: Generic-Worker → Workers

I think this is unrelated to bug 1492617; let me know if I'm wrong.

No longer blocks: 1492617

(In reply to Edwin Gao (:egao) from comment #3)

Thank you for looking into this issue - much appreciated :pmoore.

Your account of the possible things that went wrong sounds like what I did wrong with my hardware.

For future reference, do not delete task users and directories, or if that must be done, modify registry values. Got it.

Would you prefer to re-title and keep this bug open for the potential future work? Otherwise I would say this bug should be closed.

In generic-worker 15.1.0 the task user logic has been considerably reworked, so that the worker confirms that if there is an autologon user configured in the registry, that it matches the user that is currently interactively logged in, if there is one, and that the username matches the expected format for a task user. If any of that fails for any reason, a descriptive failure message will be logged and the worker will exit with a non-zero exit code. So I think we can probably mark this as done.

Is that OK with you Edwin?

I would recommend upgrading generic-worker in any case to pick up the latest fixes and features. Note the release notes advise of breaking changes, so when upgrading from version X to version Y, be sure to check all release notes between version X and Y, to make sure you capture all required changes. But I'm around to support if you get stuck in any way.

Thanks!

Flags: needinfo?(egao)

:pmoore - I agree that this bug can be closed. Generic-worker for the windows10-aarch64 hardware running at Bitbar has proven quite stable so far with v14.0.1 that was set up using OpenCloudConfig.

Flags: needinfo?(egao)
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.