Closed Bug 1404450 Opened 7 years ago Closed 6 years ago

taskcluster win7 loaner keeps shutting down

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rwood, Assigned: grenade)

References

Details

I'm trying to debug a talos suite on a tc win 7 loaner, obtained via these instructions [1].

The tc win7 loaner task runs fine and spins up, and after I grab the credentials etc. I am able to remote desktop into the loaner no problem, right after it is available. I get started setting up the dev environment and then the aws instance is killed and my connection terminates.

I thought maybe it was being bumped by TC because the wiki says it can be pulled at anytime (ie. maybe there was a shortage/lots of other higher tasks pending). However thanks for :dustin looking into one of my loaner instances, looks like it is dying due to another issue.

HaltOnIdle: loaner state is unknown and has not been rectified since last check at 6:07:07 PM. instance will be terminated.

<•dustin> full logs from papertrail if you want to file it

https://irccloud.mozilla.com/pastebin/Y2QzfU8O/

[1] https://wiki.mozilla.org/ReleaseEngineering/How_To/Self_Provision_a_TaskCluster_Windows_Instance
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:03:42 1 put requests issued to https://taskcluster-public-artifacts.s3-us-west-2.amazonaws.com/EfXNlGOXQ_uMKBVrByyeKA/0/public/logs/live_backing.log?AWSAccessKeyId=AKIAJQESBGXODWDRTZUA&Content-Type=text%2Fplain%3B%20charset%3Dutf-8&Expires=1506710033&Signature=SDPfteRjOopQ4dxh6EV1FxKyZhg%3D 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:03:42 Response 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:03:42 HTTP/1.1 200 OK 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: Content-Length: 0 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: Date: Fri, 29 Sep 2017 18:03:44 GMT 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: Etag: "098b85640af4eb80182b554ae27137a6" 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: Server: AmazonS3 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: X-Amz-Id-2: 5B0hnGDmoV9xTrSLmIRH/NaUpyYdZSyVYxdxhxYkLxkWtXcKZieoykG34W2lgxFPKGoRb47nDH0= 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: X-Amz-Request-Id: A2C88E8C7293AE5F 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: X-Amz-Version-Id: S.Jax.J1FJ9VA7fTD2AVLosmCV8axv_9 
Sep 29 13:03:44 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:03:42 Resolving task... 
Sep 29 13:03:45 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:03:43 ERROR encountered: Task not successful due to following exception(s): 
Sep 29 13:03:45 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: Exception 1) 
Sep 29 13:03:45 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: [] 
Sep 29 13:03:45 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: Exit code: 4294967295 
Sep 29 13:03:45 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:03:43 No previous task user desktop, so no need to close any open desktops 
Sep 29 13:03:45 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:03:43 Trying to remove directory 'Z:\task_1506707840' via os.RemoveAll(path) call as GenericWorker user... 
Sep 29 13:04:43 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:04:41 Checking if there is a new deploymentId... 
Sep 29 13:04:43 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com generic-worker: 2017/09/29 18:04:41 No change to deploymentId - "baebbfda25e0" == "baebbfda25e0" 
Sep 29 13:05:08 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com HaltOnIdle: loaner state unknown 
Sep 29 13:07:09 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com HaltOnIdle: loaner state unknown 
Sep 29 13:09:09 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com HaltOnIdle: loaner state is unknown and has not been rectified since last check at 6:07:07 PM. instance will be terminated. 
Sep 29 13:09:09 i-0bfd47798aa31f4be.gecko-t-win7-32.use1.mozilla.com USER32: The process C:\windows\system32\shutdown.exe (I-0BFD47798AA31) has initiated the shutdown of computer I-0BFD47798AA31 on behalf of user NT AUTHORITY\SYSTEM for the following reason: Application: Maintenance (Planned)   Reason Code: 0x80040001   Shutdown Type: shutdown   Comment: HaltOnIdle :: loaner state unknown and unrectified
Assignee: nobody → rthijssen
Status: NEW → ASSIGNED
Hi Rob, in the meantime is there another way I would be able to get a taskcluser win 7 (VM) loaner? I'm trying to figure out an issue with one of our talos test suites failing on a win 7 vm, as part of Bug 1369537.
Flags: needinfo?(rthijssen)
Robert, i've sent you an email with a temporary workaround.
Flags: needinfo?(rthijssen)
(In reply to Rob Thijssen (:grenade - UTC+3) from comment #3)
> Robert, i've sent you an email with a temporary workaround.

Workaround works great, thanks Rob!
we did eventually resolve this with a patch to the loaner prep scripts
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: AWS-Provisioner → Services
You need to log in before you can comment on or make changes to this bug.