794248 - Re-imaging XP machines can come back in a non-usable state

Reporter

Description

•

13 years ago

Hi Van, Is there anything different on our XP imaging process? It seems that all recent XP machines cannot use taskkill properly and it causes various problems. The way to check if a machine does not work is to type: * tasklist /svc and you will see a message like this: "ERROR: Logon failure: unknown user name or bad password." talos-r3-xp-ref does not show this symptom. talos-r3-xp-001 got re-imaged today and it has this symptom. What snapshot are we using? Have any of the tools for re-imaging XP slaves have changed in the last 1-2 months? hosts? tools update? bootcamp? There might be nothing going on but it will help me a lot to cut the list of possible problems. Excuse me if I ask non-sense! :)

Van Le [:van]

Comment 1

•

13 years ago

I PXE boot these XP hosts to get them to image so nothing has changed on my end. Van

hwine

Comment 2

•

13 years ago

Van - this is in reference to Amy's comment in bug 788382 comment 20 I believe Armen is asking for: a) identification of the current pxe image you're using b) what the list of available "roll back" images are (in case he needs to do a binary search on when the problem occurred) Based on the ref image working, and the re-imaged machine not working, the validity of the pxe image is questionable.

Amy Rich [:arr] [:arich]

Assignee

Comment 3

•

13 years ago

Van can't answer any of these questions because all dcops does is kick off the automated imaging process that relops controls. DCops doesn't have access to the imaging server or the captured images. Releng asked relops to take a new snapshot of the ref machine on 20120612, and this image has been used since then. We can instead image machines with the image snapshot that was requested on 20120508 or go back even further to 20111215 or 20111129. We can also try taking a new ref image, but it's unlikely that this will fix the problem unless there was a problem with the ref server in the first place that has since been corrected. Neither relops nor dcops changes any of the tools on the ref servers, and no changes were made to the imaging server where the images were captured. The only change that's been made on the deployment server end was to the password that the imaging account uses, but an error there that would have resulted in the machine not installing at all, not causing issues on the client end after the installation.

Assignee: server-ops → arich

Component: Server Operations: DCOps → Server Operations: RelEng

QA Contact: dmoore → arich

Amy Rich [:arr] [:arich]

Assignee

Comment 4

•

13 years ago

FYI, based on the error, my first guess would be that the issue might lie with the root/administrator/cltbld password change that happened just prior to the last snapshot requested by releng.