Closed Bug 1058559 Opened 10 years ago Closed 10 years ago

(Auto-)login for QA builder does not work on OS X

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

All
macOS
task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: whimboo, Assigned: whimboo)

References

()

Details

There is one already puppetized OS X box (mm-osx-107.qa.scl3.mozilla.com) which doesn't allow the builder to auto-login after reboot. When you login via VNC you will see the login window. Interestingly it shows two builder accounts! Not sure what went wrong here. Dustin, do you have any idea?

We cannot move to Puppet as long as this hasn't been solved. Marking the dependency.
Not sure if something is visible here, which would help us to get this fixed:

Aug 26 05:00:09 mm-osx-107 loginwindow[64]: Login Window Application Started
Aug 26 05:00:11 mm-osx-107 XProtectUpdater[49]: Ignoring new signature plist: Not an increase in version
Aug 26 05:00:12 mm-osx-107 com.apple.launchd[1] (com.apple.xprotectupdater[49]): Exited with code: 252
Aug 26 05:00:13 mm-osx-107 rpcsvchost[104]: sandbox_init: com.apple.msrpc.netlogon.sb succeeded
Aug 26 05:00:17 mm-osx-107 WindowServer[100]: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
Aug 26 05:00:27 mm-osx-107 loginwindow[64]: **DMPROXY** Found `/System/Library/CoreServices/DMProxy'.
Aug 26 05:00:27 mm-osx-107 loginwindow[64]: 
	O3REQUIRE FAILED: ret == 0
	   - file: /SourceCache/BezelServices/BezelServices-232.3/Utilities/Utilities.c
	   - line: 91
Aug 26 05:00:27 mm-osx-107 loginwindow[64]: Failed to lookup Bezel UI Server port: Bootstrap Unknown Service.
Aug 26 05:00:27 mm-osx-107 loginwindow[64]: Login Window Started Security Agent
Aug 26 05:00:28 mm-osx-107 com.apple.launchctl.LoginWindow[125]: com.apple.findmymacmessenger: Already loaded
Aug 26 05:00:28 mm-osx-107 com.apple.launchctl.LoginWindow[125]: com.apple.store_helper: Already loaded
Aug 26 05:00:28 mm-osx-107 com.apple.launchctl.LoginWindow[125]: com.apple.storeagent: Already loaded
Aug 26 05:00:28 mm-osx-107 SecurityAgent[130]: Echo enabled
Aug 26 05:00:28 mm-osx-107 SecurityAgent[130]: User info context values set for mozauto
Aug 26 05:00:34 mm-osx-107 UserEventAgent[11]: servermgr_certs[11] -[CertsRequestHandler(HelperAdditions) identitiesFromKeychain:]:	SecItemCopyMatching (err = -25300)
Aug 26 05:00:34: --- last message repeated 1 time ---
Aug 26 05:00:34 mm-osx-107 com.apple.UserEventAgent-System[11]: SecItemCopyMatching: The specified item could not be found in the keychain.
Aug 26 05:00:34: --- last message repeated 1 time ---
Aug 26 05:00:34 mm-osx-107 UserEventAgent[11]: CertsKeychainMonitor: ready to process keychain & timer events
Aug 26 05:00:55 mm-osx-107 com.oracle.java.Helper-Tool[482]: launchctl: Error unloading: com.oracle.java.Java-Updater
Aug 26 05:01:21 mm-osx-107 screenresolution[884]: starting screenresolution argv=/usr/local/bin/screenresolution get 
Aug 26 05:01:21 mm-osx-107 screenresolution[884]: 3891612: (connectAndCheck) Untrusted apps are not allowed to connect to or launch Window Server before login.
Aug 26 05:01:21 mm-osx-107 screenresolution[884]: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
Aug 26 05:01:21 mm-osx-107 screenresolution[884]: Error: failed to get list of active displays
Aug 26 05:01:21 mm-osx-107 screenresolution[886]: starting screenresolution argv=/usr/local/bin/screenresolution set 1024x768x32 
Aug 26 05:01:21 mm-osx-107 screenresolution[886]: 3891612: (connectAndCheck) Untrusted apps are not allowed to connect to or launch Window Server before login.
Aug 26 05:01:21 mm-osx-107 screenresolution[886]: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
Aug 26 05:01:21 mm-osx-107 screenresolution[886]: Error: failed to get list of active displays
Aug 26 05:01:21 mm-osx-107 puppet-agent[112]: (/Stage[main]/Screenresolution/Exec[set-resolution]/returns) 2014-08-26 05:01:21.856 screenresolution[886:707] starting screenresolution argv=/usr/local/bin/screenresolution set 1024x768x32 
Aug 26 05:01:21 mm-osx-107 puppet-agent[112]: (/Stage[main]/Screenresolution/Exec[set-resolution]/returns) 2014-08-26 05:01:21.859 screenresolution[886:707] Error: failed to get list of active displays
Aug 26 05:01:21 mm-osx-107 puppet-agent[112]: /usr/local/bin/screenresolution set 1024x768x32 returned 1 instead of one of [0]
Aug 26 05:01:21 mm-osx-107 puppet-agent[112]: (/Stage[main]/Screenresolution/Exec[set-resolution]/returns) change from notrun to 0 failed: /usr/local/bin/screenresolution set 1024x768x32 returned 1 instead of one of [0]
Aug 26 05:01:26 mm-osx-107 puppet-agent[112]: Finished catalog run in 44.00 seconds
Aug 26 05:03:10 mm-osx-107 screenresolution[1692]: starting screenresolution argv=/usr/local/bin/screenresolution get 
Aug 26 05:03:10 mm-osx-107 screenresolution[1692]: 3891612: (connectAndCheck) Untrusted apps are not allowed to connect to or launch Window Server before login.
Aug 26 05:03:10 mm-osx-107 screenresolution[1692]: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
Aug 26 05:03:10 mm-osx-107 screenresolution[1692]: Error: failed to get list of active displays
Aug 26 05:03:11 mm-osx-107 screenresolution[1694]: starting screenresolution argv=/usr/local/bin/screenresolution set 1024x768x32 
Aug 26 05:03:11 mm-osx-107 screenresolution[1694]: 3891612: (connectAndCheck) Untrusted apps are not allowed to connect to or launch Window Server before login.
Aug 26 05:03:11 mm-osx-107 screenresolution[1694]: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
Aug 26 05:03:11 mm-osx-107 screenresolution[1694]: Error: failed to get list of active displays
Aug 26 05:03:11 mm-osx-107 puppet-agent[946]: (/Stage[main]/Screenresolution/Exec[set-resolution]/returns) 2014-08-26 05:03:11.021 screenresolution[1694:707] starting screenresolution argv=/usr/local/bin/screenresolution set 1024x768x32 
Aug 26 05:03:11 mm-osx-107 puppet-agent[946]: (/Stage[main]/Screenresolution/Exec[set-resolution]/returns) 2014-08-26 05:03:11.024 screenresolution[1694:707] Error: failed to get list of active displays
Aug 26 05:03:11 mm-osx-107 puppet-agent[946]: /usr/local/bin/screenresolution set 1024x768x32 returned 1 instead of one of [0]
Aug 26 05:03:11 mm-osx-107 puppet-agent[946]: (/Stage[main]/Screenresolution/Exec[set-resolution]/returns) change from notrun to 0 failed: /usr/local/bin/screenresolution set 1024x768x32 returned 1 instead of one of [0]
Aug 26 05:03:15 mm-osx-107 puppet-agent[946]: Finished catalog run in 41.03 seconds

Not sure if it is screenresolution which is failing and causing the no-autologin, but it might be a possible issue.
Something is broken with the passwords on that machine. Not sure why. Nothing has been changed by myself here. I will try to delete all accounts and let puppet re-create everything.
That there are two builder accounts sounds very suspicious -- I have no idea how that could happen!
Ok, so there was mozauto and mozauto.bak under /Users. Not sure from where this is coming. Maybe because we already had a mozauto account before we puppetized this machine? I cannot tell right now.

Anyway, I deleted all the user accounts on this box except for 'puppet', where I don't know what this is and no profile exists under /Users. Any idea?

I rebooted the machine and puppet re-created the correct user account. Now a single Builder is available. Surprisingly I'm still not able to login! So checking 'tail -100 /var/log/secure.log' I see:

Aug 26 05:53:29 mm-osx-107 SecurityAgent[141]: User info context values set for mozauto
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): Got user: mozauto
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): Got ruser: (null)
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): Got service: authorization
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in od_principal_for_user(): No authentication authority returned
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in od_principal_for_user(): failed: 7
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): Failed to determine Kerberos principal name.
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): Done cleanup3
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): Kerberos 5 refuses you
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): pam_sm_authenticate: ntlm
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: in pam_sm_authenticate(): OpenDirectory - The authtok is incorrect.
Aug 26 05:53:29 mm-osx-107 authorizationhost[148]: Failed to authenticate user <mozauto> (error: 9).

Dustin has this something to do with the LDAP changes you made recently?
When I run 'passwd mozauto' as root and set the correct password, it works and I can login. So something during user creation is going wrong here.
Severity: normal → major
I have to add that the login only works until the next reboot. Then it fails again.
Blocks: 733534
So the problem here was that the passwords were stored as cleartext for builder_pw_kcpassword_base64 and builder_pw_saltedsha512. Thanks Dustin for the hint! :)

I'm going to update those passwords, and at the same time I will introduce a new password for our mozauto user, which is indeed more secure.
Assignee: relops → hskupin
Status: NEW → ASSIGNED
So I have set a new password for the mozauto user, and generated the new password for various keys in hiera's secrets.eyaml file. It's not done for all platforms yet. I will do it later whenever we add new platforms.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Summary: Auto-login for QA builder does not work on OS X → (Auto-)login for QA builder does not work on OS X
Whiteboard: [qa-automation-blocked]
You need to log in before you can comment on or make changes to this bug.