Closed Bug 1512121 (t-yosemite-r7-300) Opened 6 years ago Closed 5 years ago

[MDC1] t-yosemite-r7-300 problem tracking

Categories

(Infrastructure & Operations :: RelOps: Hardware, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: relops-bug-generator, Assigned: dividehex)

References

Details

No description provided.
Depends on: 1512122
This worker is not running jobs since 9 hours ago and the last task was resolved as exception. The reboot controller failed, also the ssh connection is not working. The last logs from Papertrail: Dec 04 18:00:00 t-yosemite-r7-300.test.releng.mdc1.mozilla.com sntp: time set +0.009968 s Dec 04 18:10:00 t-yosemite-r7-300.test.releng.mdc1.mozilla.com /usr/sbin/cron: (root) CMD (. /usr/local/bin/proxy_reset_env.sh && PUPPET_SERVER=releng-puppet2.srv.releng.mdc1.mozilla.com /usr/local/bin/renew_cert.sh > /dev/null 2>&1) Dec 04 18:16:08 t-yosemite-r7-300.test.releng.mdc1.mozilla.com diagnostics_agent: message repeated 14 times: [ ] Dec 04 18:24:09 t-yosemite-r7-300.test.releng.mdc1.mozilla.com BezelServices: 250.15[92]: ASSERTION FAILED: dvcAddrRef != ((void *)0) -[DriverServices getDeviceAddress:] line: 2727 Dec 04 18:24:09 t-yosemite-r7-300.test.releng.mdc1.mozilla.com BezelServices: 250.15[92]: ASSERTION FAILED: dvcAddrRef != ((void *)0) -[DriverServices getDeviceAddress:] line: 2727 Dec 04 18:25:00 t-yosemite-r7-300.test.releng.mdc1.mozilla.com /usr/sbin/cron: (root) CMD (. /usr/local/bin/proxy_reset_env.sh && PUPPET_SERVER=releng-puppet2.srv.releng.mdc1.mozilla.com /usr/local/bin/renew_cert.sh > /dev/null 2>&1) Dec 04 18:40:00 t-yosemite-r7-300.test.releng.mdc1.mozilla.com /usr/sbin/cron: (root) CMD (. /usr/local/bin/proxy_reset_env.sh && PUPPET_SERVER=releng-puppet2.srv.releng.mdc1.mozilla.com /usr/local/bin/renew_cert.sh > /dev/null 2>&1) Dec 04 18:55:01 t-yosemite-r7-300.test.releng.mdc1.mozilla.com /usr/sbin/cron: (root) CMD (. /usr/local/bin/proxy_reset_env.sh && PUPPET_SERVER=releng-puppet2.srv.releng.mdc1.mozilla.com /usr/local/bin/renew_cert.sh > /dev/null 2>&1) Dec 04 19:00:00 t-yosemite-r7-300.test.releng.mdc1.mozilla.com /usr/sbin/cron: (root) CMD (/usr/bin/killall ntpd)
Checked this machine again. Van Le brought it back alive 3 days ago. riman@riman-VM:~$ fping t-yosemite-r7-300.test.releng.mdc1.mozilla.com t-yosemite-r7-300.test.releng.mdc1.mozilla.com is alive However it is still missing from TC and the SSH connection seems to be stuck after DUO request. The logs in Papertrail for last 5 days: Dec 10 03:35:04 t-yosemite-r7-300 com.apple.xpc.launchd: (com.apple.bsd.dirhelper[6786]): Endpoint has been activated through legacy launch(3) APIs. Please switch to XPC or bootstrap_check_in(): com.apple.bsd.dirhelper Dec 10 09:04:15 t-yosemite-r7-300 com.apple.xpc.launchd: (com.openssh.sshd.07B42D93-D1FE-455E-9AA1-FF10707250CB): Service instances do not support events yet. Dec 10 11:24:18 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 10 11:24:18 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 11 02:41:11 t-yosemite-r7-300 com.apple.xpc.launchd: (com.openssh.sshd.A9C42FBC-B048-4E74-A5B6-95A4224C2502): Service instances do not support events yet. Dec 11 02:41:56 t-yosemite-r7-300 com.apple.xpc.launchd: (com.openssh.sshd.31B570FE-1A1A-474C-8A5A-19CAFC0023EE): Service instances do not support events yet. Dec 11 03:35:05 t-yosemite-r7-300 com.apple.xpc.launchd: (com.apple.bsd.dirhelper[7628]): Endpoint has been activated through legacy launch(3) APIs. Please switch to XPC or bootstrap_check_in(): com.apple.bsd.dirhelper Dec 11 11:24:21 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 11 11:24:21 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 12 03:35:05 t-yosemite-r7-300 com.apple.xpc.launchd: (com.apple.bsd.dirhelper[8467]): Endpoint has been activated through legacy launch(3) APIs. Please switch to XPC or bootstrap_check_in(): com.apple.bsd.dirhelper Dec 12 11:24:25 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 12 11:24:25 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 13 03:35:03 t-yosemite-r7-300 com.apple.xpc.launchd: (com.apple.bsd.dirhelper[9306]): Endpoint has been activated through legacy launch(3) APIs. Please switch to XPC or bootstrap_check_in(): com.apple.bsd.dirhelper Dec 13 11:24:28 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 13 11:24:28 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 14 03:35:05 t-yosemite-r7-300 com.apple.xpc.launchd: (com.apple.bsd.dirhelper[10145]): Endpoint has been activated through legacy launch(3) APIs. Please switch to XPC or bootstrap_check_in(): com.apple.bsd.dirhelper Dec 14 11:24:01 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 14 11:24:01 t-yosemite-r7-300 com.apple.xpc.launchd: assertion failed: 14F27: launchd + 163797 [C0446878-E8D0-3461-A226-91FF1C2B2DA6]: 0xe Dec 14 12:45:18 t-yosemite-r7-300 com.apple.xpc.launchd: (com.openssh.sshd.E402FB55-02F7-4252-944F-10ADC1CAC884): Service instances do not support events yet.
Flags: needinfo?(dhouse)
Depends on: 1516573
Looks good, QTS reimaged the worker (https://bugzilla.mozilla.org/show_bug.cgi?id=1516573#c1) Marking it as fixed.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(dhouse)
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1539109

rebooted via roller, but it didn't helped
tried to log on it via ssh but received :

Stdio forwarding request failed: Session open refused by peer
ssh_exchange_identification: Connection closed by remote host

tried to ping the machine :

Pinging t-yosemite-r7-300.test.releng.mdc1.mozilla.com [10.49.56.241] with 32 bytes of data:
Request timed out.
Request timed out.
Request timed out.
Request timed out.

Ping statistics for 10.49.56.241:
Packets: Sent = 4, Received = 0, Lost = 4 (100% loss),

the last entry on papertrail :

Mar 25 16:09:11 t-yosemite-r7-300.test.releng.mdc1.mozilla.com taskgated: binary have embedded signature that validated /Users/cltbld/tasks/task_1553555232/build/venv/bin/python[753]
Mar 25 16:09:12 t-yosemite-r7-300.test.releng.mdc1.mozilla.com taskgated: binary have embedded signature that validated /Users/cltbld/tasks/task_1553555232/build/venv/bin/python[757]
Mar 25 16:09:24 t-yosemite-r7-300.test.releng.mdc1.mozilla.com taskgated: binary have embedded signature that validated /System/Library/CoreServices/backupd.bundle/Contents/Resources/TMHelperAgent.app[776]
Mar 25 16:09:24 t-yosemite-r7-300.test.releng.mdc1.mozilla.com TMHelperAgent: LSExceptions [0x7fe260f15620] loaded
Mar 25 16:09:34 t-yosemite-r7-300.test.releng.mdc1.mozilla.com TMHelperAgent: LSExceptions [0x7fe260f15620] unloaded
Mar 25 16:09:38 t-yosemite-r7-300.test.releng.mdc1.mozilla.com taskgated: binary have embedded signature that validated /Users/cltbld/tasks/task_1553555232/build/venv/bin/python[783]
Mar 25 16:09:38 t-yosemite-r7-300.test.releng.mdc1.mozilla.com Dock: LSExceptions [0x6000002b4ac0] loaded
Mar 25 16:09:39 t-yosemite-r7-300.test.releng.mdc1.mozilla.com firefox: LSExceptions [0x1098809a0] loaded
Mar 25 16:09:43 t-yosemite-r7-300.test.releng.mdc1.mozilla.com sandboxd: ([791]): plugin-container(791) deny forbidden-sandbox-reinit
Mar 25 16:09:44 t-yosemite-r7-300.test.releng.mdc1.mozilla.com sandboxd: ([792]): plugin-container(792) deny forbidden-sandbox-reinit
Mar 25 16:09:46 t-yosemite-r7-300.test.releng.mdc1.mozilla.com sandboxd: ([793]): plugin-container(793) deny forbidden-sandbox-reinit
Mar 25 16:09:49 t-yosemite-r7-300.test.releng.mdc1.mozilla.com Dock: LSExceptions [0x6000002b4ac0] unloaded
Mar 25 16:09:49 t-yosemite-r7-300.test.releng.mdc1.mozilla.com firefox: LSExceptions [0x1098809a0] unloaded
Mar 25 16:09:50 t-yosemite-r7-300.test.releng.mdc1.mozilla.com Install: in Progress[795]: LSExceptions [0x7f8158e014e0] loaded
Mar 25 16:09:50 t-yosemite-r7-300.test.releng.mdc1.mozilla.com diagnostics_agent: message repeated 13 times: [ ]
Mar 25 16:09:50 t-yosemite-r7-300.test.releng.mdc1.mozilla.com diagnostics_agent: message repeated 3 times: [ ]
Mar 25 16:09:50 t-yosemite-r7-300.test.releng.mdc1.mozilla.com spindump: [793] Not monitoring spin due to thermal pressure
Mar 25 16:09:50 t-yosemite-r7-300.test.releng.mdc1.mozilla.com taskgated: binary have embedded signature that validated /System/Library/CoreServices/SubmitDiagInfo[800]
Mar 25 16:09:50 t-yosemite-r7-300.test.releng.mdc1.mozilla.com SubmitDiagInfo: Couldn't load config file from on-disk location. Falling back to default location. Reason: Won't serialize in _readDictionaryFromJSONData due to nil object
Mar 25 16:09:51 t-yosemite-r7-300.test.releng.mdc1.mozilla.com spindump: [791] Not monitoring spin due to thermal pressure
Mar 25 16:09:51 t-yosemite-r7-300.test.releng.mdc1.mozilla.com spindump: [787] Not monitoring spin due to thermal pressure

Please power cycle this so we can reimage it.

Flags: needinfo?(dhouse)

(In reply to Attila Craciun [:arny] from comment #5)

Please power cycle this so we can reimage it.

I'm going to power it off for a few days to see if it needs a rest. I'll mark it as ignore in the spreadsheet.

Depends on: 1542116
Type: task → defect
Component: CIDuty → RelOps: Hardware
QA Contact: dlabici
Assignee: nobody → jwatkins

no problems with this machine on mojave

Status: REOPENED → RESOLVED
Closed: 6 years ago5 years ago
Flags: needinfo?(dhouse)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.