Closed Bug 1481794 (T-W1064-MS-121) Opened 6 years ago Closed 6 years ago

[MDC1] T-W1064-MS-121 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: arny, Unassigned)

References

Details

No jobs for one day, rebooted, still no jobs for hours. Reimaged and now is running jobs. See the logs before reimage. Aug 07 00:17:59 T-W1064-MS-121.mdc1.mozilla.com generic-worker: 2018/08/07 07:17:59 Resolving task...#015 Aug 07 00:17:59 T-W1064-MS-121.mdc1.mozilla.com generic-worker: 2018/08/07 07:17:59 Command finished successfully!#015 Aug 07 00:18:00 T-W1064-MS-121.mdc1.mozilla.com generic-worker: 2018/08/07 07:17:59 No previous task user desktop, so no need to close any open desktops#015 Aug 07 00:18:00 T-W1064-MS-121.mdc1.mozilla.com generic-worker: 2018/08/07 07:17:59 Trying to remove directory 'C:\Users\task_1533618450' via os.RemoveAll(path) call as GenericWorker user...#015 Aug 07 00:18:03 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Graphic Card being used "Intel(R) Iris(R) Pro Graphics P580 " #015 Aug 07 00:18:03 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Removing temp dir contents #015 Aug 07 00:18:03 T-W1064-MS-121.mdc1.mozilla.com generic-worker: C:\Users\GenericWorker\AppData\Local\Temp\aria-debug-2352.log#015 Aug 07 00:18:03 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Deleted file - C:\Users\GenericWorker\AppData\Local\Temp\livelog761575723\stream#015 Aug 07 00:18:04 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Deleted file - C:\ProgramData\Package Cache\{B74E65FD-CC47-41C5-4B89-791A3F61942D}v8.100.25984\Installers\Kits Configuration Installer-x86_en-us.msi#015 Aug 07 00:18:04 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Removing log files older than 1 day #015 Aug 07 00:18:04 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Removing Windows log files older than 7 days #015 Aug 07 00:18:04 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Removing Recycle.bin contents #015 Aug 07 00:18:05 T-W1064-MS-121.mdc1.mozilla.com User32: The process C:\windows\system32\shutdown.exe (T-W1064-MS-121) has initiated the restart of computer T-W1064-MS-121 on behalf of user T-W1064-MS-121\GenericWorker for the following reason: No title for this reason could be found Reason Code: 0x800000ff Shutdown Type: restart Comment: Rebooting as generic worker ran successfully#015 Aug 07 00:18:08 T-W1064-MS-121.mdc1.mozilla.com Service_Control_Manager: The sshd service terminated unexpectedly. It has done this 1 time(s).#015 After reboot and before reimage. Aug 08 04:15:56 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DSC: Job {55D603E6-9AFC-11E8-883D-F40343DF3609} : This event indicates that failure happens when LCM is processing the configuration. Error Id is 0x1. Error Detail is The SendConfigurationApply function did not succeed.. Resource Id is [Script]FirewallRule_ICMPv6In and Source Info is C:\windows\TEMP\xDynamicConfig.ps1::584::9::Script. Error Message is PowerShell DSC resource MSFT_ScriptResource failed to execute Set-TargetResource functionality with error message: Error formatting a string: Index (zero based) must be greater than or equal to zero and less than the size of the argument list.. .#015 Aug 08 04:15:56 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DSC: Job {55D603E6-9AFC-11E8-883D-F40343DF3609} : MIResult: 1 Error Message: PowerShell DSC resource MSFT_ScriptResource failed to execute Set-TargetResource functionality with error message: Error formatting a string: Index (zero based) must be greater than or equal to zero and less than the size of the argument list.. Message ID: ProviderOperationExecutionFailure Error Category: 7 Error Code: 1 Error Type: MI#015 Aug 08 04:15:58 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DSC: Job {55D603E6-9AFC-11E8-883D-F40343DF3609} : Job runs under the following LCM setting. ConfigurationMode: ApplyAndMonitor ConfigurationModeFrequencyMins: 15 RefreshMode: PUSH RefreshFrequencyMins: 30 RebootNodeIfNeeded: NONE DebugMode: False#015 Aug 08 04:15:58 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DSC: RESULT 1 [NXLOG@14506 Keywords="4611686018427387904" EventType="ERROR" EventID="4252" ProviderGuid="{50DF9E12-A8C4-4939-B281-47E1325BA63E}" Version="0" Task="0" OpcodeValue="0" RecordNumber="68329" ActivityID="{F03E5353-2EF5-0000-21AA-3EF0F52ED401}" ThreadID="4280" Channel="Microsoft-Windows-DSC/Operational" Domain="NT AUTHORITY" AccountName="SYSTEM" UserID="S-1-5-18" AccountType="User" Opcode="Info" JobId="{55D603E6-9AFC-11E8-883D-F40343DF3609}" MIResult="1" ErrorMessage="The SendConfigurationApply function did not succeed." ErrorCategory="0" ErrorCode="1" ErrorType="MI" EventReceivedTime="2018-08-08 11:15:58" SourceModuleName="eventlog" SourceModuleType="im_msvistalog"] Job {55D603E6-9AFC-11E8-883D-F40343DF3609} : MIResult: 1 Error Message: The SendConfigurationApply function did not succeed. Message ID: MI RESULT 1 Error Category: 0 Error Code: 1 Error Type: MI#015 Aug 08 04:15:58 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DSC: Job {55D603E6-9AFC-11E8-883D-F40343DF3609} : Details logging completed for C:\windows\System32\Configuration\ConfigurationStatus\{55D603E6-9AFC-11E8-883D-F40343DF3609}-0.details.json.#015 Aug 08 04:15:58 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DSC: Job DscTimerConsistencyOperationResult : DSC Engine Error : #011 Error Message: NULL #011Error Code : 1 #015 Aug 08 04:15:58 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Checking for C:\dsc\task-claim-state.valid file... #015 Aug 08 04:16:03 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Checking for C:\dsc\task-claim-state.valid file... #015 Aug 08 04:16:08 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Checking for C:\dsc\task-claim-state.valid file... #015 Aug 08 04:16:13 T-W1064-MS-121.mdc1.mozilla.com generic-worker: Checking for C:\dsc\task-claim-state.valid file... #015
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 1490313
The worker looks to be shutdown, I power it up and wait for it to show in taskcluster, it did not so I started the re-image process on it.
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Depends on: 1495253
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Depends on: 1531048

re-opening bug as the machine isn't available on taskcluster

https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-win10-64-hw/workers/mdc1/T-W1064-MS-121

The papertrail last entries :

Mar 16 17:02:30 T-W1064-MS-121.mdc1.mozilla.com mlx4eth63: Mellanox ConnectX-3 Pro Ethernet Adapter device detected that the link connected to port 2 is down. This can occur if the physical link is disconnected or damaged, or if the other end-port is down.
Mar 16 17:02:30 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DNS-Client: The system failed to register host (A or AAAA) resource records (RRs) for network adapter with settings: Adapter Name : {7086DA67-9AE0-4135-B2B0-1A847D8914A8} Host Name : T-W1064-MS-121 Primary Domain Suffix : mdc1.mozilla.com DNS server list : 10.48.75.120, 10.50.75.120 Sent update to server : <?> IP Address(es) : 10.49.40.77 The reason the system could not register these RRs during the update request was because of a system problem. You can manually retry DNS registration of the network adapter and its settings by typing 'ipconfig /registerdns' at the command prompt. If problems still persist, contact your DNS server or network systems administrator. See event details for specific error code information.
Mar 16 17:02:31 T-W1064-MS-121.mdc1.mozilla.com generic-worker-wrapper: Checking for manifest completion
Mar 16 17:02:33 T-W1064-MS-121.mdc1.mozilla.com Microsoft-Windows-DNS-Client: The system failed to register host (A or AAAA) resource records (RRs) for network adapter with settings: Adapter Name : {7086DA67-9AE0-4135-B2B0-1A847D8914A8} Host Name : T-W1064-MS-121 Primary Domain Suffix : mdc1.mozilla.com DNS server list : 10.48.75.120, 10.50.75.120 Sent update to server : <?> IP Address(es) : 10.49.40.77 The reason the system could not register these RRs during the update request was because of a system problem. You can manually retry DNS registration of the network adapter and its settings by typing 'ipconfig /registerdns' at the command prompt. If problems still persist, contact your DNS server or network systems administrator. See event details for specific error code information.

tried to ping the machine and received reply :

ping t-w1064-ms-121.wintest.releng.mdc1.mozilla.com

Pinging t-w1064-ms-121.wintest.releng.mdc1.mozilla.com [10.49.40.77] with 32 bytes of data:
Reply from 10.49.40.77: bytes=32 time=202ms TTL=126
Reply from 10.49.40.77: bytes=32 time=202ms TTL=126
Reply from 10.49.40.77: bytes=32 time=202ms TTL=126
Reply from 10.49.40.77: bytes=32 time=202ms TTL=126

Ping statistics for 10.49.40.77:
Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
Approximate round trip times in milli-seconds:
Minimum = 202ms, Maximum = 202ms, Average = 202ms

Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Ran a re-image process. Currently the worker can be found on taskcluster, but it didn't ran any jobs yet :

https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-win10-64-hw/workers/mdc1/T-W1064-MS-121

We will keep on monitoring it.

the machine seems to be up and running and taking jobs.
https://tools.taskcluster.net/provisioners/releng-hardware/worker-types/gecko-t-win10-64-hw/workers/mdc1/T-W1064-MS-121
We will close the bug for now. If the problem will persist in the future, we will re-open this bug.

Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.