Closed Bug 1627889 Opened 6 years ago Closed 5 years ago

Intermittent Raptor, Talos [taskcluster:error] Aborting task... [hang while fetching artifacts]

Categories

(Testing :: Talos, defect, P3)

Version 3
defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: malexandru, Unassigned)

Details

(Keywords: intermittent-failure)

This happens for various artifacts like minidump_stackwalk.tar.xz and others. I wonder if it is still somewhat related to bug 1616556, but also for Windows now. Mike, any idea?

Flags: needinfo?(mh+mozilla)
Summary: Talos and Raptor tests timing out while fetching artifacts → Intermittent Raptor, Talos [taskcluster:error] Aborting task... [hang while fetching artifacts]

It doesn't seem related, and it's not only happening when downloading artifacts. Seems like problems with Windows workers?

Flags: needinfo?(mh+mozilla)

:markco, was there reimaging of the workers recently that could affect this? I assume this is datacenter and not bitbar.

Flags: needinfo?(mcornmesser)
Priority: -- → P3

(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #3)

:markco, was there reimaging of the workers recently that could affect this? I assume this is datacenter and not bitbar.

jmaher: There have been spot reimaging of nodes that have not been taking tasks, but no mass reimaging. It looks like these are failing on missing files under c:\users\task_* directories. Those directories are not persistent between reboots since a new task user is used per task.

pmoore: Any ideas?

Flags: needinfo?(mcornmesser) → needinfo?(pmoore)

The last failure of this type was seen on April 7th, or maybe more recent occurrences have been classified against another bug.

The logs in comment 0 contain "[taskcluster:error] Task aborted - max run time exceeded" so it looks like the task maxRunTime is less than the amount of time the tasks require. This could be due to the task hanging, or taking longer than it previously did, or the maxRunTime being less than it previously was.

Flags: needinfo?(pmoore)

No, here one excerpt from the logs:

[fetches 2020-04-07T07:11:11.575Z] Removing C:\Users\task_1586238152\fetches\minidump_stackwalk.tar.xz
[taskcluster:error] Aborting task...
[taskcluster 2020-04-07T07:36:05.239Z] SUCCESS: The process with PID 356 (child process of PID 7416) has been terminated.

There is no timestamp for Aborting task..., which would be great to have btw, but as it shows it hangs in removing the minidump_stackwalk.tar.xz.

Severity: normal → S3
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.