Closed Bug 1516575 Opened 5 years ago Closed 4 years ago

Protect workers against the OOM killer

Categories

(Taskcluster :: Workers, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: glandium, Assigned: wcosta)

Details

as has been seen with the landing of bug 1516374 yesterday, workers can end up killed under OOM conditions, making jobs fail as "claim-expired" without any logs.

It would be better if the worker wasn't killed, giving it a chance to report an actual failure with OOM messages.

It is possible to protect processes against the OOM killer with:

echo -17 > /proc/pid/oom_adj.

(per https://linux-mm.org/OOM_Killer)

It seems we should protect the worker processes this way.
Component: General → Worker
QA Contact: pmoore
Component: Worker → Workers
Assignee: nobody → wcosta

Wander: what's the status here?

Flags: needinfo?(wcosta)
Status: NEW → ASSIGNED
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.