Closed Bug 1616282 Opened 6 years ago Closed 6 years ago

taskclusteretl - pods were being evicted due to low ephemeral-storage

Categories

(Data Platform and Tools :: General, defect, P1)

defect
Points:
3

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: trink, Unassigned)

Details

Some taskcluster workers started outputting logs is the 10 - 40 GB range blowing out the ephemeral storage causing the ETL pods to continually be evicted and recreated. I believe this started the end of last week Feb 13/14 (I was on PTO and traveling so the initial events have rolled off). The throughput was throttled on Feb 15 to prevent the eviction but I need to follow up on the root cause.

Points: --- → 3
Priority: -- → P1

Based on the cost and number of tasks I am betting it was Feb 12 which corresponds with: https://bugzilla.mozilla.org/show_bug.cgi?id=1547111

The number of log processors was decreased to reduce the I/O on the drive. They will be returned to their normal level today when the latest updates are pushed.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Component: Pipeline Ingestion → General
You need to log in before you can comment on or make changes to this bug.