Closed
Bug 1269725
Opened 8 years ago
Closed 8 years ago
The CrashAggregateView watchdog should check that a _SUCCESS file exists
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rvitillo, Assigned: mdoglio)
References
Details
On Friday we transferred ownership of the moz-crash-rate-aggregates job from azhang to mdoglio.
We noticed that the moz-crash-rate-aggregates job never successfully ran to completion as the scheduled EMR clusters terminate with the error: "Shut down as step failed". No logs are available.
It looks though that the data on S3 is complete, i.e. each partition has a _SUCCESS file which should be written by Spark only after all the files belonging to that partition have been written. It seems that the job is failing at the very end for some other reason.
Furthermore, the watchdog job, which should send an alert when the job fails, seems to be checking only for the existence of the partition on S3, not for the presence of the _SUCCESS file. We should rectify that.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → mdoglio
Updated•8 years ago
|
Points: --- → 3
Priority: -- → P1
Assignee | ||
Comment 1•8 years ago
|
||
This has been fixed in https://github.com/mozilla/telemetry-batch-view/pull/71 (under review).
Depends on: 1275346
Assignee | ||
Updated•8 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 2•8 years ago
|
||
The first part of this bug is a WONTFIX. We replaced the old python job with a new version written in scala, which is not affected by the bug described.
The second part is still valid, as we are using the same watchdog script as before.
Summary: moz-crash-rate-aggregates job is failing → The CrashAggregateView watchdog should check that a _SUCCESS file exists
Assignee | ||
Comment 3•8 years ago
|
||
Deployed on production
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•