Closed
Bug 1309876
Opened 8 years ago
Closed 8 years ago
Mobile jobs are failing
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rvitillo, Assigned: mdoglio)
References
Details
User Story
The following atmo v1 jobs have been failing for a while: - mobile-android-addons-v1 - mobile-android-events-v1 - mobile-android-clients-v1 Mauro, are those jobs still being used?
No description provided.
Reporter | ||
Updated•8 years ago
|
Summary: Mobl → Mobile jobs are failing
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(mdoglio)
Assignee | ||
Comment 1•8 years ago
|
||
I suspect nobody is using it but I don't know tbh. I meant to add them to airflow in bug 1305423 but I haven't done it yet. Let's ask bbermes if those datasets are still needed.
Flags: needinfo?(mdoglio) → needinfo?(bbermes)
Updated•8 years ago
|
Points: --- → 2
Priority: -- → P2
Comment 2•8 years ago
|
||
When did they stop working? Most of my presto queries query android_events_v1, mobile_events_v1, android_clients_v1, android_addons_v1, and mobile_clients_v1, so I think we should try to figure out what the issue is. Thanks to you both for following up.
Flags: needinfo?(rvitillo)
Flags: needinfo?(mdoglio)
Flags: needinfo?(bbermes)
Assignee | ||
Comment 3•8 years ago
|
||
They seem to have failed on random days since the Sept 29th. I'll move them to airflow (in bug 1305423) as soon as possible, atmo v1 doesn't help me monitor these jobs very much. I'll take care of the eventual backfill as well.
Flags: needinfo?(mdoglio)
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(rvitillo)
Comment 4•8 years ago
|
||
Thanks Mauro, Please let us know when we can expect this to be fixed. We are currently waiting for some Activity Stream data to come into re:dash for Android...
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → mdoglio
Status: NEW → ASSIGNED
Priority: P2 → P1
Assignee | ||
Comment 5•8 years ago
|
||
I'm migrating the files right now. It shouldn't take more than a day to backfill the missing data, so I would say EOD tomorrow.
Assignee | ||
Comment 6•8 years ago
|
||
:barbara running the backfill is taking more than expected, I'll give you an update by EOD today or earlier.
Assignee | ||
Comment 7•8 years ago
|
||
After some investigation it turns out the filter on build_id between 20100101000000 and 99999999999999 is slowing down the job A LOT. On a 20 nodes cluster it takes about 30 minutes to run a android_addons job on 1% of data with the filter set. Without the filter it takes about 10 minutes to run the same job on 100%. I'm wondering if the filter is actually needed or not. :barbara do you have an opinion on that? In the meantime I'll change the notebooks to apply the build_id filter once the data is in spark. That should make the backfill extremely fast.
Assignee | ||
Comment 8•8 years ago
|
||
I found (and fixed) bug 1315243, which explains why the jobs were taking so long. I'm backfilling the last month of data for android_clients and android_events; android_addons was already backfilled last friday. :barbara can you please confirm that the numbers generated make sense to you?
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(bbermes)
Comment 9•8 years ago
|
||
I think it looks good now, thanks. Was there also an issue with mobile_clients?
Flags: needinfo?(bbermes)
Assignee | ||
Comment 10•8 years ago
|
||
no, mobile_clients wasn't affected by that bug.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•