Status

Cloud Services
Metrics: Pipeline
P1
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: rvitillo, Assigned: mdoglio)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

User Story

The following atmo v1 jobs have been failing for a while:

- mobile-android-addons-v1
- mobile-android-events-v1
- mobile-android-clients-v1

Mauro, are those jobs still being used?
Comment hidden (empty)
(Reporter)

Updated

a year ago
Blocks: 1255755
(Reporter)

Updated

a year ago
Summary: Mobl → Mobile jobs are failing
(Reporter)

Updated

a year ago
Flags: needinfo?(mdoglio)
(Assignee)

Comment 1

a year ago
I suspect nobody is using it but I don't know tbh. I meant to add them to airflow in bug 1305423 but I haven't done it yet. Let's ask bbermes if those datasets are still needed.
Flags: needinfo?(mdoglio) → needinfo?(bbermes)

Updated

a year ago
Points: --- → 2
Priority: -- → P2
When did they stop working?

Most of my presto queries query android_events_v1, mobile_events_v1, android_clients_v1, android_addons_v1, and mobile_clients_v1, so I think we should try to figure out what the issue is.

Thanks to you both for following up.
Flags: needinfo?(rvitillo)
Flags: needinfo?(mdoglio)
Flags: needinfo?(bbermes)
(Assignee)

Comment 3

a year ago
They seem to have failed on random days since the Sept 29th. I'll move them to airflow (in bug 1305423) as soon as possible, atmo v1 doesn't help me monitor these jobs very much. I'll take care of the eventual backfill as well.
Flags: needinfo?(mdoglio)
(Reporter)

Updated

a year ago
Flags: needinfo?(rvitillo)
Thanks Mauro,

Please let us know when we can expect this to be fixed. 

We are currently waiting for some Activity Stream data to come into re:dash for Android...
(Assignee)

Updated

a year ago
Assignee: nobody → mdoglio
Status: NEW → ASSIGNED
Priority: P2 → P1
(Assignee)

Comment 5

a year ago
I'm migrating the files right now. It shouldn't take more than a day to backfill the missing data, so I would say EOD tomorrow.
(Assignee)

Comment 6

a year ago
:barbara running the backfill is taking more than expected, I'll give you an update by EOD today or earlier.
(Assignee)

Comment 7

a year ago
After some investigation it turns out the filter on build_id between 20100101000000 and 99999999999999 is slowing down the job A LOT. On a 20 nodes cluster it takes about 30 minutes to run a android_addons job on 1% of data with the filter set. Without the filter it takes about 10 minutes to run the same job on 100%. I'm wondering if the filter is actually needed or not. :barbara do you have an opinion on that? In the meantime I'll change the notebooks to apply the build_id filter once the data is in spark. That should make the backfill extremely fast.
(Assignee)

Comment 8

a year ago
I found (and fixed) bug 1315243, which explains why the jobs were taking so long. I'm backfilling the last month of data for android_clients and android_events; android_addons was already backfilled last friday. :barbara can you please confirm that the numbers generated make sense to you?
(Assignee)

Updated

a year ago
Flags: needinfo?(bbermes)
I think it looks good now, thanks.

Was there also an issue with mobile_clients?
Flags: needinfo?(bbermes)
(Assignee)

Comment 10

a year ago
no, mobile_clients wasn't affected by that bug.
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.