Make Test Pilot data available in Re:dash

RESOLVED INCOMPLETE

Status

Cloud Services
Metrics: Product Metrics
P1
normal
RESOLVED INCOMPLETE
2 years ago
2 years ago

People

(Reporter: clouserw, Assigned: Rebecca Weiss)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

2 years ago
I'd like to use https://sql.telemetry.mozilla.org/ to analyze the Test Pilot data (bug 1255182 and bug 1255184).

Thanks!

Comment 1

2 years ago
Hey Will, My limited understanding is you can use spark to answer questions right now - what's the timeline for needing redash?
Flags: needinfo?(wclouser)
(Reporter)

Comment 2

2 years ago
We're launching on May 10th and I'd like to make dashboards for the projects we launch and make sure we're recording stuff in the right shape for them.  So.... April 22nd?

I'm flexible, but I haven't done this before so "sooner the better, but don't stress out."  I also don't know how much work this is for you or what your backlog is like.  Let me know if April 22nd is unreasonable?
Flags: needinfo?(wclouser)
I know we discussed that April 22nd would be out of scope, is there a more reasonable timeline?

Regarding Spark: we'll need to confirm that Spark is going to be enough for our dashboard requirement for launch.
Flags: needinfo?(thuelbert)
(Assignee)

Comment 4

2 years ago
Reassigning to me; I'm working on the Spark notebook KPI check right now.

If I can demonstrate that we can compute basic DAU and MAU within the notebook, we can check to see about getting this data available through re:dash then.
Assignee: nobody → rweiss
(Assignee)

Comment 5

2 years ago
Notebook that creates csv of MAU and DAU for Test Pilot is available here: 
https://gist.github.com/rjweiss/fabee4d22b6d272c3758aeca75b9728a

We will need to schedule this notebook to run regularly and also verify that this data is available from within re:dash (and can be used to populate some simple dashboard widgets therein).  I will consult with :mreid offline about this.
Thank you!
Flags: needinfo?(thuelbert)
(Assignee)

Comment 7

2 years ago
Spoke to mreid, he said he will be checking to see if it is easy to migrate the CSV I uploaded in that notebook to Presto.  Once in Presto, it will be accessible within re:dash (and I have created a placeholder dashboard in re:dash that is awaiting data).

I also need to schedule the notebook to run on a daily basis to push that data to the CSV bucket, so it's important that the migration from csv-s3 bucket to Presto is performed on a schedule.

:mreid, can you let me know what the status is on migrating to Presto?
Flags: needinfo?(mreid)
(Assignee)

Updated

2 years ago
Priority: -- → P1

Comment 8

2 years ago
It's relatively easy to make CSV data queryable by Presto. There's a small quirk where the header line with field names actually appears as data, which can be fixed by 1268896 (and is easy to work around in queries).

I forked Rebecca's notebook from Comment 5 to make a few minor tweaks[1] and tested it out via Presto[2] and all seems well.

Once a finalized notebook is scheduled and the output location of the CSV data is set, I can easily update the table definition in presto and we should be good to go. Nothing further needs to happen on the presto/redash side upon scheduled CSV updates.

[1] https://gist.github.com/mreid-moz/dac9c5b67f01ea3734a207821b120668
[2] https://sql.telemetry.mozilla.org/queries/263
Flags: needinfo?(mreid)
(Assignee)

Comment 9

2 years ago
I will take this back to Javaun to make sure that our computation of DAU and MAU for test pilot suits their needs. Additionally, we will likely need to break this out further to compute DAU for each individual test separately.

Updated

2 years ago
Blocks: 1270961

Updated

2 years ago
Component: Metrics: Pipeline → Metrics: Product Metrics
(Assignee)

Comment 10

2 years ago
I scheduled a job with the version of notebook in comment 8 above using a.t.m.o's job scheduler.  Output below:

Your code has been uploaded to s3://telemetry-analysis-code-2/jobs/TxP DAU MAU v1/Telemetry - Test Pilot KPI Validity Check.ipynb.
Any output files found in relative to where the notebook will be execute will be published at s3://telemetry-private-analysis-2/TxP DAU MAU v1/data/. The output files will overwrite anything already in that location in S3.
The job will be run daily at 4:00 UTC.
The job will be allowed to run for a max of 120 minutes, after which it will be killed. 
Cron spec will be 0 4 * * *
(Assignee)

Comment 11

2 years ago
I created another notebook, which constructs another csv for each of the Test Pilot tests' DAU and MAU.

This is available here: https://gist.github.com/rjweiss/1193b079c3bfaa7038c41ca4c2ceadff

This notebook is currently NOT scheduled as it is waiting for review.

:mreid, can you review the notebook for badness?  If you sign off, I will file another bug to create a new table in presto using the csv created by this notebook as well as schedule the job to continue uploading on a daily basis.
Flags: needinfo?(mreid)

Comment 12

2 years ago
Per IRC discussion with :rweiss, this dataset has sort of become obsolete - a new bug will be filed when the mechanics of the testpilot / testpilottest pings are finalized.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Flags: needinfo?(mreid)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.