Closed Bug 1649074 Opened 4 years ago Closed 4 years ago

FxA Metrics ETL: Make device-commands events available for querying in redash

Categories

(Data Platform and Tools :: General, task, P2)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rfkelly, Assigned: klukas)

References

Details

(Whiteboard: [SACI])

Attachments

(1 file)

(I'm filing this by analogy to Bug 1633918 but please don't hesitate to redirect me as appropriate)

For monitoring the quality of the "send tab to device" feature, we currently have a redash dashboard here that is build using client-submitted metrics from the sync ping. To get a more reliable sense of whether and how sending tabs is succeeding, we'd like to add some server-generated metrics to this dashboard. I have a PR to emit them on the FxA server here and they're documented here.

In order to add them to our dashboard, I believe we need an ETL job that extracts these events from the raw server logs, applies the HMAC anonymization logic to any user-linked identifiers, and then makes them available in a table queryable from redash. Leif, that's based on your recent suggestions in the send-tab metrics google doc, does it sound right to you?

I'm quite happy to have a go at putting together the necessary queries here, based on existing FxA jobs in the bigquery-etl repo. But any advice or guidance would of course be appreciated.

Whiteboard: [SACI]
Assignee: nobody → jklukas

These metrics have started to become available in bigquery; noting the query that I used to find them for reference:

WITH funnel_events AS (
  SELECT
    TIMESTAMP_TRUNC(timestamp, HOUR) AS hour,
    jsonPayload.type,
    jsonPayload.fields.uid,
    jsonPayload.fields.index,
    jsonPayload.fields.command,
    jsonPayload.fields.target,
    jsonPayload.fields.targetOS,
    jsonPayload.fields.targetType,
    jsonPayload.fields.sender,
    jsonPayload.fields.senderOS,
    jsonPayload.fields.senderType,
  FROM `moz-fx-fxa-prod-0712.fxa_prod_logs.docker_fxa_auth_2020*`
  WHERE jsonPayload.type IN ("device.command.invoked", "device.command.notified", "device.command.retrieved")
  -- This date is when the metrics were deployed and stable in prod;
  -- there is some data for earlier dates but it's from a failed deployment so we don't count it.
  AND _TABLE_SUFFIX >= '0708'
),

:klukas, any chance you can recommend a good existing example script in the bigquery-etl repo, that I could use as a starting point to poke at extracting these events over to redash?

Flags: needinfo?(jklukas)

I'll plan to take a crack at this today. Your sample query here give me the bulk of what I need to productionize the query. You'll need a data engineer involved anyway to provision the table.

Flags: needinfo?(jklukas)

See your example query translated into something appropriate for bigquery-etl in https://github.com/mozilla/bigquery-etl/pull/1150

Feel free to comment there or use it as a basis for your own PR. It's a bit difficult of a workflow, since the HMAC piece requires privs available only to Airflow, so you'll need to comment that piece out if you want to iterate on the query in the BQ console.

This new table/query is now scheduled and backfilled.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED

Thank you! Linking for completeness, I was able to use this data to make some nice graphs:

:tcsc, could you please incorporate these into your new send-tab dashboard for easy future reference? I had a look but I don't think I have edit permissions.

Flags: needinfo?(tchiovoloni)

I can't because those aren't published, but you should have edit permissions now.

Flags: needinfo?(tchiovoloni)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: