Bug 1628740 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

See description in https://docs.google.com/document/d/1UNulczGgBSepy0V1YISAfEuczbtvXaR59HB7xwTCFCk/edit#heading=h.yku9rl5thbdk

We need to 

The existing schema for the Amplitude events is:

```
User    fxa_uid        string    
Event    device_id        string    
User    ua_browser        string    
User    ua_version        string    
Event    flow_id        string                               
```

In BigQuery, we have the `telemetry.sync` ping table which includes fields `id` (which is likely the hashed `fxa_uid`, `payload.device_id`, and user agent info parsed under `metadata.user_agent`, though we may want to check on format of those fields and whether they match previous expectations.

I do not see an obvious `flow_id` field in the ping table, so some investigation is needed to understand where that comes from.

This may be a fairly straightforward task of creating a query to write out these events 1:1 to a derived table with schema matching the above, and then tying that table into the same nightly bulk export machinery we use for other Amplitude sends.
See description in https://docs.google.com/document/d/1UNulczGgBSepy0V1YISAfEuczbtvXaR59HB7xwTCFCk/edit#heading=h.yku9rl5thbdk

We need to 

The existing schema for the Amplitude events is:

```
User     fxa_uid        string    
Event    device_id      string    
User     ua_browser     string    
User     ua_version     string    
Event    flow_id        string                               
```

In BigQuery, we have the `telemetry.sync` ping table which includes fields `id` (which is likely the hashed `fxa_uid`, `payload.device_id`, and user agent info parsed under `metadata.user_agent`, though we may want to check on format of those fields and whether they match previous expectations.

I do not see an obvious `flow_id` field in the ping table, so some investigation is needed to understand where that comes from.

This may be a fairly straightforward task of creating a query to write out these events 1:1 to a derived table with schema matching the above, and then tying that table into the same nightly bulk export machinery we use for other Amplitude sends.

Send tab metrics are currently broken, so it's not clear at this point how we can validate the output.
See description in https://docs.google.com/document/d/1UNulczGgBSepy0V1YISAfEuczbtvXaR59HB7xwTCFCk/edit#heading=h.yku9rl5thbdk

We need to export "tab sent" and "tab received" events to Amplitude based on sync telemetry.

The existing schema for the Amplitude events is:

```
User     fxa_uid        string    
Event    device_id      string    
User     ua_browser     string    
User     ua_version     string    
Event    flow_id        string                               
```

In BigQuery, we have the `telemetry.sync` ping view which includes fields `id` (which is likely the hashed `fxa_uid`, `payload.device_id`, and user agent info parsed under `metadata.user_agent`, though we may want to check on format of those fields and whether they match previous expectations.

I do not see an obvious `flow_id` field in the ping table, so some investigation is needed to understand where that comes from.

This may be a fairly straightforward task of creating a query to write out these events 1:1 to a derived table with schema matching the above, and then tying that table into the same nightly bulk export machinery we use for other Amplitude sends.

Send tab metrics are currently broken, so it's not clear at this point how we can validate the output.
See description in https://docs.google.com/document/d/1UNulczGgBSepy0V1YISAfEuczbtvXaR59HB7xwTCFCk/edit#heading=h.yku9rl5thbdk

We need to export "tab sent" and "tab received" events to Amplitude based on sync telemetry.

The existing schema for the Amplitude events is:

```
User     fxa_uid        string    
Event    device_id      string    
User     ua_browser     string    
User     ua_version     string    
Event    flow_id        string                               
```

In BigQuery, we have the `telemetry.sync` ping view which includes fields `payload.uid` (which should be the hashed `fxa_uid`), `payload.device_id`, and user agent info parsed under `metadata.user_agent`, though we may want to check on format of those fields and whether they match previous expectations. See https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/data/sync-ping.html for more details on the ping structure.

The `flow_id` will be extracted from `payload.events` where it appears in the object map as key "flowID".

This may be a fairly straightforward task of creating a query to write out these events 1:1 to a derived table with schema matching the above, and then tying that table into the same nightly bulk export machinery we use for other Amplitude sends.

Send tab metrics are currently broken, so it's not clear at this point how we can validate the output.

Back to Bug 1628740 Comment 0