Closed Bug 1873482 Opened 2 years ago Closed 3 days ago

structured missing columns in `firefox_desktop.metrics_v1` for `events`.[...].`timestamp`

Categories

(Data Platform and Tools :: Glean: SDK, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kik, Assigned: janerik)

References

Details

(Whiteboard: [dataquality])

Attachments

(2 files)

structured missing columns in firefox_desktop.metrics_v1 for events.[...].timestamp

Last 7 day count: 5,935
2 weeks ago count: 3,710

A 59.97% increase.

Assignee: nobody → srose

The moz-fx-data-shared-prod.firefox_desktop_stable.metrics_v1 table does have an events.timestamp column.

It turns out the cause of these "missing column" errors is that Glean's event timestamp field is an unsigned 64-bit integer, while BigQuery's integer type is a signed 64-bit integer, and when an event timestamp has a value larger than BigQuery's maximum supported integer value of 9,223,372,036,854,775,807 it gets shunted into the additional_properties column instead, and thus gets reported as a missing column.

I have reported the underlying issue to the Glean team in Slack, and since this is only happening very rarely (8.4k errors in the last week, or only 0.002% of pings) I'm closing it as won't-fix.

Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → WONTFIX

Those seem like rather not-valid values regardless. We should more gracefully handle these.
Re-opening and moving to the SDK, at least we should not sent out these large values.

Status: RESOLVED → REOPENED
Component: General → Glean: SDK
Resolution: WONTFIX → ---
Assignee: srose → nobody
Assignee: nobody → jrediger
Status: REOPENED → ASSIGNED
Priority: -- → P2

FYI, here's a Redash query I made to make it easier to check how often this is still happening: https://sql.telemetry.mozilla.org/queries/112212?p_date_range=d_last_30_days

Attachment #9536134 - Flags: data-review?(chutten)

Comment on attachment 9536134 [details]
1873482-data-review.txt

DATA COLLECTION REVIEW RESPONSE:

Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes.

Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection can be controlled through the product's preferences.

If the request is for permanent data collection, is there someone who will monitor the data over time?

No. This collection will expire on 2026-06-31.

Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 1, Technical.

Is the data collection request for default-on or default-off?

Default on for all channels.

Does the instrumentation include the addition of any new identifiers?

No.

Is the data collection covered by the existing Firefox privacy notice?

Yes.

Does the data collection use a third-party collection tool?

No.


Result: datareview+

Attachment #9536134 - Flags: data-review?(chutten) → data-review+
Status: ASSIGNED → RESOLVED
Closed: 3 months ago3 days ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: