Closed Bug 1625547 Opened 5 years ago Closed 5 years ago

Investigate validation errors in `telemetry.sync` for `#/payload/devices/0/id`

Categories

(Data Platform and Tools :: General, defect, P2)

defect
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: ascholtz, Assigned: ascholtz)

Details

(Whiteboard: [dataquality])

Attachments

(1 file)

See https://datastudio.google.com/u/0/reporting/1MKUu478GKR7myUUe5eItvyCpxP5TBIqm/page/QDQ4

Validation errors in telemetry.sync for #/payload/devices/0/id reached 280k in the last 28 days.

I did some investigation here:
https://console.cloud.google.com/bigquery?sq=545667491807:2c900a95c08140e59f934157fc7c0fce

The devices ID causing this error is "00000000000000000000000000000000":
org.everit.json.schema.ValidationException: #/payload/devices/0/id: string [00000000000000000000000000000000] does not match pattern ^[0-9a-f]{64}$

It looks like this problems is mainly caused by clients with Firefox 74.0 buildID: 20200309095159

Hi Lina, do you know of any recent changes that would account for this unexpected device id string?

Flags: needinfo?(lina)
Points: --- → 2
Priority: -- → P2
Whiteboard: data-quality → [data-quality]

That is the default ID, which we use if we can't get the real ID for "reasons". We use it as the default for both the uid and device.id.

But it looks like you're expecting a 64-character string for device.id. (That matches my real profile—my hashedUID() is 32 characters, but hashedDeviceID() is 64). I think we'll need to do is fix the client code to send 64 zeroes if we can't get the device ID, and maybe also loosen up the RegExp in the pipeline schema?

I'm also a little curious about why we're seeing the uptick in 74—is it because it's our current release now, or other reasons? Bug 1582263, which moved device info into the top-level devices field, was uplifted to 71.

Flags: needinfo?(lina)

We will not know the uid or device id for users signed in to FxA but not to sync. My best guess is that maybe 74 started an about:welcome or similar promo which caused more people to sign up for an account without sync?

I also think that we've always sent this string when no device ID is available - which can happen in scenarios other than the one above - so if we've always been rejecting those pings some other data might be skewed.

Assignee: nobody → ascholtz

The schemas have been changed to accept the default device id (00000000000000000000000000000000). Once the new schemas get deployed these validation errors should disappear.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [data-quality] → [dataquality]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: