Remove client/ping info from the dau-reporting ping
Categories
(Data Platform and Tools :: Glean: SDK, enhancement, P1)
Tracking
(firefox134 fixed)
Tracking | Status | |
---|---|---|
firefox134 | --- | fixed |
People
(Reporter: janerik, Assigned: janerik)
References
(Blocks 1 open bug)
Details
Attachments
(3 files)
We should reduce the amount of data sent in those pings, like the client info.
In a follow-up we need to re-implement a subset of those fields as ordinary metrics in the respective apps.
For now we just remove the fields (and make sure our pipeline knows how to ingest them, this needs the glean-min schema).
This bug is going to do that for Desktop.
Comment 1•23 days ago
|
||
This is the same situation we encountered in https://bugzilla.mozilla.org/show_bug.cgi?id=1901256.
Going from glean to glean-min schema is backward incompatible and not supported on the platform without manual intervention. The easiest and safest solution would be to define a new ping with glean-min schema and start dropping dau-reporting
at ingestion when it's no longer needed. I'd probably lean towards this option since this ping is meant to be the primary way to count dau and there seems to be some time pressure.
ni? :whd for visibility because other options require SRE interventions.
Considering manual intervention route, we need to keep in mind that:
- dau-reporting is defined in gecko-dev, so has tables created for 16 applications
- We are currently receiving these pings from firefox_desktop. If we change the schema to
glean-min
, these pings will fail validation and go to error stream, so we'll lose continuity in reporting (:janerik - is this acceptable?)
Now for the actual steps I think we could either:
- Do what we did for
user-characteristics
ping, i.e. delete and recreate tables after changing the schema. This would involve following more strict version of the process from https://bugzilla.mozilla.org/show_bug.cgi?id=1898105#c10. We would lose existing data unless we back it up. - Do what Wil proposed in a Slack discussion:
we add special case handling in schema generator (it looks like there some precedence for that) to adjust the output schema making the info fields optional. BQ schema should be fine since we set the info fields to NULLABLE by default
eventually we will want to delete the incompatible pings and probably block them at the ingestion pipeline
after that we can drop the info fields from the schema and bq
Option 2. assures data continuity (is this important?), but is significantly more complex. Dropping columns in production tables seems a bit risky but might work.
Assignee | ||
Comment 2•23 days ago
|
||
I'll discuss that with the responsible people and see what they say.
Comment 3•19 days ago
|
||
We decided that we'll rename the ping in the client code. From the platform point of view this is equivalent to defining a new ping so we won't need any manual interventions (clearing ni?:whd).
Assignee | ||
Comment 4•14 days ago
|
||
This also removes it from the baseline schedule and should stop it from
being sent. It will be fully removed at a later point.
Assignee | ||
Comment 5•14 days ago
|
||
This is done for Desktop, Fenix and Focus all at once
Assignee | ||
Comment 6•14 days ago
|
||
Comment 8•12 days ago
|
||
bugherder |
Assignee | ||
Comment 9•10 days ago
|
||
thatswinnie merged PR [mozilla-mobile/firefox-ios]: Bug 1929832 - Mark dau-reporting ping as deprecated (#23281) in d16f53c.
now also landed in iOS
Assignee | ||
Updated•10 days ago
|
Assignee | ||
Updated•10 days ago
|
Description
•