Support the distribution name reset
Categories
(Data Platform and Tools :: Glean: SDK, enhancement, P1)
Tracking
(Not tracked)
People
(Reporter: baku, Assigned: chutten)
References
(Depends on 1 open bug)
Details
Attachments
(1 file)
There is a need to reset the distribution name stored in Glean's internal persistent storage (see bug 2027788).
At the moment, this is not possible. The only available workaround is to update the distribution name to an empty string.
This bug proposes either introducing a reset_distribution API or allowing None as a valid value for the name parameter in update_distribution.
I'm not fully aware of all the implications of this change, so if this approach is not appropriate, please let me know.
| Reporter | ||
Updated•14 days ago
|
Comment 1•14 days ago
|
||
original proposal for that API: https://docs.google.com/document/d/1TIZhpBeZcJSEnIZJwj0Cj9saL2Tfs0X6oi_4gFU35eM/edit?tab=t.0
Notably includes a short discussion on clearing the value.
From the doc:
Updating an attribution field with a None value will leave any existing value in-place. Attribution fields are stored persistently as though they have user lifetime.
and attached discussion:
jer: Do we need a way to unset them though? Can attribution change over (client-)time?
chutten: It would simplify testing, but outside of that I don't believe so. Except to perhaps (accidentally?) identify problems in the service maintaining attribution data between runs
jrconlin: What about in the case of an error? (e.g. a value was specified that was later determined to be incorrect or misapplied?)
chutten: That's true. We might discover through data a mistake in attribution instrumentation then ship an improved algorithm in a later version. We need to be able to overwrite later, and a subset of that is being able to clear.
seems we're at that stage now where we have a use case for clearing.
cc :chutten for visibility
| Assignee | ||
Comment 2•11 days ago
|
||
Yup, we can get this added. When (which versions, dates) do you need it for?
| Reporter | ||
Comment 3•11 days ago
|
||
Luckily, KPI queries are built on top of legacy telemetry. But, soon is better to keep Glean and Legacy telemetry in sync after the MozillaOnline user migration.
| Assignee | ||
Comment 4•9 days ago
|
||
Okay, so design time.
Current Status:
In the Glean SDK, update_{att|dist}ribution uses None fields to signal "We're only updating some of theses fields (the ones with Some(...)). The others (the ones with None) we are leaving alone."
In FOG, the JS FOG API is presently using xpidl's behaviour of defaulting to void/empty nsCString for unsupplied arguments. FOG then coerces empty nsCString values to None.
Options
- We could be a little more clever about FOG's JS API design for
update{Att|Dist}ributionto support updating specific fields to an empty string"".- This doesn't satisfy the request, as it wouldn't clear values in a way that would turn into
NULLin SQL, but would be clear semantically. - But at that point we might as well set
"<not MozillaOnline any more>"as it'd have the same effect (a sentinel value). - Fixing the API might be a reasonable task outside of this request, as the API is a little clunky and does artificially restrict the acceptable values by treating
""as a sentinel forNone.
- This doesn't satisfy the request, as it wouldn't clear values in a way that would turn into
- Add
clear_{att|dist}ributionwhich clears the stored values.- Easy to understand and straightforward to use.
- Necessarily makes the attribution and distribution fields act different from typical
stringmetrics (which cannot be cleared). Cannot be implemented against the public metric API.
- Change the meaning of
Nonein theupdate_{att|distr}ibutionAPI to mean "please clear this field".- Uses the existing API
- Breaking change
- How would we enable partial updates? (Do we even need that lever?)
- Actually, come to think on it, setting
"<not MozillaOnline any more>"(or a more sensible indicator) would be quite a good thing to identify this population.
:baku, do you have opinions on which of these options (or another of your choice) would best suit your use case?
| Reporter | ||
Comment 5•8 days ago
|
||
I like option 2. It's "self-contained" (no impacts on existing methods), is easy to understand and to use. But anything works for me. Thanks!
Comment 6•8 days ago
|
||
Updated•4 days ago
|
| Assignee | ||
Comment 7•3 days ago
|
||
travis79 merged PR [mozilla/glean]: bug 2043535 - Add clear_{att|dist}ribution methods to clear core attribution and distribution data. (#3482Edit title) in 3715fcf.
Next up: vendor the release
Description
•