Instrument database loading cases
Categories
(Data Platform and Tools :: Glean: SDK, task, P2)
Tracking
(Not tracked)
People
(Reporter: chutten, Assigned: perry.mcmanis)
References
(Blocks 1 open bug)
Details
Attachments
(2 files)
42 bytes,
text/x-github-pull-request
|
Details | Review | |
2.97 KB,
text/plain
|
travis_
:
data-review+
|
Details |
rkv likely tells us the difference between "Tried to open the db and there wasn't one" and "Tried to open the db and it was broken" cases that result in Glean performing first run actions (generating client_id, resetting seq
and first_run_hour
, etc).
We should consider instrumenting these cases of first run so we can help diagnose what proportions of first runs aren't actually "first"
Updated•2 years ago
|
Reporter | ||
Comment 1•2 years ago
|
||
(( Could've sworn I wrote a comment about this already... ah well ))
We're looking to instrument three cases:
- db isn't present (Glean starts afresh)
- db is present, and is bad (Glean starts afresh)
- db is present, and is good (Glean uses existing data)
I looked at this a couple weeks ago and rkv_new
is the right place to look for Case 2, but for differentiating Cases 1 and 3 you'll need to use something like Path::exists
.
Plus, there's the wrinkle that this is happening while opening the db. Glean doesn't exist, meaning you can't directly instrument this using e.g. set_sync
. If you can get to the dispatcher, you might be able to dispatch something for after init's done... but otherwise, your data can't be added to a Glean that doesn't fully exist yet.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 3•2 years ago
|
||
Implementation is in progress. RKV does give us a handy error that makes this a bit simpler.
Adding complexity is the need to persist the error (since well, there's no RKV storage if RKV got messed up) and getting that into a metric. Will update as I make progress.
Assignee | ||
Comment 6•2 years ago
•
|
||
No, not the instrumentation.
I landed the code to correctly handle the error and make the behavior itself match what we described it as doing (and desired it to do). However, after discussing with Travis, it became apparent that actually plumbing this all the way up to being sent in a new error metric was a pretty meaty task and we decided to split the work off into this ticket for dealing with later.
Comment 7•2 years ago
|
||
(In reply to Perry McManis [:perry.mcmanis] from comment #6)
No, not the instrumentation.
Thanks, please untake it if you're no longer planning on working on it :-) Consider bringing this up in the next SDK meeting for re-triage?
Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 8•2 years ago
|
||
Update: after discussing we have decided this is worth doing.
I will take it back and get it completed with help from Jan-Erik.
Comment 9•1 year ago
|
||
This might be worth adding any errors around trying to clear the database or write to it (or at least ensuring we have adequate instrumentation around these things already)
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Updated•1 year ago
|
Assignee | ||
Updated•1 year ago
|
Comment 10•1 year ago
|
||
Assignee | ||
Comment 11•1 year ago
|
||
Comment 12•1 year ago
|
||
Comment on attachment 9368167 [details]
Data Review request
Data Review
- Is there or will there be documentation that describes the schema for the ultimate data set in a public, complete, and accurate way?
Yes, through the metrics.yaml file and the Glean Dictionary.
- Is there a control mechanism that allows the user to turn the data collection on and off?
Yes, through the data preferences in the application settings.
- If the request is for permanent data collection, is there someone who will monitor the data over time?
Permanent collection to be monitored over time by pmcmanis@mozilla.com and glean-team@mozilla.com
- Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?
Category 1, Technical data
- Is the data collection request for default-on or default-off?
Default-on
- Does the instrumentation include the addition of any new identifiers (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)?
No
- Is the data collection covered by the existing Firefox privacy notice?
Yes
- Does the data collection use a third-party collection tool?
No
Result
data-review+
Assignee | ||
Comment 13•1 year ago
|
||
A small update, we will be changing the name/description:
Name: rkv_load_error
Description: If there was an error loading the RKV database, record it.
All other aspects of this collection are identical.
Assignee | ||
Updated•1 year ago
|
Description
•