Open Bug 1326060 Opened 9 years ago Updated 3 years ago

Document and test the behavior of telemetry environment profile.creationDate and .resetDate

Categories

(Toolkit :: Startup and Profile System, defect)

defect

Tracking

()

People

(Reporter: benjamin, Unassigned)

Details

User Story

* profile creation date is recorded when we first create a profile (typically the first run of Firefox)
* For a profile created before Firefox 42, we calculate the profile creation date from the creation date of the oldest file in the profile directory
* After this date is recorded, it will not change
* The profile resetDate will be null

Profile refresh/reset behavior:
* When the user chooses profile refresh (e.g. from about:support or visiting SUMO), a new profile is created behind the scenes
* The new profile has the same telemetry clientID as the old profile
* The new profile has the creationDate of the old profile
* The resetDate will be recorded as the current date
From discussion last week, we want to rely on the profile creationDate and resetDate to identify "new users" for the purpose of other metrics. We have relatively little confidence in these metrics right now, and even some confusion about the expected behavior when the profile is reset. I'm going to record the behavior we believe is correct in the user story, and we can refine if necessary. I think we should do a combination of checks on the data: * manually test telemetry data on all the desktop OSes for correct creation and reset dates ** check the behavior of each of these if the local computer clock is wrong * query telemetry data to compare the date of a user's first session in telemetry to the profile creation date, to check for how common clock skew could be * cross-check profile creation dates for funnelcake builds which are available only in short windows * query and examine client IDs where the profile creation date changes over time * query and examine client IDs where the profile reset date changes, to get a baseline for how many users use profile refresh and how often they use it. Check whether the profile creation date stays constant as expected.
Georg can you confirm the user story and review the checking strategy here?
Flags: needinfo?(gfritzsche)
I can confirm the behavior in the user story. The checking strategy sounds good. I would expect that one outcome is documenting a percentage/confidence value on to what degree we can trust these date values. (Commenting on User Story) > * profile creation date is recorded when we first create a profile > (typically the first run of Firefox) I see, that is a bit hard to discover. That comes from here: https://dxr.mozilla.org/mozilla-central/rev/d192a99be4b436f2dc839435319f7630d5d8f4b0/toolkit/profile/nsToolkitProfileService.cpp#899 Is the idea to update the environment documentation in this bug as well or should i open a separate one? (referring to toolkit/components/telemetry/docs/data/environment.rst)
Flags: needinfo?(gfritzsche)
I think we can use the same bug for both testing and documentation.
thuelbert, can you add this to the data team backlog and QA priorities?
Flags: needinfo?(thuelbert)
+ krupa and madalin for the qa side. Krupa for timeline/prioritization.
Flags: needinfo?(thuelbert)
I have created some test scenarios based on the user story described above. The results so far can be found here: https://docs.google.com/spreadsheets/d/1luec3XPudgZC87g2EskfSi3RVwRczp6C1ZTzICJSqZ8/edit#gid=0 I would need more info about the second part of comment 0. I presume those queries need to be run on stmo. I would need more details about which tables I will find those information. Is there and documentation about the db structure?
The profile's creationDate and most-recent resetDate are stored in the environment portion of each "main" ping[1]. There are several views onto this data, so it depends on what kind of questions you're hoping to answer. We have a guide to help choose[2]. [1]: http://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/environment.html [2]: https://github.com/mozilla/telemetry-batch-view/blob/master/docs/choosing_a_dataset.md
Oh yes, forgot to list some useful resources. The biggest one is probably to join #telemetry on irc.mozilla.org for real-time assistance on all aspects of data collection, collation, and communication. Other ones include this[1] particular resource for the Longitudinal dataset. The folder[2] contains some additional documentation. [1]: https://github.com/mozilla/telemetry-batch-view/blob/master/docs/longitudinal_examples.md [2]: https://github.com/mozilla/telemetry-batch-view/blob/master/docs/
(In reply to Madalin Cotetiu from comment #6) > I have created some test scenarios based on the user story described above. > The results so far can be found here: > https://docs.google.com/spreadsheets/d/ > 1luec3XPudgZC87g2EskfSi3RVwRczp6C1ZTzICJSqZ8/edit#gid=0 > > I would need more info about the second part of comment 0. I presume those > queries need to be run on stmo. I would need more details about which tables > I will find those information. Is there and documentation about the db > structure? Madalin, Please add tests for - * behavior when a new profile is copied from an existing one (the date is still set correctly) * create a new profile -> cause a crash -> restart profile and check the date are still accurate * checks for funnelcake builds Thanks!
I have added tests as suggested in comment 9. All the test results can be seen here: https://docs.google.com/spreadsheets/d/1luec3XPudgZC87g2EskfSi3RVwRczp6C1ZTzICJSqZ8/edit#gid=0
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.