Closed Bug 1161515 Opened 9 years ago Closed 9 years ago

Add previousSessionId to Telemetry payloads

Categories

(Toolkit :: Telemetry, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox40 --- fixed
firefox41 --- fixed

People

(Reporter: gfritzsche, Assigned: qeole, Mentored)

References

Details

(Whiteboard: [lang=js] [good next bug] [uplift])

Attachments

(3 files, 7 obsolete files)

9.89 KB, patch
qeole
: review+
Details | Diff | Splinter Review
6.02 KB, patch
qeole
: review+
Details | Diff | Splinter Review
2.22 KB, patch
qeole
: review+
Details | Diff | Splinter Review
While we submit the previousSubsessionId, we lose the ability to reliably chain if we happen to miss a ping.

Submitting the previousSessionId too should help with reconstructing.
Blocks: 1120356
Depends on: 1157359
This is easy to test by extending test_SubsessionChaining.js.
Keeping a record of the previous session state seems like unnecessary complexity.

Why is reliable chaining across sessions considered important?
We already keep track of the previous session state (for previousSubsessionId and profileSubsessionCounter).

I think we have to live with the fact that we will lose some subsessions in corner-cases.
Brendan said it could be useful for reconstruction of the chains (across a profiles life-time).
I imagine it's good for detecting missing sessions too.

Happy to close this if we have better and/or existing options for this.
Whiteboard: [lang=js] [good next bug]
Assignee: nobody → qeole
Hi Georg,

I think I made the edits corresponding to steps 1 to 4 of your last comment.

I haven't finished updating the test yet: the Assert.equal() I added at line 64 does not work (see example output below).
Is it an expected behavior, or should session IDs chain just the same as subsession IDs?

----
(subset of output from |mach test …/test_SubsessionChaining.js| with debug prints, IDs were truncated for “readability”)
>  LOG: Thread-1 INFO "Archive entry 1 - id:…, reason: shutdown"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: f52be1c… f52be1c… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 2 - id:…, reason: aborted-session"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: eb47dd4… eb47dd4… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 3 - id:…, reason: environment-change"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: 5ea6ec1… 5ea6ec1… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 4 - id:…, reason: shutdown"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: 5ea6ec1… ef7fd14… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 5 - id:…, reason: daily"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: ef7fd14… ef7fd14… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 6 - id:…, reason: environment-change"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: ef7fd14… 15b5431… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 7 - id:…, reason: shutdown"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: ef7fd14… 15b5431… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 8 - id:…, reason: environment-change"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: 15b5431… 15b5431… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 9 - id:…, reason: aborted-session"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: 15b5431… 0182c57… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 10 - id:…, reason: daily"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: 0182c57… 0182c57… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 11 - id:…, reason: environment-change"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: 0182c57… dd32899… :currentSessionId"
>  LOG: Thread-1 INFO "Archive entry 12 - id:…, reason: aborted-session"
>  PROCESS_OUTPUT: Thread-1 (pid:11138) "previousSessionId: 0182c57… dd32899… :currentSessionId"
Attachment #8606691 - Flags: feedback?(gfritzsche)
(In reply to Qeole [:qeole] from comment #5)
> I haven't finished updating the test yet: the Assert.equal() I added at line
> 64 does not work (see example output below).
> Is it an expected behavior, or should session IDs chain just the same as
> subsession IDs?

That is expected, subsession ids and session ids behave differently.
A session covers a whole "browser session", it consists of 1 to N subsessions.
Each session ends with a subsession with a |reason| of "shutdown" (the browser shut down cleanly) or "aborted-session" (the browser did not shut down cleanly, probably crashed).

We expect the previousSessionId to:
* start out with null
* stay null until the first match on SESSION_END_PING_REASONS
* match the sessionId of the previous session until the next match on SESSION_END_PING_REASONS
* etc.
Comment on attachment 8606691 [details] [diff] [review]
bug1161515-managing_previousSessionId.patch

Review of attachment 8606691 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetrySession.jsm
@@ +1911,3 @@
>                                  SESSION_STATE_FILE_NAME);
>  
>      // Try to load the "profileSubsessionCounter" from the state file.

We could change the comment while we're here.
"Try to load info about the previous session from the state file."?
Attachment #8606691 - Flags: feedback?(gfritzsche) → feedback+
Thanks Georg!
I think I got it, the test is passing on my computer.

Asking for review.
Attachment #8606691 - Attachment is obsolete: true
Attachment #8606740 - Flags: review?(gfritzsche)
Comment on attachment 8606740 [details] [diff] [review]
bug1161515-managing_previousSessionId.patch v2

Review of attachment 8606740 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, this is nearly done, only minor points now.

::: toolkit/components/telemetry/TelemetrySession.jsm
@@ +1442,5 @@
>      }
>  
>      // Generate a unique id once per session so the server can cope with duplicate
>      // submissions, orphaning and other oddities. The id is shared across subsessions.
> +    this._previousSessionId = this._sessionId;

This doesn't look right, we should just remove that line.
At this point |this._sessionId| is still null, so it happens to be harmless.

We don't have any data on the previous session until |_loadSessionData()| has finished.
The only time where |this._sessionId| could already have a value is in tests where we fake a browser restart and forget to reset it.

@@ +1910,5 @@
>    _loadSessionData: Task.async(function* () {
>      let dataFile = OS.Path.join(OS.Constants.Path.profileDir, DATAREPORTING_DIRECTORY,
>                                  SESSION_STATE_FILE_NAME);
>  
> +    // Try to load info about previous session / subsession from the state file.

Nit: Just "about the previous session"?

::: toolkit/components/telemetry/tests/unit/test_SubsessionChaining.js
@@ +51,5 @@
>    Assert.equal(previousPing.payload.info.subsessionCounter, 1,
>                 "subsessionCounter must be 1 the first time.");
>  
>    let expectedSubsessionCounter = 1;
> +  let actualPreviousSessionId = previousPing.payload.info.previousSessionId;

Nit: expectedPreviousSessionId?

@@ +61,5 @@
>      do_print("Archive entry " + i + " - id: " + currentPing.id + ", reason: " + currentInfo.reason);
>  
>      Assert.equal(aExpectedReasons.shift(), currentInfo.reason,
>                   "Telemetry should only get pings with expected reasons.");
> +    if (SESSION_END_PING_REASONS.has(previousInfo.reason)) {

I think we don't need to handle this explicitly here.
I'd suggest to just test for |Assert.equal(expectedPreviousSessionId, currentInfo.previousSessionId)| here.

Then change how we handle expectedSubsessionCounter below to also update expectedPreviousSessionId as needed.
Attachment #8606740 - Flags: review?(gfritzsche) → feedback+
OK, all points fixed.
Asking for feedback because I don't know what you prefer for handling |expectedPreviousSessionId|: I used the same code pattern as for |expectedSubsessionCounter|, but in its current state it's not really elegant.

Maybe I could use something like:
> if (SESSION_END_PING_REASONS.has(currentInfo.reason)) {
>   expectedSubsessionCounter = 1;
>   expectedPreviousSessionId = currentInfo.sessionId;
> } else {
>   expectedSubsessionCounter++;
> }
to avoid doing twice the comparison on |currentInfo.reason|?
Attachment #8606740 - Attachment is obsolete: true
Attachment #8607049 - Flags: feedback?(gfritzsche)
(In reply to Qeole [:qeole] from comment #10)
> OK, all points fixed.
> Asking for feedback because I don't know what you prefer for handling
> |expectedPreviousSessionId|: I used the same code pattern as for
> |expectedSubsessionCounter|, but in its current state it's not really
> elegant.
> 
> Maybe I could use something like:
> > if (SESSION_END_PING_REASONS.has(currentInfo.reason)) {
> >   expectedSubsessionCounter = 1;
> >   expectedPreviousSessionId = currentInfo.sessionId;
> > } else {
> >   expectedSubsessionCounter++;
> > }
> to avoid doing twice the comparison on |currentInfo.reason|?

Yes please, that looks much more readable :)
Attachment #8607049 - Flags: feedback?(gfritzsche) → feedback+
Updated.
Should I go for a Try push?
Attachment #8607049 - Attachment is obsolete: true
Attachment #8607072 - Flags: review?(gfritzsche)
Comment on attachment 8607072 [details] [diff] [review]
bug1161515-managing_previousSessionId.patch v4

Review of attachment 8607072 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, that looks good.

Yes, let's do a try push - xpcshell only on mac/linux/windows/android.

::: toolkit/components/telemetry/tests/unit/test_SubsessionChaining.js
@@ +73,5 @@
>  
>      // Store the current ping as previous.
>      previousPing = currentPing;
>      // Reset the expected subsession counter, if required. Otherwise increment the expected
>      // subsession counter.

Let's fix the comment too. Suggesting:
"// If this is the final subsession of a session we need to update expected values accordingly."
Attachment #8607072 - Flags: review?(gfritzsche) → review+
First try run failed:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a62b88ca76bf

I added a line about sessionId in test test_TelemetrySessions.js, now everything passes:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bbb03aff2a90

There are tests in this file asserting that subsession IDs and counters are sent correctly, and written correctly to datareporting/session-state.json file.
Do we want to update them as well?
Attachment #8607072 - Attachment is obsolete: true
Attachment #8607523 - Flags: feedback?(gfritzsche)
(In reply to Qeole [:qeole] from comment #14)
> I added a line about sessionId in test test_TelemetrySessions.js, now
> everything passes:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=bbb03aff2a90
> 
> There are tests in this file asserting that subsession IDs and counters are
> sent correctly, and written correctly to datareporting/session-state.json
> file.
> Do we want to update them as well?

Yes, good point, sorry i missed that.

These are the location i can see on a quick pass that could be extended to previousSessionId / persisted sessionId handling:
https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l239
https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1170
https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1245
https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1256
https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1274
Comment on attachment 8607523 [details] [diff] [review]
bug1161515-managing_previousSessionId.patch v5

This look fine, thanks.
To ease things we could put extending the test coverage into another patch on this bug and treat this one as reviewed.
Attachment #8607523 - Flags: feedback?(gfritzsche) → review+
Question about |fakeGenerateUUID()|: sometimes it's called after a test to reset fake generation to default (see [1] or [2]), and sometimes it is not (e.g. [3] and [4]).
It does not seem to cause problems in current state, but maybe harmonizing it (resetting at the end of each test that called the function) could be an idea?

[1] https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l559
[2] https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1236

[3] https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1185
[4] https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1264
Attachment #8607523 - Attachment is obsolete: true
I found nothing to add at line [1].
I added some checks on sessionId in all other locations stated in comment #15.
The test passes on my computer.

[1] https://hg.mozilla.org/mozilla-central/annotate/f65cc0022a0e/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js#l1256
Attachment #8608317 - Flags: feedback?(gfritzsche)
Comment on attachment 8608317 [details] [diff] [review]
bug1161515-test_TelemetrySession.js-extended_test_coverage.patch v1

Review of attachment 8608317 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, thanks!
Attachment #8608317 - Flags: feedback?(gfritzsche) → feedback+
(In reply to Qeole [:qeole] from comment #17)
> Question about |fakeGenerateUUID()|: sometimes it's called after a test to
> reset fake generation to default (see [1] or [2]), and sometimes it is not
> (e.g. [3] and [4]).
> It does not seem to cause problems in current state, but maybe harmonizing
> it (resetting at the end of each test that called the function) could be an
> idea?

That is a good point. If you wanted to do that we could put it in a separate patch here or in a follow-up bug.
(In reply to Georg Fritzsche [:gfritzsche] from comment #20)
> Comment on attachment 8608317 [details] [diff] [review]
> bug1161515-test_TelemetrySession.js-extended_test_coverage.patch v1
> 
> Review of attachment 8608317 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks good, thanks!

Should I consider that one as reviewed as well?

(In reply to Georg Fritzsche [:gfritzsche] from comment #21)
> (In reply to Qeole [:qeole] from comment #17)
> > Question about |fakeGenerateUUID()|: sometimes it's called after a test to
> > reset fake generation to default (see [1] or [2]), and sometimes it is not
> > (e.g. [3] and [4]).
> > It does not seem to cause problems in current state, but maybe harmonizing
> > it (resetting at the end of each test that called the function) could be an
> > idea?
> 
> That is a good point. If you wanted to do that we could put it in a separate
> patch here or in a follow-up bug.

Added in a separate (third) patch and asking for review.
Attachment #8608656 - Flags: review?(gfritzsche)
Comment on attachment 8608317 [details] [diff] [review]
bug1161515-test_TelemetrySession.js-extended_test_coverage.patch v1

Review of attachment 8608317 [details] [diff] [review]:
-----------------------------------------------------------------

Ah, sorry - yes, this patch is fine.
Attachment #8608317 - Flags: feedback+ → review+
Comment on attachment 8608656 [details] [diff] [review]
bug1161515-reset_fakeGenerateUUID.patch v1

Review of attachment 8608656 [details] [diff] [review]:
-----------------------------------------------------------------

I see fakeGenerateUUID() being used in more test functions.
Ideally we would reset it at the end of each test function that sets it so that we don't depend on a specific test order etc.
Attachment #8608656 - Flags: review?(gfritzsche) → feedback+
(In reply to Georg Fritzsche [:gfritzsche] from comment #24)
> Comment on attachment 8608656 [details] [diff] [review]
> bug1161515-reset_fakeGenerateUUID.patch v1
> 
> Review of attachment 8608656 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I see fakeGenerateUUID() being used in more test functions.
Could you be more specific please? I found 4 uses in test_TelemetrySession.js (listed in comment #17), 2 of them already being followed with a |fakeGenerateUUID(generateUUID, generateUUID)| to reset generators.
The patch addresses the other 2; but I couldn't find any more calls, in this file or in any other?

> Ideally we would reset it at the end of each test function that sets it so
> that we don't depend on a specific test order etc.
Agreed. Does this mean you expect |fakeGenerateUUID(generateUUID, generateUUID)| to be the very last instruction in the test, or is it sufficient to do it after all IDs generation in the test (as I did in current version of patch)?
(In the former case, should I also move existing calls to the end of the test function?)
I forgot to set any flag, needinfo for Georg
Status: NEW → ASSIGNED
Flags: needinfo?(gfritzsche)
I see, i missed the existing resets - thanks!
Flags: needinfo?(gfritzsche)
Attachment #8608656 - Flags: review+
(In reply to Qeole [:qeole] from comment #25)
> > Ideally we would reset it at the end of each test function that sets it so
> > that we don't depend on a specific test order etc.
> Agreed. Does this mean you expect |fakeGenerateUUID(generateUUID,
> generateUUID)| to be the very last instruction in the test, or is it
> sufficient to do it after all IDs generation in the test (as I did in
> current version of patch)?
> (In the former case, should I also move existing calls to the end of the
> test function?)

I was assuming that we only reset fakeGenerateUUID() in 2 of the 4 places, so i thought we'd depend on test order for cleanup. My mistake.
Lets do a try push with xpcshell tests across desktop platforms, Android & B2G desktop.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=effbaa5b3309

Everything seems to be good. May I ask for check-in?
Flags: needinfo?(gfritzsche)
Yes, that looks good, thanks :)

You can set the review+ yourself for the first patch in the attachment details so it's clear that it was reviewed.
Flags: needinfo?(gfritzsche)
Attachment #8608656 - Attachment is obsolete: true
Attachment #8608313 - Flags: review+
Attachment #8610535 - Flags: review+
Comment on attachment 8610537 [details] [diff] [review]
bug1161515-reset_fakeGenerateUUID.patch v1 r+

Review of attachment 8610537 [details] [diff] [review]:
-----------------------------------------------------------------

Updating comments of patches with name of reviewer.
I also set back the r+ flag on the three of them.

Asking for check-in for the three patches.
Attachment #8610537 - Flags: review+
Keywords: checkin-needed
Whiteboard: [lang=js] [good next bug] → [lang=js] [good next bug] [uplift]
Comment on attachment 8608313 [details] [diff] [review]
bug1161515-managing_previousSessionId.patch v5 r+

Approval Request Comment
[Feature/regressing bug #]:
Unified Telemetry, https://wiki.mozilla.org/Unified_Telemetry
This is part of the first (main) batch of uplifts to 40 to enable shipping on that train, see bug 1120356, comment 2.
[User impact if declined]:
Data & measurement insight projects delayed or blocked with direct impact on projects depending on this.
[Describe test coverage new/current, TreeHerder]:
We have good automation coverage of the feature.
We also had manual tests of the main tasks as well as confirmation of correct behavior on Nightly for the patches here.
[Risks and why]:
Low-risk - these patches are rather isolated to Telemetry and have been on Nightly for a while with no bad reports.
We intend to track on-going data quality and other issues during the 40 aurora & beta and flip the new behavior off when it doesn't meet the requirements.
[String/UUID change made/needed]:
The only string changes were to the about:telemetry page.
We decided that we can live with missing translations on that page for a cycle as that page is not exactly user-facing.
Attachment #8608313 - Flags: approval-mozilla-aurora?
Comment on attachment 8610535 [details] [diff] [review]
bug1161515-test_TelemetrySession.js-extended_test_coverage.patch v1 r+

Approval Request Comment
[Feature/regressing bug #]:
Unified Telemetry, https://wiki.mozilla.org/Unified_Telemetry
This is part of the first (main) batch of uplifts to 40 to enable shipping on that train, see bug 1120356, comment 2.
[User impact if declined]:
Data & measurement insight projects delayed or blocked with direct impact on projects depending on this.
[Describe test coverage new/current, TreeHerder]:
We have good automation coverage of the feature.
We also had manual tests of the main tasks as well as confirmation of correct behavior on Nightly for the patches here.
[Risks and why]:
Low-risk - these patches are rather isolated to Telemetry and have been on Nightly for a while with no bad reports.
We intend to track on-going data quality and other issues during the 40 aurora & beta and flip the new behavior off when it doesn't meet the requirements.
[String/UUID change made/needed]:
The only string changes were to the about:telemetry page.
We decided that we can live with missing translations on that page for a cycle as that page is not exactly user-facing.
Attachment #8610535 - Flags: approval-mozilla-aurora?
Comment on attachment 8610537 [details] [diff] [review]
bug1161515-reset_fakeGenerateUUID.patch v1 r+

Approval Request Comment
[Feature/regressing bug #]:
Unified Telemetry, https://wiki.mozilla.org/Unified_Telemetry
This is part of the first (main) batch of uplifts to 40 to enable shipping on that train, see bug 1120356, comment 2.
[User impact if declined]:
Data & measurement insight projects delayed or blocked with direct impact on projects depending on this.
[Describe test coverage new/current, TreeHerder]:
We have good automation coverage of the feature.
We also had manual tests of the main tasks as well as confirmation of correct behavior on Nightly for the patches here.
[Risks and why]:
Low-risk - these patches are rather isolated to Telemetry and have been on Nightly for a while with no bad reports.
We intend to track on-going data quality and other issues during the 40 aurora & beta and flip the new behavior off when it doesn't meet the requirements.
[String/UUID change made/needed]:
The only string changes were to the about:telemetry page.
We decided that we can live with missing translations on that page for a cycle as that page is not exactly user-facing.
Attachment #8610537 - Flags: approval-mozilla-aurora?
Comment on attachment 8610537 [details] [diff] [review]
bug1161515-reset_fakeGenerateUUID.patch v1 r+

Unified telemetry is an important new feature. It is blocking some other projects. Taking it.
Attachment #8610537 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8610535 [details] [diff] [review]
bug1161515-test_TelemetrySession.js-extended_test_coverage.patch v1 r+

Unified telemetry is an important new feature. It is blocking some other projects. Taking it.
Attachment #8610535 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8608313 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.