Closed Bug 786788 Opened 12 years ago Closed 11 years ago

Distribution Prediction, Build Migration and Early/New Adopters

Categories

(Toolkit :: Telemetry, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: joy, Assigned: froydnj)

References

Details

(Whiteboard: [leave open])

Attachments

(5 files, 8 obsolete files)

2.25 KB, patch
vladan
: review+
Details | Diff | Splinter Review
2.97 KB, patch
vladan
: review+
Details | Diff | Splinter Review
3.96 KB, patch
Details | Diff | Splinter Review
4.87 KB, patch
Details | Diff | Splinter Review
8.49 KB, patch
Details | Diff | Splinter Review
A bug (see ) requested that 

(1) we be able to 'predict the distribution'
of measurements based on a certain number of days of data. 
(see, related to (1) of https://metrics.mozilla.com/projects/browse/METRICS-995, though to be honest i dont understand this comment  -taras can chime in here)

Also,(2) a comment hypothesizes that early
adopters submit different measurements compared to later adopters.
(see (2) of https://metrics.mozilla.com/projects/browse/METRICS-995)

 (3) Dynamics of migration from
buildid to buildid (see https://bugzilla.mozilla.org/show_bug.cgi?id=765010)

To answer , we did some rough analysis.

Data Collected:
- all Nightly builds between 20120702 and 20120801

- for a given buildid, 99% of the submissions come within 21 days of the release of a new
  buildid. Within the first 14 days this is about 96%


We then collected 15 days of data for 91 buildids i.e. for a buildid, collect 15 days of data
starting from the day of the buildid. Thus every buildid has a 15 days of data.

- the percent picked up depends on when the build went out: if the build is on Saturday, more % is
  picked up within 3 days as opposed to a build being released on Friday. This is obvious since
  Friday builds hit the weekend. As a result we should collect 7 days  - depending on day of
  release, 7 days picks up about 90% of the data.


* How does a Measurement change based on # of days of data collected:

We looked at EVENTLOOP_UI_LAG_EXP_MS. If the distribution for this based on first 3 days, next 4-7
days, 8-11 days . For this measurement there is *no difference* between the first 3 days, next 4-7
days etc.

Keep in mind, data returned in the first 3 days could possibly be from 'early adopters', by the 4th
day they might have moved onto new builds, hence the data in 4-7 days is from slower
adopters. However this need not be the case. 

since the distribution for EVENTLOOP_UI_LAG_EXP_MS does not change, the distribution based on 7 days
is the predicted distribution! However there is no guarantee that this be the case for other
measures (especially ones that depend on the dynamics of the internet)

For example, DISK_CACHE_CORRUPT_DETAILS, a categorical variable, the distribution for 4-7 days *is*
different from the distribution for <=3 days.

A priori, it is difficult to say if a variable will change distribution with days.

** Summary for (1)
So to answer (1)  'predict the distribution', it is not always possible and ideally, we should wait
for 7 days of data(for a given buildid)  since that is about 90% of any data submitted and waiting
another 14 days for the remaining 10% is not probably worthwhile if we assume that the new 10% will
not drastically change the distribution.

** Summary for (2) and (3)


In some sense looking at data partitioned by first 3 days, 4-7 days, 7-11 days delineates the early
adopters (the first 3 days), from later adopters (4-7 days). However, people can submit data in all
3 periods or just first one, or even the second one (they could have moved from an old build in
which case they are late adopters). The best inference we can make is if the submission date is far
removed from the buildid, then the packet is from a late adopter. 

However this doesn't provide any help towards Migration - how many users moved from one build to
another in how many days. For this I recommend: add "LastBuildID" field to the packet . This field
is populated if the last buildid is different from the json$info$appBuildID, otherwise it is null.

We then have an idea of how many packets have migrated from one buildid to another.

This still doesn't tell us anything about unique users. So we can add one more field
"LastSubmissionDate" or "DaysSinceLastSubmission". This tells us how many unique users on a given
build.

Using these two we can provide migration curves and early/new adopter analysis.
> another in how many days. For this I recommend: add "LastBuildID" field to
> the packet . This field

> 
> This still doesn't tell us anything about unique users. So we can add one
> more field
> "LastSubmissionDate" or "DaysSinceLastSubmission". This tells us how many
> unique users on a given
> build.

Both of these are privacy preserving. Adding them cannot infringe privacy and can only benefit analyses.

Cheers
Assignee: nobody → nfroyd
OS: Mac OS X → All
Hardware: x86 → All
Just to clarify, you want LastBuildID and LastSubmissionDate only in the first ping from a given session, correct?  So if we land this feature, the first several pings from an updated client would look like:

session ID X, ping 1: no LastBuildID, no LastSubmissionDate
session ID X, ping 2: no LastBuildID, no LastSubmissionDate
...
session ID Y, ping 1: LastBuildID present, LastSubmissionDate present
session ID Y, ping 2: no LastBuildID, no LastSubmissionDate
...
session ID Z, ping 1: LastBuildID present, LastSubmissionDate present
...
Thanks for spending time on this. My comments:

> session ID X, ping 1: no LastBuildID, no LastSubmissionDate

Correct. So the feature has landed and this the first ping subsequent to that. We have no idea of LastSubmissionDate so this is empty.
Similarly, we have no idea of LastBuildID, so this too is empty.



> session ID X, ping 2: no LastBuildID, no LastSubmissionDate

Now, ping 2 can retrieve information for LastSubmissionDate (i.e. the date of submission (YYYYMMDD) of ping 1) and so LastSubmissionDate := date of submission of Ping 1
LastBuildID is only filled if appBuildID != buildID of previous ping, so since this is the same session X, I am assuming appBuildID of ping 2 == appBuildID of ping 1


...
> session ID Y, ping 1: LastBuildID present, LastSubmissionDate present
Yes 

> session ID Y, ping 2: no LastBuildID, no LastSubmissionDate
Same logic as (session ID X, ping 2): i guess appBuildID cannot change for the same session, but  LastSubmissionDate := date of submission of Session ID Y, ping 1 


...
> session ID Z, ping 1: LastBuildID present, LastSubmissionDate present

yes.

Sounds right?
Hello,

May i know the status on this?
I haven't touched this bug due to constraints elsewhere.  I don't think it's difficult, I've just had other priorities the last two weeks.
I am going to want to privacy folks to look at this, though.  Sharing data across sessions as described in comment 3 is sufficient for me to want an expert to look at this.

What is this giving us that Firefox Health Report does not?
1. FHR does not carry any telemetry data.

2. The only thing shared is last buildid and last submission date - this does not necessarily link packet A to packet B unless there is exactly one packet that has the indicated last buildid and last submission date. Even then , one gets the last two packets and still one has telemetry data.
:geekboy, this looks fine to me, but maybe I'm missing something. Could you weigh in please?
Flags: needinfo?(sstamm)
Looks fine to me.  Double-checking with Tom.
Flags: needinfo?(sstamm) → needinfo?(tom)
I think that this change is consistent with the privacy statements that we make regarding Telemetry. I do not think that this behavior would violate the privacy expectations of a user who has Telemetry turned on.

However, I'd like to triple check with... just kidding: I think we're good here.
Flags: needinfo?(tom)
Sid and Tom have both signed off, removing p-r-n.
One question about sending LastSubmittedDate timestamps: we're going to run into a situation like:

session X, ping N: sent at time T
<session X shuts down, saves ping N+1 with LastSubmittedDate of T>
<saves LastSubmittedDate of T somewhere else as well>
session Y, ping 1: sent with LastSubmittedDate of T
session Y, sending saved pings: sends session X, ping N+1 with LastSubmittedDate of T

so we're winding up with two pings that both have the same LastSubmittedDate.  Is that going to cause problems for whatever analyses are being run?

The easiest way out of this is to send the first ping from session Y with whatever time session X's N+1 ping was saved at, but that's not quite right for the purposes of analysis either.
Flags: needinfo?(sguha)
This is a cleanup so that it's easier to tell when we're sending current
session pings.  And it's just a nice cleanup in general.
Attachment #690500 - Flags: review?(vdjeric)
Just what it says on the tin.
Attachment #690501 - Flags: review?(vdjeric)
We're going to write out the values for lastSubmittedDate and lastSubmittedBuildID
as a JSON object; we might as well have a generic function that takes care of
all the grotty details.
Attachment #690503 - Flags: review?(vdjeric)
...and finally, what we've all been waiting for.
Attachment #690504 - Flags: review?(vdjeric)
Of course the object destructuring syntax doesn't work when assigning to this.FOO.
Attachment #690503 - Attachment is obsolete: true
Attachment #690503 - Flags: review?(vdjeric)
Attachment #690554 - Flags: review?(vdjeric)
Attachment #690504 - Attachment is obsolete: true
Attachment #690504 - Flags: review?(vdjeric)
Attachment #690555 - Flags: review?(vdjeric)
(In reply to Nathan Froyd (:froydnj) from comment #12)
> One question about sending LastSubmittedDate timestamps: we're going to run
> into a situation like:
> 
> session X, ping N: sent at time T
> <session X shuts down, saves ping N+1 with LastSubmittedDate of T>
> <saves LastSubmittedDate of T somewhere else as well>
> session Y, ping 1: sent with LastSubmittedDate of T
> session Y, sending saved pings: sends session X, ping N+1 with
> LastSubmittedDate of T
> 
> so we're winding up with two pings that both have the same
> LastSubmittedDate.  Is that going to cause problems for whatever analyses
> are being run?
> 
> The easiest way out of this is to send the first ping from session Y with
> whatever time session X's N+1 ping was saved at, but that's not quite right
> for the purposes of analysis either.

Yes I had realized this earlier but hadn't gotten around to commenting on it.

The uses of this data is to
1) Typical inter-arrival time of the session pings: how often is the browser being used
2) Typical time since last build version - dynamics of shift from build to build (there have been concerns about 'slow adopters')

So,

LastSubmissionDate := date of the last idle-daily ping submission
LastBuild :=  The last buildid (present if not equal to current build id)

So for 
Caveats:
I should point out two oversights in comment [1]

1. LastSubmissionDate uses the same theory as Days Since last Ping (see [1] and [2])
however, though the theory was good, there have been issues counting "unique number of users", see [3], so getting the correct unique pings might or might not happen.
2. LastBuild can be used to tag sessions as coming from fast migrators or not.



One last request, do you think it's possible to include in every ping

TotalNumberOfSubmittedSessionsOnThisBuild

this count is inclusive of the current ping.

Why: if last build is old, we might  think this session coming from an installation with infrequent use. The TotalNumberOfSubmittedSessionsOnThisBuild indicates activity on this build. This corresponds totalPingCount of [4]

Use:
a) well only to segment/profile based on histograms/info vars sessions according to sessions from actively used installations or not. 

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=616835
[2] https://blog.mozilla.org/metrics/2011/04/13/using-the-new-days-last-ping-metric-to-look-at-firefox-4-downloads/
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=677617
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=620837
Flags: needinfo?(sguha)
(In reply to Saptarshi Guha from comment #20)
> (In reply to Nathan Froyd (:froydnj) from comment #12)
> > so we're winding up with two pings that both have the same
> > LastSubmittedDate.  Is that going to cause problems for whatever analyses
> > are being run?
> 
> Yes I had realized this earlier but hadn't gotten around to commenting on it.

OK, I think your comment suggests it's OK to have two pings with the same LastSubmissionDate.

Though I'm not sure about:

> The uses of this data is to
> 1) Typical inter-arrival time of the session pings: how often is the browser
> being used
> 2) Typical time since last build version - dynamics of shift from build to
> build (there have been concerns about 'slow adopters')
> 
> So,
> 
> LastSubmissionDate := date of the last idle-daily ping submission
> LastBuild :=  The last buildid (present if not equal to current build id)
> 
> So for 
> Caveats:

It looks like this bit got cut off, so I'm not sure...

> One last request, do you think it's possible to include in every ping
> 
> TotalNumberOfSubmittedSessionsOnThisBuild
> 
> this count is inclusive of the current ping.

That's pretty easy to add.  I'll add that as another patch.
> OK, I think your comment suggests it's OK to have two pings with the same
> LastSubmissionDate.
> 

Yes, basically a day has at most one idle-daily and it is tagged with characteristics of the last idle-daily.




> Though I'm not sure about:
> 
> > The uses of this data is to
> > 1) Typical inter-arrival time of the session pings: how often is the browser
> > being used
> > 2) Typical time since last build version - dynamics of shift from build to
> > build (there have been concerns about 'slow adopters')
> > 
> > So,

I just wanted to clarify what we can and cannot do with these two metrics.

lastBuild
----------

if we do a date subtraction i.e. PingSubmissionDate - lastBuildConvertedtoDate
then the histograms can be profiled by some rough indicator of rough fast  installation moves from build to build. If this differences is large, then the submission could have  come from a slow adopter . Now this is not entirely true, because this large difference might be an outlier in the "slow adopter"'s history...
(FHR contains this history)

lastSubmissionDate
------------------
 can be used to give a snapshot the activity of the installation (that sent this session ping). If last submission date was a day ago, the installation was used yesterday .... Theoretically it can be used to count unique # of session pings but my references above discuss some unexplainable glitches in the engineering
 

Hope that helps and thanks again for your time on this and the TotalNumberOfSubmittedSessionsOnThisBuild.
btw, just wanted to clarify

TotalNumberOfSubmittedSessionsOnThisBuild is the total number sessions submitted (:= saved_sessions + idle_daily) for that build
Comment on attachment 690500 [details] [diff] [review]
part 1 - let the uuid always be the slug and store the reason separately

What was the original motivation for the slug being set to the ping reason in test pings? To prevent test pings from accidentally getting submitted to Telemetry?
Attachment #690500 - Flags: review?(vdjeric) → review+
Attachment #690554 - Flags: review?(vdjeric) → review+
Comment on attachment 690555 [details] [diff] [review]
part 4 - persist lastSubmittedDate and lastSubmittedBuildID to files

>+  loadLastSubmittedValues: function loadLastSubmittedValues() {
>+    let file = this.lastSubmittedValuesFile();
>+    let channel = NetUtil.newChannel(file);
>+    channel.contentType = "application/json";
>+
>+    NetUtil.asyncFetch(channel, function(stream, result) {

Couldn't we just read these two values from the most recent savedPing?

>+        let v = JSON.parse(string);

Nit: use more descriptive variable name
Attachment #690555 - Flags: review?(vdjeric)
Comment on attachment 690501 [details] [diff] [review]
part 2 - send lastSubmittedDate and lastAppBuildID in telemetry pings

>+  // The appBuildID with which we successfully sent our last ping.
>+  _lastAppBuildID: null,
..
>     function onSuccess() {
>+      if (data.slug == this._uuid) {
>+        this._lastPingDate = new Date();
>+        this._lastAppBuildID = Services.appinfo.appBuildID;
>+      }
>       this.sendPingsFromIterator(server, reason, i);
>     }
..
>+    if (this._lastAppBuildID) {
>+      payloadObj.simpleMeasurements.lastSubmittedBuildID = this._lastAppBuildID;
>+    }
>+

Wouldn't this patch result in the lastSubmittedBuildId field being set on the 2nd ping from the same session? I thought Saptarshi only wanted that field set when the previous ping was from a different build?
Attachment #690501 - Flags: review?(vdjeric)
Comment on attachment 690502 [details] [diff] [review]
part 2.5 - add tests for lastSubmittedDate and lastSubmittedBuildID

Are we implementing the "TotalNumberOfSubmittedSessionsOnThisBuild" field in this bug? 

Also, isn't it potentially de-anonymizing to include that field, since there is now a counter linking a user's sessions?

e.g. If a user restarts an old build of Firefox an unusually high number of times (let's say 200), it's now possible to link his browsing sessions to each other and figure out their order. His sessions will have the same build ID + this new field incrementing sequentially
Attachment #690502 - Flags: review?(vdjeric)
> Wouldn't this patch result in the lastSubmittedBuildId field being set on
> the 2nd ping from the same session? I thought Saptarshi only wanted that
> field set when the previous ping was from a different build?

it doesn't matter if it is present. I only suggested that so to save space, but if you're okay with setting it, i dont mind.

If lastsubmittedBuildID == appBuildID, then all we know is that the this session and last session are from the same build.
Keep in mind telemetry submissions can already be fingerprinted using addon combinations,OS, arch and graphics adapters.  So it is possible to use addon combination,OS, arch, graphic adapters, submission dates together to chain these. Our intern Bing Han did an intern presentation using this method for Crash Reports (without using the stack info).

Also the blocklist ping has the total pings sent on that version. So the argument carries over for the blocklist ping too - so one could possibly chain those pings.

So yes, this can be used to devise algorithms to create a fingerprint but algorithms to do that already exist with the data we have. Do keep in mind, we are tracking a collection of metrics (OS, build, arch,...) and not a person. There is no data to tie this back to a person or even a device.

Also none of these algorithms (even with this count) is perfect; very  very far from it,- we can't create a reliable fingerprinting scheme (i.e. every algorithm will cause sessions from different installations to identified as same installation and other way around too). 


> Also, isn't it potentially de-anonymizing to include that field, since there
> is now a counter linking a user's sessions?
> 
> e.g. If a user restarts an old build of Firefox an unusually high number of
> times (let's say 200), it's now possible to link his browsing sessions to
> each other and figure out their order. His sessions will have the same build
> ID + this new field incrementing sequentially
(In reply to Vladan Djeric (:vladan) from comment #24)
> What was the original motivation for the slug being set to the ping reason
> in test pings? To prevent test pings from accidentally getting submitted to
> Telemetry?

I think that's part of it.  It also makes the implementation<->test interface slightly simpler if you're always sending to a fixed URL.  (Otherwise the test has to dig out whatever the UUID is going to be--or maybe it's possible to register a catch-all HTTP handler...dunno.)

(In reply to Vladan Djeric (:vladan) from comment #26)
> >+  // The appBuildID with which we successfully sent our last ping.
> >+  _lastAppBuildID: null,
> ..
> >     function onSuccess() {
> >+      if (data.slug == this._uuid) {
> >+        this._lastPingDate = new Date();
> >+        this._lastAppBuildID = Services.appinfo.appBuildID;
> >+      }
> >       this.sendPingsFromIterator(server, reason, i);
> >     }
> ..
> >+    if (this._lastAppBuildID) {
> >+      payloadObj.simpleMeasurements.lastSubmittedBuildID = this._lastAppBuildID;
> >+    }
> >+
> 
> Wouldn't this patch result in the lastSubmittedBuildId field being set on
> the 2nd ping from the same session? I thought Saptarshi only wanted that
> field set when the previous ping was from a different build?

Mmm.  I see.  Yes, this logic needs to be twiddled a bit.

(In reply to Vladan Djeric (:vladan) from comment #25)
> Comment on attachment 690555 [details] [diff] [review]
> part 4 - persist lastSubmittedDate and lastSubmittedBuildID to files
> 
> >+  loadLastSubmittedValues: function loadLastSubmittedValues() {
> >+    let file = this.lastSubmittedValuesFile();
> >+    let channel = NetUtil.newChannel(file);
> >+    channel.contentType = "application/json";
> >+
> >+    NetUtil.asyncFetch(channel, function(stream, result) {
> 
> Couldn't we just read these two values from the most recent savedPing?

We don't know which saved ping that is: there could be multiple pings that we read in.  We could use timestamps of the ping files to determine what's the most recent, but that gets messy, IMHO (and possibly incorrect).

> >+        let v = JSON.parse(string);
> 
> Nit: use more descriptive variable name

OK.

(In reply to Vladan Djeric (:vladan) from comment #27)
> Are we implementing the "TotalNumberOfSubmittedSessionsOnThisBuild" field in
> this bug? 

Sure.  I thought that could be an easy follow-on patch, but if you want me to roll it into the current patch series, I can do that.

> Also, isn't it potentially de-anonymizing to include that field, since there
> is now a counter linking a user's sessions?

I think Saptarshi's rebuttal of this question is correct.  Also, if we wanted a monotonically increasing ID for sessions, we can already get that from the server logs.  So I don't think there's any harm in including it.  But, in the spirit of this bug, let's double-check with Tom. ;)
Flags: needinfo?(tom)
I think I got the logic correct now.
Attachment #690501 - Attachment is obsolete: true
Attachment #691953 - Flags: review?(vdjeric)
Tests needed to be updated, since we're not sending lastSubmittedBuildID anymore.
Attachment #690502 - Attachment is obsolete: true
Attachment #691954 - Flags: review?(vdjeric)
Variable renamed.  Everything else stays the same because I think there are
good reasons for keeping things as a separate file, as mentioned already.
Attachment #690555 - Attachment is obsolete: true
Attachment #691958 - Flags: review?(vdjeric)
I think that Telemetry is plenty fingerprintable as it is, so "TotalNumberOfSubmittedSessionsOnThisBuild" (or "TNOSSOTB" as I like to call it, for short) doesn't make material difference. We could -- in principle -- reliably assess whether two Telemetry payloads are from the same source with or without TNOSSOTB, and our reliability only gets a *little* bit better with it. Besides, we're not trying to re-identify particular users through their Telemetry, and we're the only ones with this data, so that risk is pretty darn low.

tl;dr: little marginal risk, consistent with the spirit of the feature, doesn't violate and users' expectations. Go for it.
Flags: needinfo?(tom)
Comment on attachment 691953 [details] [diff] [review]
part 2 - send lastSubmittedDate and lastSubmittedBuildID in telemetry pings

>>+  // The last Date on which we successfully sent a ping.
>+  _lastPingDate: null,
>+  // The appBuildID with which we successfully sent our last ping.
>+  _lastSubmittedBuildID: null,
.. 
>+    if (this._lastPingDate) {
>+      payloadObj.simpleMeasurements.lastSubmittedDate = this._lastPingDate.getTime();

getTime returns epoch time in seconds instead of a calendar date. This clashes with the "lastSubmittedDate" field name + it causes us to include unnecessary precision. I'm aware I'm being a stickler ;)

My concerns about the counter-like behavior of the "TNOSSOTB" field arose under the assumption that we didn't already keep server logs that record submission times for individual pings.
Attachment #691953 - Flags: review?(vdjeric) → review-
Comment on attachment 691954 [details] [diff] [review]
part 2.5 - add tests for lastSubmittedDate and lastSubmittedBuildID

>--- a/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js
>+++ b/toolkit/components/telemetry/tests/unit/test_TelemetryPing.js
>@@ -22,16 +22,17 @@ const PATH = "/submit/telemetry/test-ping";
> const SERVER = "http://localhost:4444";
> const IGNORE_HISTOGRAM = "test::ignore_me";
> const IGNORE_HISTOGRAM_TO_CLONE = "MEMORY_HEAP_ALLOCATED";
> const IGNORE_CLONED_HISTOGRAM = "test::ignore_me_also";
> const ADDON_NAME = "Telemetry test addon";
> const ADDON_HISTOGRAM = "addon-histogram";
> const FLASH_VERSION = "1.1.1.1";
> const SHUTDOWN_TIME = 10000;
>+const APP_BUILD_ID = "2007010102";

What's the purpose of this global var now?

By the way, we could also add a test that confirms the lastSubmittedBuildId field is not present.
Attachment #691954 - Flags: review?(vdjeric)
Comment on attachment 691958 [details] [diff] [review]
part 4 - persist lastSubmittedDate and lastSubmittedBuildID to files

>+  loadLastSubmittedValues: function loadLastSubmittedValues() {
>+    let file = this.lastSubmittedValuesFile();
>+    let channel = NetUtil.newChannel(file);
>+    channel.contentType = "application/json";
>+
>+    NetUtil.asyncFetch(channel, function(stream, result) {

Nit: rename to asyncLoadLastSubmittedValues
Attachment #691958 - Flags: review?(vdjeric) → review+
By the way, if you add tests that check values read from TelemetryLastSubmittedValues.txt (which I think would be a good idea), you will have to change your interfaces to take a callback function (see bug 815709).. Otherwise you risk creating an intermittent orange from tests checking the payload for fields that haven't yet been asynchronously fetched from disk.

Similarly, this patch also introduces the (very unlikely) possibility of "incorrect" field values in about:telemetry, but I don't think that's a big deal.
I would like to confirm my understanding of LastSubmissionDate

Day 0: Installation had exactly one session lasting 3hrs. Idle daily sent.
Day 1: 
- One Idle daily is sent, no saved sessions sent because none exist (from Day0). (lastsubmissiondate == Day0) 
- Installation had 5 sessions on this day

Day 2:
- One idle daily sent, (lastsubmissiondate == Day1)
- 5 saved sessions sent (lastsubmissiondate == Day0)
See how the lastsubmission date for the 5 saved sessions sent on Day2 is Day0. That's because the saved-sessions which are being sent on Day 2 actually occurred on Day 1.

This sounds reasonable, right?Is this how it's implemented?
That sounds reasonable and I believe this is how the code is implemented. Nathan can confirm
Flags: needinfo?(nfroyd)
I am unclear why there's no saved sessions sent on Day1, unless you meant that the Day0 session carried over into Day1.  I'm also not clear on how the timing of the sessions on Day1 and Day2 work.  Neverthless, here's an answer to the question that I think you're trying to ask:

The lastsubmissiondate for the 5 saved sessions on Day1 depends on when they were saved relative to the idle daily ping that you say occurred on Day1.  So if you had something like this on Day1:

- session with idle-daily
- saved-session 1
- saved-session 2
- saved-session 3
- saved-session 4
- saved-session 5

then those saved-session pings would reflect that the lastsubmissiondate was on Day1.  However, if you had this pattern instead:

- saved-session 1
- saved-session 2
- saved-session 3
- session with idle-daily
- saved-session 4
- saved-session 5

then the first three saved-session pings would have a lastsubmissiondate of Day0, while the last two saved-session pings would have a lastsubmissiondate of Day1.

Does that answer your question?
Flags: needinfo?(nfroyd)
(In reply to Nathan Froyd (:froydnj) from comment #41)
> I am unclear why there's no saved sessions sent on Day1, unless you meant

My mistake. Day1 indeed has a saved session corresponding to the 3hr session of Day0.


As for the rest, this implementation does match comment 20. However i dont think i was precise. The date of the last idle-daily for saved-sessions is the date of the last idle-daily not on the same day as the current saved-session.

So for case 1:
date of last submission  = Day 0

And for case 2:
date of last submission = day 0


The reason for this is:

the lastsubmissiondate gives a rough idea (and very rough) of the activeness of the installation that generated a session. For case 1, sending the Day 1 would not tell us anything if the last day of use was 3 days ago or 1 day ago.


Moreover, on the server, case 1 and case 2 look the same, the submission dates are at DDMMYYYY level, so any ordering looks the same.

Cheers
Sapsi
(In reply to Saptarshi Guha from comment #42)
> As for the rest, this implementation does match comment 20. However i dont
> think i was precise. The date of the last idle-daily for saved-sessions is
> the date of the last idle-daily not on the same day as the current
> saved-session.
> 
> So for case 1:
> date of last submission  = Day 0
> 
> And for case 2:
> date of last submission = day 0
> 
> The reason for this is:
> 
> the lastsubmissiondate gives a rough idea (and very rough) of the activeness
> of the installation that generated a session. For case 1, sending the Day 1
> would not tell us anything if the last day of use was 3 days ago or 1 day
> ago.

Sorry, I'm trying to fit all the discussion we were having back into my head today.  So for:

Day 0 (the first session after an updated binary with this bug fixed has been installed)

session A: idle-daily ping
session B: no ping, saved-session
session C: no ping, saved-session

Day 1:

session D: no ping, saved-session
session E: no ping, saved-session

Day 2:

session F: no ping, saved-session
session G: idle-daily ping, saved-session

Day 3:

session H: no ping, saved session

Day 4:

session I: idle-daily, carries over to the next day

Day 5:

session I (continued): idle-daily, saved session
session J: no ping, saved-session

Day 6:

session K: no ping, saved-session

Apologies for the length of the example, wanted to try to get all the bases covered.  Let's enumerate all the values that lastsubmissiondate (hereafter LSD) will take on:

Day 0:

session A: no LSD
session B: no LSD (saved-session of the same day as idle-daily)
session C: no LSD (likewise)

Day 1:

session D: LSD = Day 0 (from session A)
session E: LSD = Day 0 (likewise)

Day 2:

session F: LSD = Day 0 (likewise)
session G: LSD = Day 0 (likewise)

Day 3:

session H: LSD = Day 2 (from session G)

Day 4:

session I: LSD = Day 2 (likewise)

Day 5:

session I (continued): LSD = Day 2 (likewise)
session J: LSD = Day 4 (from the start of session I)

Day 6:

session K: LSD = Day 5 (from the end of session I)

Are all these correct?
Flags: needinfo?(sguha)
Hello,

Thanks for this detailed example. i guess I assumed that idle-daily would be sent every day an installation was active. However that is not the case. As can be seen from Day 1, no idle-daily was sent despite the installation being active. Hence on day2, LSD is 0 though in some sense we would like it to be 1.

*Objective*: every submitted session has a DDMMYYYY (hereon called date) that identifies the **day** of last use of that installation (last active date=LAD)

Implementation: same for both  SS and ID. Every session,S, has a start time, say t1 (DDMMYYYY) and end end time t2.
We assume no two sessions for a profile can overlap.

Case 1. 
Let's assume t1 and t2 both have the same DDMMYYYY
Then the LAD is the last day a session occurred (need not have been submitted to server). That is find the most recent date (excluding the current one that is equal to this S's DDMMYYYY) with a session. Thus LAD has to be strictly less than DDMMYYYY of S's t1.

e.g. on monday, there were two sessions and no ping was sent, and on tuesday 4 sessions and one was idle-daily which was sent, all the LADs for tuesdays 4 sessions would correspond to Monday.

Case 2:
Let's assume DDMMYYYY of t1 is on day 1 and DDMMYYYY of t2 is the following day.
Since the session started on day 1, the LAD for this session will be strictly less than day 1. Thus the last active date is wrt to the start time of a session.

So for the above example


Day 0:
no lAD for any of A,B, or C because we haven't saved the dates of the last active date. This session being the first one with new bug.

session A: no LAD
session B: no LAD 
session C: no LAD (likewise)

Day 1:

session D: LAD = Day 0 (because there was at least one session on Day 0)
session E: LAD = Day 0 (likewise)

Day 2:

session F: LAD = Day 1 (likewise)
session G: LAD = Day 1 (likewise)

Day 3:

session H: LAD = Day 2 

Day 4:

session I: LAD = Day 3 

Day 5:

session I (continued): LAD = Day 3 ( I assume session I was not sent on Day4, but was sent on Day 5?) 
session J: LAD = Day 4 (because there was a session with a start date on Day 4)

Day 6:

session K: LAD = Day 5

Sorry for the confusion. My misunderstanding caused this needless mess. Hope this definition is simpler to understand and implement, and that i've been precise.


Cheers
Saptarshi
Flags: needinfo?(sguha)
(In reply to Saptarshi Guha from comment #44)
> *Objective*: every submitted session has a DDMMYYYY (hereon called date)
> that identifies the **day** of last use of that installation (last active
> date=LAD)
> 
> Implementation: same for both  SS and ID. Every session,S, has a start time,
> say t1 (DDMMYYYY) and end end time t2.
> We assume no two sessions for a profile can overlap.
> 
> Case 1. 
> Let's assume t1 and t2 both have the same DDMMYYYY
> Then the LAD is the last day a session occurred (need not have been
> submitted to server). That is find the most recent date (excluding the
> current one that is equal to this S's DDMMYYYY) with a session. Thus LAD has
> to be strictly less than DDMMYYYY of S's t1.
> 
> e.g. on monday, there were two sessions and no ping was sent, and on tuesday
> 4 sessions and one was idle-daily which was sent, all the LADs for tuesdays
> 4 sessions would correspond to Monday.
> 
> Case 2:
> Let's assume DDMMYYYY of t1 is on day 1 and DDMMYYYY of t2 is the following
> day.
> Since the session started on day 1, the LAD for this session will be
> strictly less than day 1. Thus the last active date is wrt to the start time
> of a session.

I think this is easier to implement, thanks for the clarifications.  If I'm not mistaken, these cases can be summed up as:

"The LAD for a session is the start date of the latest session prior to the current session that did not start on the same day as the current session."

Do you agree?  If we're in agreement, then applying the same rules to the build id, e.g.:

"The LastBuildId for a session is the build id of the latest session prior to the current session that did not start on the same day as the current session."

is OK?  I'm a little unclear on the rules for the build id because before this we were talking about the last submitted build id, but the above description says nothing about telemetry submissions for LAD and suggests that we apply the same rules to LBID ("Implementation: same for both  SS and ID...")

Just one little nit to point out:

> Day 4:
> 
> session I: LAD = Day 3 
> 
> Day 5:
> 
> session I (continued): LAD = Day 3 ( I assume session I was not sent on
> Day4, but was sent on Day 5?) 

You could have a ping sent in on day 4 and a ping sent in on day 5 from a single session.  Even so, I believe the answer you gave above is correct.
Flags: needinfo?(sguha)
> I think this is easier to implement, thanks for the clarifications.  If I'm
> not mistaken, these cases can be summed up as:
> 
> "The LAD for a session is the start date of the latest session prior to the
> current session that did not start on the same day as the current session."
> 
> Do you agree?

Yes. Let's change 'latest session' to 'most recent session'. Same thing either way. I think it's easier to parse but that's me and i'm not fixated on the language thing.


>  If we're in agreement, then applying the same rules to the
> build id, e.g.:
> 
> "The LastBuildId for a session is the build id of the latest session prior
> to the current session that did not start on the same day as the current
> session."
> 
> is OK?  I'm a little unclear on the rules for the build id because before
> this we were talking about the last submitted build id, but the above
> description says nothing about telemetry submissions for LAD and suggests
> that we apply the same rules to LBID ("Implementation: same for both  SS and
> ID...")
> 

True we hadn't discussed BuildID.  But your definition makes sense. 


By ("Implementation: same for both  SS and ID...") i speak in reference to the LAD implementation for saved-session(SS) and idle-daily(ID).



> Just one little nit to point out:
> 
> > Day 4:
> > 
> > session I: LAD = Day 3 
> > 
> > Day 5:
> > 
> > session I (continued): LAD = Day 3 ( I assume session I was not sent on
> > Day4, but was sent on Day 5?) 
> 
> You could have a ping sent in on day 4 and a ping sent in on day 5 from a
> single session.  Even so, I believe the answer you gave above is correct.

You mean a single  session can send two pings? Will it have the same session id? then it will overwrite itself in Hbase.
Flags: needinfo?(sguha)
Hi
Suppose after this lands, ten weeks later when every nightly Firefox running has this bug implemented what will be the value of last buildid for a completely fresh nightly  install? We can assume that telemetry is turned on.

Will it be missing? Will it be missing if and only if the installation is a fresh new download   of  nightly with no preexisiting profile?
Reworked patch, which I believe implements the algorithms we have discussed.
Attachment #691953 - Attachment is obsolete: true
Attachment #706341 - Flags: review?(vdjeric)
Attachment #691958 - Attachment is obsolete: true
Attachment #706342 - Flags: review?(vdjeric)
Attachment #691954 - Attachment is obsolete: true
Attachment #706343 - Flags: review?(vdjeric)
Comment on attachment 706342 [details] [diff] [review]
part 4 - persist lastActive{SessionDate,BuildID} to files

>+  // Whether we have written the LastActiveValues file this session.
>+  _haveWrittenLastActiveValuesFile: false,

Nit: The name is long + the "active values file" term isn't very clear. How about "_isSessionInfoPersisted" and "loadLastSessionInfo()"?

>     function onSuccess() {
>+      if (data.slug == this._uuid &&
>+          !this._haveWrittenLastActiveValuesFile) {
>+        this.saveLastActiveValues();
>+      }
>       this.sendPingsFromIterator(server, reason, i);
>     }

So this would write out the session info only on idle-daily? Why not save session info when saving the current session's ping on shutdown?

>+    NetUtil.asyncFetch(channel, (function(stream, result) {
>+      if (!Components.isSuccessCode(result)) {
>+        return;
>+      }
>+      try {
>+        let string = NetUtil.readInputStreamToString(stream,
>+                                                     stream.available(),
>+                                                     { charset: "UTF-8" });
>+        stream.close();
>+        let obj = JSON.parse(string);
>+        this._lastActiveSessionDate = new Date(obj._lastActiveSessionDate);
>+        this._lastActiveBuildID = obj._lastActiveBuildID;
>+      } catch (e) {
>+        stream.close();
>+        file.remove(true);
>+      }
>+    }).bind(this));

IUIC, this would do main thread I/O :(

>+    this.saveObjectToFile(obj, file, /*sync=*/true, /*overwrite=*/true,
>+                          (function(success, ostream) {
>+                            this._haveWrittenLastActiveValuesFile = true;
>+                            ostream.close();
>+                          }).bind(this));
>+  },

Where is saveObjectToFile defined?
Attachment #706342 - Flags: review?(vdjeric)
Comment on attachment 706343 [details] [diff] [review]
part 5 - add tests for lastActive{SessionDate,BuildID}

- It would be nice to have tests that check that lastActiveSessionDate and lastActiveBuildID ignore any builds & session's from "today"
- Related to my earlier comment, could we trigger the writing of the session info file by calling nsITelemetryPing's saveHistograms()?
Attachment #706343 - Flags: review?(vdjeric)
Comment on attachment 706341 [details] [diff] [review]
part 2 - send lastActive{SessionDate,BuildID} in telemetry pings

LGTM, but see next comment
Attachment #706341 - Flags: review?(vdjeric)
I re-read the comments on this bug before doing the code review, and I realized I'm a little bit fuzzy on the motivations for this bug. I apologize for digressing and possibly causing us to re-hash earlier concerns, but I want to make sure our implementation will be able to answer the questions we want answered.

Back in comment 1, Saptarshi spoke of 3 things:

1) Taras(?) wanted the dash to predict the final distribution in a histogram for a given build ID. Saptarshi answered this in the same comment: "it is not always possible and ideally, we should wait for 7 days of data". This makes sense to me.

2) There was a hypothesis that early adopters report different Telemetry numbers than later adopters and Saptarshi showed that this is sometimes the case for some histograms.

Question #2a: Are we still trying to verify this hypothesis?
Question #2b: Are we trying to track Telemetry from early adopters separately from later adopters? If so, why? 

3) Comment 1 also talked about tracking adoption & desertion trends. If I understand correctly, FHR will provide us with this data as well. 

Question #3: Why is the pairing of adoption data with Telemetry data more valuable than just the FHR adoption data alone?

---------

Two other concerns:

In comment 22, Saptarshi said:
> if we do a date subtraction i.e. PingSubmissionDate - lastBuildConvertedtoDate
> then the histograms can be profiled by some rough indicator of rough fast
> installation moves from build to build. If this differences is large, then the
> submission could have come from a slow adopter.

Question #4: Assuming we want to distinguish early adopters from late adopters in Telemetry reports, we can write a patch that provides this info directly. i.e. we can have a field "isEarlyAdopter" that is true only if the current build was installed within 3 days of its release. Would this be preferable?

Also in comment 22, Saptarshi said:
> lastSubmissionDate can be used to give a snapshot the activity of the
> installation (that sent this session ping). If last submission date was a day 
> ago, the installation was used yesterday.

Question #5: I'm certain FHR provides this information already. Why combine this info with Telemetry pings?

Thank you for your patience :)
Flags: needinfo?(sguha)
(In reply to Vladan Djeric (:vladan) from comment #56)
> I re-read the comments on this bug before doing the code review, and I
> realized I'm a little bit fuzzy on the motivations for this bug. I apologize
> for digressing and possibly causing us to re-hash earlier concerns, but I
> want to make sure our implementation will be able to answer the questions we
> want answered.
> 
> Back in comment 1, Saptarshi spoke of 3 things:
> 
> 1) Taras(?) wanted the dash to predict the final distribution in a histogram
> for a given build ID. Saptarshi answered this in the same comment: "it is
> not always possible and ideally, we should wait for 7 days of data". This
> makes sense to me.
> 
> 2) There was a hypothesis that early adopters report different Telemetry
> numbers than later adopters and Saptarshi showed that this is sometimes the
> case for some histograms.

My proxy for early adopter was looking at all sessions running with
older buildids The way it was done was fixing a day and looking at
sessions submitted on that day and then partitioning by Diff=current day - builid of session packet.
The ones with larger diff were considered slow adopters. This is not a fixed definition
just that adoption is slower with larger diff.
However the problem with this definition is that an entry in cell (current,day, Diff)
could well be in a different cell in another day.

With lastdistinctbuildid, if an installation sends a session on two
different days and yet does not update, then we can still see the
/last/ build id for that session (not the current one). Based on the
difference of last build id and current build id we can define the
slowness of adoption.  This does not mean the intallation is a slow
adopter etc (that is a general label and needs long term study).  Just
this session came from an installation that had or had not moved
rapidly from one build to the current.



>  Question #2a: Are we still trying to verify this hypothesis?

Yes, somwehat. I wouldn't say it is high priority, but if someone
wanted to look at distribution of X controlling for installations who rapidly moved to their current build
i would look at X controlling for (current build - last build in days)


> Question #2b: Are we trying to track Telemetry from early adopters
> separately from later adopters? If so, why?  

See above, I've had questions from suggesting that we should not look
within 4 days rather 7 days etc. My analysis of 7 days is on the safe
side, because from the release of a given buildid, about 85% (approx)
of all submission on that build id arrive within 7 days.  If we were
to always wait at least 7 days, then adding this is not required. 
Never the less, the initial question was never properly answered.


3) Comment 1 also
> talked about tracking adoption & desertion trends. If I understand
> correctly, FHR will provide us with this data as well.  

But FHR cannot be connected to telemetry in an way at all.


Question #3:
> Why is the pairing of adoption data with Telemetry data more
> valuable than just the FHR adoption data alone? 

Because they are of different populations. I've had numerous questions about Telemetry adoption rate
or even how many unique active installations on telemetry. 


 --------- Two other
> concerns: In comment 22, Saptarshi said: > if we do a date
> subtraction i.e. PingSubmissionDate - lastBuildConvertedtoDate >
> then the histograms can be profiled by some rough indicator of rough
> fast > installation moves from build to build. If this differences
> is large, then the > submission could have come from a slow adopter.
> Question #4: Assuming we want to distinguish early adopters from
> late adopters in Telemetry reports, we can write a patch that
> provides this info directly. i.e. we can have a field
> "isEarlyAdopter" that is true only if the current build was
> installed within 3 days of its release.

I  would hesitate convert a variable that can take positive values into just a discrete 0/1 value?
Since the potential discriminating power of tis difference is yet unknown. 


> Also in comment 22, Saptarshi said: > lastSubmissionDate can be used
> to give a snapshot the activity of the > installation (that sent
> this session ping). If last submission date was a day > ago, the
> installation was used yesterday.  Question #5: I'm certain FHR
> provides this information already. Why combine this info with
> Telemetry pings? 

FHR is a loooong way into the future. We can't study telemetry
measurements controlling for activity when activity is in FHR. 
Also, for idle-daily sessions(once a day), 

DaysSinceLastPing:= current date - lastsubmissiondate

This can be used (though the current version in blocklist appears
borked) to compute unique installations. See https://bugzilla.mozilla.org/show_bug.cgi?id=616835
Flags: needinfo?(sguha)
> > Question #2b: Are we trying to track Telemetry from early adopters
> > separately from later adopters? If so, why?  
> 
> [..] My analysis of 7 days is on the safe
> side, because from the release of a given buildid, about 85% (approx)
> of all submission on that build id arrive within 7 days.  If we were
> to always wait at least 7 days, then adding this is not required. 

It doesn't seem too terrible to have to wait 7 days for a build's distribution to stabilize.

> Question #3:
> > Why is the pairing of adoption data with Telemetry data more
> > valuable than just the FHR adoption data alone? 
> 
> Because they are of different populations. I've had numerous questions about
> Telemetry adoption rate
> or even how many unique active installations on telemetry. 

Here's my thinking:

1) I think the Telemetry population will be a subset of the FHR population, because FHR is opt-out (vast majority of people will have it on by default) and Telemetry is opt-in (~1% of release-channel users will participate). I don't think there will be many Telemetry users who are not FHR users. We could verify this claim with a Telemetry field "isFHRenabled" if we like.

2) Since Telemetry population is a subset of the FHR population, we can get information about the Telemetry population from FHR data simply by adding an "isTelemetryUser" flag to FHR pings. I filed bug 837292 for this. Then we would know things such as the Telemetry opt-in rate, unique active installations, the build adoption/desertion curves, etc.

3) I don't think there's great value in segregating Telemetry users into early-adopter vs late-adopter buckets. I think it would be much more meaningful to group Telemetry users by OS version, hardware profile, storage device type (SSD vs magnetic disk) and other environment details which we know affect performance measurements.

> > Question #5: I'm certain FHR
> > provides this information already. Why combine this info with
> > Telemetry pings? 
> 
> FHR is a loooong way into the future. 

FHR is already in the current Nightly & submitting reports. Do you mean it's a long way away from being on the release channel?

> We can't study telemetry
> measurements controlling for activity when activity is in FHR. 

Which Telemetry measurements are you thinking of? I can't think of any performance measurements that would benefit from controlling for installation activity.
We are still in the process of validating the incoming numbers. Until we are confident that  numbers make sense, 
we will not trust the results derived from FHR.

I'm alright with none of this (i.e. no lastbuildid, nor lastsubmisiondate). That said, let's not again ask the questions that led to this bug.
Ok, I'll mark it wontfix then
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: