Closed Bug 719167 Opened 8 years ago Closed 4 years ago

telemetry should periodically save histograms

Categories

(Toolkit :: Telemetry, defect)

defect
Not set

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: froydnj, Unassigned)

References

Details

Once bug 707320 goes in, we should periodically save the state of histograms to disk so that we don't lose useful information in the case of a browser crash or similar.
This change is contrary to a risk resolution in the privacy review:

https://wiki.mozilla.org/Privacy/Reviews/Telemetry#Conformity_to_Private_Browsing_Mode (scroll down to the resolution box)

"Telemetry data always kept only in volatile memory or temporarily persisted to disk (see above). Telemetry collection and reporting is entirely suspended when private mode is entered, and resumed on private mode exit. See bug 661573"

Originally, nothing was persisted to disk.  We changed that in bug 707320 so that stuff could be measured during shut-down.  What is the problem we'd like to solve by persisting this data?
The problem is that between telemetry pings, the browser can crash, and we lose whatever information had accumulated since the last ping.  The accumulated information is useful.  Pings from saved sessions are distinguished from pings for the current session; saved-session pings without successful shutdowns are interesting as a secondary way of measuring crashes, perhaps.
I believe (though I am trying to find the study), most people close their browsers pretty regularly -- daily.  

So is it your goal to identify telemetry data sets that lead to crashes?  What do we gain from this change other than "fewer lost datasets"?
I could easily believe the majority of people close their browsers daily.

I honestly don't know what the exact goals were; Taras suggested doing this in bug 707320 comment 39 and asked me to file it.  I was offering possible justifications, but maybe it's better to go to the source.  Taras, what did you have in mind?
(In reply to Sid Stamm [:geekboy] from comment #3)

> So is it your goal to identify telemetry data sets that lead to crashes? 
> What do we gain from this change other than "fewer lost datasets"?

Indeed. Robustness is the main goal here.
Makes sense.  Can we delete the persisted file after the data gets transmitted?  If we are simply making sure we don't lose data on crash, can we purge any loaded-from-disk data when we recover from a crash and do a ping?  This is the same as what I recommended in bug 707320 comment 30.
(In reply to Sid Stamm [:geekboy] from comment #6)
> Makes sense.  Can we delete the persisted file after the data gets
> transmitted?  If we are simply making sure we don't lose data on crash, can
> we purge any loaded-from-disk data when we recover from a crash and do a
> ping?  This is the same as what I recommended in bug 707320 comment 30.

Yes we should do that in this bug.
(In reply to Sid Stamm [:geekboy] from comment #6)
> Makes sense.  Can we delete the persisted file after the data gets
> transmitted?  If we are simply making sure we don't lose data on crash, can
> we purge any loaded-from-disk data when we recover from a crash and do a
> ping?  This is the same as what I recommended in bug 707320 comment 30.

The deletion of the data file is being done in the patches already posted for bug 707320.
We already cover the concerns here now:
* We submit "main" pings much more frequently.
* We periodically save Telemetry session payloads to disk and submit them with reason "aborted-session" in the next session.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.