Closed Bug 1320052 Opened 8 years ago Closed 6 years ago

Stop recording both session & subsession histograms

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox62 --- fixed

People

(Reporter: gfritzsche, Assigned: janerik)

References

Details

(Whiteboard: [measurement:client])

Attachments

(7 files)

With bug 1286868 and bug 1320051 resolved, we shouldn't need the duplicated histograms anymore.

There is a remaining question of how to deal with Android though, we need to make sure to not break the existing behavior there.
As Android doesn't send main pings right now, we could probably just use the same single histogram storage there and skip clearing it for Android.
Assignee: nobody → jrediger
Priority: P3 → P1
Current, we store every histogram twice: subsession and full session. The only difference betwen the two are the "clearing" semantics. The full session storage is never cleared for the whole lifetime of the session, the subsession is cleared whenever we take a snapshot and call "clear".

Up to now, this difference was only meaningful on Desktop: we used to send both the "main-ping" (which sends subsession measurements) and the "saved-session" ping (which had full session measurements, for legacy reasons). Fennec sends the "saved-session" as well (so full measuements support). Subsession data is currently disabled on Fennec/Android. The code is riddled with things such as [1].

Here comes the fun: GeckoView supports subsessions.

One possible approach to fixing this is to drop the "full session" storage for histogram and only allow subsession histograms, with the caveat of never allow to clear the storage on Android/Fennec (vs Android/GV).

[1] - https://searchfox.org/mozilla-central/rev/795575287259a370d00518098472eaa5b362bfa7/toolkit/components/telemetry/Telemetry.cpp#583-587
Does the proposal from comment 1 make sense?
Flags: needinfo?(jrediger)
Flags: needinfo?(gfritzsche)
Flags: needinfo?(chutten)
That's as I understood it as well. Dexter also verified that we're never requesting data from both storage places at once [1].
For now I'm moving forward with this proposal.

[1]: https://searchfox.org/mozilla-central/rev/795575287259a370d00518098472eaa5b362bfa7/toolkit/components/telemetry/TelemetrySession.jsm#1189-1190,1224-1228
Flags: needinfo?(jrediger)
Having a single type of storage with semantics that differ based on platform sounds fine to me.
Flags: needinfo?(chutten)
Depends on: 1452552
Comment on attachment 8972821 [details]
Bug 1320052 - Unify histogram storage into a single container.

https://reviewboard.mozilla.org/r/241384/#review247254

A couple of nits.

::: commit-message-0b0b7:4
(Diff revision 1)
> +Bug 1320052 - Unify histogram storage into a single container. r?chutten
> +
> +On Desktop and GeckoView we only ever need to store histograms for a
> +subsession and clear the histograms when a snapshot is done (e.g. a main ping is build).

*built

::: toolkit/components/telemetry/TelemetryHistogram.cpp:1233
(Diff revision 1)
>    {
>      StaticMutexAutoLock locker(gTelemetryHistogramMutex);
>      MOZ_ASSERT(internal_IsHistogramEnumId(id));
>  
>      // This is not good standard behavior given that we have histogram instances
>      // covering multiple processes and two session types.

This comment requires updating.

::: toolkit/components/telemetry/nsITelemetry.idl:69
(Diff revision 1)
>     *   counts - array representing contents of the buckets in the histogram
>     *   ranges - array with calculated bucket sizes
>     *   sum - sum of the bucket contents
>     *
>     * @param aDataset DATASET_RELEASE_CHANNEL_OPTOUT or DATASET_RELEASE_CHANNEL_OPTIN.
> -   * @param aSubsession Whether to return the internally-duplicated subsession histograms
> +   * @param aClear Whether to clear out the histograms after snapshotting

Should note it only works if not Fennec

::: toolkit/components/telemetry/nsITelemetry.idl:235
(Diff revision 1)
>     * Serializes the keyed histograms from the given dataset to a JSON-style object.
>     * The returned structure looks like:
>     *   { process1: {name1: {histogramData1}, name2:{histogramData2}...}}
>     *
>     * @param aDataset DATASET_RELEASE_CHANNEL_OPTOUT or DATASET_RELEASE_CHANNEL_OPTIN.
> -   * @param aSubsession Whether to return the internally-duplicated subsession keyed histograms
> +   * @param aClear Whether to clear out the keyed histograms after snapshotting

ditto
Attachment #8972821 - Flags: review?(chutten) → review+
Comment on attachment 8972822 [details]
Bug 1320052 - Update internal snapshot callers to not deal with subsessions.

https://reviewboard.mozilla.org/r/241386/#review247260
Attachment #8972822 - Flags: review?(chutten) → review+
Comment on attachment 8972823 [details]
Bug 1320052 - Remove now obsolete full-session/subsession tests for histograms.

https://reviewboard.mozilla.org/r/241388/#review247262

Just so long as we still have coverage that clearing subsessions still works.
Attachment #8972823 - Flags: review?(chutten) → review+
Comment on attachment 8972821 [details]
Bug 1320052 - Unify histogram storage into a single container.

https://reviewboard.mozilla.org/r/241384/#review247264

::: toolkit/components/telemetry/TelemetryHistogram.cpp:888
(Diff revision 1)
> -  if (!map.ReflectIntoJS(&KeyedHistogram::ReflectKeyedHistogram, cx, obj)) {
>      return NS_ERROR_FAILURE;
>    }
>  
> -#if !defined(MOZ_WIDGET_ANDROID)
> -  if (subsession && clearSubsession) {
> +  // Never clear storage on Fennec
> +  if (GetCurrentPlatform() != SupportedPlatform::Fennec && clearSubsession) {

Fly-by: do we need this? We can enforce it from [TelemetrySession](https://searchfox.org/mozilla-central/rev/ce9ff94ffed34dc17ec0bfa406156d489eaa8ee1/toolkit/components/telemetry/TelemetrySession.jsm#1189-1190,1224,1226) as well: these files are only used on Fennec/Desktop. Fennec has UNIFIED_TELEMETRY off and GeckoView will never use that file :)
Comment on attachment 8972824 [details]
Bug 1320052 - Update external snapshot callers to not deal with subsessions.

https://reviewboard.mozilla.org/r/241390/#review247274

For the tests that ask for session histogram snapshots, I wonder if they have subsession splits during their tests that we were unintentionally protecting them against... probably not, but we should keep an eye on whether these start growing oranges.
Attachment #8972824 - Flags: review?(chutten) → review+
Comment on attachment 8972825 [details]
Bug 1320052 - Clamped count is not double-counted anymore.

https://reviewboard.mozilla.org/r/241392/#review247276
Attachment #8972825 - Flags: review?(chutten) → review+
Comment on attachment 8972821 [details]
Bug 1320052 - Unify histogram storage into a single container.

https://reviewboard.mozilla.org/r/241384/#review247254

> Should note it only works if not Fennec

We're not checking for Fennec anymore, so in theory it works on Fennec, but the JavaScript code will ensure it's not triggered.
The plan in comment 1 makes sense to me.
I'm happy to see this double storage go!
Flags: needinfo?(gfritzsche)
Comment on attachment 8973189 [details]
Bug 1320052 - Test that clearing subsession data still works.

https://reviewboard.mozilla.org/r/241680/#review247978
Attachment #8973189 - Flags: review?(chutten) → review+
Pushed by alessio.placitelli@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/b0396db0229c
Unify histogram storage into a single container. r=chutten
https://hg.mozilla.org/integration/autoland/rev/c7b8f6d55a0b
Update internal snapshot callers to not deal with subsessions. r=chutten
https://hg.mozilla.org/integration/autoland/rev/4749633bd02f
Remove now obsolete full-session/subsession tests for histograms. r=chutten
https://hg.mozilla.org/integration/autoland/rev/c95a4b0c6642
Update external snapshot callers to not deal with subsessions. r=chutten
https://hg.mozilla.org/integration/autoland/rev/7a64a23bf183
Clamped count is not double-counted anymore. r=chutten
https://hg.mozilla.org/integration/autoland/rev/07db52945b1f
Test that clearing subsession data still works. r=chutten
 Backed out 6 changesets (bug 1320052) on request by Dexter for requently failing test verify dom/base/test/browser_use_counters.js default tip 

Failure push: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=07db52945b1f8bc406eaf2bc6f790b833d06a198&selectedJob=177466737

Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=177466737&repo=autoland&lineNumber=2194

Backout: https://hg.mozilla.org/integration/autoland/rev/183c021e1ae07ccd57255dda17e6c9ef2c168cab

14:29:34     INFO -  33 INFO TEST-PASS | dom/base/test/browser_use_counters.js | page counts for PROPERTY_FILL after are correct -
14:29:34     INFO -  Buffered messages finished
14:29:34    ERROR -  34 INFO TEST-UNEXPECTED-FAIL | dom/base/test/browser_use_counters.js | document counts for PROPERTY_FILL after are correct - Got 16, expected 17
14:29:34     INFO -  Stack trace:
14:29:34     INFO -  chrome://mochikit/content/browser-test.js:test_is:1290
14:29:34     INFO -  chrome://mochitests/content/browser/dom/base/test/browser_use_counters.js:check_use_counter_img:253
14:29:34     INFO -  35 INFO TEST-PASS | dom/base/test/browser_use_counters.js | top level document counts are correct -
14:29:34     INFO -  Not taking screenshot here: see the one that was previously logged
14:29:34    ERROR -  36 INFO TEST-UNEXPECTED-FAIL | dom/base/test/browser_use_counters.js | document counts are correct -
14:29:34     INFO -  Stack trace:
14:29:34     INFO -  chrome://mochitests/content/browser/dom/base/test/browser_use_counters.js:check_use_counter_img:259
14:29:34     INFO -  37 INFO checking file_use_counter_svg_currentScale.svg as image with histogram PROPERTY_FILL
14:29:34     INFO -  GECKO(3132) | ++DOCSHELL 0C513000 == 8 [pid = 4024] [id = {63a6c7dc-b158-4334-bc8c-b903edf06a51}]
14:29:34     INFO -  GECKO(3132) | ++DOMWINDOW == 31 (05CBCC60) [pid = 4024] [serial = 101] [outer = 00000000]
14:29:34     INFO -  GECKO(3132) | ++DOMWINDOW == 32 (0C515000) [pid = 4024] [serial = 102] [outer = 05CBCC60]
14:29:34     INFO -  GECKO(3132) | ++DOMWINDOW == 33 (0BEE6400) [pid = 4024] [serial = 103] [outer = 05CBCC60]
14:29:34     INFO -  GECKO(3132) | ++DOCSHELL 0BF1C800 == 9 [pid = 4024] [id = {b8e624c1-7f79-438b-90b5-ab1bb9cc1b24}]
14:29:34     INFO -  GECKO(3132) | ++DOMWINDOW == 34 (05CBCD90) [pid = 4024] [serial = 104] [outer = 00000000]
14:29:34     INFO -  GECKO(3132) | ++DOMWINDOW == 35 (0BF1D800) [pid = 4024] [serial = 105] [outer = 05CBCD90]
14:29:35     INFO -  GECKO(3132) | --DOCSHELL 0BF1DC00 == 8 [pid = 4024] [id = {72ed4433-b232-49bf-aa6c-4ffb58e65d9e}]
Flags: needinfo?(jrediger)
(In reply to Alessio Placitelli [:Dexter] from comment #27)
> Note:
> https://treeherder.mozilla.org/#/
> jobs?repo=autoland&revision=07db52945b1f8bc406eaf2bc6f790b833d06a198&selected
> Job=177476205
> 
> |browser_remote_navigation_delay_telemetry.js| is failing as well.

Ok turns out this test was the only problem, as the failures in |dom/base/test/browser_use_counters.js| are not from this bug.
Flags: needinfo?(jrediger)
Comment on attachment 8974213 [details]
Bug 1320052 - Fix remaining caller of histogram snapshot API.

https://reviewboard.mozilla.org/r/242508/#review248428
Attachment #8974213 - Flags: review?(alessio.placitelli) → review+
Pushed by alessio.placitelli@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/640fdf7b0189
Unify histogram storage into a single container. r=chutten
https://hg.mozilla.org/integration/autoland/rev/2da7a07c369a
Update internal snapshot callers to not deal with subsessions. r=chutten
https://hg.mozilla.org/integration/autoland/rev/c84d5061ae3a
Remove now obsolete full-session/subsession tests for histograms. r=chutten
https://hg.mozilla.org/integration/autoland/rev/2901dd472949
Update external snapshot callers to not deal with subsessions. r=chutten
https://hg.mozilla.org/integration/autoland/rev/ad8fc821ed3a
Clamped count is not double-counted anymore. r=chutten
https://hg.mozilla.org/integration/autoland/rev/a8dc47e76224
Test that clearing subsession data still works. r=chutten
https://hg.mozilla.org/integration/autoland/rev/a2eccfbeb0ae
Fix remaining caller of histogram snapshot API. r=Dexter
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: