Closed Bug 1758248 Opened 2 years ago Closed 2 years ago

Document that canRecordExtended/canRecordPrereleaseData will always report `true` on non-content child processes

Tracking

()

Status:

RESOLVED FIXED

Milestone:

100 Branch

Tracking Flags:

Tracking

Status

firefox100

---

fixed

People

(Reporter: tnikkel, Assigned: perry.mcmanis)

Details

Attachments

(1 file)

Bug 1758248 - update documentation of canRecordExtended and canRecordPreleaseData. r=chutten 2 years ago Perry McManis [:perry.mcmanis] 48 bytes, text/x-phabricator-request		Details \| Review

Timothy Nikkel (:tnikkel)

Reporter

Description

•

2 years ago

•

Edited

Extended telemetry should not be recorded or sent on the release channel, according to

https://searchfox.org/mozilla-central/rev/1e13dfc1bd87c3747d6712807401c590d0211a46/toolkit/components/telemetry/core/nsITelemetry.idl#331

We init gCanRecordExtended in TelemetryHistogram::InitializeGlobalState

https://searchfox.org/mozilla-central/rev/1e13dfc1bd87c3747d6712807401c590d0211a46/toolkit/components/telemetry/core/TelemetryHistogram.cpp#2426

The only caller of that function passes true for the parent, content, gpu, and network process

https://searchfox.org/mozilla-central/rev/1e13dfc1bd87c3747d6712807401c590d0211a46/toolkit/components/telemetry/core/Telemetry.cpp#1140

Then later TelemetryImpl::SetCanRecordExtended is called which can potentially set canRecordExtended to false. Except the only callers of this that I can find are in js, in

https://searchfox.org/mozilla-central/rev/1e13dfc1bd87c3747d6712807401c590d0211a46/toolkit/components/telemetry/app/TelemetryControllerBase.jsm#111

and another caller in that file. But I don't think there is any js executed in the network or gpu process, and indeed I do not see TelemetryImpl::SetCanRecordExtended ever called in the gpu or network process.

Using about:telemetry I checked to see if a release build would submit checkerboarding histogram telemetry and it did indeed appear to show up in previously submitted pings there. Checkerboarding telemetry should be extended only see here

https://searchfox.org/mozilla-central/rev/1e13dfc1bd87c3747d6712807401c590d0211a46/gfx/layers/apz/src/AsyncPanZoomController.cpp#4821

And on the other end we do seem to receive checkerboard telemetry from the gpu process on release

https://glam.telemetry.mozilla.org/firefox/probe/checkerboard_severity/explore?aggregationLevel=version&channel=release&process=gpu&ref=95

I'm not generally familiar with the telemetry code, so someone should definitely verify this.

Timothy Nikkel (:tnikkel)

Reporter

Comment 1

•

2 years ago

Can you make sure this gets to the right people/person? Thanks.

Flags: needinfo?(chutten)

Chris H-C :chutten

Comment 2

•

2 years ago

That'd be me.

Congratulations (and my condolences) for untangling the mess of how canRecordExtended works. It's not the easiest part of the codebase to play in.

However, its value on non-parent processes shouldn't matter. All child-process actions on Telemetry pieces are sent over to the parent process where they're checked and conditionally applied. Each of the codepaths starting in TelemetryIPC check CanRecordDataset which will use the probe's definition to tell us whether it can be recorded in this build or not and compare it against the parent-process' knowledge of whether we're recording base or extended.

So, how does this corroborate with your observed behaviour of CHECKERBOARD_SEVERITY, then? Well, according to the definition in Histograms.json, CHECKERBOARD_SEVERITY should be collected in release (see bug 1539309). The forTelemetry code in APZ dates from 2016 when Telemetry (and Firefox as a whole) only had at most two processes, and should probably have been adjusted in bug 1539309. (+Cc jrmuizel for visibility: maybe you'll want a bug to fix that up or tear it out.)

So I think we're clear here and we can un-mococonf this bug. The flags are indeed incorrect in any process that isn't running TelemetryControllerBase.jsm's initialization routines, but that's a documentation and API ergonomics issue because they don't actually need to be correct in those processes to be dealt with correctly before storing and reporting data.

:tnikkel, what do you think? If we un-mococonf this bug, I'll use this to quickly get the APIs' docs updated.

Assignee: nobody → chutten

Severity: -- → S2

Status: NEW → ASSIGNED

Flags: needinfo?(chutten) → needinfo?(tnikkel)

Priority: -- → P1

Timothy Nikkel (:tnikkel)

Reporter

Comment 3

•

2 years ago

Phew! Thanks for the explanation! That helps explain some of the other weirdness I was seeing.

Yes, you can un-mococonf this bug. I will handle filling a follow up and fixing up the checkerboarding telemetry.

Flags: needinfo?(tnikkel)

Timothy Nikkel (:tnikkel)

Reporter

Comment 4

•

2 years ago

(In reply to Timothy Nikkel (:tnikkel) from comment #3)

I will handle filling a follow up and fixing up the checkerboarding telemetry.

That bug is bug 1758670.

Chris H-C :chutten

Comment 5

•

2 years ago

This is now a bug to fix API documentation : )

I should be able to get to this promptly.

Group: mozilla-employee-confidential

Summary: gpu and network processes appear to record and send extended telemetry on release channel → Document that canRecordExtended/canRecordPrereleaseData will always report `true` on non-content child processes

Chris H-C :chutten

Comment 6

•

2 years ago

Passing to Perry.

You'll be editing nsITelemetry.idl to update the comments on the attributes canRecordExtended and canRecordPrereleaseData to note that they will always return true on non-content child processes (or, in reverse, that their values should only be trusted on the parent process).

Background: nsITelemetry.idl is the backbone of the Telemetry singleton available in C++ and JS in Firefox Desktop. It has the public API for recording data, and some other stuff added over the years.

Assignee: chutten → pmcmanis

Botond Ballo [:botond]

Comment 7

•

2 years ago

(In reply to Chris H-C :chutten from comment #2)

The flags are indeed incorrect in any process that isn't running TelemetryControllerBase.jsm's initialization routines, but that's a documentation and API ergonomics issue because they don't actually need to be correct in those processes to be dealt with correctly before storing and reporting data.

Is it perhaps also a potential performance issue, if those processes do unnecessary work gated on Telemetry::CanRecordExtended() to collect data that then gets discarded?

Chris H-C :chutten

Comment 8

•

2 years ago

(In reply to Botond Ballo [:botond] from comment #7)

Is it perhaps also a potential performance issue, if those processes do unnecessary work gated on Telemetry::CanRecordExtended() to collect data that then gets discarded?

True. I'm unclear on the cost/benefit of designing and implementing a generic cross-process solution for canRecord* at this time, so I'll leave it to consumers to reach out when they discover they need it and continue work on Glean (which doesn't have this lever).

Perry McManis [:perry.mcmanis]

Assignee

Comment 9

•

2 years ago

Attached file Bug 1758248 - update documentation of canRecordExtended and canRecordPreleaseData. r=chutten — Details

Clarifying expected behavior of telemetry's canRecordExtended and canRecordPreleaseData in child processes in the documentation.

Pulsebot

Comment 10

•

2 years ago

Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/705b93332795
update documentation of canRecordExtended and canRecordPreleaseData. r=chutten DONTBUILD

Atila Butkovits

Comment 11

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/705b93332795

Status: ASSIGNED → RESOLVED

Closed: 2 years ago

status-firefox100: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 100 Branch

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Document that canRecordExtended/canRecordPrereleaseData will always report `true` on non-content child processes

Categories

(Toolkit :: Telemetry, defect, P1)

Tracking

()

People

(Reporter: tnikkel, Assigned: perry.mcmanis)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Attachment

General

Description

File Name

Content Type