Closed Bug 1931277 Opened 1 year ago Closed 1 year ago

Support for Configuring a Subset of Interesting Metrics to Collect

Categories

(Data Platform and Tools :: Glean: SDK, task, P1)

task

Tracking

(firefox-esr128 fixed, firefox135 fixed)

RESOLVED FIXED
Tracking Status
firefox-esr128 --- fixed
firefox135 --- fixed

People

(Reporter: mkmelin, Assigned: mkmelin)

References

Details

Attachments

(5 files, 2 obsolete files)

See https://docs.google.com/document/d/1e7QrGioYl3gTGcrpIL-0yhKZKtA7sDFqi9BnKj-UMSQ/edit?tab=t.0#heading=h.5x0d5h95i329

Provide a mechanism to allow a Glean-using product to specify a subset of “interesting” metrics while disabling all other metrics.
The primary use case for the moment is Thunderbird opting out to collect data about things is does not care about.

Attachment #9437687 - Attachment description: WIP: Bug 1931277 - Support for Configuring a Subset of Interesting Metrics to Collect. r=!travis,!chutten → Bug 1931277 - Support for Configuring a Subset of Interesting Metrics to Collect. r=TravisLong!,chutten!
Priority: -- → P1

https://github.com/mozilla/glean_parser/pull/774 was merged.
How/when do you sync glean_parser to mozilla-central/third_party/python/?

(In reply to Magnus Melin [:mkmelin] from comment #4)

https://github.com/mozilla/glean_parser/pull/774 was merged.
How/when do you sync glean_parser to mozilla-central/third_party/python/?

I can do a glean_parser release and then we hopefully pull this in within the next week (we're landing a major update anyway)

Depends on: 1933939
Pushed by mkmelin@iki.fi: https://hg.mozilla.org/integration/autoland/rev/b0c910bea119 Support for Configuring a Subset of Interesting Metrics to Collect. r=TravisLong,chutten
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED

Unfortunately there's one more problem: tests for disabled metrics do not work in tests, so tests like netwerk/test/unit/test_loadgroup_glean.js fail. Would disabling a metric mean one also have to change the test?

Is there a way to allow collection during tests? Or what solution could you think of?

Flags: needinfo?(chutten)

You could push during tests a metrics configuration that enables all metrics and pings. That would override how the binary was compiled with all of them disabled, I think.

Travis, have some ideas?

Flags: needinfo?(chutten) → needinfo?(tlong)

Might be a little tricky, given that only the "interesting" metrics are generated as enabled. While you could specify a Server Knobs config that overrides this, it would be a lengthy config as it would need to contain each metric that needed to be overridden.

I'm torn as to what the correct thing to do here is. Since these metrics are disabled and not interesting, it seems fair that testing them is also uninteresting. I'm not sure if it would be possible to skip setting "interesting" metrics only for tests, but that seems like the easiest path forward without some additional API or handling within Glean to be able to override the generated enabled state of these metrics.

I'll think on this a bit more and if I can come up with something better I'll return and post another comment, but I think either finding a way to skip these tests or to allow generation of all metrics only for testing seems like the way forward.

Flags: needinfo?(tlong)

At what stage does the disabling take place? A) simply don't collect anything, or B) collect but throw away and don't send.
We probably want something like B. Or an option to allow disabling to work that way based on some condition, like is-in-test-automation.

I don't think it'd be realistic to have separate builds only for testing.

The metrics are disabled during code generation by the glean_parser. So each instance of a metric not in the "interesting" group would be generated with enabled=false. These metrics then simply don't collect anything, instead performing an early return from the metric API functions. Metrics in test are never sent, and if they are they can be marked as "automation" to be discarded (for things like CI running tests on releases, etc). To be able to collect them but don't send them, the metrics would need to be enabled. I'm going to be on PTO, but I'm going to needinfo :janerik to think about how we might address the issue you are running into with this in the hopes that he has some ideas for us.

This might require that we have a different state for tests vs. runtime associated with the "uninteresting" metrics, and I'm not sure that's the best approach here but I would like for Glean to make this easier for you.

Flags: needinfo?(jrediger)

Honestly, for tests that are specifically designed to test Glean metrics, like netwerk/test/unit/test_loadgroup_glean.js, in my opinion these should be disabled in a configuration when the metrics are disabled. There's not much value in running tests on things that the build doesn't do at runtime.

Apart from that, I don't have a good answer either on how to deal with this. I'll talk with Travis though to brainstorm a bit further.

Flags: needinfo?(jrediger)

Is the recommendation for us to exclude tests that we know will fail when we build Thunderbird? And when new probes and tests are added, somebody will need to remember to exclude them from the Thunderbird build process?

Could some notes be added in the .ini to alert developers to the fact that TB needs to be configured to exclude new probe tests?

Or perhaps I can ask dkl to set up a herald rule so that our team gets alerted when probes and tests are added so we can choose to include the new probe or disable the new test before it starts failing?

Are there any other ideas that can help us get this in place?

Flags: needinfo?(tlong)
Flags: needinfo?(chutten)

After talking with chutten we might have an idea how to support this.
Posting here for discussion (afterwards we should probably split that out into a separate bug).

At build time we know the set of interesting metrics, which means we can also determine the set of uninteresting metrics.
In the same build step we generate a build artifact listing those.
Then for all tests where Glean is initialized, given we can positively identify those as test runs, we read this file and pass in this list of uninteresting metrics (potentially as a Server Knobs config) to re-enable them.
This means test will work because all the expected metrics are enabled.
Production builds as shipped (1) won't have this build artifact, (2) aren't test runs, and thus the metrics stay disabled.

That way no adjustement is needed as new metrics are added to Gecko, but not Thunderbird.
It's minimal changes only in FOG (no Glean changes) and likely doable relatively soon.
It won't affect developer builds for manual testing either.

Toby, as the Thunderbird person, how does that sound?
Travis, ni? to give input on that idea.

Flags: needinfo?(tlong)
Flags: needinfo?(chutten)

This would be a great solution, thanks so much for the push on this, it is very much appreciated. Is there anything that Magnus and I can do to help?

Thanks! (Looks like you forgot to ni? Travis on comment 15, I'll do that now.)

Flags: needinfo?(tlong)

That approach sounds great to me, thanks for brainstorming this while I was out. r+ from me on going with this.

Flags: needinfo?(tlong)
Attachment #9437688 - Attachment description: WIP: Bug 1931277 - c-c part of using only interesting probes → Bug 1931277 - c-c part of using only interesting probes. r=#thunderbird-reviewers
Pushed by mkmelin@iki.fi: https://hg.mozilla.org/comm-central/rev/9e934df67de2 c-c part of using only interesting probes. r=tobyp

Comment on attachment 9438276 [details] [review]
[mozilla/glean_parser] Bug 1931277 - Support for Configuring a Subset of Interesting Metrics… (#774)

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Needed for Thunderbird Glean - bug 1954442.
  • User impact if declined: Build level change
  • Fix Landed on Version: 135
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Does not impact Firefox.
Attachment #9438276 - Flags: approval-mozilla-esr128?

For 128 uplift, not to have to uplift the new Glean version with all its dependencies, we can cherry-pick the glean_parser patch and uplift that. These are the patches I have:

https://hg.mozilla.org/try/rev/0d0a5651480796d05bd0bd8d9d3a9eb201e33bb3
https://hg.mozilla.org/try/rev/deca4fe60c79c12cfb234cd48e9a29deac9ee67f

See Also: → 1954442

Bug 1918118 is a pre-req for uplifting this.

Comment on attachment 9438276 [details] [review]
[mozilla/glean_parser] Bug 1931277 - Support for Configuring a Subset of Interesting Metrics… (#774)

These patchs on the try build don't apply cleanly to the ESR128 branch. Making other uplift request depending on it also not upliftable.

Attachment #9438276 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128-

Hmm, I wonder if something changed. At least the stack rebased to yesterday's 128esr tip without a glitch:
https://treeherder.mozilla.org/jobs?repo=try&revision=30a6c19119a22d5a1ece1053f6b083b6278971a1

Attachment #9473893 - Flags: approval-mozilla-esr128?
Attachment #9473894 - Flags: approval-mozilla-esr128?

I've uploaded those patches to phabricator instead.

Attachment #9474084 - Flags: approval-mozilla-esr128?
Attachment #9474085 - Flags: approval-mozilla-esr128?
Attachment #9474084 - Attachment is obsolete: true
Attachment #9474084 - Flags: approval-mozilla-esr128?
Attachment #9474085 - Attachment is obsolete: true
Attachment #9474085 - Flags: approval-mozilla-esr128?
Attachment #9473893 - Attachment description: Bug 1931277 - [128esr] glean-parser PR: Support for Configuring a Subset of Interesting Metrics to Collect. r=TravisLong → Bug 1931277 - [128esr] glean-parser PR: Support for Configuring a Subset of Interesting Metrics to Collect.
Attachment #9473894 - Attachment description: Bug 1931277 - [128esr] Support for Configuring a Subset of Interesting Metrics to Collect. r=TravisLong,chutten → Bug 1931277 - [128esr] Support for Configuring a Subset of Interesting Metrics to Collect.
Attachment #9473894 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
Attachment #9473893 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: