Move FxMS ping and metrics out of newtab
Categories
(Firefox :: Messaging System, task, P1)
Tracking
()
Tracking | Status | |
---|---|---|
firefox137 | --- | affected |
People
(Reporter: aminomancer, Assigned: pdahiya)
References
(Blocks 2 open bugs)
Details
(Whiteboard: [hnt-trainhop])
Attachments
(2 files)
Our glean stuff is currently in newtab, but we should move it to asrouter, since both directories have herald rules tagging the respective teams for review. As it stands, we won't be tagged for review when our ping/metrics change, but the newtab team will be.
- https://searchfox.org/mozilla-central/rev/9508d23ab6de8b0734be2c95303515b08da1ce28/browser/components/newtab/pings.yaml
- https://searchfox.org/mozilla-central/rev/9508d23ab6de8b0734be2c95303515b08da1ce28/browser/components/newtab/metrics.yaml
ni?chutten to let us know if it's fine to just do this, or if we need to perform additional migration steps to make sure glean data and the dictionary page do not get messed up or duplicated. Thanks!
Reporter | ||
Updated•9 months ago
|
Comment 1•9 months ago
|
||
There are a few things that come to mind when thinking about how to do this without interrupting data flow:
- Make sure the destination and source files are in the same lists in
metrics_index.py
. This ought to keep the pipeline from even really noticing the difference. - Ensure the destination and source files both already exist, and continue to exist after the definitions move, even if they're empty. The pipeline has a mechanism for recognizing new files and another mechanism for recognizing new/changed metrics and pings within those files. If those operations aren't externally synchronized, we could get into a weird situation requiring manual intervention.
- If any files need to be added, add them early. And then don't move definitions to the new files until the day after that file appears in the list in
repositories.yaml
(usually just the day after the first build that has the new files in it). - As for the reasons why we're keeping them even if they're empty, there are two of them: 1) Saves someone some effort and some waiting for when they want to add metrics/pings to the source component later, 2) If there are backouts, even from autoland, file deletions can become annoying on the pipeline.
- If any files need to be added, add them early. And then don't move definitions to the new files until the day after that file appears in the list in
- If necessary, update the tags of these metrics and pings (if they don't have definition-specific tag metadata, they inherit the file-level
$tags
param) to point to the correct product&component so that in the dictionary they're associated with other similar instrumentation. This is a nice-to-have and can happen whenever you'd like. - Test the data coming from the build using the Glean Debug Ping Viewer (via e.g.
about:glean
) and all of your automated suites. This will ensure the instrumentation didn't pick up a fault along the way. - Monitor the data coming in after the swap reaches a build. Use
client_info.app_build
to see your data coming in from buildids that have your changes, and use themoz-fx-data-shared-prod.firefox_desktop_live.messaging_system_v1
table to read "messaging-system" pings that hit the pipeline within the past 15mins (this data is not deduplicated and is otherwise unfit for general analysis, but it's good for monitoring). You should ultimately see the rate of pings per client in builds with your change reach the rate from builds without your change, but more immediately we just want to make sure that the pings come through at all and they have the metrics you expect in the columns that you expect. If there aren't any "messaging-system" pings, or there aren't the right number of them or contents in them, reach out.
That should keep the data flowing and minimize the likelihood that we'll need to bug pipeline folks to manually intercede.
So long as things haven't changed since the last time I looked into it, "messaging-system" pings' downstream analyses are limited mostly to nightly batched derivations, so we have some wriggle room to fix things before they get too wrong and won't end up suffering too much if things do go wrong and we have to wait until things are made right.
Reporter | ||
Comment 2•9 months ago
|
||
(In reply to Chris H-C :chutten from comment #1)
Thanks for the thorough instructions! This is really helpful.
Assignee | ||
Updated•9 months ago
|
Updated•2 months ago
|
Updated•2 months ago
|
Assignee | ||
Updated•12 days ago
|
Assignee | ||
Updated•12 days ago
|
Assignee | ||
Comment 3•12 days ago
|
||
Assignee | ||
Comment 4•12 days ago
|
||
Comment 6•5 days ago
|
||
Backed out for causing build bustages @ AsrouterMetrics.h
Backout link: https://hg.mozilla.org/integration/autoland/rev/fa658b870584bcf43c524a45674c6a3507cf5603
Assignee | ||
Comment 7•5 days ago
|
||
NI @chutten for feedback and help debug above build failure on autoland about AsrouterMetrics.h
API thanks!
Comment 8•5 days ago
|
||
Ohhhhhh, I know what's going on here. Florian mentioned having problems with empty files when he split the include files.
Florian, is there an easy way to support metrics files with no definitions? Turns out there is at least one legitimate use for them (this bug's use: adding a file to the pipeline before moving metrics into it to make absolutely certain the definitions aren't interrupted).
Comment 9•4 days ago
|
||
(In reply to Chris H-C :chutten from comment #8)
Florian, is there an easy way to support metrics files with no definitions?
Not really easy, the task generating the header files takes as input the metrics_yaml.cached
file, that only contains parsed metrics. The list of yaml files doesn't exist anymore at that point. It reappears later when we access metric.defined_in["filepath"]
.
If you really want to do it, you'll need to either:
- add a dummy metric to make the metric.yaml file non-empty (and teach the code in glean_parser_ext to skip it, thus writing an empty header file.)
- make the parsing code keep the list of input yaml metric files when generating the cache, and have the header output code generate empty header files for files that don't contain any metric.
Neither of these options sound great.
Turns out there is at least one legitimate use for them (this bug's use: adding a file to the pipeline before moving metrics into it to make absolutely certain the definitions aren't interrupted).
To me this sounds more like an obscure workaround for bugs than like a legitimate use case. More specifically, looking at your point '2.' in comment 1:
- "we could get into a weird situation requiring manual intervention" doesn't sound good.
- "don't move definitions to the new files until the day after that file appears in the list in repositories.yaml" I don't think we want developers to know or care for this.
- "reasons why we're keeping them even if they're empty, there are two of them: 1) Saves someone some effort and some waiting for when they want to add metrics/pings to the source component later" There should be no waiting. Either the pipeline should update automatically as part of a new build being released, or submissions received from builds with a build id newer than the most recent update of the pipeline should be queued and only processed once the pipeline has been updated.
- "If there are backouts, even from autoland, file deletions can become annoying on the pipeline." backouts should be painless. Sheriffs won't know that backing out a patch will be causing issues elsewhere, and shouldn't have to worry about it.
I understand that fixing all these issues requires significant effort, so we might need to take a temporary workaround. If so, I think putting a descriptive string in the 'empty' metrics.yaml file that says why it's there is better than leaving the file really empty, as a build error seems a reasonable behavior to me when processing an empty metric file.
Comment 10•4 days ago
|
||
The even more "wonderful" thing about this is that we already have all these issues fixed... for every project that isn't hosted in mozilla-central. Hopefully on the move to github we'll reevaluate that cost of turning probe-scraper "push mode" on for our biggest projects.
Okay, since this is temporary, let's do this.
Punam, can you add this to your patch?
diff --git a/toolkit/components/glean/moz.build b/toolkit/components/glean/moz.build
index 612ee7e1dafd1..dfe304f90e610 100644
--- a/toolkit/components/glean/moz.build
+++ b/toolkit/components/glean/moz.build
@@ -138,7 +138,11 @@ metrics_headers = sorted(
{
convert_yaml_path_to_header_name(path) + ".h"
for path in metrics_yamls
- if path != "toolkit/components/glean/metrics.yaml"
+ if path
+ not in (
+ "toolkit/components/glean/metrics.yaml",
+ "browser/components/asrouter/metrics.yaml", # TODO: Remove in bug 1899019
+ )
}
)
EXPORTS.mozilla.glean += [("!" + filename) for filename in metrics_headers]
And then the inverse for the second patch. That'll give us the temporary permission to have that empty metrics.yaml
.
Assignee | ||
Comment 11•4 days ago
|
||
Thanks for feedback and help unblock, just updated the patch and started try run , please let me know if there is any concern around landing at the end of nightly cycle.
Comment 12•4 days ago
|
||
Comment 13•3 days ago
|
||
bugherder |
Comment 14•3 days ago
|
||
Reopened for the unpublished changeset.
Updated•3 days ago
|
Description
•