Closed Bug 1749394 Opened 3 years ago Closed 3 years ago

Perma 0:09.19 toolkit/components/glean/tests/pytest/test_no_expired_metrics.py::test_no_metrics_expired TEST-UNEXPECTED-FAIL

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

RESOLVED FIXED
98 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox96 --- wontfix
firefox97 --- wontfix
firefox98 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: janerik)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(1 file)

Filed by: malexandru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=363638881&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/GWcXN8SaTiWTkuFpKldiEQ/runs/0/artifacts/public/logs/live_backing.log


[task 2022-01-10T21:43:42.569Z]  0:08.40 toolkit/components/glean/tests/pytest/test_glean_parser_rust.py::test_numeric_expires PASSED
[task 2022-01-10T21:43:42.570Z]  0:08.40 
[task 2022-01-10T21:43:42.570Z]  0:08.40 =========================== 4 passed in 0.65 seconds ===========================
[task 2022-01-10T21:43:43.363Z]  0:09.19 Setting retcode to 1 from /builds/worker/checkouts/gecko/toolkit/components/glean/tests/pytest/test_no_expired_metrics.py
[task 2022-01-10T21:43:43.364Z]  0:09.19 /builds/worker/checkouts/gecko/toolkit/components/glean/tests/pytest/test_no_expired_metrics.py
[task 2022-01-10T21:43:43.364Z]  0:09.19 ============================= test session starts ==============================
[task 2022-01-10T21:43:43.364Z]  0:09.19 platform linux -- Python 3.6.9, pytest-3.6.2, py-1.5.4, pluggy-0.6.0 -- /builds/worker/checkouts/gecko/obj-x86_64-pc-linux-gnu/_virtualenvs/python-test/bin/python
[task 2022-01-10T21:43:43.364Z]  0:09.19 rootdir: /builds/worker/checkouts/gecko, inifile: /builds/worker/checkouts/gecko/config/mozunit/mozunit/pytest.ini
[task 2022-01-10T21:43:43.364Z]  0:09.19 collecting ... collected 1 item
[task 2022-01-10T21:43:43.364Z]  0:09.19 
[task 2022-01-10T21:43:43.364Z]  0:09.19 toolkit/components/glean/tests/pytest/test_no_expired_metrics.py::test_no_metrics_expired TEST-UNEXPECTED-FAIL
[task 2022-01-10T21:43:43.364Z]  0:09.19 
[task 2022-01-10T21:43:43.364Z]  0:09.19 =================================== FAILURES ===================================
[task 2022-01-10T21:43:43.364Z]  0:09.19 ___________________________ test_no_metrics_expired ____________________________
[task 2022-01-10T21:43:43.364Z]  0:09.19 
[task 2022-01-10T21:43:43.364Z]  0:09.19     def test_no_metrics_expired():
[task 2022-01-10T21:43:43.364Z]  0:09.19         """
[task 2022-01-10T21:43:43.364Z]  0:09.19         Of all the metrics included in this build, are any expired?
[task 2022-01-10T21:43:43.364Z]  0:09.19         If so, they must be removed or renewed.
[task 2022-01-10T21:43:43.364Z]  0:09.19 
[task 2022-01-10T21:43:43.364Z]  0:09.19         (This also checks other lints, as a treat.)
[task 2022-01-10T21:43:43.364Z]  0:09.19         """
[task 2022-01-10T21:43:43.365Z]  0:09.19         with open("browser/config/version.txt", "r") as version_file:
[task 2022-01-10T21:43:43.365Z]  0:09.19             app_version = version_file.read().strip()
[task 2022-01-10T21:43:43.365Z]  0:09.19 
[task 2022-01-10T21:43:43.365Z]  0:09.19         options = run_glean_parser.get_parser_options(app_version)
[task 2022-01-10T21:43:43.365Z]  0:09.19         metrics_paths = [Path(x) for x in metrics_yamls]
[task 2022-01-10T21:43:43.365Z]  0:09.19         all_objs = parser.parse_objects(metrics_paths, options)
[task 2022-01-10T21:43:43.365Z]  0:09.19         assert not util.report_validation_errors(all_objs)
[task 2022-01-10T21:43:43.365Z]  0:09.19 >       assert not lint.lint_metrics(all_objs.value, options)
[task 2022-01-10T21:43:43.365Z]  0:09.19 E       AssertionError: assert not [<glean_parser.lint.GlinterNit object at 0x7ff73de816d8>, <glean_parser.lint.GlinterNit object at 0x7ff73de811d0>]
[task 2022-01-10T21:43:43.365Z]  0:09.19 E        +  where [<glean_parser.lint.GlinterNit object at 0x7ff73de816d8>, <glean_parser.lint.GlinterNit object at 0x7ff73de811d0>] = <function lint_metrics at 0x7ff73e2222f0>(DictWrapper([('fog', DictWrapper([('initialization', <glean_parser.metrics.Timespan object at 0x7ff73e0fd2b0>), ('fail...metrics.Counter object at 0x7ff73de81e80>), ('uri_count', <glean_parser.metrics.Counter object at 0x7ff73de81470>)]))]), {'allow_reserved': False, 'custom_is_expired': <function get_parser_options.<locals>.<lambda> at 0x7ff73de68730>, 'custom_validate_expires': <function get_parser_options.<locals>.<lambda> at 0x7ff73de688c8>})
[task 2022-01-10T21:43:43.365Z]  0:09.19 E        +    where <function lint_metrics at 0x7ff73e2222f0> = lint.lint_metrics
[task 2022-01-10T21:43:43.365Z]  0:09.19 E        +    and   DictWrapper([('fog', DictWrapper([('initialization', <glean_parser.metrics.Timespan object at 0x7ff73e0fd2b0>), ('fail...metrics.Counter object at 0x7ff73de81e80>), ('uri_count', <glean_parser.metrics.Counter object at 0x7ff73de81470>)]))]) = <glean_parser.util.keep_value.<locals>.ValueKeepingGenerator object at 0x7ff73e0fdf28>.value
[task 2022-01-10T21:43:43.365Z]  0:09.19 
[task 2022-01-10T21:43:43.365Z]  0:09.19 toolkit/components/glean/tests/pytest/test_no_expired_metrics.py:41: AssertionError
[task 2022-01-10T21:43:43.365Z]  0:09.19 =========================== 1 failed in 0.29 seconds ===========================
[task 2022-01-10T21:43:43.365Z]  0:09.19 Sorry, Glean found some glinter nits:
[task 2022-01-10T21:43:43.365Z]  0:09.19 WARNING: EXPIRED: geckoview.validation.build_id: Metric has expired. Please consider removing it.
[task 2022-01-10T21:43:43.365Z]  0:09.19 WARNING: EXPIRED: geckoview.validation.version: Metric has expired. Please consider removing it.
[task 2022-01-10T21:43:43.365Z]  0:09.19 
[task 2022-01-10T21:43:43.365Z]  0:09.19 Please fix the above nits to continue.
[task 2022-01-10T21:43:43.365Z]  0:09.19 To disable a check, add a `no_lint` parameter with a list of check names to disable.
[task 2022-01-10T21:43:43.365Z]  0:09.19 This parameter can appear with each individual metric, or at the top-level to affect the entire file.
[task 2022-01-10T21:43:43.365Z]  0:09.19 Return code from mach python-test: 1
[task 2022-01-10T21:43:43.397Z] Creating default state directory: /builds/worker/.mozbuild
[task 2022-01-10T21:43:43.397Z] Test configuration changed. Regenerating backend.
[task 2022-01-10T21:43:43.397Z] No build detected, test metadata may be incomplete.
[taskcluster 2022-01-10 21:43:43.729Z] === Task Finished ===
[taskcluster 2022-01-10 21:43:43.729Z] Unsuccessful task run with exit code: 1 completed in 21.093 seconds

Tooru, could you please take a look?
It looks similar to the issue which was tracked in Bug 1744667.

Flags: needinfo?(arai.unmht)

Those probes are added by bug 1732928.

Flags: needinfo?(arai.unmht) → needinfo?(jrediger)
Regressed by: 1732928
Has Regression Range: --- → yes

Query: https://sql.telemetry.mozilla.org/queries/82578/source?p_appid=org_mozilla_fenix

We introduced geckoview.validation.build_id and geckoview.validation.version as mirrors of the otherwise EXTRACTed metrics.
My query shows:
Up to 98% of all metric pings from Fenix that contain the legacy build ID contain the new metric as well.
Out of these the reported value is the same as the legacy metric >= 98% of the time.

The initial goal of this new metric was to validate that the gecko integration works. This seems to be the case.
One other goal would be to figure out the missing/mismatched data.
This can probably be done with the current data sufficiently for now.

Looping in :chutten for validation.
I suggest to remove the metrics, with 2 followup bugs: look into the data further and finally fully migrate these 2 metrics.

Flags: needinfo?(jrediger) → needinfo?(chutten)

They served their purpose of validation Gecko integration.
In followups we do some proper migration.

Assignee: nobody → jrediger
Status: NEW → ASSIGNED
Priority: P5 → P1

(In reply to Jan-Erik Rediger [:janerik] from comment #3)

Query: https://sql.telemetry.mozilla.org/queries/82578/source?p_appid=org_mozilla_fenix

We introduced geckoview.validation.build_id and geckoview.validation.version as mirrors of the otherwise EXTRACTed metrics.
My query shows:
Up to 98% of all metric pings from Fenix that contain the legacy build ID contain the new metric as well.
Out of these the reported value is the same as the legacy metric >= 98% of the time.

The initial goal of this new metric was to validate that the gecko integration works. This seems to be the case.
One other goal would be to figure out the missing/mismatched data.
This can probably be done with the current data sufficiently for now.

Looping in :chutten for validation.
I suggest to remove the metrics, with 2 followup bugs: look into the data further and finally fully migrate these 2 metrics.

Pointing the query at org_mozilla_firefox for release data, the ratios of pings with matching values starts at 0.99 and go up from there.

Our original goals seem mostly covered. The only thing that remains is to migrate the extracted metrics so that we report GV's build_id and version in Fenix natively. And that can happen later.

Flags: needinfo?(chutten)
Pushed by jrediger@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/01e58b5ab435 Remove now-expired validation metrics. r=chutten
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 98 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: