Update probe scraper for Mozilla Rally
Categories
(Data Platform and Tools :: Glean Platform, task, P1)
Tracking
(Not tracked)
People
(Reporter: aagarwal, Assigned: Dexter)
Details
Attachments
(5 files)
Metrics and pings definitions for Mozilla Rally have moved to a new monorepo. Also, the same set of definitions are now being used for both the Core Add-on and the Web Platform.
As such, the fields in the probe-scraper repositories.yaml should be updated:
canonical_app_name: Rally Core Add-on and Web Platform
app_description: |
The Rally Core Add-on and Web Platform orchestrate
the installation and the lifecycle of [Rally](https://rally.mozilla.org/)
studies.
url: https://github.com/mozilla-rally/rally
branch: main
metrics_files:
- web-platform/glean/metrics.yaml
ping_files:
- web-platform/glean/pings.yaml
(rest can remain as-is, from my understanding)
Assignee | ||
Updated•3 years ago
|
Comment 1•3 years ago
|
||
Updated•3 years ago
|
Comment 2•3 years ago
|
||
Looks like this introduced an incompatible schema change: https://workflow.telemetry.mozilla.org/log?dag_id=probe_scraper&task_id=mozilla_schema_generator&execution_date=2022-05-17T00%3A00%3A00%2B00%3A00
[2022-05-18 20:31:29,051] {{pod_launcher.py:149}} INFO - --- /app/validate_schema_evolution/da26f512f0f322e06a297ffa05f2cc51d3eabc72_base-revision/rally-core.demographics.1.txt
[2022-05-18 20:31:29,051] {{pod_launcher.py:149}} INFO - +++ /app/validate_schema_evolution/97fad6adf5de1a80deb777074b36a40037fbe520_generated-schemas/rally-core.demographics.1.txt
[2022-05-18 20:31:29,051] {{pod_launcher.py:149}} INFO - @@ -46,4 +46,2 @@
[2022-05-18 20:31:29,051] {{pod_launcher.py:149}} INFO - root.metrics.labeled_boolean.user_gender.[].value BOOL
[2022-05-18 20:31:29,051] {{pod_launcher.py:149}} INFO - -root.metrics.labeled_boolean.user_income.[].key STRING
[2022-05-18 20:31:29,051] {{pod_launcher.py:149}} INFO - -root.metrics.labeled_boolean.user_income.[].value BOOL
[2022-05-18 20:31:29,052] {{pod_launcher.py:149}} INFO - root.metrics.labeled_boolean.user_origin.[].key STRING
Comment 3•3 years ago
|
||
Assignee | ||
Comment 4•3 years ago
|
||
Hey folks,
looks like the rally-core-addon repository removed the user.income
on July 2021 and started sending the user.exact_income
. The monorepo metrics.yaml is now identical to the rally-core-addon one (minus one forward compatible change).
Why are we tripping on this now? How can we fix this? Is this because the monorepo did not retain the full history on the YAML files?
Comment 5•3 years ago
|
||
Is this because the monorepo did not retain the full history on the YAML files?
This is most likely the case. It looks like you were involved in a similar fix for regrets reporter so I'd recommend doing something similar:
https://github.com/mozilla-extensions/regrets-reporter/pull/28
https://github.com/mozilla/probe-scraper/pull/380
https://bugzilla.mozilla.org/show_bug.cgi?id=1744585
Assignee | ||
Comment 6•3 years ago
|
||
Hey Ashwin, Rob,
looks like that the history for the YAML files was not ported over to the monorepo. Unfortunately, to make this whole dance work, we need to carry over the history (or, rather, the commit-by-commit changes) for the two YAML files from the rally-core addon. Any chance you could kindly do it?
Then I could re-land the scraper changes.
Comment 7•3 years ago
|
||
(In reply to Alessio Placitelli [:Dexter] from comment #6)
Hey Ashwin, Rob,
looks like that the history for the YAML files was not ported over to the monorepo. Unfortunately, to make this whole dance work, we need to carry over the history (or, rather, the commit-by-commit changes) for the two YAML files from the rally-core addon. Any chance you could kindly do it?
Then I could re-land the scraper changes.
Yes we are planning to do this, will your tooling follow renames OK (e.g. git log --follow
)? I know github has problems with them.
Assignee | ||
Comment 8•3 years ago
|
||
(In reply to Robert Helmer [:rhelmer] from comment #7)
(In reply to Alessio Placitelli [:Dexter] from comment #6)
Hey Ashwin, Rob,
looks like that the history for the YAML files was not ported over to the monorepo. Unfortunately, to make this whole dance work, we need to carry over the history (or, rather, the commit-by-commit changes) for the two YAML files from the rally-core addon. Any chance you could kindly do it?
Then I could re-land the scraper changes.
Yes we are planning to do this, will your tooling follow renames OK (e.g.
git log --follow
)? I know github has problems with them.
I'm not sure. But even something as simple as exporting the commits as patches are re-applying them in order would work for this. See comment 5.
Reporter | ||
Comment 9•3 years ago
|
||
I applied patches in this PR which is now merged: https://github.com/mozilla-rally/rally/pull/24
We should be ready to re-open the probe-scraper PR.
Comment 10•3 years ago
|
||
probe scraper only checks commits in the linear history as returned by git log -- filename
, and does not specify --follow
, so https://github.com/mozilla-rally/rally/pull/24 was probably not sufficient because it needed to use a rebase merge (instead of a merge commit) to expose old versions of the file to probe scraper.
Assignee | ||
Updated•3 years ago
|
Reporter | ||
Comment 11•3 years ago
|
||
Opened a new PR with a rebase merge: https://github.com/mozilla-rally/rally/pull/26
Reporter | ||
Comment 12•3 years ago
|
||
Closed the PR; the repo should be ready now.
Comment 13•3 years ago
|
||
Updated•3 years ago
|
Assignee | ||
Comment 14•3 years ago
|
||
Hey :relud, looks like the PR from comment 13 was merged. Did it work? Can this be closed?
Reporter | ||
Comment 15•3 years ago
|
||
FWIW I am not seeing the new "unenrollment" table in the analysis environment yet, but it's been less than 24 hours.
Comment 16•3 years ago
|
||
:Dexter looks like schema deploys worked, so yeah, this can be closed.
Ashwin: yeah, I expect, it will take until a few hours into tomorrow UTC before that is available.
Comment 17•3 years ago
|
||
Ashwin: yeah, I expect, it will take until a few hours into tomorrow UTC before that is available.
FWIW I am not seeing the new "unenrollment" table in the analysis environment yet, but it's been less than 24 hours.
Ashwin pinged me to get that PR merged before today's probe scraper run so the (intentional) issue here is that Rally schema updates do not automatically propagate, see https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=124977458#Rally(akaPioneer)-Schemasupdatesdon'tautomaticallypropagatetoproduction.
I've deployed these tables to production and authorized them so you should have access to the tables via the analysis environment now.
Reporter | ||
Comment 18•3 years ago
•
|
||
I just looked in the analysis environment and I do indeed see the new "unenrollment" table.
However, I now also see an "onboarding" table that I don't think was there before. I looked through the commit history and it looks like an "onboarding" ping was introduced in this commit but subsequently removed in this commit and never re-added.
I'm guessing this is happening because that second commit where the "onboarding" ping was removed is also now being ignored by probe-scraper as per :relud's necessary patch. So the schema generation process never sees that ping get removed.
For the time being, I don't think it's a blocking issue for us to have an extraneous table (correct me if I'm wrong). But it would be good to get rid of it so as not to cause confusion. Let me know how we can proceed with that, including if I should file a new bug.
Reporter | ||
Updated•3 years ago
|
Comment 19•3 years ago
|
||
I'm guessing this is happening because that second commit where the "onboarding" ping was removed is also now being ignored by probe-scraper as per :relud's necessary patch. So the schema generation process never sees that ping get removed.
I haven't investigated this hypothesis closely but in order to remove this ping we will need to update the incompatibility allowlist in
https://github.com/mozilla/mozilla-schema-generator/commits/main/incompatibility-allowlist to allow the deletion to propagate.
For the time being, I don't think it's a blocking issue for us to have an extraneous table (correct me if I'm wrong).
This is correct.
But it would be good to get rid of it so as not to cause confusion. Let me know how we can proceed with that, including if I should file a new bug.
I'll leave :relud's NI so he can make a recommendation here.
An alternate (rally-specific) way we can hide the fact that this table exists is to exclude the authorized view from being published to the analysis project. This is similar to how we treat deletion-request pings. I'd recommend we simply delete the schemas and associated tables but this would be a low effort alternative.
Comment 20•3 years ago
|
||
(In reply to Ashwin Agarwal from comment #18)
I'm guessing this is happening because that second commit where the "onboarding" ping was removed is also now being ignored by probe-scraper as per :relud's necessary patch. So the schema generation process never sees that ping get removed.
normally we don't remove tables even when the ping is removed from glean, because we still accept data from every version of the application. If we want to delete that table anyway, we should ignore all commits where the ping exists in probe scraper and add it to the incompatibility allowlist like :whd mentioned
Reporter | ||
Comment 21•3 years ago
|
||
(In reply to Daniel Thorn [:relud] from comment #20)
normally we don't remove tables even when the ping is removed from glean, because we still accept data from every version of the application. If we want to delete that table anyway, we should ignore all commits where the ping exists in probe scraper and add it to the incompatibility allowlist like :whd mentioned
Ah ok that makes sense. I just ran some queries and it looks like the table is empty, so for us it makes sense to not have it in the analysis environment. I am okay with either deleting it or hiding it, :relud whatever you'd prefer?
Comment 22•3 years ago
|
||
Comment 24•3 years ago
|
||
Comment 25•3 years ago
|
||
The tables and views have been deleted.
Assignee | ||
Updated•3 years ago
|
Description
•