Update stable table view generation to rename url2 -> url and text2 -> text, etc.
Categories
(Data Platform and Tools :: General, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: klukas, Assigned: akomar)
References
Details
(Whiteboard: [dataplatform])
Attachments
(3 files)
See https://bugzilla.mozilla.org/show_bug.cgi?id=1737656#c13 for more detail.
Many Glean ping stable tables contain incorrect fields metrics.url
, metrics.text
, metrics.jwe
, and metrics.labeled_rate
.
We can remove these from user-facing views by updating the generate_views
machinery in bigquery-etl. There's already a similar special case for cleaning fenix metrics specifically and the logic for this would be similar.
Logic would basically be:
- If there is a field with any of the above names, remove it
- If a field
metrics.url2
exists, rename it tometrics.url
, same for the other 3 types
Reporter | ||
Comment 1•3 years ago
|
||
Given that url2
and text2
fields have existed for a while now in some rally pings, we likely should implement this in two steps. In the first step, we would include the contents of metrics.url2
in both the metrics.url2
and metrics.url
positions, so that the metrics can be accessed using either name. Then, we'll announce the change to rally folks and make sure they have a bit of time to adapt before removing metrics.url2
since they may have existing queries depending on that name.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 2•2 years ago
|
||
Given that
url2
andtext2
fields have existed for a while now in some rally pings, we likely should implement this in two steps. In the first step, we would include the contents ofmetrics.url2
in both themetrics.url2
andmetrics.url
positions, so that the metrics can be accessed using either name. Then, we'll announce the change to rally folks and make sure they have a bit of time to adapt before removingmetrics.url2
since they may have existing queries depending on that name.
:whd how are things with Rally data, is anyone still using it? Do you think we still need to follow the original plan outlined above?
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 3•2 years ago
|
||
(In reply to Arkadiusz Komarzewski [:akomar] from comment #2)
Given that
url2
andtext2
fields have existed for a while now in some rally pings, we likely should implement this in two steps. In the first step, we would include the contents ofmetrics.url2
in both themetrics.url2
andmetrics.url
positions, so that the metrics can be accessed using either name. Then, we'll announce the change to rally folks and make sure they have a bit of time to adapt before removingmetrics.url2
since they may have existing queries depending on that name.:whd how are things with Rally data, is anyone still using it? Do you think we still need to follow the original plan outlined above?
It looks like we should follow the original plan here since url metric is already being used in some applications and url2
field exists in their respective schemas:
➜ mozilla-pipeline-schemas git:(generated-schemas) ✗ grep -r "url2" schemas
schemas/org-mozilla-ios-firefox/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/mdn-yari/page/page.1.bq: "name": "url2",
schemas/mdn-yari/action/action.1.bq: "name": "url2",
schemas/org-mozilla-ios-firefoxbeta/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/org-mozilla-firefox/metrics/metrics.1.bq: "name": "url2",
schemas/org-mozilla-firefox/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/org-mozilla-firefox/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/org-mozilla-focus-beta/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/rally-attention-stream/youtube-video-recommendations/youtube-video-recommendations.1.bq: "name": "url2",
schemas/rally-attention-stream/youtube-ads/youtube-ads.1.bq: "name": "url2",
schemas/rally-attention-stream/user-journey/user-journey.1.bq: "name": "url2",
schemas/rally-attention-stream/youtube-video-details/youtube-video-details.1.bq: "name": "url2",
schemas/rally-attention-stream/advertisements/advertisements.1.bq: "name": "url2",
schemas/rally-attention-stream/article-contents/article-contents.1.bq: "name": "url2",
schemas/rally-attention-stream/tracking-pixel/tracking-pixel.1.bq: "name": "url2",
schemas/org-mozilla-focus-nightly/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/org-mozilla-focus/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/pine/metrics/metrics.1.bq: "name": "url2",
schemas/org-mozilla-fennec-aurora/metrics/metrics.1.bq: "name": "url2",
schemas/org-mozilla-fennec-aurora/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/org-mozilla-fennec-aurora/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/org-mozilla-klar/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/firefox-desktop/metrics/metrics.1.bq: "name": "url2",
schemas/rally-markup-fb-pixel-hunt/fbpixelhunt-journey/fbpixelhunt-journey.1.bq: "name": "url2",
schemas/rally-markup-fb-pixel-hunt/fbpixelhunt-event/fbpixelhunt-event.1.bq: "name": "url2",
schemas/rally-markup-fb-pixel-hunt/fbpixelhunt-pixel/fbpixelhunt-pixel.1.bq: "name": "url2",
schemas/org-mozilla-firefox-beta/metrics/metrics.1.bq: "name": "url2",
schemas/org-mozilla-firefox-beta/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/org-mozilla-firefox-beta/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/org-mozilla-fenix/metrics/metrics.1.bq: "name": "url2",
schemas/org-mozilla-fenix/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/org-mozilla-fenix/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
schemas/org-mozilla-ios-fennec/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/org-mozilla-fenix-nightly/metrics/metrics.1.bq: "name": "url2",
schemas/org-mozilla-fenix-nightly/topsites-impression/topsites-impression.1.bq: "name": "url2",
schemas/org-mozilla-fenix-nightly/cookie-banner-report-site/cookie-banner-report-site.1.bq: "name": "url2",
The next steps seem to be:
- Modify the view generation logic to automatically alias metrics.url2 field as metrics.url, same for the other fields (metrics.text, metrics.jwe, and metrics.labeled_rate). This will result in both those fields returning exactly the same values. This should allow us downstream impact since the url field should not be currently used.
- Announce the deprecation of url2 field and allow for anything using it to update its references to use url (and other) field(s) instead.
- Remove the fields with suffix
2
from the view.
Comment 4•2 years ago
|
||
Comment 6•2 years ago
|
||
We will wait until the above PR is merged and confirm that the change is working as intended. Once we're confident everything is in order, we will proceed to move forward with step 2. of the above outlined plan.
Comment 7•2 years ago
|
||
Comment 8•2 years ago
|
||
Comment 9•2 years ago
|
||
PR opened to carry out step one of the above outlines plan:
https://github.com/mozilla/bigquery-etl/pull/4029
Description
•