Closed Bug 1867770 Opened 2 years ago Closed 2 years ago

Enable new Glean App `mozillavpn_cirrus`

Categories

(Data Platform and Tools Graveyard :: Glean Platform, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: brizental, Unassigned)

Details

To be filled by the requester

Application ID*: mozillavpn_backend_cirrus
Application Canonical Name: Mozilla VPN Cirrus Sidecar (?)
Description: The sidecar Cirrus container for the Mozilla VPN server
Data-review response link:
Repository URL: https://github.com/mozilla/experimenter
Locations of metrics.yaml files (can be many):

  • experimenter/cirrus/server/telemetry/metrics.yaml

Locations of pings.yaml files (can be many):

  • experimenter/cirrus/server/telemetry/pings.yaml

Dependencies**:

  • glean-core

Retention Days***: 180 (The same as the Mozilla VPN)

To be filled by the Glean team

Application friendly name: my_app_name

The following are only required for products requiring encryption:

Document namespace: my-app-name

Please NI Operations on this bug to request the creation of encryption keys and an analysis project.

Since this is not VPN client telemetry, I'd suggest this not be added under the existing mozilla_vpn app grouping (which is currently all client telemetry), but rather under a new app grouping, and with a generic enough app grouping name to allow for future other VPN server telemetry to be added under it (e.g. mozilla_vpn_backend or mozilla_vpn_server).

I agree with generic app grouping, just want to make sure it ends with _cirrus to make sure it is compatible with the analysis

Hi Sean! Yeah, this does not need to be added to that group. I agree. Does that mean we need to change the app id to something else though? As Yashika mentioned, whatever is fine, but we need to end with _cirrus. That is the only constraint.

Flags: needinfo?(srose)

(In reply to Beatriz Rizental [:brizental] from comment #3)

Hi Sean! Yeah, this does not need to be added to that group. I agree. Does that mean we need to change the app id to something else though? As Yashika mentioned, whatever is fine, but we need to end with _cirrus. That is the only constraint.

I wasn't suggesting changing the app_id, I was suggesting using a generic app_name to encompass all VPN server telemetry, like the app_name "mozilla_vpn" encompasses the VPN client telemetry for desktop/legacy (app_id "mozillavpn"), Android (app_id "org.mozilla.firefox.vpn"), iOS (app_id "org.mozilla.ios.FirefoxVPN"), and iOS network extension (app_id "org.mozilla.ios.FirefoxVPN.network-extension") as separate "channels" of the VPN client app.

One main impact of the app_name is you get a BigQuery dataset with that name containing roll-up views for the app's pings that combine the data from all of the app's channels. So if you'd want the VPN server's Cirrus pings combined in the same view with other future VPN server Glean pings, then using a generic app_name for VPN server telemetry would make sense. If not, you're free to use a separate app_name just for the VPN server Cirrus pings (which is what Monitor has done).

Flags: needinfo?(srose)

Repository URL: https://github.com/mozilla/experimenter

While that's the repo where the Cirrus metrics/pings are defined, I think the repository URL in the probe-scraper app definition should be the VPN server's own repo (https://github.com/mozilla-services/guardian-website) and the Cirrus metrics/pings can be brought in by specifying the nimbus-cirrus library as a dependency (which is what Monitor has done for their Cirrus setup).

Ok, I see. I changed it to mozillavpn_backend_cirrus. With regards to the URL, that works. Is it an issue however, that the Mozilla VPN server repository is private? (Here it is: https://github.com/mozilla-services/guardian-website).

sorry I mean app_name should end with _cirrus

(In reply to Beatriz Rizental [:brizental] from comment #6)

Ok, I see. I changed it to mozillavpn_backend_cirrus. With regards to the URL, that works. Is it an issue however, that the Mozilla VPN server repository is private? (Here it is: https://github.com/mozilla-services/guardian-website).

It has been added as mozillavpn_cirrus and with the Experimenter repo URL, which is fine.

Probe scraper doesn't currently use Git auth so it couldn't access private repos, though if that becomes a requirement we could set up Git auth for it.

For the monitor we were using the monitor repo link- https://github.com/mozilla/probe-scraper/blob/a07c5ec772e3dc71a4adc94f992371b78eb0b8d6/repositories.yaml#L1354, can we do same for VPN? I mean adding git auth for it

(In reply to ykhurana from comment #9)

For the monitor we were using the monitor repo link- https://github.com/mozilla/probe-scraper/blob/a07c5ec772e3dc71a4adc94f992371b78eb0b8d6/repositories.yaml#L1354, can we do same for VPN? I mean adding git auth for it

Probably yes, though that'd take more work.

Though if we did the same setup as monitor_cirrus with nimbus-cirrus specified as a dependency and no metrics/ping files explicitly listed then maybe probe scraper wouldn't even try to access the specified repo, in which case Git auth still wouldn't (currently) be needed.

Though if we did the same setup as monitor_cirrus with nimbus-cirrus specified as a dependency and no metrics/ping files explicitly listed then maybe probe scraper wouldn't even try to access the specified repo, in which case Git auth still wouldn't (currently) be needed.

I am sorry Sean, I didn't realize nimbus-cirrus was a dependecy that existed. We should use that instead.

I also think it should not be an issue that the Guardian repository is private, given there is not nothign for probe scraper to crawl there.

sounds good to me using nimbus-cirrus dependency

Should I make changes in the repositories.yaml to reflect the above conversation?

Marlene, not sure I am the best equipped person to answer here. But I think yes.

cc [:srose].

Flags: needinfo?(srose)
Flags: needinfo?(mhirose)

Re the app ID change, I think it makes sense to do since mozillavpn has been used for client telemetry and it would be good to clearly differentiate the server/backend telemetry. Once complicating factor is that mozillavpn_cirrus datasets and tables have been deployed to BigQuery, though they don't appear to currently contain any data yet, so as long as no data has been received then deleting those datasets/tables should probably only be a minor hassle (though I can't remember offhand exactly what all is required to delete such Glean ingestion artifacts).

Re the repo/dependency change, I agree that also makes sense, so that future updates to the nimbus-cirrus dependency apply to the VPN server Cirrus telemetry. However, there is a minor risk that even with no metric/ping files specified probe scraper will fail due to it being a private repo, and we'd have to tweak probe scraper not to attempt access when there are no metric/ping files specified, or figure out the Git auth setup.

In summary, I agree with making the proposed changes, just be aware that's creating some more work to be done (hopefully not a lot).

Flags: needinfo?(srose)

FWIW, I was only referring to the following changes:

  1. Change app ID
  2. Use the nimbus-cirrus dep instead of manually pointing to those metrics and pings files

Pointing to the Guardian repository I am totally neutral about, we don't need to do it if it means any extra work.

yeah we haven't populated any data yet, I think it would be safe to do refactoring at the moment

(In reply to Sean Rose [:srose] from comment #15)

Re the repo/dependency change, I agree that also makes sense, so that future updates to the nimbus-cirrus dependency apply to the VPN server Cirrus telemetry. However, there is a minor risk that even with no metric/ping files specified probe scraper will fail due to it being a private repo, and we'd have to tweak probe scraper not to attempt access when there are no metric/ping files specified, or figure out the Git auth setup.

I've merged https://github.com/mozilla/probe-scraper/pull/708 to accomplish this, with a tweak to probe-scraper to not scrape repositories with no metrics/ping/tag files configured.

I believe this bug has now been fully resolved.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in before you can comment on or make changes to this bug.