Closed Bug 1811928 Opened 2 years ago Closed 2 years ago

Android Components: get metrics/pings.yaml for crash pings from mozilla-central

Categories

(Firefox for Android :: General, task)

All
Android
task

Tracking

()

RESOLVED FIXED

People

(Reporter: afranchuk, Assigned: afranchuk)

References

()

Details

Attachments

(1 file)

For bug 1810951, we'd like to add Glean-based crash pings to libcrash. Such crash pings have been recently added to mozilla-central as part of bug 1790569.

After some discussion with :chutton and :travis_, we came to the conclusion that it'd be best for the android-components build to pull the metrics/pings.yaml files from mozilla-central as part of the gradle build.

Adding a few needinfos (from git blame, etc) to determine whether this is desirable and whether there is already precedent for pulling artifacts from mozilla-central.

Flags: needinfo?(s.kaspari)
Flags: needinfo?(csadilek)

... to determine whether this is desirable and whether there is already precedent for pulling artifacts from mozilla-central.

Yes, we are doing this for [1], which is packaged as part of GeckoView [2].

[1] https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/geckoview/streaming/metrics.yaml
[2] https://searchfox.org/mozilla-central/source/mobile/android/geckoview/build.gradle#380

Pinging Jan-Erik :), who will be best to confirm if this is desirable, but technically it should be possible through GeckoView. Jan-Erik, do you have any concerns with this?

Flags: needinfo?(s.kaspari)
Flags: needinfo?(jrediger)
Flags: needinfo?(csadilek)

The setup for geckoview-streaming is a bit complex and a bit of a hack to be honest.
The Glean Gradle plugin knows how to look up a single metrics.yaml in the root of the AAR and is then able to incorporate that at build time.
The probe-scraper (the tool that consumes all metrics files and generates info for the database schemas) can handle that, because engine-gecko is a dependency and sourced from the m-c repository.

Now a similar mechanism could work for libcrash.
There's some caveats though:

  • The crash ping and crash metrics are already known as metrics of gecko (see this PR and dictionary), so we wouldn't want that same file suddenly also be reported as coming from libcrash (otherwise our tooling will complain about duplicated metrics)
    • the dependency tree will look a bit weird: libcrash won't mention its own metrics/pings.yaml, but all consumers will actually have those through gecko. You just wouldn't be able to use libcrash in apps that don't depend on gecko. That weirdness can be seen in the dictionary (notice how it's tagged gecko and Toolkit :: Crash Reporting), but that's all.
  • How the plugin handles it right now for geckoview-streaming is a hack and I'd rather like to avoid requiring changes to the plugin. We should find out if this can be done directly in libcrash's build.gradle before invoking the Glean plugin.
Flags: needinfo?(jrediger)

You just wouldn't be able to use libcrash in apps that don't depend on gecko.

Does this actually apply to all of libcrash, or could it be made to only apply to the use of the GleanCrashReporterService? Either way, I don't like the idea of this sort of hidden dependency. If I'm not mistaken the GleanCrashReporterService already relies on one such dependency: the application must be using and initializing Glean itself. Though that particular case is more reasonable. I would feel a bit better if it were called GleanGeckoCrashReporterService in that case.

I certainly don't want to pile on more hacks. I'm sure there's a way we can do it directly in build.gradle if that is preferable. In that case, would we want to also change the geckoview-streaming to do it through gradle too, or is there some extra subtlety there that I'm missing?

If I've been following along closely enough, the complexity is that I made the mistake of asking you to put your metrics|pings.yaml in gecko_metrics. This means if probe-scraper sees the same pings and the same metrics being included twice for the same application (Fenix includes gecko_metrics, libcrash will include your copy), it'll yell at us.

We might be able to move the definitions files to firefox_desktop_metrics which will allow libcrash to supply the definitions for Fenix and the component to supply them for Firefox Desktop (and no one to supply them for applications that use the redistributable portions of gecko but don't include libcrash). We shouldn't even need to coordinate this too closely, so long as we move your definitions files to firefox_desktop_metrics first.

There are subtleties here, though, so if we pursue this we should involve someone (like :relud) more knowledgeable about probe-scraper and any unsupported schema transformations that might result. They'll know whether the transformation we propose will be tricky, easy, or frustrating.

That was my mistake too, jumping the gun without fully understanding how the Fenix integration would work. It would have been more straightforward if we were able to only apply things in Gecko, but unfortunately crash reporting is a unique case since it has to do things "outside" of the application.

It does sound like moving them back to firefox_desktop would architecturally be the most appropriate.

Yes, doing it that way would also work and would solve the dependency issue, leaving us with only figuring out how to properly get that file in the a-c build.

:relud, see above? Would it be okay to move some metrics/pings from gecko to firefox_desktop, and have those files pulled into libcrash (for use in Fenix)? At this point whether those metrics are pulled in or merely replicated in libcrash are beside the point: either way if they remained in gecko we'd have a potential conflict in Fenix (as I understand it) if they have identical names.

Flags: needinfo?(dthorn)

(In reply to Alex Franchuk from comment #7)

:relud, see above? Would it be okay to move some metrics/pings from gecko to firefox_desktop, and have those files pulled into libcrash (for use in Fenix)? At this point whether those metrics are pulled in or merely replicated in libcrash are beside the point: either way if they remained in gecko we'd have a potential conflict in Fenix (as I understand it) if they have identical names.

I can actually answer here.
There's one little wrinkle here: probe-scraper needs a path to the file in the repo of lib-crash, so in https://github.com/mozilla-mobile/firefox-android. I forgot about that yesterday in my last answer.
With the file not in the repo we don't have a way to tell probe-scraper about it. probe-scraper only looks in the repository and doesn't invoke the build system.

There's no perfect solution.
Maybe the following one is okayish:

  • Have a library crash-shared in probe-scraper, that has https://github.com/mozilla/gecko-dev as the repo.
  • Make lib-crash depend on that
  • Move toolkit/components/crashes/metrics.yaml to the firefox_desktop section (should be done in the metrics_index.py, we have tooling that syncs those changes over)

probe-scraper re-uses checkous of repos, so we're not recloning gecko-dev just for that.

Flags: needinfo?(dthorn)

I was wondering how that was going to work, but wasn't familiar enough with probe-scraper so I assumed the lack of discussion meant there was some way :)

The alternative approach here, of course, would be to just have copied metrics definitions in two places. The disadvantage is naturally that all the metadata that goes along with them has to be maintained if updates occur, but the advantage is an easier way to make the rest of the tooling work. I expect over time there will be at least a few metrics that end up being exclusive to desktop/android/etc. If there's one definition file for everything, we could probably just leave comments on which are expected or not given the platform. If there are separate definitions, that property will be more obvious.

I could be swayed either way, but I always find it less error prone to avoid duplication (unless the duplicated stuff is truly static/permanent). So I think your suggestion is probably the best we'll do. Thankfully there shouldn't be too many more odd cases like this.

Depends on: 1812600

:janerik, by "Make lib-crash depend on that" do you mean depend on the gecko-dev repo? If so, I have that working.

Does adding crash-shared to probe-scraper just make it link the reports nicely? It's not clear to me how the application distinction of probe-scraper interacts with the actual pings and metric reporting. Will they be rejected if there isn't an associated library/application registered?

I have this done alongside bug 1810951 in the same commit. If it's important to have separate PRs and/or commits for them, I can separate them.

https://github.com/mozilla-mobile/firefox-android/pull/671

Assignee: nobody → afranchuk

I split the commit in two to have one just for this. https://github.com/mozilla-mobile/firefox-android/pull/686

Severity: -- → N/A

The PR has landed.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: