Closed Bug 1898345 Opened 2 years ago Closed 2 years ago

prefix metrics with "socorro."

Categories

(Socorro :: General, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Socorro was written a long time ago in a world with different standards. We're moving to a metrics standard where the key is prefixed with the service name. This is in addition to setting the app tag.

Antenna does this wrong. The application code emits metrics like breakpad_resource.on_post and there's a STATSD_NAMESPACE configuration variable which is passed into Markus DatadogBackend which prefixes all emitted keys with socorro.collector.

Socorro processor, webapp, and crontabber do this wrong. The application code emits metrics like processor.es.save_processed_crash and nothing adds the prefix.

We want to fix both of these in the GCP migration such that the application emits the complete key, we're not setting a statsd_namespace, and telegraf doesn't make any changes to it.

For example, Antenna code running in GCP would emit socorro.collector.breakpad_resource.on_post.

For example, Socorro processor code running in GCP would emit socorro.processor.es.save_processed_crash.

Also, we want to make sure these things are in effect in GCP, but not AWS. For AWS, we want it to run like it's currently running.

willkg merged PR [mozilla-services/antenna]: bug-1898345: fix metrics key prefix (#1020) in 739b4cc.

This is the Antenna fix.

After this autodeploys to stage, I'll update:

  • sentry_scrub_error panel in Socorro stage AWS and GCP dashboards
  • all the collector-related panels in the Socorro stage GCP dashboard

I updated sentry_scrub_error panels in Socorro stage AWS and GCP dashboards.

I updated all the collector-related panels in the Socorro stage GCP dashboard.

willkg merged PR [mozilla-services/socorro]: bug-1898341, bug-1898345: add "host" tag, fix metrics key prefix (#6623) in 169ca35.

After this autodeploys to stage, I'll need to:

  • update the sentry_scrub_error panels in the Socorro stage AWS and GCP dashboards
  • update all the non-collector panels in the Socorro stage GCP dashboard

On 2024-05-24, I updated the sentry_scrub_error panels in Socorro stage AWS and GCP dashboards. I also updated all the keys in the Socorro stage GCP dashboard.

This was deployed in bug 1899547, tag v2024.05.29.

Flags: needinfo?(willkg)

The Socorro part of this is deployed, but we should wait until the Antenna part is deployed before closing this out.

Flags: needinfo?(willkg)

This was deployed for Antenna in bug 1899805, tag v2024.05.30.

Flags: needinfo?(willkg)

Keys look fine for all socorro services in the Socorro AWS prod dashboard.

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Flags: needinfo?(willkg)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: