Closed Bug 1428330 Opened 6 years ago Closed 6 years ago

Datadog performance metrics for a10n

Categories

(Localization Infrastructure and Tools :: Automation, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Pike, Assigned: Pike)

References

Details

Attachments

(1 file)

43 bytes, text/x-github-pull-request
miles
: feedback+
Details | Review
It'd be nice to know how our elmo automation performs in general, but in particular as we're moving to a new infrastructure, let's start collecting metrics so that we can find out if we're getting better or worse.

Talking to Miles last year, we figured that datadog would be good to use.

I'd like to start doing so on l10n-dashboard2.webapp.scl3.mozilla.com, so that we get baseline metrics.
Attached file PR to add metrics
Miles, can you help with this? I don't expect an honest code review, but the config parts in particular? Would you know of someone you'd like to look at this?

Also, for both you and Eric, what do we need to do to get this up on l10n-dashboard2.webapp.scl3.mozilla.com?
Attachment #8940192 - Flags: feedback?(miles)
I've not installed the Datadog agent but is it all in the virtualenv like that PR makes it look?  Even if it isn't, I assume installing it at the OS level would be fine, just spin off a bug for whatever you want us to do there.
The Datadog agent itself should probably be installed on the elmo hosts directly. By default, the Datadog agent listens on localhost:8125 - we use the default in cloudops' infra.
Attachment #8940192 - Flags: feedback?(miles) → feedback+
These work now, https://app.datadoghq.com/dash/integration/custom%3Aelmo?live=true&tpl_var_scope=host%3Al10n-dashboard2.webapp.scl3.mozilla.com&page=0&is_auto=false&from_ts=1516798178802&to_ts=1516812578802&tile_size=m.

I'm still wondering about the times and ms, and if there are tricks to set up the dashboards to report more human-readable numbers than 3M ms. Miles, do you know?
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
For calculations to be done on the metrics, you'll need to create a timeboard instead of a custom metrics display.
I've gone ahead and done that here: https://app.datadoghq.com/dash/512513/elmo-scl3-prod.

I copied your custom metrics and overlayed related timing metrics on the same graphs. If you hover over the lines, you can see which lines below to which metrics. It looks like you got the timings in minutes - nice! - if in the future you want to do more calculations, you can click the + icon to the right of a metric and choose options there.
I actually fixed this manually, I went into each metric and set its unit to milliseconds, see https://app.datadoghq.com/metric/summary?filter=end_to for examples. After that, datadog started showing human-readable numbers.

I've also created two dashboards, both are named elmo-ish, similar to yours ;-)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: