Socorro uses Datadog for metrics generated in the processor and infrastructure. Antenna uses Datadog as well and uses the Markus (https://github.com/willkg/markus) library to make generating metrics really easy. Socorro webapp currently has no scaffolding for Datadog/statsd metrics. Further, the scaffolding I wrote for the processor is a hack and there's also the Statsd crashstorages which are interesting, but not convenient to use at all except for metrics for crashstorage-centric things. I want to switch all our metrics code to use Markus. Markus separates metrics generation code from metrics backend setup and configuration which alleviates problems with bootstrapping and doing things at module import time. It also has a logging backend and test mocks which make debugging and development much easier. This bug covers setting up Markus in Socorro thus making metrics much easier to do in the processor and webapp and anywhere else.
There are two problems with using Markus: 1. Markus doesn't support Python 2.7--I'll have to fix that. :( 2. Markus uses the latest Datadog Python library and Socorro uses a really old one that has a different API. We could either write our own Markus backend (it's not hard) or update Socorro to the latest Datadog Python library (I have no idea how hard this is). Why do this now? This will make generating metrics as easy as logging things with the Python logging library. Having metrics be hard to do really hampers our ability to understand how Socorro performs in -prod. Why Markus as opposed to something else? Markus gives us a really convenient API and testing tool. I looked hard for similar things and saw nothing which led me to write Markus. Having said that, I wrote it (bus factor) I don't think anyone else uses it (lack of community, etc), so that poses some problems. Maybe supporting Python 2.7 alleviates that? Maybe marketing it more will help? Anyhow, I think it's worth using in Socorro. Let us cast off these shackles and be hampered no more!
This issue covers supporting Python 2.7 in Markus: https://github.com/willkg/markus/issues/23
(In reply to Will Kahn-Greene [:willkg] ET needinfo? me from comment #1) > 2. Markus uses the latest Datadog Python library and Socorro uses a really > old one that has a different API. We could either write our own Markus > backend (it's not hard) or update Socorro to the latest Datadog Python > library (I have no idea how hard this is). I was planning on doing the latter as part of bug 1306731 anyway. Depending on urgency we could just wait for that to land.
I fixed Markus to support Python 2.7 (that was super easy) and pushed out 1.0. I'll look at whether that'll "just work" with Socorro or whether I have to update our datadog library and what that entails next. I think I'm going to split the work here into "get Markus working with the webapp" and "get Markus working with the rest of Socorro" steps. It's a little trickier to replace the existing statsd stuff in Socorro with Markus because we need to make sure we don't change any of the existing metrics keys. That'll definitely take a while and I need to get Markus into the webapp so we can do bug #1411991 which has some urgency behind it.
Adding metrics infrastructure to the webapp is in PR 4138: https://github.com/mozilla-services/socorro/pull/4138 After that lands, I'll look at adding Markus to Socorro apps. Then I'll look at whether we can remove the statsd crashstorage stuff. Those are lower priority, though, and can wait until after the change freeze.
Commit pushed to master at https://github.com/mozilla-services/socorro https://github.com/mozilla-services/socorro/commit/f6acde5ebdf32532a9b174950274263052ef6f38 bug 1412590 - add Markus to webapp This adds Markus to the webapp so we can generate metrics.