Closed Bug 1674406 Opened 4 years ago Closed 3 years ago

add metrics generation for eliot

Categories

(Eliot :: General, enhancement, P2)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(5 files)

When writing Eliot, I added some rough metrics and I have the infrastructure for generating metrics implemented, but I didn't spend a lot of time thinking about what questions I wanted to answer with metrics and making sure I had metrics to generate data to answer those questions.

This bug covers that work.

Assignee: nobody → willkg
Status: NEW → ASSIGNED

willkg merged PR #2267: "bug 1674406: reimplement metrics in Eliot and add new ones" in 4ba06a9.

I also redid metrics in Eliot so that they're auto-documented. It extends a proof-of-concept I did a while back. I figure we'll test it out here and if it works, I'll switch Tecken webapp to do the same thing. If it doesn't work, then we'll remove it.

Brian pointed out the gauges should be counters. Then he said it'd be even better if they were histograms since we're probably going to look at max/min/mean over time for them.

Changing them to histograms now.

The thing left to do here is build a dashboard.

I started to add a dashboard.

Things we want to see:

  1. symbolication v4 vs. v5 usage
  2. how long it takes to handle a symbolication request
  3. mean and 95 percentile for how long it takes to parse SYM files
  4. cache hits vs. cache misses
  5. cache churn (adding things and removing things)

I've got graphs for 1, 2, and 3.

Cache hits vs. cache misses can't be done now because the current code has that in the diskcache get method, but that's effectively skipped because the symbolicator_resource code checks to see if it's in the cache first. Oops. I'll need to fix that.

I thought I had a graph for 5 looking at eliot.diskcache.set vs. eliot.diskcache.evict, but I'm not seeing any evict metrics. Either the cache is so enormous that I haven't hit evictions, yet, or the disk cache manager isn't set up to send metrics, yet. I need to look into that.

I fixed 4 (cache hits vs. misses).

I spent a bunch of time looking at 5 (cache churn). I haven't seen any evictions, but the cache manager is emitting metrics, so it's entirely possible I haven't put enough load on the server to create an eviction.

Regardless, I've got dashboard now and I think I'm going to call this good. We can do followup bugs with specific needs.

Marking as FIXED.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED

Moving to Eliot product.

Component: Symbolication → General
Product: Tecken → Eliot
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: