Track inflxudb cardinality
Categories
(Cloud Services :: Operations: Metrics/Monitoring, task)
Tracking
(Not tracked)
People
(Reporter: brian, Assigned: brian)
Details
We want to be able to alert on changes in the rate of cardinality increase, and to quickly identify what measurements they're occurring on. To do this, we can collect this information from influxdb regularly and write it back as measurements there.
cardinality per db
Tracking things at the database level is straightforward. For each database in show databases
:
show series cardinality on foo
and storing influxdb.cardinality.series database=foo cardinality=N
show measurement cardinality on foo
and storing influxdb.cardinality.measurements database=foo cardinality=N
cardinality per measurement
Tracking series cardinality at the measurement level would also be very useful, but I haven't idenitified a way to do that. One thing we could potentially do is
For each tag in show tag keys
:
show tag values cardinality with key = "baz"
and for each measurement in the result storing influxdb.cardinality.tag_values database=foo measurement=bar tag=baz cardinality=N
This doesn't let us determine the series cardinality for a specific measurement; i.e. if one has two tags with 10 values each the actual cardinality could be anywhere from 10 to 100. This would let us track changes in the tag keys and values though, so it's better than nothing.
I am unsure if this is performant enough to actually do. In my limited testing someties these show tag values cardinality
queries are reasonably fast, other times they're quite slow.
If we do this, we should bail out with an error if the number of measurements or number of tags keys is above some threshold to avoid having the tracking itself cause a cardinality explosion
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
I added the telegraf internal plugin by default to our projects in GCP back in December, and to our projects in AWS earlier in the week. It will take some time before we have data from it for most projects.
The internal plugin will give us a measurement of series written per telegraf. This again doesn't get us actual series cardinality over time, since one telegraf could be writing the same 1000 series every minute and another telegraf could be writing a new set of 1000 series every minute, but they would both record 1000 each time.
It should still be very helpful for seeing changes in the number of metrics written for a project though, and help us track down cardinality explosions like happened with sync-rs.
Assignee | ||
Comment 2•5 years ago
|
||
Thanks to help from inflxudata support i leared we show series cardinality
supports a from
clause that lets us get what we want, e.g.
> show series cardinality from "cpu"
name: cpu
count
-----
92035
so we can ignore my tag key shenanigans from the bug description and go with recording that for everything from show measurements
instead.
Assignee | ||
Comment 3•5 years ago
|
||
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 4•5 years ago
|
||
I've picked up that PR again and think it's in good shape. Once reviewed will merge and set up Jenkins job.
Assignee | ||
Comment 5•5 years ago
|
||
I've merged the code. Need to make some permissions changes then write the Jenkinsfile.
Assignee | ||
Comment 6•5 years ago
|
||
Brain dump that next steps here are something like
1a) create a new inflxudb user and give them perms to read form all dbs and write to svcops.
1b) document granting them read for any new dbs
1c) switch cres in jenkins to that user
2) change endpoint config in jenkins to remove path
3) write and deploy jenkinsfile, scheduling daily run
Assignee | ||
Comment 7•4 years ago
|
||
Assignee | ||
Comment 8•4 years ago
|
||
This is now deployed and there's a dashboard at https://earthangel-b40313e5.influxcloud.net/d/fwyzsaZMk/influxdb-cardinality?orgId=1
Assignee | ||
Comment 9•4 years ago
|
||
Description
•