Closed
Bug 1170217
Opened 10 years ago
Closed 10 years ago
tc-client: Record statistics (or add a hook for it)
Categories
(Taskcluster :: Services, defect)
Taskcluster
Services
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jonasfj, Assigned: jonasfj)
Details
Attachments
(2 files)
It would be nice if we had client side statistics for all API calls.
Or well, most of them. Especially when we have cross-region workers,
it's nice to know how many retries the requests takes, etc...
(also getting stats per workerType, etc. would be nice).
Each call to tc-client probably have a set of columns/tags:
- duration // time from call started to ended, all requests included
- retries // 0, if first request was successful, ...
- method // claimTask, etc...
- success // 0 or 1
- statusCode // http status code from last response
- baseUrl // https://queue.taskcluster.net/v1
- target // queue, scheduler, aws-provisioner,
The components (docker-worker, task-graph scheduler, etc.) using tc-client
probably also have a few columns/tags to add:
- component // docker-worker, queue, scheduler, aws-provisioner
- process // process of component: server, pulse-handler, expire-tasks
Some like docker-worker might have additional columns too:
- workerType
- provisionerId
### Option A:
var queue = new taskcluster.Queue({
credentials, ..., // usual config
stats: {
tags: { // all component specific columns/tags
component: 'docker-worker',
process: 'main',
workerType: '...',
provisionerId: '...'
}
drain: new base.stats.Influx(...)
}
});
queue.claimTask() // will now create a stats entry in the series ClientCalls
The downside to this approach is fairly tight coupling between
taskcluster-client and base.stats, which isn't very generic.
### Option B:
var queue = new taskcluster.Queue({
credentials, ..., // usual config
stats: function(duration, retries, method, success, baseUrl, ...) {
// logic that inserts it into a time series in influxdb
// this logic all adds component specific columns/tags
}
});
Down side of (B) is that it's involves a lot of repetition, however, it's
fairly generic.
### Option C:
This is (B) + an auxiliary method in base.stats that will make it look
like this:
var queue = new taskcluster.Queue({
credentials, ..., // usual config
stats: base.stats.createClientCallReporter({
tags: { // all component specific columns/tags
component: 'docker-worker',
process: 'main',
workerType: '...',
provisionerId: '...'
}
drain: new base.stats.Influx(...)
})
});
Now we have the convenience of (A) but as generic as (B).
--------------
Okay, so this bug ended up being my thought process, I think (C) is
the solution.
This only leaves the question, do we want and/or need to record statistics
for every API call on the client side of all/most components?
I think there is some value to this. Other thoughts?
Note, I don't see this is as important for python or go clients to implement.
Comment 1•10 years ago
|
||
thanks for raising this Jonas!
(In reply to Jonas Finnemann Jensen (:jonasfj) from comment #0)
> Note, I don't see this is as important for python or go clients to implement.
I think when the generic worker is actively managing a significant number of production tasks, it will become more interesting to have this data, so we should probably think about doing it when the go client is highly active in production. Agreed we shouldn't look at it yet for go client.
| Assignee | ||
Comment 2•10 years ago
|
||
I think option (C) is the clear way to go here.
As garndt is presumably the first consumer he gets the review.
I assume you're interested in landing this, given that docker-worker is the largest consumer.
Anyways, I appreciate any feedback/suggestions etc..
| Assignee | ||
Comment 3•10 years ago
|
||
This is the taskcluster-base side of (C).
Attachment #8617030 -
Flags: review?(garndt)
Updated•10 years ago
|
Attachment #8616949 -
Flags: review?(garndt) → review+
Comment 4•10 years ago
|
||
Comment on attachment 8617030 [details] [review]
Github PR for taskcluster-base
Looks good. There were some sections that I only glazed over because it seems that it was just moving the series definitions to series.js and generally looked like the same code afterwards
Attachment #8617030 -
Flags: review?(garndt) → review+
| Assignee | ||
Comment 5•10 years ago
|
||
Use tc-base 0.7.26
and tc-client 0.22.2
and this should work :)
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Component: TaskCluster → General
Product: Testing → Taskcluster
Target Milestone: --- → mozilla41
Version: unspecified → Trunk
Comment 6•10 years ago
|
||
Resetting Version and Target Milestone that accidentally got changed...
Target Milestone: mozilla41 → ---
Version: Trunk → unspecified
| Assignee | ||
Updated•10 years ago
|
Component: General → Client Libraries
Updated•6 years ago
|
Component: Client Libraries → Services
You need to log in
before you can comment on or make changes to this bug.
Description
•