Closed Bug 1170217 Opened 10 years ago Closed 10 years ago

tc-client: Record statistics (or add a hook for it)

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jonasfj, Assigned: jonasfj)

Details

Attachments

(2 files)

It would be nice if we had client side statistics for all API calls. Or well, most of them. Especially when we have cross-region workers, it's nice to know how many retries the requests takes, etc... (also getting stats per workerType, etc. would be nice). Each call to tc-client probably have a set of columns/tags: - duration // time from call started to ended, all requests included - retries // 0, if first request was successful, ... - method // claimTask, etc... - success // 0 or 1 - statusCode // http status code from last response - baseUrl // https://queue.taskcluster.net/v1 - target // queue, scheduler, aws-provisioner, The components (docker-worker, task-graph scheduler, etc.) using tc-client probably also have a few columns/tags to add: - component // docker-worker, queue, scheduler, aws-provisioner - process // process of component: server, pulse-handler, expire-tasks Some like docker-worker might have additional columns too: - workerType - provisionerId ### Option A: var queue = new taskcluster.Queue({ credentials, ..., // usual config stats: { tags: { // all component specific columns/tags component: 'docker-worker', process: 'main', workerType: '...', provisionerId: '...' } drain: new base.stats.Influx(...) } }); queue.claimTask() // will now create a stats entry in the series ClientCalls The downside to this approach is fairly tight coupling between taskcluster-client and base.stats, which isn't very generic. ### Option B: var queue = new taskcluster.Queue({ credentials, ..., // usual config stats: function(duration, retries, method, success, baseUrl, ...) { // logic that inserts it into a time series in influxdb // this logic all adds component specific columns/tags } }); Down side of (B) is that it's involves a lot of repetition, however, it's fairly generic. ### Option C: This is (B) + an auxiliary method in base.stats that will make it look like this: var queue = new taskcluster.Queue({ credentials, ..., // usual config stats: base.stats.createClientCallReporter({ tags: { // all component specific columns/tags component: 'docker-worker', process: 'main', workerType: '...', provisionerId: '...' } drain: new base.stats.Influx(...) }) }); Now we have the convenience of (A) but as generic as (B). -------------- Okay, so this bug ended up being my thought process, I think (C) is the solution. This only leaves the question, do we want and/or need to record statistics for every API call on the client side of all/most components? I think there is some value to this. Other thoughts? Note, I don't see this is as important for python or go clients to implement.
thanks for raising this Jonas! (In reply to Jonas Finnemann Jensen (:jonasfj) from comment #0) > Note, I don't see this is as important for python or go clients to implement. I think when the generic worker is actively managing a significant number of production tasks, it will become more interesting to have this data, so we should probably think about doing it when the go client is highly active in production. Agreed we shouldn't look at it yet for go client.
Attached file Github PR
I think option (C) is the clear way to go here. As garndt is presumably the first consumer he gets the review. I assume you're interested in landing this, given that docker-worker is the largest consumer. Anyways, I appreciate any feedback/suggestions etc..
Assignee: nobody → jopsen
Status: NEW → ASSIGNED
Attachment #8616949 - Flags: review?(garndt)
This is the taskcluster-base side of (C).
Attachment #8617030 - Flags: review?(garndt)
Attachment #8616949 - Flags: review?(garndt) → review+
Comment on attachment 8617030 [details] [review] Github PR for taskcluster-base Looks good. There were some sections that I only glazed over because it seems that it was just moving the series definitions to series.js and generally looked like the same code afterwards
Attachment #8617030 - Flags: review?(garndt) → review+
Use tc-base 0.7.26 and tc-client 0.22.2 and this should work :)
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: TaskCluster → General
Product: Testing → Taskcluster
Target Milestone: --- → mozilla41
Version: unspecified → Trunk
Resetting Version and Target Milestone that accidentally got changed...
Target Milestone: mozilla41 → ---
Version: Trunk → unspecified
Component: General → Client Libraries
Component: Client Libraries → Services
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: