Each worker has a number of useful stats which are not obvious or visible outside of the logs... We should add a read only public endpoint which shows worker stats... The fun stuff: Adding: - uptime stats - current time to shutdown - current tasks running (task id's with links to the inspector) - etc... Details: We should not utilize too many resources and/or put this behind a feature flag so lower end nodes don't need to waste resources on serving this... Primary concern is memory (node can eat it very quickly for high levels of concurrency which generally the tasks need) and security (ideally this runs in a different process as the actual worker code)
It might be nice to make this an artifact too. Normally, we only want to investigate this when a task behaved weird. Ie. an artifact public/stats/docker-worker.htm might be nice. So we see statistics like this even after a run as was resolved. It's easily more useful and easier to implement this as an artifact created by docker-worker. And could in that case be hidden in a task.payload feature flag.
I see that being useful too but my intention is for debugging stuff as the worker runs (edge case) or in the case that the worker is not shutting down or whatever which does not work as an artifact.
Ahh... I see, that makes sense.
ah and disk space and some other stats which don't fuck up the worker as it is running...
Going to take an initial stab at this .. My plan is to expose a "json" endpoint (which will run in a different process) which we can then build a "UI" for... How we expose the link for this is unclear but for people working on TC this will be useful right away to inspect the running state of the worker.
Assignee: nobody → jlal
Component: TaskCluster → General
Product: Testing → Taskcluster
Component: Docker-Worker → Worker
Status: NEW → RESOLVED
Last Resolved: 5 months ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.