Closed Bug 1570160 Opened 6 years ago Closed 6 years ago

docker-worker has funny hostname in logs in GCP

Categories

(Taskcluster :: Workers, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

Jul 30 21:50:52 docker-worker.aws-provisioner.<!DOCTYPE kernel: [ 49.015051] init: docker main process ended,

Based on Brian's work, it looks like deploy/template/var/lib/cloud/scripts/per-boot/init.sh changes the logging hostname and restarts the logger. That means that the syslog for a worker is split between the logs from before that point and after. Which explains why when worker startup fails, the logs don't appear in papertrail. In fact, searching for ip-10-144-55-254 finds me the first half of the logs for i-05a0eea0dd561896a.

On the other hand, after the fact, it's going to be very difficult to determine the ec2 hostname for the host (since the IP is lost), whereas the instance-id is pretty easy to find (since it matches the workerId).

In general, though, we don't know that instance IDs will match. So I think the "real" fix to this, to enable finding logs after-the-fact, is either

  • tc-worker-runner logs workerGroup / workerId on startup; or
  • workers log hostname at the beginning of a task
    and then not reset the hostname at all during startup.

In the interim, I'll stop resetting the hostname in GCP, but leave the hostnames as they are in AWS until one or both of the above are complete.

tc-worker-runner logs workerGroup / workerId on startup; or

https://github.com/taskcluster/taskcluster-worker-runner/pull/17

On a rando AWS instance:

root@ip-172-31-35-60:~# node
> require('os').hostname()
'ip-172-31-35-60'

so, I think we could just log that at task startup.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.