Closed Bug 1374363 Opened 8 years ago Closed 8 years ago

OCC configured hardware machines reporting to papertrail under two names

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: markco, Assigned: markco)

References

Details

Started in bug 1371780 (In reply to Pete Moore [:pmoore][:pete] from comment #21) > 6) Do you know why there was a 49 minute gap between the two logs, and why > they are logging under different system names? (In reply to Mark Cornmesser [:markco] from comment #23) > Currently, I am using two machines. (In reply to Pete Moore [:pmoore][:pete] from comment #28) > (In reply to Mark Cornmesser [:markco] from comment #23) > > > Currently, I am using two machines. > > System name: > T-W864-IX-011 > > and > > System name: > T-W864-IX-011.wintest.releng.scl3.mozilla.com > > are two different machines? Why do they have the same unqualified name? There are 2 separate machines, but those are 011 and 012. I am not sure why we are sometimes getting the short name and sometimes getting a name including domain.
Assignee: relops → mcornmesser
Blocks: 1358558
short name and FQDN separation is pretty common with a bunch of our infrastructure. Often the host comes up with shortname and then reports FQDN after configuration is complete.
in the case of ec2 tc windows workers, we use a deliberate naming convention in order to separate logs in papertrail and make it easy to match build logs with pt event logs. the convention is: <hostname>.<workertype>.<datacenter>.mozilla.com (where the hostname is set to the ec2 instance id) eg: i-00015b4e5923f29a9.gecko-t-win7-32-gpu.usw2.mozilla.com - occ sets the domain name here: https://github.com/mozilla-releng/OpenCloudConfig/blob/34c5e02c/userdata/rundsc.ps1#L761-L782 - nxlog config makes use of the fqdn here: https://github.com/mozilla-releng/OpenCloudConfig/blob/34c5e02c/userdata/Configuration/nxlog/win10.conf#L48 - papertrail makes use of the fqdn in its filters to separate logs into workertype specific groups for ec2 machines, it's normal to get instances reporting under two names, because during the initial boot the host still has the name of the parent ami. then occ renames and reboots it and it starts reporting under the correct name. i don't know what's happening on hardware, but i hope that explanation explains the intentional design in occ.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.