Make AWS node type available to graphite & build metadata

RESOLVED FIXED

Status

Release Engineering
Other
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: (dormant account), Assigned: catlee)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

(Reporter)

Description

3 years ago
This will make diagnosing issues m1.medium to m3.medium, tracking perf via graphite and figuring out random speedups/slowdowns* easier.

http://glandium.org/blog/?p=3201
(Assignee)

Comment 1

3 years ago
Not really practical ATM since we assume that hostname == buildbot slave name, and I don't think we want to pre-allocate all possible instance type / slave # combinations in buildbot.

We've been discussing ways to break the hostname == buildbot slave name requirement, so maybe wait for that?

Or, is there another way we could make this data available to make data analysis possible?
(Reporter)

Comment 2

3 years ago
We could fake it in graphite. Still need an easy way to get into build logs, etc. Could just set the machine-local hostname in /etc/hostname?
(Assignee)

Comment 3

3 years ago
The important things are getting the instance type into graphite, and also into the build metadata and logs. The instance id should also go into the build metadata and logs.
Assignee: nobody → catlee
OS: Windows 8.1 → All
Summary: Put AWS node type into hostname → Make AWS node type available to graphite & build metadata
(Assignee)

Comment 4

3 years ago
Created attachment 8385451 [details] [diff] [review]
get instance metadata and submit some of it to graphite

A few pieces here:
- a script to grab metadata from AWS's service and dump it into /etc/instance_metadata.json.
- an init service to make sure ^^ is run on boot
- a diamond collector to submit the instance type to graphite

I'll be reading the instance_metadata.json file into buildbot properties as well.
Attachment #8385451 - Flags: review?(rail)
Comment on attachment 8385451 [details] [diff] [review]
get instance metadata and submit some of it to graphite

Review of attachment 8385451 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/instance_metadata/files/InstanceMetadataCollector.conf
@@ +1,3 @@
> +enabled=True
> +interval=600
> +path=instance_metadata

No idea about the format, but it looks good. :)

::: modules/instance_metadata/files/instance_metadata.initd
@@ +26,5 @@
> +DESC="instance_metadata"
> +
> +CMD=/usr/local/bin/instance_metadata.py
> +OUTPUT=/etc/instance_metadata.json
> +PYTHON=/tools/python27/bin/python

Can you use ${packages::mozilla::python27::python} here so it doesn't bite us if we decide to upgrade?
(Assignee)

Comment 6

3 years ago
Created attachment 8385596 [details] [diff] [review]
get instance metadata and submit some of it to graphite

interdiff:

diff --git a/modules/instance_metadata/manifests/init.pp b/modules/instance_metadata/manifests/init.pp
index 2ef9418..871dfb8 100644
--- a/modules/instance_metadata/manifests/init.pp
+++ b/modules/instance_metadata/manifests/init.pp
@@ -31,7 +31,7 @@ class instance_metadata {
                     file {
                         "/etc/init.d/instance_metadata":
                             require => File["/usr/local/bin/instance_metadata.py"],
-                            source  => "puppet:///modules/instance_metadata/instance_metadata.initd",
+                            content => template("instance_metadata/instance_metadata.initd.erb"),
                             mode    => 0755,
                             owner   => "root",
                             notify  => Service["instance_metadata"];
diff --git a/modules/instance_metadata/files/instance_metadata.initd b/modules/instance_metadata/templates/instance_metadata.initd.erb
similarity index 95%
rename from modules/instance_metadata/files/instance_metadata.initd
rename to modules/instance_metadata/templates/instance_metadata.initd.erb
index 71484cd..0acb52b 100644
--- a/modules/instance_metadata/files/instance_metadata.initd
+++ b/modules/instance_metadata/templates/instance_metadata.initd.erb
@@ -27,7 +27,7 @@ DESC="instance_metadata"
 
 CMD=/usr/local/bin/instance_metadata.py
 OUTPUT=/etc/instance_metadata.json
-PYTHON=/tools/python27/bin/python
+PYTHON=<%= scope.lookupvar('::packages::mozilla::python27::python') %>
 
 test -x ${CMD} || exit 0
Attachment #8385451 - Attachment is obsolete: true
Attachment #8385451 - Flags: review?(rail)
Attachment #8385596 - Flags: review?(rail)
Attachment #8385596 - Flags: review?(rail) → review+
(Assignee)

Updated

3 years ago
Attachment #8385596 - Flags: checked-in+
(Assignee)

Comment 7

3 years ago
puppet patch in production
This should eventually be moved from node definitions to toplevel classes.  Do you have an idea which toplevel class it will move to?  What blocks moving it there now?
(Assignee)

Comment 9

3 years ago
I think ideally it's on all nodes, and instance_metadata is.

We don't have diamond packaged for all our nodes, which is why they're for specific node types.

John, do you recall why you put the diamond include at the node definition rather than in toplevel::slave somewhere?
Flags: needinfo?(jhopkins)
Ah, I missed that instance_metadata is in toplevel.  diamond is only temporarily in the node defs until it's implemented everywhere. Thanks for the explanation!
Flags: needinfo?(jhopkins)
Can this information be added to build logs by buildbot? (Should I file a separate bug for this)
(Assignee)

Comment 12

3 years ago
Yes, I'm working on that as well
(Assignee)

Comment 13

3 years ago
Pushed https://hg.mozilla.org/build/puppet/rev/23f1c183d846 to make sure the instance metadata is readable (it's mode 0600 root:root ATM)
Ubuntu nodes are running facter-1.7.5 now, so you should be able to get most of this data (and more) from facter now.
something here is in production
(Assignee)

Comment 16

3 years ago
Created attachment 8389952 [details] [diff] [review]
script to find and output metadata
Attachment #8389952 - Flags: review?(bhearsum)
(Assignee)

Comment 17

3 years ago
Created attachment 8389953 [details] [diff] [review]
call metadata script
Attachment #8389953 - Flags: review?(bhearsum)
Attachment #8389953 - Flags: review?(bhearsum) → review+
Attachment #8389952 - Flags: review?(bhearsum) → review+
(Assignee)

Updated

3 years ago
Attachment #8389952 - Flags: checked-in+
(Assignee)

Updated

3 years ago
Attachment #8389953 - Flags: checked-in+
(Assignee)

Updated

3 years ago
Depends on: 983742
in production:

 bug 976415 - don't flunk on failure
bug 976415 - Make sure we can read the file before reading it
(Assignee)

Comment 19

3 years ago
Some builds now have the AWS instance data in the logs, e.g.

https://tbpl.mozilla.org/php/getParsedLog.php?id=36162353&tree=Mozilla-Inbound&full=1

========= Started set props: aws_ami_id aws_instance_id aws_instance_type (results: 0, elapsed: 0 secs) (at 2014-03-14 13:21:51.212797) =========
python tools/buildfarm/maintenance/get_instance_metadata.py
 in dir /builds/slave/m-in-l64-000000000000000000000/. (timeout 1200 secs)
 watching logfiles {}
 argv: ['python', 'tools/buildfarm/maintenance/get_instance_metadata.py']
 environment:
  CCACHE_HASHDIR=
  CVS_RSH=ssh
  G_BROKEN_FILENAMES=1
  HISTCONTROL=ignoredups
  HISTSIZE=1000
  HOME=/home/cltbld
  HOSTNAME=bld-linux64-ec2-314.build.releng.usw2.mozilla.com
  LANG=en_US.UTF-8
  LESSOPEN=|/usr/bin/lesspipe.sh %s
  LOGNAME=cltbld
  MAIL=/var/spool/mail/cltbld
  PATH=/usr/local/bin:/usr/lib64/ccache:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/home/cltbld/bin
  PWD=/builds/slave/m-in-l64-000000000000000000000
  SHELL=/bin/bash
  SHLVL=1
  TERM=linux
  TMOUT=86400
  USER=cltbld
  _=/tools/buildbot/bin/python
 using PTY: False
{"aws_ami_id": "ami-6eea8b5e", "aws_instance_id": "i-45a9394c", "aws_instance_type": "c3.xlarge"}
program finished with exit code 0
elapsedTime=0.014326
aws_ami_id: u'ami-6eea8b5e'
aws_instance_id: u'i-45a9394c'
aws_instance_type: u'c3.xlarge'


These are also set as properties which will be accessible via the build status json, etc.
Is this diamond information still useful?  These classes were left in the node definitions "temporarily", and they cause instance_metadata to be installed, but now we want to run that from runner.  I'll remove them for the moment in bug 1046926.
(Assignee)

Comment 21

3 years ago
I think the diamond info is still useful, yes.

How do you recommend we proceed?
I didn't end up changing the node definitions, but we should.

I think that the 'include diamond' should get moved to the appropriate buildslave toplevel classes, and the instance_metadata specific bits moved to diamond::instance_metadata.
(Assignee)

Comment 23

2 years ago
I think diamond has been killed.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
QA Contact: mshal
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.