Closed Bug 913602 Opened 7 years ago Closed 7 years ago

deploy slaveapi

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bhearsum, Assigned: dustin)

References

Details

Attachments

(3 files, 3 obsolete files)

We need at least a dev and production instance. Not sure if a staging one matters.
Let's talk about how this will work.
Dustin and I chatted about this today. Summary:
* We'll deploy in-house instead of in AWS. Dustin made the point that most of the services we'll be talking to live in-house, so it makes more sense to deploy nearer to them. He also said that netflows will be easier this way.
* We'll set-up a dedicated VM for this and proxy it through the public RelEng cluster. It will be LDAP protected and accessible only to IT, RelEng, and Sheriffs.
** Dustin, we didn't talk about it explicitly but I assume it's possible to have both a dev and production instance?
* SlaveAPI will run as a deamon on the VMs, we'll probably need an init script for it.
* Machine set-up will be handled by PuppetAgain.
* We'll need at least the following netflows (more may come up later):
** Bugzilla https (bugzilla-dev for the slaveapi dev instance)
** Slavealloc http
** Inventory https
** All IPMI interfaces 623/udp
** All PDUs 161/udp
** All slaves ssh
** All slaves icmp ping

In the near future (probably early Q4) we'll also need a flow to a mysql db after we implement that portion of the app. 


Dustin, did I miss anything?
As far as dev and prod, yes, we can do that.  They'd be something like https://secure.pub.build.mozilla.org/{slaveapi,slaveapi-dev}, and would proxy back to the relevant ports on the relevant hosts.  Eventually, when we build a staging releng web cluster (bug 841345), that will turn into something like https://secure-dev.pub.build.allizom.org/slaveapi.

Do you need two backend VMs for the two instances, or can you set things up in puppet to run two instances on different ports on the same VM?
(In reply to Dustin J. Mitchell [:dustin] from comment #3)
> As far as dev and prod, yes, we can do that.  They'd be something like
> https://secure.pub.build.mozilla.org/{slaveapi,slaveapi-dev}, and would
> proxy back to the relevant ports on the relevant hosts.  Eventually, when we
> build a staging releng web cluster (bug 841345), that will turn into
> something like https://secure-dev.pub.build.allizom.org/slaveapi.

Sounds good.

> Do you need two backend VMs for the two instances, or can you set things up
> in puppet to run two instances on different ports on the same VM?

We can run them on the same VM if necessary, but I'd prefer to run them on separate ones if all else is mostly equal.
Depends on: 915229
We'll use two VMs.
Depends on: 915341
Ben: both hosts are ready for your puppet work.  I added node defs with toplevel::server to get them started.
Attached patch WIP (obsolete) — Splinter Review
This should be complete in terms of the resources being defined, but I could use some advice on the secrets. Do these go in Hiera now, or should I put them in secrets.csv?

Also wondering where all of the API URLs belong. I put the Bugzilla one as parameter to slaveapi::instance because that definitely needs to differ between dev and production. The others (slavealloc, inventory) are currently used only to read from, so it may make sense to put them in moco-config.pp. There are plans to have slaveapi write to slavealloc in the future, so we may want to parametrize that from the start.
Attachment #806006 - Flags: feedback?(dustin)
Come to think of it, we've talked about integrating the AWS scripts too, which would mean that inventory could be a read/write too (because they do DNS entries when we create new instances).
Comment on attachment 806006 [details] [diff] [review]
WIP

Review of attachment 806006 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not sure the base/instance distinction is worthwhile here, since we're running separate staging and prod hosts.  It doesn't hurt, though.

As for configuration, that should come from moco-config.pp.  For items that differ between staging and production, you can conditionalize the values right in moco-config.pp with has_aspect("staging").
  https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Aspects
so
  if (has_aspect('staging')) {
    $slaveapi_bugzilla_url = ..
  } else {
    $slaveapi_bugzilla_url = ..
  }

And yes, I think parameterizing everything makes sense, rather than trying to piecemeal which will and will not change.

I didn't realize this had a cleartext copy of the root password.  That's a little scary.

::: modules/slaveapi/manifests/instance.pp
@@ +19,5 @@
> +    # $inventory_username =
> +    # $bugzilla_username =
> +    # $default_domain =
> +    # $ipmi_username = 
> +    # XXX: where to get secrets from?

https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Secrets
should answer that

@@ +54,5 @@
> +                "pycrypto==2.6",
> +                # unused, but one of the vendor libraries (Flask) requires it
> +                "jinja2==2.7.1",
> +                "MarkupSafe==0.18",
> +                "slaveapi==1.0",

I like!

@@ +70,5 @@
> +            content => template("credentials.json.erb"),
> +            owner => $user,
> +            group => $group,
> +            notify => Exec["$title-reload-slaveapi-server"],
> +            require => Python::Virtualenv["$basedir"];

can you add show_diff => false to these? for when bug 865799 lands.

@@ +88,5 @@
> +                File["${config_file}"],
> +                File["${credentials_file}"],
> +            ],
> +            unless => "/bin/sh -c 'test -e ${basedir}/signing.pid'";
> +    }

I think these are left over?

It's probably easiest to use supervisord to run these daemons - mozpool's got an example of the puppet for that.  I think you mentioned that slaveapi daemonizes itself, so writing an initscript for it and managing it with service {} is an option, too.

::: modules/slaveapi/templates/credentials.json.erb
@@ +2,5 @@
> +    "inventory": "<%=@inventory_password%>",
> +    "ipmi": "<%=@ipmi_password%>",
> +    "bugzilla": "<%=@bugzilla_password%>",
> +    "ssh": {
> +        "cltbld": ["<%=Array(@cltbld_passwords).join('","')%>"],

It might be better to json-encode these arrays, so that any funny characters get quoted properly:
  "cltbld": <%= JSON.generate(Array(@cltbld_passwords)) %>
Attachment #806006 - Flags: feedback?(dustin) → feedback+
(In reply to Dustin J. Mitchell [:dustin] from comment #9)
> Comment on attachment 806006 [details] [diff] [review]
> WIP
> 
> Review of attachment 806006 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm not sure the base/instance distinction is worthwhile here, since we're
> running separate staging and prod hosts.  It doesn't hurt, though.

Yeah. I was looking at doing this more like Buildbot in terms of using an init script, but it felt silly to me to artificially restrict us into one instance per machine.

> As for configuration, that should come from moco-config.pp.  For items that
> differ between staging and production, you can conditionalize the values
> right in moco-config.pp with has_aspect("staging").
>   https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Aspects
> so
>   if (has_aspect('staging')) {
>     $slaveapi_bugzilla_url = ..
>   } else {
>     $slaveapi_bugzilla_url = ..
>   }

Ah, right, I think you mentioned this to me a couple of weeks ago. Thanks for the reminder.

> And yes, I think parameterizing everything makes sense, rather than trying
> to piecemeal which will and will not change.
> 
> I didn't realize this had a cleartext copy of the root password.  That's a
> little scary.

We work around this on the signing servers by requiring passphrases to be entered on start-up. I don't think we can do that here because there's just too many different passwords that would be required. However, one thing we could look at doing is encrypting credentials.json and requiring a passphrase to decrypt it at start-up. This would mean that Puppet couldn't manage the running service (except for reloading), but I don't think that's the end of the world. I can file a separate bug on this if you think that's worthwhile, or perhaps it's something better discussed in the security review.


> @@ +88,5 @@
> > +                File["${config_file}"],
> > +                File["${credentials_file}"],
> > +            ],
> > +            unless => "/bin/sh -c 'test -e ${basedir}/signing.pid'";
> > +    }
> 
> I think these are left over?

Not leftover...but I need to s/signing/slaveapi/.

> It's probably easiest to use supervisord to run these daemons - mozpool's
> got an example of the puppet for that.  I think you mentioned that slaveapi
> daemonizes itself, so writing an initscript for it and managing it with
> service {} is an option, too.

What's the advantage to using supervisord over directly running the process itself? 

> 
> ::: modules/slaveapi/templates/credentials.json.erb
> @@ +2,5 @@
> > +    "inventory": "<%=@inventory_password%>",
> > +    "ipmi": "<%=@ipmi_password%>",
> > +    "bugzilla": "<%=@bugzilla_password%>",
> > +    "ssh": {
> > +        "cltbld": ["<%=Array(@cltbld_passwords).join('","')%>"],
> 
> It might be better to json-encode these arrays, so that any funny characters
> get quoted properly:
>   "cltbld": <%= JSON.generate(Array(@cltbld_passwords)) %>

Oh, sweet. I didn't know about this.
Let's save the passwords question for the secreview.  If we could use an SSH key instead, for example, then the risk is quite a bit lower (since changing an SSH key is pretty easy).  We'll see what the pros say.

Supervisord has the advantage of handling stdout/stderr nicely, and potentially automatically restarting the service when it fails.  It's also just less work :)
Attached patch fully tested patch (obsolete) — Splinter Review
So this seems to be working on slaveapi-dev1. I've address pretty much everything you brought up, but I didn't add supervisord into the mix. I admit that we'll probably want it at some point, but since all output goes to the log already, I don't think it's particularly urgent.
Attachment #806006 - Attachment is obsolete: true
Attachment #807365 - Flags: review?(dustin)
Attached patch install ipmi, snmpset (obsolete) — Splinter Review
Forgot about these runtime dependencies.
Attachment #807365 - Attachment is obsolete: true
Attachment #807365 - Flags: review?(dustin)
Attachment #807394 - Flags: review?(dustin)
Comment on attachment 807394 [details] [diff] [review]
install ipmi, snmpset

Review of attachment 807394 [details] [diff] [review]:
-----------------------------------------------------------------

These are minor points, and the patch looks fine.  Since I'll be away tomorrow, if you want to land with these addressed, go right ahead.  It seems like the easy solution is to change "staging" to "dev" in this patch and add that to the list of known aspects in the wiki.

::: manifests/moco-nodes.pp
@@ +140,5 @@
>  }
>  
>  node "slaveapi-dev1.srv.releng.scl3.mozilla.com" {
> +    $aspects = [ "staging" ]
> +    include toplevel::server::slaveapi

Should we rename this host, or rename the aspect?  "Dev" and "staging" aren't the same :)

::: modules/slaveapi/manifests/base.pp
@@ +18,5 @@
> +
> +    $root = "/builds/slaveapi"
> +
> +    file {
> +        # instances are stored with locked-down perms

Is this the case here?  I don't see mode => 0700.  Given the presence of unencrypted secrets, it couldn't hurt to add the mode back!
Attachment #807394 - Flags: review?(dustin) → review+
(In reply to Dustin J. Mitchell [:dustin] from comment #14)
> Comment on attachment 807394 [details] [diff] [review]
> install ipmi, snmpset
> 
> Review of attachment 807394 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> These are minor points, and the patch looks fine.  Since I'll be away
> tomorrow, if you want to land with these addressed, go right ahead.  It
> seems like the easy solution is to change "staging" to "dev" in this patch
> and add that to the list of known aspects in the wiki.
> 
> ::: manifests/moco-nodes.pp
> @@ +140,5 @@
> >  }
> >  
> >  node "slaveapi-dev1.srv.releng.scl3.mozilla.com" {
> > +    $aspects = [ "staging" ]
> > +    include toplevel::server::slaveapi
> 
> Should we rename this host, or rename the aspect?  "Dev" and "staging"
> aren't the same :)

I'll s/staging/dev/ in the patch.

> ::: modules/slaveapi/manifests/base.pp
> @@ +18,5 @@
> > +
> > +    $root = "/builds/slaveapi"
> > +
> > +    file {
> > +        # instances are stored with locked-down perms
> 
> Is this the case here?  I don't see mode => 0700.  Given the presence of
> unencrypted secrets, it couldn't hurt to add the mode back!

Good point.
Attached patch patch as landedSplinter Review
Attachment #807394 - Attachment is obsolete: true
Attachment #807775 - Flags: checked-in+
Dev and prod are both up. Still need to do some doc updates.
I found a couple of small bugs and ended up pushing a version bump in https://hg.mozilla.org/build/puppet/rev/9e5c9656c417.

I also realized that slaveapi was running as root, this patch should fix that.
Attachment #807820 - Flags: review?(bugspam.Callek)
Attachment #807820 - Flags: review?(bugspam.Callek) → review+
Attachment #807820 - Flags: checked-in+
SlaveAPI is deployed now. Docs on how to use it are tracked in bug 925137 and other improvements are tracked in other bugs in Release Engineering: Tools.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
https://secure.pub.build.mozilla.org/slaveapi/ doesn't seem to work yet.  AIUI that should proxy to http://slaveapi1.srv.releng.scl3.mozilla.com:8000/ but right now that's a 404.  Please advise?

I'll also need to add some operational docs to Mana.

Are there other bits left unfinished here?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Dustin J. Mitchell [:dustin] from comment #20)
> https://secure.pub.build.mozilla.org/slaveapi/ doesn't seem to work yet. 
> AIUI that should proxy to http://slaveapi1.srv.releng.scl3.mozilla.com:8000/
> but right now that's a 404.  Please advise?

There's nothing served from the root currently. URLs like http://slaveapi1.srv.releng.scl3.mozilla.com:8000/results seem to work. Though, https://secure.pub.build.mozilla.org/slaveapi/results doesn't...so that seems like an issue!

> Are there other bits left unfinished here?

I don't think so...
Right, well, I haven't set that up yet :)  I'll get that done, verify, and close.
Assignee: bhearsum → dustin
Ben, can you change the HTTP port to 8080?
Attached patch switch portSplinter Review
Attachment #817130 - Flags: review?(dustin)
Attachment #817130 - Flags: review?(dustin) → review+
Attachment #817130 - Flags: checked-in+
I *think* this is done..
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.