Closed Bug 815758 Opened 12 years ago Closed 12 years ago

set up staging and production mozpools of pandas


(Infrastructure & Operations :: RelOps: General, task)

Not set


(Not tracked)



(Reporter: dustin, Assigned: dustin)




(2 files)

One rack (including one mobile-imaging server) for staging, backed by the staging db; the remainder for production.

Both should come from the production racks.  Let's leave the first six chassis as testing/dev for the moment, with hopes they'll eventually be available to move into a production rack.

This will require
* puppet changes to allow servers to run staging or prod
* hacks to the inventory sync so production doesn't "discover" the staging pandas
Per our meeting, the production pool will be used for both b2g and Android, at releng's discretion.  Mozpool can be used to flexibly reimage machines between those two.
Blocks: 802317
Attached patch bug815758.patchSplinter Review
Assignee: server-ops-releng → dustin
Attachment #685832 - Flags: review?(bugspam.Callek)
Comment on attachment 685832 [details] [diff] [review]

Review of attachment 685832 [details] [diff] [review]:

::: manifests/nodes.pp
@@ +73,5 @@
> +    $is_bmm_admin_host = false
> +    include toplevel::server::mozpool
> +}
> +
> +node /mobile-imaging-0(0[2-9]|10)\.p\d+\.releng\.scl1\.mozilla\.com/ {

why using this complicated regex rather than just using /mobile-imaging-\d+\.p\d+\.releng\.scl1\.mozilla\.com/ ?

Puppet node matching will find the *first* that matches, therefore the -001 will match first and then it won't touch this one.

::: modules/mozpool/templates/config.ini.erb
@@ +3,5 @@
>  [inventory]
> +url = <%= scope.lookupvar('::mozpool::settings::inventory_url') %>
> +username = <%= scope.lookupvar('::mozpool::settings::inventory_username') %>
> +password = <%= scope.lookupvar('::mozpool::settings::inventory_password') %>

Are you sure the mozpool::settings::inventory* works, I don't see it defined in settings.pp directly (merely included from config::secrets)
Attachment #685832 - Flags: review?(bugspam.Callek) → review+
I wasn't aware it picked the first - that's helpful.  The second of those was probably worth an r-!

Fixed and pushed.
that's landed - now to install it and add the proper config.
Attached patch bug815758.patchSplinter Review
Attachment #686168 - Flags: review?(jhopkins)
Comment on attachment 686168 [details] [diff] [review]

passes visual inspection
Attachment #686168 - Flags: review?(jhopkins) → review+
landed; once that's in, I'll run sync scripts on both pools and we'll be set.

Worth noting, I'm treating this as a temporary solution - we certainly don't want a whole rack of "staging pandas" long-term.  Arguably, if we decide to dedicate certain devices to non-production uses, that dedication should be done within a single mozpool cluster, the way we do for slavealloc.  Anyway, that explains why this is kind of a lame implementation.
OK, everything's synchronized.  We have two pools now.

I do *not* like this - it means they have different pxe configs that we'll need to keep in sync, among other problems.

So, let's plan to use the staging pool for a month or so at most, until we're pretty confident that our tooling's not going to wreak mayhem in production, and then re-merge all of the pandas into the same pool.  As we've discussed, we can use other means to allocate pandas to particular purposes, later.
Closed: 12 years ago
Resolution: --- → FIXED
Blocks: 816237
This is being reverted in bug 818995
Component: Server Operations: RelEng → RelOps
Product: → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.