Closed Bug 1061589 Opened 10 years ago Closed 9 years ago

Rename of ffxbld_dsa to ffxbld_rsa in puppet/hiera

Categories

(Infrastructure & Operations :: RelOps: Puppet, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pmoore, Assigned: rail)

References

Details

(Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1397] )

Attachments

(6 files)

Private key ffxbld_dsa contains an rsa key, and as such needs to be renamed to ffxbld_rsa.

The source code changes for this happen in parent bug 1061188. Puppet (and hiera store?) also need(s) to be updated.

Two alternative roll out scenarios:

Scenario 1
==========
During a downtime, this change is landed in combination with bug 1061188 and bug 1061579.

Scenario 2
========== 
1) Puppet/hiera is/are updated to deliver ffxbld_dsa and ffxbld_rsa (as a copy)
2) Bug 1061188 is rolled out, to switch our tools over to use the new filename
3) A separate bug is landed to remove ffxbld_dsa from puppet/hiera

Please advise which of these two scenarios you prefer (or if you have a different idea).

Thanks,
Pete
Summary: Rename of ffxbld_dsa to ffxbld_rsa → Rename of ffxbld_dsa to ffxbld_rsa in puppet/hiera
I can do this.
Assignee: relops → bhearsum
Scenario 2 sounds like a much better idea, and easier to schedule.
This looks deceptively simple, but I think it's all we need once builder_ssh_key_prod_ffxbld_rsa is added to Hiera.
Attachment #8482842 - Flags: review?(dustin)
And by all we need, I mean all we need to start the transition. We need to strip out some ffxbld_dsa references after, and switch some other ones to point at ffxbld_rsa.
Comment on attachment 8482842 [details] [diff] [review]
add ffxbld_rsa key

Looks good.  Don't forget to make sure the new secret is in place before deploying :)
Attachment #8482842 - Flags: review?(dustin) → review+
I've deployed the new key to hiera:

[root@releng-puppet2.srv.releng.scl3.mozilla.com ~]# hiera builder_ssh_key_prod_ffxbld_dsa | eyaml encrypt --pkcs7-private-key /etc/hiera/keys/private_key.pem --pkcs7-public-key /etc/hiera/keys/public_key.pem --stdin -l builder_ssh_key_prod_ffxbld_rsa

then appended relevant part of output to /etc/hiera/secrets.eyaml

I then confirmed it worked, by running both of the following commands:

[root@releng-puppet2.srv.releng.scl3.mozilla.com ~]# hiera builder_ssh_key_prod_ffxbld_rsa
[root@releng-puppet2.srv.releng.scl3.mozilla.com ~]# hiera builder_ssh_key_prod_ffxbld_dsa

and confirming they produced the same output.

Will now work on landing Ben's patch on default so it gets picked up in the next reconfig shortly.
Comment on attachment 8482842 [details] [diff] [review]
add ffxbld_rsa key

Merged to production. Let the fireworks begin!

https://hg.mozilla.org/build/puppet/rev/dd308f23e11e
Bustage! Added another key ('buildmaster_ssh_key_ffxbld_rsa') now too...

[root@releng-puppet2.srv.releng.scl3.mozilla.com ~]# hiera builder_ssh_key_prod_ffxbld_dsa | eyaml encrypt --pkcs7-private-key /etc/hiera/keys/private_key.pem --pkcs7-public-key /etc/hiera/keys/public_key.pem --stdin -l buildmaster_ssh_key_ffxbld_rsa
Hmmm looks like this wasn't entirely successful.

On slaves I see the change, but not on masters, e.g.:

pmoore@Elisandra:~ $ ssh cltbld@buildbot-master04 'ls .ssh/ffx*'
.ssh/ffxbld_dsa
pmoore@Elisandra:~ $
Ah, I was just too quick, it hadn't propagated - it has now.
Status: NEW → ASSIGNED
Status: ASSIGNED → NEW
Thanks for doing this Pete. Is this bug done now?
Flags: needinfo?(pmoore)
Almost done, but not quite!

In hiera we have both keys, but we can't drop the old one yet as it is still being used. The challenge at the moment is getting spot instances and windows builders updated with the new ssh key, since spot instances are not running puppet, so only new ones are picking up the new key, and Windows builders are only getting the new key after a reboot, and since we have plenty of capacity for builders, at last check, not all had rebooted and picked up new key yet.

When all slaves have the new key, I have the patches to land to switch over configs to use the new key, and after that successfully lands, we can remove the old key from hiera.
Flags: needinfo?(pmoore)
Have you found windows builders without the new key? We just reimaged the entire iX fleet, so they should have all picked up the new key at that point.
Flags: needinfo?(pmoore)
Hey Amy,

Thanks for this! Let me take a look. I had decided to just upload the file in the case it was missing - sounds like I don't need to now! Will double check.

Pete
Flags: needinfo?(pmoore)
(In reply to Amy Rich [:arich] [:arr] from comment #16)
> Have you found windows builders without the new key? We just reimaged the
> entire iX fleet, so they should have all picked up the new key at that point.

Seems not to be everywhere yet:

cltbld@B-2008-IX-0157 ~
$ date; ls ~/.ssh/ffxbld_rsa
Thu Oct 16 02:50:41 PDT 2014
ls: /c/Users/cltbld/.ssh/ffxbld_rsa: No such file or directory

cltbld@B-2008-IX-0157 ~
$

Not to worry, I have a way to work around it:
https://bugzilla.mozilla.org/show_bug.cgi?id=1082535#c69

Thanks Amy!
Pete
Despite me assigning myself originally, Pete is clearly getting to this before me.
Assignee: bhearsum → pmoore
We figured out why the windows key wasn't being pushed out, so that's all fixed.
Thanks guys! The only open piece before landing again would be to make sure spot instances are updated - then I can land the patches again.

I'll try to get to that later today.
Status: NEW → ASSIGNED
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/416]
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/416] → [kanban:engops:https://kanbanize.com/ctrl_board/6/422]
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/422]
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/479]
So I think we are ready to delete the dsa key in hiera now!

All systems are now using the rsa key.
This should remove the old dsa key from the .ssh directory where it should no longer be referenced, and also fixed release runner to use the rsa key instead of the dsa key.

One observation: our staging ffxbld_dsa key hasn't been renamed or recreated as an rsa key - do you know if there is anything in the puppet changes that would also affect staging environments, that means we would need to also create an ffxbld_rsa key for staging too at this time?

That said, we don't really have a functioning staging environment at the moment with dedicated slaves and masters, so maybe we don't need to worry too much.

Please note I will not delete the entries in hiera (lines in releng-puppet2.srv.releng.scl3.mozilla.com:/etc/hiera/secrets.eyaml) until after this puppet change has landed and propagated.

I will also need to check if GPO is talking to hiera for retrieving secrets too, or has its own secret store, to make sure that I don't break anything there. There is a separate bug for making the GPO changes (bug 1061579).
Attachment #8515064 - Flags: review?(bhearsum)
Attachment #8515064 - Flags: feedback?(dustin)
Comment on attachment 8515064 [details] [diff] [review]
bug1061589_puppet_v1.patch

I've remembered Ben is on PTO this week...
Attachment #8515064 - Flags: review?(nthomas)
Attachment #8515064 - Flags: review?(bhearsum)
Attachment #8515064 - Flags: feedback?(dustin)
Comment on attachment 8515064 [details] [diff] [review]
bug1061589_puppet_v1.patch

Since this is very puppet I'll pass it on to rail.
Attachment #8515064 - Flags: review?(nthomas) → review?(rail)
Coincidentally, I happened to notice there is an issue on staging - we're not pushing ~/.ssh/ffxbld_rsa with puppet, but are referencing it buildbot etc, so uploads to dev-stage01 fail.
(In reply to Nick Thomas [:nthomas] from comment #26)
> Coincidentally, I happened to notice there is an issue on staging - we're
> not pushing ~/.ssh/ffxbld_rsa with puppet, but are referencing it buildbot
> etc, so uploads to dev-stage01 fail.

Ah, I feared this (see comment 23 observation). :(

I will roll out a new ffxbld_rsa key for staging too. I think this is the most reasonable solution to the problem.

Also, looks like there are new dependencies on ffxbld_dsa that have been added since patches in bug 1061188 first landed, so I will get those fixed too (e.g. https://hg.mozilla.org/build/buildbot-configs/rev/02f134ab655b).

Will cancel the review request for now, until these two tasks have been completed.
Attachment #8515064 - Flags: review?(rail)
Also, thanks to Rail, for spotting the references to ffxbld_dsa in mxr!
Whiteboard: [kanban:engops:https://kanbanize.com/ctrl_board/6/479] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1387] [kanban:engops:https://kanbanize.com/ctrl_board/6/479]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1387] [kanban:engops:https://kanbanize.com/ctrl_board/6/479] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1388] [kanban:engops:https://kanbanize.com/ctrl_board/6/479]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1388] [kanban:engops:https://kanbanize.com/ctrl_board/6/479] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1390] [kanban:engops:https://kanbanize.com/ctrl_board/6/479]
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1390] [kanban:engops:https://kanbanize.com/ctrl_board/6/479] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1397] [kanban:engops:https://kanbanize.com/ctrl_board/6/479]
I've created a new ffxbld_rsa staging key, and added it to hiera with the name:

builder_ssh_key_staging_ffxbld_rsa

I've added it to our secrets repo too.

I will now create a puppet patch for using this in staging.
So this should hopefully fix your staging issue Jordan...

I've added the new key to hiera.

If this all goes fine, we can probably clean up hiera as there are a bunch of old/renamed keys in there that have been superseded.

Explanation of change:
When I previously created the new production ffxbld_rsa key to replace the previous production ffxbld_dsa key, I overlooked staging. To resolve this, I now also generated a new ffxbld_rsa key for staging, added it to hiera, added it to our private secrets git repo, and this puppet patch should start using it.
Attachment #8517437 - Flags: review?(jlund)
Attachment #8517437 - Flags: feedback?(dustin)
Attachment #8517437 - Flags: feedback?(dustin) → feedback+
The new key will need to be added to ffxbld@dev-stage01, and possibly other places.
(In reply to Nick Thomas [:nthomas] from comment #31)
> The new key will need to be added to ffxbld@dev-stage01, and possibly other
> places.

Ah thanks Nick, I was wondering about this. I figured this would be updating the private key on the slaves, so they could open ssh connections to <user@machines> but I didn't know which <user@machines> would need the public key added to authorized_keys on the other end of the connection for this to actually work. I then thought maybe the target machines that the slaves connect to might also get this key data from the same ssh_keys class where they were added for the slaves, but I guess this is not the case. Anyway, that clears up this mystery, many thanks!
Comment on attachment 8517437 [details] [diff] [review]
bug1061589_puppet_new-staging-key_v1.patch

Review of attachment 8517437 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm with heira key being available.

remind me, in puppet, if you don't manage a file anymore (e.g. ffxbld_dsa), that file will continue to exist on slaves until they are re-imaged correct?

I guess that's not really an issue though :)
Attachment #8517437 - Flags: review?(jlund) → review+
Whiteboard: [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1397] [kanban:engops:https://kanbanize.com/ctrl_board/6/479] → [kanban:engops:https://mozilla.kanbanize.com/ctrl_board/6/1397]
(In reply to Jordan Lund (:jlund) from comment #33)
> Comment on attachment 8517437 [details] [diff] [review]
> bug1061589_puppet_new-staging-key_v1.patch
> 
> Review of attachment 8517437 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> lgtm with heira key being available.
> 
> remind me, in puppet, if you don't manage a file anymore (e.g. ffxbld_dsa),
> that file will continue to exist on slaves until they are re-imaged correct?
> 
> I guess that's not really an issue though :)

It is a good point - we can explicitly make sure it does not exist (i.e. gets removed by puppet). I think safest to do that so that new dependencies do not get introduced on a key that happens to exist on a slave a person looks at. Thanks Jordan, good catch!
Dustin,

Am I right in thinking that puppet will automatically delete files in ~/.ssh/ that it is not controlling, for all users? I have a vague memory that puppet intentionally deletes anything that it is not responsible for from ~/.ssh/ as a security measure, and if that is the case, I won't have to do any fancy work as per comment 34.

Thanks!
Pete
Flags: needinfo?(dustin)
In that particular directory, your memory is correct, yes.
(In reply to Nick Thomas [:nthomas] from comment #31)
> The new key will need to be added to ffxbld@dev-stage01, and possibly other
> places.

Ben has kindly updated ffxbld@dev-stage01 - I'll raise a request for getting hg updated.
Depends on: 1102252
Flags: needinfo?(dustin)
OK, so last part to do here is to remove the old keys from hiera, and to remove references from puppet to the old hiera keys, if any still exist.
Comment on attachment 8515064 [details] [diff] [review]
bug1061589_puppet_v1.patch

(In reply to Pete Moore [:pete][:pmoore] from comment #27)

> Will cancel the review request for now, until these two tasks have been
> completed.

Re-requesting review, since no longer blocked on other steps.
Attachment #8515064 - Flags: review?(rail)
Comment on attachment 8515064 [details] [diff] [review]
bug1061589_puppet_v1.patch

Review of attachment 8515064 [details] [diff] [review]:
-----------------------------------------------------------------

The patch looks good to me. Let's coordinate the ladning since release-runner may need a restart and we don't want to do that in the middle of a release job.
Attachment #8515064 - Flags: review?(rail) → review+
(In reply to Rail Aliiev [:rail] from comment #42)
> Comment on attachment 8515064 [details] [diff] [review]
> bug1061589_puppet_v1.patch
> 
> Review of attachment 8515064 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> The patch looks good to me. Let's coordinate the ladning since
> release-runner may need a restart and we don't want to do that in the middle
> of a release job.

Thanks Rail!

When would be a good time for landing it?

Thanks,
Pete
I'd say now, but https://bugzilla.mozilla.org/show_bug.cgi?id=1061188#c44 is not in production yet. Let's coordinate it when it's in production.
Thanks Rail!

Ben says he has just done a reconfig, so we might be good to go...
(In reply to Pete Moore [:pete][:pmoore] from comment #45)
> Thanks Rail!
> 
> Ben says he has just done a reconfig, so we might be good to go...

I did a reconfig for the merge, which means I transplanted the one changeset I needed. This was not included because I didn't want to take on any unnecessary risk.
IF there is no rush, let's land it after the PDX work week. We have a bunch of releases this week.
Ah thanks Ben for the clarification - sorry for jumping to conclusions, and Rail, sure no problem we can delay; there is no rush.
Assignee: pmoore → relops
Hey Rail,

I spotted this bug is still open - based on comment 47, I guess we can land now. Would you be able to take care of getting this landed such that the landing doesn't interfere with any ongoing releases?

Thanks!
Pete
Flags: needinfo?(rail)
Sure, I'll take a look at this next week.
Assignee: relops → rail
Flags: needinfo?(rail)
Attached patch mozharness.diffSplinter Review
:catlee, do we need to pin this in-tree?
Attachment #8595993 - Flags: review?(catlee)
Attached patch configs.diffSplinter Review
Attachment #8595996 - Flags: review?(bhearsum)
Attached patch puppet.diffSplinter Review
This can be landed after the previous 2 patches are in production. Additionally, release-runner should be restarted after this lands.
Attachment #8596000 - Flags: review?(bugspam.Callek)
Attachment #8595996 - Flags: review?(bhearsum) → review+
Comment on attachment 8596000 [details] [diff] [review]
puppet.diff

Review of attachment 8596000 [details] [diff] [review]:
-----------------------------------------------------------------

I presume you/ben will be taking care of release-runner and not "buildduty"
Attachment #8596000 - Flags: review?(bugspam.Callek) → review+
Thanks Rail for taking care of this! What a lot of work for a single character change to a filename! :D
Attachment #8595993 - Flags: review?(catlee) → review+
release runner restarted
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: