Closed Bug 1545456 Opened 6 years ago Closed 6 years ago

Deploy Autograph 3.1.0 and config changes to HSM stack

Categories

(Cloud Services :: Operations: Deployment Requests - DEPRECATED, task)

task
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: u581815, Assigned: u581815)

References

(Blocks 1 open bug)

Details

Please deploy Autograph 3.1.0 to staging and perform QA testing.

https://mana.mozilla.org/wiki/display/SVCOPS/Autograph#Autograph-QA

Code changes from 3.0.5: https://github.com/mozilla-services/autograph/compare/3.0.5...3.1.0

Config changes:

  • add Fenix beta and rel keys give signingscript_fenix access to them; rename rel key to ngt for bug 1545378 (commit 8ce9e8b9a1558f9bbe87e2f82fb693ce703685d2)
  • add omni.ja signer for bug 1533818 (commit f320dbe51675c8f6f11f39ddc8a18076271776fc)
  • add firefox_ creds for mar signers for bug 1540277 (commit 097f9dd1721e7f72b1b930d5c58cee0f1e5b9f2b)

Scheduled for next Tuesday and Wednesday to hit the Fenix May 1st deadline.

We will also need content signature changes.

Stage is deployed. For QA:

a. the lambda monitor is passing w/ updated content sig. expirations and new signers

b. signed and validated the test addon (bumped the version number)

c. the kinto refreash lambda succeeded (req id: a4cc2bea-166f-4ca5-8f5f-86f94cbba6f4), but syncing stage and stage preview collections fail with InvalidSignatureError: Invalid content signature (main-preview/hijack-blocklists) except for blocklist-preview/certificates for some reason. :leplatrem would changing intermediates break validation in about:remotesettings?

d. :bpitts or :miles can you run "./manage.py update_signatures --force" on the stage SHIELD admin host?

e. :aki can you test MAR signing in stage?

Flags: needinfo?(miles)
Flags: needinfo?(bpitts)
Flags: needinfo?(aki)
Flags: needinfo?(mathieu)
Flags: needinfo?(miles)

(In reply to Greg Guthe [:g-k] [:gguthe] from comment #3)

e. :aki can you test MAR signing in stage?

MAR signing works in stage: https://tools.taskcluster.net/groups/QS-zKsCzSPOihDLbxCQoUQ/tasks/LalRDZkAS3y-z_P7e54Raw/runs/0

Flags: needinfo?(aki)

pinged :leplatrem on IRC and we're OK to deploy prod:

leplatrem> seems ok SERVER=https://settings.stage.mozaws.net/v1/ python aws_lambda.py validate_signature

edit: actually we should hold off

leplatrem> there might be an issue indeed: https://bugzilla.mozilla.org/show_bug.cgi?id=1546657
leplatrem> and I confirm that signature verification is broken in Firefox when pointing the dev tools on stage

Flags: needinfo?(mathieu)

There are reports of signature problems on Normandy Stage in bug 1546657. I think we should delay this deploy to prod until that is resolved

Depends on: 1546657

The relevant changes here are:

  • use new HSM intermediates (since the current one is expiring)
  • generate EEs with golang (instead of openssl) and the make-hsm-ee script https://github.com/mozilla-services/autograph/pull/275
  • more manually construct the key, crt, and chain files for content sig signers using a handful of scripts in autograph-hiera-sops

However, the autograph-monitor and python validators pass.

We decided to rollback to unblock QA and that change is live in stage. I checked that the refresh lambda passed and now some of the stage and stage preview collections sync in about:remotesettings

:ulfr helped debug the content signature certs so I'll work on getting that fix out

Reverted the content signature changes for bug 1546195 (to deploy next week), so we can get the Fenix keys and other changes out.

Redeployed to stage and restarting Stage QA.

Stage QA:

a. monitor is passing using the old content signature expiration dates with new the signers: 2019/04/24 18:36:07 All signature responses passed, monitoring OK

b. AMO config is unchanged

c. the kinto refresh lambda completed successfully and stage and stage preview sync in about:remotesettings

d. :bpitts ran the normandy update signatures (mentioned in chat)

e. stage MAR signers unchanged

Deployed to prod. Prod QA:

a. the monitor is still failing due to the expiring content sig. intermediate (but we'll rotate that next week for bug 1546195). The new signers are present (fenix, omni.ja)

b. AMO prod doesn't talk to the HSM stack

c. the kinto refresh lambda succeeded (req id: 728e2dbe-07f5-4daf-8b58-a4c40853273c) and prod and prod preview sync in about:remotesettings (as expected since we haven't changed EEs)

d. :bpitts or :miles can you run "./manage.py update_signatures --force" on the prod SHIELD admin host?

e. not sure if we have a prod MAR signing test

Flags: needinfo?(miles)
Flags: needinfo?(bpitts)
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED

I force signed in normandy prod.

Flags: needinfo?(bpitts)

(In reply to Brian Pitts from comment #13)

I force signed in normandy prod.

Thanks!

I think we're all set here.

Status: RESOLVED → VERIFIED
Flags: needinfo?(miles)
You need to log in before you can comment on or make changes to this bug.