Closed Bug 1247736 Opened 9 years ago Closed 9 years ago

Please deploy tokenserver 1.2.20

Categories

(Cloud Services :: Operations: Deployment Requests - DEPRECATED, task)

task
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: rfkelly, Unassigned)

References

Details

(Whiteboard: [qa+])

This version of tokenserver includes handling for a new "reset" event type from the FxA SQS stream, to help disconnect devices in a more timely manner after password reset: Bug 1226094 - Notify tokenserver of password reset events in FxA Please deploy. For QA'ing this in stage, we'll want to reproduce the steps in Bug 1206325 to test that devices are indeed disconnected after password reset in a timely manner. (No particular urgency here though, if you want to wait until after the dockerization work in Bug 1245385 that's a-oh-kay)
QA Contact: kthiessen
Notice: I am in London doing Kinto work for the week of 15-19 February; my bandwidth for other tasks will be pretty limited. Happy to schedule this work sometime after next week.
Looked at the changes. If the message's event type is not recognize then it is deleted from the queue. I propose these changes: 1. rename process_account_deletions.py to process_account_events.py - requires puppet changes, but removes a YTF? in the future 2. if the message type is unrecognized, log it and ignore it. - It will go back on the queue to be processed again. - logging: write an event to stdout (mozlog), we'll get these into kibana - increment a (datadog) statsd tokenserver.account_events.ignored metric - this will be monitored and we'll be alerted The reason for #2 is that servers will run different versions during a deployment. Important messages will get deleted and we won't know why. If we don't care about losing events, just ignore my above comment. :)
See: * https://github.com/mozilla-services/tokenserver/pull/80/files * https://github.com/mozilla-services/puppet-config/pull/1782 (Although I haven't test the later in combination with the former) > If we don't care about losing events, I don't care about losing these events (in fact they're already being sent by the fxa-auth-server, and being silently dropped by the current version of tokenserver).
v1.2.19 includes the renaming linked above
Summary: Please deploy tokenserver 1.2.18 → Please deploy tokenserver 1.2.19
Depends on: 1245385
Now that bug 1245385 is done I can focus on rolling this out. Should be pretty straight forward.
Planning on rolling this out ASAP. Waiting on https://github.com/mozilla/browserid-verifier/pull/78 to merge to get some security patches from node 0.10.42. This release also updates openresty to 1.9.7.3 which includes the glibc security fix.
New tokenserver boxes rolled out with: - tokenserver 1.2.19 with the new SQS event handling code - browserid-verifier 0.5.1 - openresty 1.9.7.3
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
I had to roll back this deployment. About an hour ago the new account SQS event processor started crashing. It looks like it doesn't sanitize data correctly and Python throws an exception which systemd eventually stops restarting the service. I filed an issue against it here: https://github.com/mozilla-services/tokenserver/issues/85
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I've tagged 1.2.20 with a fix for the above issue.
Summary: Please deploy tokenserver 1.2.19 → Please deploy tokenserver 1.2.20
Blocks: 1252704
I think this is ready for another shot. :mostlygeek, want anything else before we deploy?
Flags: needinfo?(bwong)
Don't need anything. In my deploy queue. Planning on launching it today or tomorrow.
Flags: needinfo?(bwong)
I just rolled this out and it looks a lot better. I'm going to run a canary server in prod for a day to make sure no strange behavior pops up. After that I shall mark it verified.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Just checked 1.2.20 in prod and no more crashes processing events.
No longer blocks: 1254734
Status: RESOLVED → VERIFIED
Note that 1.2.20 has not been totally rolled out yet. There's a mix of 1.2.17 and 1.2.20 servers in the wild right now. After Bug 1254734 is deployed then 1.2.20 will be totally deployed.
You need to log in before you can comment on or make changes to this bug.