signing server generated a token, then couldn't find it later

RESOLVED INVALID

Status

Release Engineering
General
RESOLVED INVALID
4 years ago
7 months ago

People

(Reporter: bhearsum, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

4 years ago
We had a bunch of signing fail this morning, around the same time that redis was restarted multiple times. In looking at logs I found that a slave generated a token on signing6, and then couldn't use signing4, 5 *or* 6 to sign. It doesn't surprise me that 4 or 5 didn't work if redis was unavailable at the time, but it should've worked on 6, because it should've had the token locally already. From signing6's log:
2014-01-08 06:16:35,597 - INFO - 10.134.48.228 POST /token
2014-01-08 06:16:35,598 - INFO - request for token for slave 10.134.53.7 for 21600 seconds
2014-01-08 06:16:35,605 - INFO - 10.134.48.228 - - [2014-01-08 06:16:35] "POST /token HTTP/1.0" 200 138 0.008219

2014-01-08 06:25:13,962 - INFO - 10.134.53.7 POST /sign/gpg
2014-01-08 06:25:14,084 - INFO - couldn't find token data for key rOfiA2Y0IOBSQHxy++/6HleXGRqwFO/ogqsyUNI9BU4 locally, checking cache
2014-01-08 06:25:14,116 - INFO - not in cache; failing verify_token
2014-01-08 06:25:14,117 - INFO - 10.134.53.7 - - [2014-01-08 06:25:14] "POST /sign/gpg HTTP/1.1" 400 86 0.155135

And from the slave:
2014-01-08 09:28:26,047 - 60573158673390f6896d181698326c76d9a83f22: processing ../../dist//firefox-29.0a1.en-US.linux-x86_64.checksums on https://signing4.srv.releng.scl3.mozilla.com:9110
2014-01-08 09:28:26,573 - 60573158673390f6896d181698326c76d9a83f22: uploading for signing
2014-01-08 09:28:27,139 - 60573158673390f6896d181698326c76d9a83f22: error uploading file for signing: 400 Invalid token
2014-01-08 09:28:28,141 - 60573158673390f6896d181698326c76d9a83f22: processing ../../dist//firefox-29.0a1.en-US.linux-x86_64.checksums on https://signing5.srv.releng.scl3.mozilla.com:9110
2014-01-08 09:28:28,623 - 60573158673390f6896d181698326c76d9a83f22: uploading for signing
2014-01-08 09:28:29,233 - 60573158673390f6896d181698326c76d9a83f22: error uploading file for signing: 400 Invalid token
2014-01-08 09:28:30,234 - 60573158673390f6896d181698326c76d9a83f22: processing ../../dist//firefox-29.0a1.en-US.linux-x86_64.checksums on https://signing6.srv.releng.scl3.mozilla.com:9110
2014-01-08 09:28:30,707 - 60573158673390f6896d181698326c76d9a83f22: uploading for signing
2014-01-08 09:28:31,281 - 60573158673390f6896d181698326c76d9a83f22: error uploading file for signing: 400 Invalid token

Perhaps there's a bug that causes the local cache to get overridden? Or maybe redis just lost data?
Tokens are deleted if they're provided with an invalid nonce, and I don't see any particular logging about that.

From the look of the logging, perhaps that slave got a file signed elsewhere, then comes back to signing6 to try to get the signed file:

2014-01-08 06:24:42,670 - INFO - 10.134.53.7 GET /sign/gpg/60573158673390f6896d181698326c76d9a83f22
2014-01-08 06:24:42,670 - INFO - 10.134.53.7 - - [2014-01-08 06:24:42] "GET /sign/gpg/60573158673390f6896d181698326c76d9a83f22 HTTP/1.1" 404 96 0.000782

having gotten the file signed, it would have a nonce other than "" (the initial nonce), and thus on the first POST to /sign/gpg, signing6 would delete the local token store.

At any rate, this will be fixed by bug 945751 which eliminates all state on the signing servers.

Comment 2

4 years ago
This happened again soon after a redis restart.

https://tbpl.mozilla.org/php/getParsedLog.php?id=32821830&tree=Mozilla-B2g26-v1.2
(Reporter)

Comment 3

4 years ago
Isn't a problem now that bug 945751 is fixed, per comment #1.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → INVALID
(Assignee)

Updated

7 months ago
Component: Tools → General
Product: Release Engineering → Release Engineering
You need to log in before you can comment on or make changes to this bug.