Closed Bug 1048990 Opened 11 years ago Closed 10 years ago

Deploy release 0.10.0 to loop-server Stage

Categories

(Cloud Services :: Operations: Deployment Requests - DEPRECATED, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: alexis+bugs, Unassigned)

References

Details

(Whiteboard: [qa+])

Code changes: https://github.com/mozilla-services/loop-server/compare/0.9.0...0.10.0 Configuration changed since 0.9.0, see: https://github.com/mozilla-services/loop-server/compare/0.9.0...0.10.0#diff-9 Especially, we should support multiple channels now on the server. The configuration file should be updated as follows: - For nightly/aurora: https://bugzilla.mozilla.org/show_bug.cgi?id=1033574 - For all other channels (default) https://bugzilla.mozilla.org/show_bug.cgi?id=1021955 Credentials should be: > credentials": { > "default": { > "apiKey": "<put your apiKey here>", > "apiSecret": "<put your apiSecret here>" > }, > "aurora": { > "apiKey": "<put your apiKey here>", > "apiSecret": "<put your apiSecret here>" > }, > "nightly": { > "apiKey": "<put your apiKey here>", > "apiSecret": "<put your apiSecret here>" > }, > }, James, I'll let you create the prod deployment request once this is deployed and verified to stage.
Status: NEW → ASSIGNED
Whiteboard: [qa+]
Blocks: 1048264
I *think* :mostlygeek just updated Stage to 0.10.0 I see just now a new m3.medium instance: i-6bfb0b46 Code: loop-server-svcops 0.10.0-1 x86_64 20905864 puppet-config-loop 20140729184904-1 x86_64 13881 This is instance i-6bfb0b46, a m3.medium in us-east-1a Standard tags are: app => loop_server, type => loop_server, env => stage, stack => loop-server-stage But, right now, we are getting a 502 if we curl the server...
This should be all good now. Moving forward with basic verification of loop-server Stage. Waiting to hear back from the third-party on an available test window to hit their live server... Also, see: https://github.com/mozilla-services/puppet-config/pull/781
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Verified we have one updated m3.medium instance running the following code: loop-server-svcops 0.10.0-1 x86_64 20905864 puppet-config-loop 20140811172607-1 x86_64 14277 Noting here some changes to the all-important config file: /data/loop-server/config/settings.json The following are new/updated entries: ...etc... "fxaAudiences": [ "app://loop.stage.mozaws.net", "http://loop.stage.mozaws.net" ], "fxaTrustedIssuers": [ "api-accounts.stage.mozaws.net", "msisdn-dev.stage.mozaws.net" ], ...etc... "timers": { "connectionDuration": 10 }, "tokBox": { "apiUrl": "THE USUAL", "credentials": { "default": { "apiKey": "THE USUAL", "apiSecret": "THE USUAL", "apiUrl": "THE USUAL" } } }, Please confirm the additions, changes, and layout. (I can supply values for "THE USUAL" if needed)
curl https://loop.stage.mozaws.net {"name":"mozilla-loop-server","description":"The Mozilla Loop (WebRTC App) server","version":"0.10.0","homepage":"https://github.com/mozilla-services/loop-server/","endpoint":"https://loop.stage.mozaws.net","fakeTokBox":false} curl -I https://loop.stage.mozaws.net HTTP/1.1 200 OK Date: Mon, 11 Aug 2014 18:25:02 GMT Content-Type: application/json; charset=utf-8 Content-Length: 226 Connection: keep-alive ETag: W/"e2-2194042355" Timestamp: 1407781502
The configuration looks good to me. Also the default channel apiUrl is optional if it is the same as the tokBox.apiUrl and the connectionDuration timers is now set to the default value.
:natim thanks! Did a couple of quick 'make test' to verify Stage. Interesting change to the loadtest Makefile. 'make test' runs a 'make build'. Not sure why this is needed, given that only QA and Dev use this and know to run 'make build' once before any calls to 'make test'. Is there some requirement to run 'make build' every time we test? Anyway, waiting to hear back from our third-party...
OK, our test window is Tuesday afternoon, PDT.
The 30min test also looks good: https://loads.services.mozilla.com/run/c087f859-17d9-421e-82bd-f23a2793ae1f Going to try a bit more traffic next...
The more stress-based 30min load test failed miserably, probably because I overloaded our single m3.medium instance: Run 3: 30min https://loads.services.mozilla.com/run/aa5ba100-df1c-4946-878a-a3c512226a22 users = 40 duration = 1800 agents = 10 Most errors were of this type: 500 Internal Server Error Going back to an easier 10min to make sure everything is good. Then I will dial it up a little slower this time...
Blocks: 1052929
The following runs were all successful: Run 4: 10min https://loads.services.mozilla.com/run/b543bf16-f526-4631-b90a-879f5e8a1132 Users [20] Agents 5 Duration 600 Run 5: 10min https://loads.services.mozilla.com/run/87bb391a-eddd-4199-937c-e3aec8b65e27 Users [25] Agents 5 Duration 600 Run 6: 10min https://loads.services.mozilla.com/run/819cedf6-0d69-44aa-820a-1ba6856e9495 Users [30] Agents 5 Duration 600 Run 7: 45min https://loads.services.mozilla.com/run/db16450e-b428-4315-8a25-9a02d5450684 users = 30 duration = 2700 agents = 5 Tests over 15005 Successes 15005 Failures 0 Errors 0 TCP Hits 60620 Opened web sockets 30301 Total web sockets 30301 Bytes/websockets 6156909 Requests / second (RPS) 22
Logs look really good after this early runs with the 500s. The only other thing I noticed was some bot/spam activity in the /media/ephemeral0/nginx/logs/loop_server.access.log and /media/ephemeral0/circus/loop_server/loop_server.out.log logs (as 404s).
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.