Closed Bug 1253306 Opened 8 years ago Closed 8 years ago

Please deploy loop-server 0.20.0 to Stage

Categories

(Cloud Services :: Operations: Deployment Requests - DEPRECATED, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: rhubscher, Assigned: dmaher)

References

Details

Attachments

(1 file)

------------------
RELEASE NOTES
------------------
https://github.com/mozilla-services/loop-server/releases

COMPARISONS
https://github.com/mozilla-services/loop-server/compare/0.19.0...0.19.1
https://github.com/mozilla-services/loop-server/compare/0.19.1...0.19.2
https://github.com/mozilla-services/loop-server/compare/0.19.2...0.19.3
https://github.com/mozilla-services/loop-server/compare/0.19.3...0.20.0

TAGS
https://github.com/mozilla-services/loop-server/releases/tag/0.20.0
https://github.com/mozilla-services/loop-server/commit/3fce0dd9faf1b47df16249e40a69ae86b142a8b7

CHANGELOG

0.20.0 (2016-03-03)
-------------------

- Switch from statsd-node to node-statsd to handle tags. (#363)
- Remove loads v1 loadtests. (#364)
- Display pretty exception when tests raises with Mocha. (#365)
- Add a new room action to get metrics about whitelisted shared domains (#365)
- Add a new route to let the client send UX metrics in Google Analytics (#366)
- Upgrade libsodium and add support for node 0.12 (#367)
- Add a __lbhealthcheck__ endpoint (#368)


0.19.3 (2016-02-12)
-------------------

- Add a way to log the loop-client versions in use. (#362)


0.19.2 (2016-01-07)
-------------------

- Update ws module to version 1.0.1 (#360)
Assignee: nobody → jschneider
QA Contact: rpappalardo
Pipeline #20 started for Stage.
Attached image loopsvr_20k.png
Hi :natim,

We ran quite a few loads trying to get about 5K/30s on STAGE (fyi: prod is ~8K/30s).

Even at 16 very large load instances + 200 users, we still couldn't surpass that amount.

We've bumped it up to 32 load instances + 400 users and have surpassed the prod load, but we notice that 2XXs are being replaced by 4XXs which doesn't seem right.  (see attached)

I think we need to do some test tweaking or continue to adjust our load or both.

Do you know where we're at with the additional metrics we requested from :bobm?  I believe he wired them up on his end, but I'm not sure where we can see them now.  Kibana?
Flags: needinfo?(rhubscher)
Hi,

> Do you know where we're at with the additional metrics we requested from :bobm?  I believe he wired them up on his end, but I'm not sure where we can see them now.  Kibana?

Yes bobm wired them, but I didn't look at them just yet. But you should be able to access them.

https://kibana.shared.us-west-2.prod.mozaws.net/index.html#/dashboard/elasticsearch/Loop%20Rooms%20Timeline

According to that page, it seems that most of the calls are made on the GET /rooms endpoint and on POST /rooms/:ID

> I think we need to do some test tweaking or continue to adjust our load or both.

What we should aim for is making sure we are CPU bound for our loadtests so that the information we get back is the maximum load we can handle with 2 and with 3 instances.

Right now our RPS is lower in stage than in production because we might be calling third parties more often during the loadtest than it is the case in production.


Before we successfully generate loadtest that have RPS comparable to production ones, we can also validate that we handle the same load on stage that we had with the previous version of the server. (load handled by 0.19.3 is the same as the load handled with 0.20.0)

Also if the ElasticCache cluser we use is different between stage and prod it might be the issue here.
Flags: needinfo?(rhubscher)
QA Contact: rpappalardo → chartjes
Verifying what is currently on staging

Checking configuration for https://hello.firefox.com/config.js
var loop = loop || {};

loop.config = {
  serverUrl: 'https://loop.services.mozilla.com/v0',
  feedbackApiUrl: 'https://input.mozilla.org/api/v1/feedback',
  feedbackProductName: 'Loop',
  downloadFirefoxUrl: 'https://www.mozilla.org/firefox/new/?scene=2&utm_source=hello.firefox.com&utm_medium=referral&utm_campaign=non-webrtc-browser\#download-fx',
  privacyWebsiteUrl: 'https://www.mozilla.org/privacy/firefox-hello/',
  legalWebsiteUrl: 'https://www.mozilla.org/about/legal/terms/firefox-hello/',
  marketplaceUrl: 'https://marketplace.firefox.com/iframe-install.html',
  learnMoreUrl: 'https://www.mozilla.org/hello/',
  roomsSupportUrl: 'https://support.mozilla.org/kb/group-conversations-firefox-hello-webrtc',
  guestSupportUrl: 'https://support.mozilla.org/kb/respond-firefox-hello-invitation-guest-mode',
  generalSupportUrl: 'https://support.mozilla.org/kb/respond-firefox-hello-invitation-guest-mode',
  unsupportedPlatformUrl: 'https://support.mozilla.org/kb/which-browsers-will-work-firefox-hello-video-chat',
  tilesIframeUrl: 'https://tiles.cdn.mozilla.net/iframe.html',
  tilesSupportUrl: 'https://support.mozilla.org/kb/tiles-firefox-hello',
  fxosApp: {
    name: 'Firefox Hello',
    manifestUrl: 'https://marketplace.firefox.com/app/54b83aea-1208-4605-82eb-22819f39d81d/manifest.webapp',
    rooms: false
  }
};

Checking https://loop.stage.mozaws.net
{
    "description": "The Mozilla Loop (WebRTC App) server",
    "endpoint": "https://loop.stage.mozaws.net",
    "fakeTokBox": false,
    "fxaOAuth": true,
    "homepage": "https://github.com/mozilla-services/loop-server/",
    "i18n": {
        "defaultLang": "en-US"
    },
    "name": "mozilla-loop-server",
    "version": "0.19.4"
}


Getting headers for https://loop.stage.mozaws.net

{'Content-Length': '273', 'Timestamp': '1457964771', 'Vary': 'Origin', 'Connection': 'keep-alive', 'ETag': 'W/"111-JZnR6UMKKxG8HBkBqdMFew"', 'Date': 'Mon, 14 Mar 2016 14:12:51 GMT', 'Content-Type': 'application/json; charset=utf-8'}


Checking https://loop.stage.mozaws.net/push-server-config
{
    "pushServerURI": "wss://autopush.stage.mozaws.net"
}


Checking heartbeat at https://loop.stage.mozaws.net/__heartbeat__
{
    "fxaVerifier": true,
    "provider": true,
    "push": true,
    "storage": true
}


Finished run

------------------
E2E-TESTS
------------------

Placed several calls successfully using loop-server 0.19.4 tag on STAGE
using: Firefox General Release (45.0) and Nightly (48.0a1) with 
loop.server set to https://loop.stage.mozaws.net/v0


TEST RESULTS

end-2-end test calls (shared URL) - OK
Video/audio mute/unmute - OK
messaging - OK 
Tab & window-sharing - OK
privacy & ToS links - OK
Feedback screens - OK
push notifications - OK
>     "version": "0.19.4"

It seems that the 0.20.0 release was not deployed yet, was it?
:natim, correct - 0.20.0 has not been deployed to staging yet. That's scheduled for 1100 EDT March 14. I'm just confirming what we have already in place.
Gentlepersons,

Due to the untimely demise of jp's computron, I will be performing the part of "svcops deployment automaton" in today's presentation of loop-server stage deploy theatre.
Assignee: jschneider → dmaher
Unfortunately the rpmbuild failed (see paste below).  Debug is on-going.






Start: rpmbuild loop-server-svcops-20160315154203-3fce0dd.src.rpm
Building target platforms: x86_64
Building for target x86_64
Executing(%prep): /bin/sh -e /var/tmp/rpm-tmp.Qprb3a
+ umask 022
+ cd /builddir/build/BUILD
+ LANG=C
+ export LANG
+ unset DISPLAY
+ cd /builddir/build/BUILD
+ rm -rf loop-server-svcops-20160315154203
+ /bin/mkdir -p loop-server-svcops-20160315154203
+ cd loop-server-svcops-20160315154203
+ /bin/tar -xf /builddir/build/SOURCES/app.tar
+ /bin/chmod -Rf a+rX,u+w,g-w,o-w .
+ exit 0
Executing(%build): /bin/sh -e /var/tmp/rpm-tmp.nZEHp6
+ umask 022
+ cd /builddir/build/BUILD
+ cd loop-server-svcops-20160315154203
+ LANG=C
+ export LANG
+ unset DISPLAY
+ npm install --production

> heapdump@0.3.5 install /builddir/build/BUILD/loop-server-svcops-20160315154203/node_modules/heapdump
> node-gyp rebuild

make: Entering directory `/builddir/build/BUILD/loop-server-svcops-20160315154203/node_modules/heapdump/build'
  CXX(target) Release/obj.target/addon/src/heapdump.o
  SOLINK_MODULE(target) Release/obj.target/addon.node
  SOLINK_MODULE(target) Release/obj.target/addon.node: Finished
  COPY Release/addon.node
make: Leaving directory `/builddir/build/BUILD/loop-server-svcops-20160315154203/node_modules/heapdump/build'
npm WARN engine boom@2.10.1: wanted: {"node":">=0.10.40"} (current: {"node":"v0.10.29","npm":"1.4.14"})
npm WARN engine cryptiles@2.0.5: wanted: {"node":">=0.10.40"} (current: {"node":"v0.10.29","npm":"1.4.14"})
npm WARN engine hoek@2.16.3: wanted: {"node":">=0.10.40"} (current: {"node":"v0.10.29","npm":"1.4.14"})

> hiredis@0.2.0 install /builddir/build/BUILD/loop-server-svcops-20160315154203/node_modules/hiredis
> node-gyp rebuild

make: Entering directory `/builddir/build/BUILD/loop-server-svcops-20160315154203/node_modules/hiredis/build'
  CC(target) Release/obj.target/hiredis/deps/hiredis/hiredis.o
  CC(target) Release/obj.target/hiredis/deps/hiredis/net.o
  CC(target) Release/obj.target/hiredis/deps/hiredis/sds.o
  CC(target) Release/obj.target/hiredis/deps/hiredis/async.o
  CC(target) Release/obj.target/hiredis/deps/hiredis/read.o
  AR(target) Release/obj.target/deps/hiredis.a
  COPY Release/hiredis.a
  CXX(target) Release/obj.target/hiredis/src/hiredis.o
  CXX(target) Release/obj.target/hiredis/src/reader.o
  SOLINK_MODULE(target) Release/obj.target/hiredis.node
  SOLINK_MODULE(target) Release/obj.target/hiredis.node: Finished
  COPY Release/hiredis.node
make: Leaving directory `/builddir/build/BUILD/loop-server-svcops-20160315154203/node_modules/hiredis/build'

> sodium@1.0.22 preinstall /builddir/build/BUILD/loop-server-svcops-20160315154203/node_modules/sodium
> make

./autogen.sh: line 13: libtoolize: command not found
make: *** [configure] Error 127
npm ERR! sodium@1.0.22 preinstall: `make`
npm ERR! Exit status 2
npm ERR!
npm ERR! Failed at the sodium@1.0.22 preinstall script.
npm ERR! This is most likely a problem with the sodium package,
npm ERR! not with npm itself.
npm ERR! Tell the author that this fails on your system:
npm ERR!     make
npm ERR! You can get their info via:
npm ERR!     npm owner ls sodium
npm ERR! There is likely additional logging output above.

npm ERR! System Linux 2.6.32-504.8.1.el6.x86_64
npm ERR! command "/usr/bin/node" "/usr/bin/npm" "install" "--production"
npm ERR! cwd /builddir/build/BUILD/loop-server-svcops-20160315154203
npm ERR! node -v v0.10.29
npm ERR! npm -v 1.4.14
npm ERR! code ELIFECYCLE
npm ERR!
npm ERR! Additional logging details can be found in:
npm ERR!     /builddir/build/BUILD/loop-server-svcops-20160315154203/npm-debug.log
npm ERR! not ok code 0
error: Bad exit status from /var/tmp/rpm-tmp.nZEHp6 (%build)


RPM build errors:
    Bad exit status from /var/tmp/rpm-tmp.nZEHp6 (%build)
ERROR: Exception(/home/jenkins/slave/workspace/loop-server-PACKAGE/BUILD/loop-server-svcops-20160315154203-3fce0dd.src.rpm) Config(cloudops-6-x86_64) 0 minutes 18 seconds
Comment #9 was just a missing dependency for `libtool`, which is easily fixed.  Unfortunately, we're now tumbling down the dependency rabbit hole - next stop, `autoconf`:

```
> sodium@1.0.22 preinstall /builddir/build/BUILD/loop-server-svcops-20160315162650/node_modules/sodium
> make

autoreconf: Entering directory `.'
autoreconf: configure.ac: not using Gettext
autoreconf: running: aclocal --force -I m4
configure.ac:1: error: Autoconf version 2.65 or higher is required
```

WIP
Summary: Loop — Please deploy loop-server 0.20.0 to Stage → Please deploy loop-server 0.20.0 to Stage
So here's the deal: the version of autoconf available in CentOS 7 is `2.63-5.1.el6`, but some element of the NPM build process wants version 2.65 or better.

There's an `autoconf268` package available for install, but in order to avoid conflicts with the base autoconf, all of the paths/filenames are suffixed with `268`, as in /usr/bin/autoconf268 .  The trick is getting the build process to use the alternate bin & libs.

WIP
Turns out that npm libsodium is difficult to build on CentOS 6.x (this is going to be true of more and more things going forward).  The long-term solution is to migrate Loop to CentOS 7; however, this is *well* outside the scope of this deployment, heh.  For now, I've asked :natim to downgrade libsodium in the hopes that we can get this deployment out the door.


```
09:08:00 < phrawzty> Standard8: natim: In the meantime, would it be possible to
                     use an earlier version of libsodium - one that doesn't
                     require autoconf > 2.65 ?
09:08:19 < phrawzty> Standard8: natim: Assuming, of course, that this doesn't
                     introduce security problems, etc.
09:09:23 < natim> phrawzty: I can probably tag a 0.20.1 with the version
                  reverted, would that works for you?
09:09:51 < natim> Standard8: I would do that in the 0.20.x branch so that you
                  can still use master with node 0.12
09:10:14 < phrawzty> natim: Seems like it'd work for me
09:13:31 < natim> phrawzty: here you go:
https://github.com/mozilla-services/loop-server/releases/tag/0.20.1
```


I'm going to go ahead and try that now.
(In reply to Daniel Maher [:phrawzty] from comment #4)
> I'm going to go ahead and try that now.

The package built as expected (pipeline #74), and Jenkins was able to run the LoopServer-STAGE-Deploy job (#71); however, the instance behind the load balancer is out of the pool, and is timing out connections to port 80.  Worse yet, I can't SSH into the failing node, so something is clearly busted.

WIP
There's a new ticket for 0.20.1 here:
Bug 1257241 - Please deploy loop-server 0.20.1 to STAGE
May we close this in favour of bug 1257241?
Marking as wontfix unless there is a better option.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: