Closed Bug 1176487 Opened 9 years ago Closed 8 years ago

Initial setup for new prod and stage Heroku apps under the Mozilla account

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

Details

Attachments

(1 file, 3 obsolete files)

3.04 KB, application/x-shellscript
Details
We currently have the "treeherder-heroku" app, from the prototype:
https://dashboard.heroku.com/apps/treeherder-heroku/resources

We need a separate app for both prod and stage. I guess we could rename the above app, but it might be nicer to start fresh.

Possible names:
* treeherder
* treeherder-stage

This will require creating these new apps, linking with the github repo, adding permissions to the appropriate people for the new apps, setting up the environment variables, defining which buildpack (since auto-detect fails for our repo due to both node and python), setting up the addons etc.
Note: We'll hold off doing this until most of the other deps of bug 1176484 are complete.
Things that may be useful:
https://devcenter.heroku.com/articles/fork-app
https://devcenter.heroku.com/articles/labs-pipelines

Pipelines sounds pretty cool - eg we deploy changes to stage and then promote the same slug to prod - ensuring prod ends up identical.
(In reply to Ed Morley [:emorley] from comment #2)
> Pipelines sounds pretty cool - eg we deploy changes to stage and then
> promote the same slug to prod - ensuring prod ends up identical.

It's worth noting that if bug 1201455 goes with the solution of "add migrations to the bin/post_compile script", then we'd have to remember to run all DB-touching steps in post_compile again, any time we used the `heroku pipeline:promote` command, which is pretty sucky. As such, I think we'd have to avoid using the (Labs) Pipeline feature until such a time when Heroku add official deploy task support (have submitted a ticket requesting this, see bug 1201455 comment 1).
Heroku have just launched a new feature called "Heroku Flow" which combines the Pipelines feature with some other functionality and cool UI. Looks pretty useful:
http://blog.heroku.com/archives/2015/9/3/heroku_flow_pipelines_review_apps_and_github_sync
Priority: P3 → P2
Depends on: 1176486
Blocks: 1176488
Blocks: 1176491
I've set up treeherder-prod, however treeherder-stage is already taken.

It's not under the mozilla org, and I've asked Mauro/Kendall and they said it's not registered under their personal accounts.

I've sent an email to James Lal, in case he has it from ages ago (he had a prototype named treeherder-test, since deleted, but there may be another), and have also filed a support ticket asking if it's someone with a mozilla.com email or who is a member of the mozilla org:
https://help.heroku.com/tickets/338459

Failing that we can always go with:
mozilla-treeherder-stage / mozilla-treeherder-prod
  or:
moz-treeherder-stage / moz-treeherder-prod
Assignee: nobody → emorley
The existing treeherder-stage app was owned by Cameron - it's now been deleted and these newly created:

https://dashboard.heroku.com/apps/treeherder-stage
https://dashboard.heroku.com/apps/treeherder-prod

I've:
* given everyone else on the team access 
* locked the apps
* assigned to the stage/prod parts of the pipeline as appropriate (https://dashboard.heroku.com/pipelines/b7e33de9-2138-40ad-af57-00c6f9622109)

Left to do:
* Add addons (CloudAMQP, Deploy hooks, librato, memcachier, SSL endpoint)
* Set all the non-DB environment variables
* Once the RDS instances are up, set the DB environment variables
* Use the deploy pages to deploy an appropriate branch to each
* Adjust number of running dynos for each working type. Initially we'll want the workers at zero (or at least celerybeat at zero) so we don't clobber the incoming data from the DB replication.

I'll hold off on the above for the moment, since as soon as I add the addons we'll start being charged.
* Buildpacks:
I've updated the existing Heroku instance to v88 of the nodejs buildpack and 724e51b5c7ed3a522abea83083aaddd6a1deee6c for the Python buildpack (they haven't tagged master for a bit, so going with a SHA for now). The new Heroku apps have been updated:

[~/src/treeherder]$ heroku buildpacks --app treeherder-stage
=== treeherder-stage Buildpack URLs
1. https://github.com/heroku/heroku-buildpack-nodejs.git#v88
2. https://github.com/heroku/heroku-buildpack-python.git#724e51b5c7ed3a522abea83083aaddd6a1deee6c
[~/src/treeherder]$ heroku buildpacks --app treeherder-prod
=== treeherder-prod Buildpack URLs
1. https://github.com/heroku/heroku-buildpack-nodejs.git#v88
2. https://github.com/heroku/heroku-buildpack-python.git#724e51b5c7ed3a522abea83083aaddd6a1deee6c

* Features:

=== App Features (treeherder-heroku)
[ ] http-end-to-end-continue  Send 100-continue headers to the backend
[ ] http-session-affinity     Enable session affinity for all requests
[ ] http-shard-header         Turn shard headers on
[+] log-runtime-metrics       Emit dyno resource usage information into app logs
[+] release-phase             Enable the experimental release phase [alpha]
[ ] runtime-dyno-metadata     Share dyno metadata in environment variables

I've enabled log-runtime-metrics on the new instances - for release-phase I've emailed Owen at Heroku, since the feature is still in private alpha.
(In reply to Ed Morley [:emorley] from comment #7)
> I've enabled log-runtime-metrics on the new instances - for release-phase
> I've emailed Owen at Heroku, since the feature is still in private alpha.

The release phase beta has been enabled on the new apps.
No longer depends on: 1176486
Summary: Set up new prod and stage Heroku apps under the Mozilla account → Initial setup for new prod and stage Heroku apps under the Mozilla account
Add-ons added to the new apps:

  $ heroku addons -a treeherder-stage

    Add-on                                  Plan      Price
    ──────────────────────────────────────  ────────  ─────────
    cloudamqp (th-stage-cloudamqp)          bunny     $99/month
     └─ as CLOUDAMQP

    deployhooks (th-stage-deployhooks-irc)  irc       free
     └─ as DEPLOYHOOKS

    librato (th-stage-librato)              nickel    $19/month
     └─ as LIBRATO

    memcachier (th-stage-memcachier)        1000      $70/month
     └─ as MEMCACHIER

    ssl (th-stage-ssl)                      endpoint  $20/month
     └─ as SSL

  $ heroku addons -a treeherder-prod

    Add-on                                 Plan      Price
    ─────────────────────────────────────  ────────  ─────────
    cloudamqp (th-prod-cloudamqp)          bunny     $99/month
     └─ as CLOUDAMQP

    deployhooks (th-prod-deployhooks-irc)  irc       free
     └─ as DEPLOYHOOKS

    librato (th-prod-librato)              nickel    $19/month
     └─ as LIBRATO

    memcachier (th-prod-memcachier)        1000      $70/month
     └─ as MEMCACHIER

    ssl (th-prod-ssl)                      endpoint  $20/month
     └─ as SSL


Custom domains added (though still needs the certs setting up later):

  $ heroku domains -a treeherder-stage
    === treeherder-stage Heroku Domain
    treeherder-stage.herokuapp.com

    === treeherder-stage Custom Domains
    Domain Name             DNS Target
    ──────────────────────  ──────────────────────────────
    treeherder.allizom.org  treeherder-stage.herokuapp.com

  $ heroku domains -a treeherder-prod
    === treeherder-prod Heroku Domain
    treeherder-prod.herokuapp.com

    === treeherder-prod Custom Domains
    Domain Name             DNS Target
    ──────────────────────  ─────────────────────────────
    treeherder.mozilla.org  treeherder-prod.herokuapp.com


Papertrail log drains set up (need to follow the rest of http://help.papertrailapp.com/kb/hosting-services/heroku/#standalone to finish up the Papertrail side):

  $ heroku drains -a treeherder-stage | grep papertrail
    syslog+tls://REDACTED.papertrailapp.com:REDACTED (d.REDACTED)

  $ heroku drains -a treeherder-prod | grep papertrail
    syslog+tls://REDACTED.papertrailapp.com:REDACTED (d.REDACTED)


Environment variables set up as much as possible at this point:

  $ heroku config -a treeherder-stage | egrep -v '^(LIBRATO|MEMCACHIER|CLOUDAMQP)'
    === treeherder-stage Config Vars
    AUTOCLASSIFY_JOBS:            1
    BROKER_URL:                   $CLOUDAMQP_URL
    DATABASE_URL:                 mysql://FILLME:FILLME@FILLME/treeherder
    DATABASE_URL_RO:              mysql://FILLME:FILLME@FILLME/treeherder
    NEW_RELIC_APP_NAME:           treeherder-stage
    NEW_RELIC_CONFIG_FILE:        newrelic.ini
    NEW_RELIC_LICENSE_KEY:        REDACTED
    SERVE_MINIFIED_UI:            1
    SITE_URL:                     https://treeherder.allizom.org
    TREEHERDER_ALLOWED_HOSTS:     treeherder.allizom.org
    TREEHERDER_DJANGO_SECRET_KEY: REDACTED
    TREEHERDER_REQUEST_HOST:      treeherder.allizom.org
    TREEHERDER_REQUEST_PROTOCOL:  https
    WEB_CONCURRENCY:              3

  $ heroku config -a treeherder-prod | egrep -v '^(LIBRATO|MEMCACHIER|CLOUDAMQP)'
    === treeherder-prod Config Vars
    AUTOCLASSIFY_JOBS:            1
    BROKER_URL:                   $CLOUDAMQP_URL
    DATABASE_URL:                 mysql://FILLME:FILLME@FILLME/treeherder
    DATABASE_URL_RO:              mysql://FILLME:FILLME@FILLME/treeherder
    NEW_RELIC_APP_NAME:           treeherder-prod
    NEW_RELIC_CONFIG_FILE:        newrelic.ini
    NEW_RELIC_LICENSE_KEY:        REDACTED
    SERVE_MINIFIED_UI:            1
    SITE_URL:                     https://treeherder.mozilla.org
    TREEHERDER_ALLOWED_HOSTS:     treeherder.mozilla.org
    TREEHERDER_DJANGO_SECRET_KEY: REDACTED
    TREEHERDER_REQUEST_HOST:      treeherder.mozilla.org
    TREEHERDER_REQUEST_PROTOCOL:  https
    WEB_CONCURRENCY:              3
Eugh, for some reason their free-tier postgres addon keeps on getting added to both apps, and reappears after I delete it. Guessing a recent bug on their side, filed:
https://help.heroku.com/tickets/343353

For papertrail, following http://help.papertrailapp.com/kb/hosting-services/heroku/#standalone I've:
* Used the output of `heroku drains` to identify the system names on https://papertrailapp.com/groups/853883 and renamed them to treeherder-prod and treeherder stage
* Added these two systems to the Treeherder group: https://papertrailapp.com/groups/1510504
* Set up the custom searches against the Treeherder group (had to tweak from that in the guide to make it group-specific)...

  TOKEN='REDACTED'
  API_URL='https://papertrailapp.com/api/v1/searches.json'
  GROUP='search[group_id]=1510504'

  curl -G -v -H "X-Papertrail-Token: $TOKEN" -X POST $API_URL --data-urlencode 'search[name]=Platform errors' --data-urlencode 'search[query]="error code=H" OR "Error R" OR "Error L"' --data-urlencode $GROUP
  curl -G -v -H "X-Papertrail-Token: $TOKEN" -X POST $API_URL --data-urlencode 'search[name]=Deploys' --data-urlencode 'search[query]=program:(heroku/api heroku/slug) -scheduler' --data-urlencode $GROUP
  curl -G -v -H "X-Papertrail-Token: $TOKEN" -X POST $API_URL --data-urlencode 'search[name]=Scheduler jobs' --data-urlencode 'search[query]=program:scheduler' --data-urlencode $GROUP
  curl -G -v -H "X-Papertrail-Token: $TOKEN" -X POST $API_URL --data-urlencode 'search[name]=Dyno state changes' --data-urlencode 'search[query]=web (Idling OR Unidling OR Cycling OR "State changed" OR "Starting process")' --data-urlencode $GROUP
  curl -G -v -H "X-Papertrail-Token: $TOKEN" -X POST $API_URL --data-urlencode 'search[name]=Web app output' --data-urlencode 'search[query]="app/web"' --data-urlencode $GROUP

Logs are now visible at:
https://papertrailapp.com/systems/treeherder-stage/events
https://papertrailapp.com/systems/treeherder-prod/events
Changes made to Memcachier for both apps:
* New Relic key added.
-> However their integration was broken - filed bug 1255456 to follow up.

Changes made to CloudAMQP for both apps:
* Notification email added (treeherder mailing list)
* CPU + memory alarms enabled
* Queue alarm enabled (for >=500 messages for 300s)
-> However their New Relic integration is also broken today! Following up in bug 1255460.
For some reason setting environment variables on the new apps has stopped working (via both the CLI and the web UI), have filed:
https://help.heroku.com/tickets/343712
Attached file App setup script (obsolete) —
Attached file Dump app config script (obsolete) —
For usage like so:
./heroku-dump-config.sh treeherder-heroku > treeherder-heroku.txt 2>&1

...to allow easy diffing of each app.
Pretty much everything that isn't blocked has been done at this point, so closing this out.

Remaining work will be added as deps of bug 1176484.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
There's since been yet more back and forth on https://help.heroku.com/tickets/343712 ("Unable to set/change environment variables") relating to release-phase quirks. If anyone wants access to the ticket, let me know (I was under the impression adding people to the "share" list on a ticket just gave access, but didn't spam them, but mdoglio said emails are sent after all - so I've stopped adding people to so many tickets).
Attached file App setup script v2 (obsolete) —
Updating the setup script for future reference:
* Added NEW_RELIC_API_KEY (different from licence key, needed by bug 1165229)
* Adjusted dummy DATABASE_URL/DATABASE_URL_RO so django-environ finds it more parsable
* Mentioned release-phase labs
* Updated buildpack SHAs to reflect changes in bug 1254961
Attachment #8729470 - Attachment is obsolete: true
The some reason the papertrail 'system' entries for the new stage/prod heroku apps disappeared, causing them not to appear in the treeherder group.

For future reference, to restore them the steps were:
1) Run `heroku drains --app treeherder-FOO`
2) In the resultant output, copy the instance id of form 'd.1234-5678....'
3) Visit https://papertrailapp.com/dashboard -> All systems
4) Use the filter field to search for that instance
5) Choose 'settings' next to that instance
6) Give it an appropriate name (eg treeherder-stage)
7) Visit the treeherder group from the main dashboard page, and edit the group membership to ensure that the new name is included

Stage/prod are now visible here again:
https://papertrailapp.com/groups/1510504
Attached file App setup script v3
Changes:
* Sets up the foundelasticsearch addon
* Sets a placeholder ELASTICSEARCH_URL env variable
* Sets SKIP_PREDEPLOY=1
* Uses newer buildpack versions
Attachment #8731261 - Attachment is obsolete: true
No longer blocks: 1176488
(In reply to Ed Morley [:emorley] from comment #10)
> * Set up the custom searches against the Treeherder group (had to tweak from
> that in the guide to make it group-specific)...

The saved searches don't show up properly in the "saved searches" menu in the events view, only the ones from "all systems".

Have emailed Papertrail support with:

"""
When I visit https://papertrailapp.com/systems/treeherder-stage/events and click the saved searches button on the bottom toolbar, I'm shown:

Saved Searches
    docker econnrefused
    docker-500-error
    docker-worker
    ECONNABORTED
    error retrieving secrets
    ISE errors
    no diskspace
    provisioner - alert operator
    provisioner-spot decisions
    read_pulse_jobs
    Task error - aufs
    taskcluster-monitor - ISE
    taskcluster-queue:OperationTimedOut

...which is missing the saved searches assigned to the "Treeherder" group that I've created for that system:
https://papertrailapp.com/groups/1510504

Instead the entries above are only from the global saved searches that I see under "all systems":
https://papertrailapp.com/groups/853883

However, I've found if I save a search against this exact system (rather than the "Treeherder" group or "All systems"), then it *is* shown at the top of the saved searches list.

So it seems like:
1) Either the saved searches from the Treeherder group are not being included at all, or else the saved searches menu only shows the top N saved searches. If the former, that seems like a bug, if the latter, a "show more" or similar would be ideal.
2) With #1 fixed, it would be really helpful if saved searches for any user-created groups for that system, were shown higher in the saved searches menu order than those from "all systems".
"""
Comment on attachment 8729471 [details]
Dump app config script

A newer version of this script can now be found in bug 1176484.
Attachment #8729471 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: