Closed
Bug 1281821
Opened 8 years ago
Closed 6 years ago
Clarify the Pulse usernames and environment variables
Categories
(Tree Management :: Treeherder: Infrastructure, defect, P2)
Tree Management
Treeherder: Infrastructure
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: emorley, Assigned: emorley)
References
Details
Attachments
(4 files)
Broken out from bug 1281810 comment 5.
We use Pulse for:
* publishing events, which people can subscribe to
* the new style jobs ingestion from eg taskcluster
At the moment we have an account for each of these on prod (one called "treeherder" and the other "treeherder-prod"), and only an account for the second on stage (aiui).
We should:
* Decide whether two usernames are necessary for each environment, or if we should use one
* Pick more appropriate usernames for each (eg "treeherder-publishing-prod" and "treeherder-ingestion-prod")
* Pick more descriptive environment variable names
...which will hopefully make this less confusing for future password rotations :-)
Assignee | ||
Comment 2•8 years ago
|
||
> We should:
> * Decide whether two usernames are necessary for each environment, or if we
> should use one
I think we can probably just use one username for prod and one for stage, and have that username be used for both the "publish to pulse" and "ingest from pulse" features.
We could then combine the `PULSE_URI` and `PULSE_DATA_INGESTION_CONFIG` environment variables into one (I think a good name for the combined one would be `PULSE_URL`, to match eg `DATABASE_URL`).
We'd then make pulse data ingestion conditional on `PULSE_DATA_INGESTION_CONFIG` being set, and not `PULSE_URL`, since the latter would default to `amqps://guest:guest@pulse.mozilla.org/` etc.
Thoughts? :-)
Flags: needinfo?(cdawson)
Comment 3•8 years ago
|
||
(In reply to Ed Morley [:emorley] from comment #2)
> We'd then make pulse data ingestion conditional on
> `PULSE_DATA_INGESTION_CONFIG` being set, and not `PULSE_URL`, since the
> latter would default to `amqps://guest:guest@pulse.mozilla.org/` etc.
I'm not quite following here. Didn't you say in the previous sentence that we'd eliminate ``PULSE_DATA_INGESTION_CONFIG`` and just use ``PULSE_URL``?
Regardless, though, I'm fine with making it the one URL and have pulse ingestion fail if it's guest, and enabled if it's SSL and uses the user. I think it'll throw errors if you try to use guest with ingestion, though I haven't tried.
So, yeah, this all SGTM for unifying our env vars... :)
Flags: needinfo?(cdawson)
Assignee | ||
Comment 4•8 years ago
|
||
> I'm not quite following here. Didn't you say in the previous sentence that we'd eliminate ``PULSE_DATA_INGESTION_CONFIG`` and just use ``PULSE_URL``?
Ah sorry I meant "make it conditional on `PULSE_DATA_INGESTION_SOURCES`" -- I think this perhaps highlights why it would be good to tweak the variable names, I keep on getting them mixed up hehe :-)
Comment 5•8 years ago
|
||
Ahh, gotcha. Yeah, it's way too easy to do at this point. :)
Comment 6•8 years ago
|
||
Talked to Ed. We will wait till we're on Heroku because it'll be easier to manage that transition. So removing this as a blocker to pulse.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → emorley
Assignee | ||
Comment 7•8 years ago
|
||
As part of this, it would be great to move away from using multi-line environment variables. They've made my SCL3->Heroku config comparison script a lot more complicated than it need be. For example a selection of the regex I have to use to clean up the SCL3 env dump:
sed -n '1h;1!H;${g;s/\n //g;p;}' "$TMPFILE" \
| sed -n '1h;1!H;${g;s/\n]/]/g;p;}' \
| sed 's/ //g' \
| sed 's/,"/, "/g' \
| sed 's/,{/, {/g' \
Comment 9•6 years ago
|
||
Comment 10•6 years ago
|
||
Once PR 3871 has been merged we'll need to do some tidy up on each Heroku env's config before deploying:
* Rename `PULSE_URI` -> `PULSE_URL`
* Remove `PULSE_DATA_INGESTION_CONFIG`
* Remove `PULSE_DATA_INGESTION_SOURCES`
* Remove `PULSE_PUSH_SOURCES`
* Add `PULSE_SOURCE_EXCHANGES="exchange/taskcluster-treeherder/v1/jobs,exchange/fxtesteng/jobs"
This means we're now down to the following configuration values:
* `PULSE_EXCHANGE_NAMESPACE`
* `PULSE_SOURCE_DESTINATIONS`
* `PULSE_SOURCE_EXCHANGES`
* `PULSE_SOURCE_PROJECTS`
* `PULSE_URL`
Only `PULSE_URL` and `PULSE_EXCHANGE_NAMESPACE` are left in our settings.py.
I've left `PULSE_EXCHANGE_NAMESPACE` there because it appears to only be used for publishing jobs in `treeherder/model/tasks.py` and IIRC that's going away at some point?
This leaves only `PULSE_URL`, should this move into `treeherder/services/pulse/connection.py`?
Comment 11•6 years ago
|
||
After some more thorough testing I realised `PULSE_PUSH_SOURCES` is still in use so I've refactored my changes such that we now have job_sources and push_sources (ingestion covers both!). In the Heroku envs we can still remove this config setting however since it's set to the default and we shouldn't have unnecessarily config overrides.
Comment 12•6 years ago
|
||
Commits pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/cc4bfa47f6ce60ee5722b65aa38ae169d7464e1f
Bug 1281821 - Use PULSE_URI instead of PULSE_DATA_INGESTION_CONFIG
https://github.com/mozilla/treeherder/commit/6e7b1efac3ce16b9c17ec64fa01d4908ade8dfb2
Bug 1281821 - Rename PULSE_URI -> PULSE_URL to match other config vars
https://github.com/mozilla/treeherder/commit/fb99e98c4f48d74ebc04c9455cc70133ab6123c8
Bug 1281821 - Split PULSE_DATA_INGESTION_SOURCES into three separate settings
Comment 13•6 years ago
|
||
Comment 14•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/079956945f8e8337ac5db30b0a68b7f73649cf37
Bug 1281821 - use PULSE_URI instead of PULSE_URL for publishing to Pulse
Assignee | ||
Updated•6 years ago
|
Assignee: emorley → ghickman
Status: NEW → ASSIGNED
Assignee | ||
Comment 15•6 years ago
|
||
Thank you for all the great refactoring/simplification here!
Looking through the remaining configuration, I think there's the following potential further cleanup:
* removal of destinations (handled by PR #3884)
* consolidation of PULSE_URI and PULSE_URL (bug 1342134)
* possible moving of PULSE_URL to services
* removal of the `PULSE_EXCHANGE_NAMESPACE` concept (since publishing is currently only used for job retriggering, and it would actually likely be useful to have stage/dev use the same exchange as prod, allowing testing of retriggers - since currently nothing listens to the `treeherder-staging` or `treeherder-prototype` exchanges).
* moving of all non-secrets related environment variables from Heroku environment variables to the default values in-code. eg: syncing the default `PULSE_JOB_EXCHANGES` to match what is used in production (it's currently different) and then unsetting the prod value. For some of these we can probably also just remove the env configurability entirely and have as fixed constants (since the values change so rarely, and we've never really set them differently on each of dev/stage/prod).
* apply the same refactoring to PULSE_PUSH_SOURCES as was done for the jobs config
* possibly simplifing the PULSE_JOB_PROJECTS and future PULSE_PUSH_<routing-key> environment variables into one, allowing people to specifying a single "pulse ingestion repository" filter (for the use case of solving the data ingestion backlog problem in vagrant by filtering to a subset of the repositories).
* seeing if there is a way to get pulse ingestion working with a pulse.mozilla.org guest account (which can be configured by default in Vagrant, allowing things to work out of the box)
Comment 16•6 years ago
|
||
While looking at where `PULSE_EXCHANGE_NAMESPACE` was used I also found our TreeherderPublisher was overly complicated for our usage, handled in PR #3919.
Comment 17•6 years ago
|
||
> apply the same refactoring to PULSE_PUSH_SOURCES as was done for the jobs config
I'm not sure we should even make this configurable per env. The current default mirrors what each Heroku env uses and we don't override it in Vagrant or the tests.
Should we remove the configurability of this and simply document how one can limit the push sources for local dev if necessary?
Assignee | ||
Comment 18•6 years ago
|
||
Yeah let's just remove the configurability of PULSE_PUSH_SOURCES entirely.
Comment 19•6 years ago
|
||
Comment 20•6 years ago
|
||
With PR #3926[1] I've converted PULSE_PUSH_SOURCES into a list of strings in the format `<exchange>.<routing_key>[:<routing_key>]`.
While looking at adding some test coverage to the read_pulse_jobs and read_pulse_pushes commands I noticed the only major difference between the two is how they retrieve and build their routing keys.
What about converting the job configuration back to a single key, `PULSE_JOB_SOURCES`? Rather than a JSON key it would follow the format of the pushes key and be a list of strings. With the removal of job destinations we could follow the same format I showed above since we use projects as part of the routing_key.
[1]: https://github.com/mozilla/treeherder/pull/3926
Comment 21•6 years ago
|
||
Having looked at moving PULSE_URL into the Pulse service I think we have two options:
# Settings-based
Leave it in settings so we can override it in tests/settings.py. This leaves one configuration value in the settings but it would mirror things like DATABASE_URL, etc and allows us to not set a URL in dev but set one for tests using tests/settings.py.
# Service-based
Move it into treeherder.services.pulse.connection and set a default when pulling it from the env. This would make the tests work but failures from the read_pulse_* in dev to be Connection refused rather than the more explicit "please set PULSE_URL" from Django environ.
I think I'm leaning towards the settings-based option because it simplifies local dev, what do you think Ed?
Comment 22•6 years ago
|
||
Turns out there's more to this than I realised, caught by a stale env! I've broken the discussion out into https://bugzilla.mozilla.org/show_bug.cgi?id=1484196
Comment 23•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/425350fb2b785320f4b497cabbbcebcc5ff4d7a0
Bug 1281821 - Convert Pulse push configuration into a list
We don't currently change the push ingestion sources in any of our envs
(prod, staging, prototype, test, dev) from their defaults. This
hardcodes those unchanging values and documents where to change them
should an engineer need to.
Assignee | ||
Updated•6 years ago
|
Assignee: ghickman → emorley
Assignee | ||
Comment 24•6 years ago
|
||
I've cleaned up the now unused environment variables from {prototype,stage,prod}, using:
{thd,ths,thp} config:unset PULSE_EXCHANGE_NAMESPACE PULSE_JOB_DESTINATIONS PULSE_JOB_EXCHANGES PULSE_PUSH_SOURCES PULSE_URI
Assignee | ||
Updated•6 years ago
|
Comment 25•6 years ago
|
||
Comment 26•6 years ago
|
||
Commit pushed to master at https://github.com/mozilla/treeherder
https://github.com/mozilla/treeherder/commit/55cf146ef176f9997f06d86d407a3a5f1ee8a441
Bug 1281821 - Docs: Correct Pulse environment variable references (#4097)
To account for the changes in #3948.
Assignee | ||
Updated•6 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•