Deploy shipitscript into GCP
Categories
(Release Engineering :: General, enhancement)
Tracking
(Not tracked)
People
(Reporter: catlee, Assigned: rail)
References
Details
(Whiteboard: [releng:q12019])
Comment 1•5 years ago
•
|
||
:oremj would it be possible to create web-less instances (just like we did for shipit worker) for new project: scriptworker/shipit.
As usual we would need this for 3 environments:
- testing (docker tag: scriptworker/shipit-docker-testing-latest)
- staging (docker tag: scriptworker/shipit-docker-staging-latest)
- production (docker tag: scriptworker/shipit-docker-production-latest)
we will also need a network access to shipit
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
(In reply to Rok Garbas [:garbas] from comment #1)
- testing (docker tag: scriptworker/shipit-docker-testing-latest)
- staging (docker tag: scriptworker/shipit-docker-staging-latest)
- production (docker tag: scriptworker/shipit-docker-production-latest)
I think we need to fix these tags, slashes are not allowed.
Comment 3•5 years ago
|
||
:autrilla
new tags are:
- scriptworker_shipitscript_docker_testing
- scriptworker_shipitscript_docker_staging
and those images are pushed to https://hub.docker.com/r/mozilla/release-services/tags
for now we can skip the production, until we figure out how to handle secrets.
Assignee | ||
Comment 4•5 years ago
|
||
I updated the images and they are ready (tested on my laptop, TM) to go. I have the secrets, just need to hand them over.
We also need to figure out how to properly configure the network, so we can connect to both ship-it APIs (until we retire v1).
-
ship-it v1 is hosted by IT and requires a VPN connection (vpn_shipit for prod and vpn_shipitdev LDAP groups I believe). ericz may have better info.
-
ship-it v1 is hosted by cloudops and restricted by IP and vpn_cloudops_shipit is the LDAP group we use to add users. I'm not sure if the LDAP group is relevant in this case.
Comment 5•5 years ago
|
||
(In reply to rail@mozilla.com from comment #4)
I updated the images and they are ready (tested on my laptop, TM) to go. I have the secrets, just need to hand them over.
Great! Is there any difference to how the image should be ran on each environment, other than the secrets?
We also need to figure out how to properly configure the network, so we can connect to both ship-it APIs (until we retire v1).
- ship-it v1 is hosted by IT and requires a VPN connection (vpn_shipit for prod and vpn_shipitdev LDAP groups I believe). ericz may have better info.
This might be a bit problematic, I thought we only needed to talk to v2. I imagine IT would want us to have a single static IP from which we talk to ship-it, is that so :ericz?
I haven't done anything like this before on GCP, but someone from my team has, and AIUI it was for applications we control, not for something ran by IT.
- ship-it v2 is hosted by cloudops and restricted by IP and vpn_cloudops_shipit is the LDAP group we use to add users. I'm not sure if the LDAP group is relevant in this case.
Talking to ship-it v2 won't be an issue since they're both in the same cluster and we won't need to cross into the internet.
Assignee | ||
Comment 6•5 years ago
|
||
(In reply to Adrian Utrilla [:autrilla] from comment #5)
Great! Is there any difference to how the image should be ran on each
environment, other than the secrets?
The command line is the same (the default CMD directive). They are configured to use different configs depending on the env/secrets.
Comment 7•5 years ago
|
||
To talk to ship-it v1 I think we'd have to set it up on a public-facing load balancer with a different DNS name and then we could potentially limit it by IP address (or maybe something else but offhand I can't think of anything better).
Comment 8•5 years ago
|
||
:ericz, all our traffic from our nonprod (staging and testing) environments will come from 35.197.23.59. Could you whitelist this so we can talk to shipitv1 from it? Let me know if you need any more information to do this.
Comment 9•5 years ago
|
||
I'm spinning off that work in new bug 1525746.
Assignee | ||
Comment 10•5 years ago
|
||
Sigh, the name won't resolve, because we use split-horizon DNS.
2019-02-15 03:04:59,899 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): ship-it-dev.allizom.org:443
Traceback (most recent call last):
File "/nix/store/sfx431rh4x09nv0sgripmn01rf6pwdb6-python3.7-urllib3-1.24.1/lib/python3.7/site-packages/urllib3/connection.py", line 159, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw)
File "/nix/store/sfx431rh4x09nv0sgripmn01rf6pwdb6-python3.7-urllib3-1.24.1/lib/python3.7/site-packages/urllib3/util/connection.py", line 57, in create_connection
for res in socket.getaddrinfo(host, port, family, socket.SOCK_STREAM):
File "/nix/store/sh0rq55jaambzqx59g0kdk59g23vj8m6-python3-3.7.0/lib/python3.7/socket.py", line 748, in getaddrinfo
for res in _socket.getaddrinfo(host, port, family, type, proto, flags):
socket.gaierror: [Errno -2] Name or service not known
Assignee | ||
Comment 11•5 years ago
|
||
The good thing that the worker takes the tasks from the queue. \o/
We may want to tweak the worker name a bit, it uses the hostname (which is not unique) and equals to 22 first characters of the k8s workload name (scriptworker-stage-shipitapi-app-1 -> scriptworker-stage-shi). I'm not even sure that the name would be somehow useful...
Assignee | ||
Comment 12•5 years ago
|
||
Adrian, do we autodeploy the docker images to stage/testing now? I tried to test a workaround, but it looks like scriptworker-stage-shipitapi-app-1 is still using the images from Feb 6.
Comment 13•5 years ago
|
||
We did not when you commented, but we do now. There's an up-to-date image in staging now.
Assignee | ||
Comment 14•5 years ago
|
||
Thank you!
Assignee | ||
Comment 15•5 years ago
|
||
We finally dropped ship-it v1 and don't need any special routes to MDC1/2. We can undo the special settings.
Now I'm getting a 403 from https://api.shipit.staging.mozilla-releng.net/ when I try to run shipitscript. The idea was that they are in the same cluster, so the IP-based whitelisting works either out of the box or without any extra setup.
Adrian, can you
-
get rid of the customization made in comment #8, no need to communicate to ship-it v1 anymore. No rush with this.
-
make sure that scriptworker-stage-shipitapi-app-1 is whitelisted in either in shipitapi-dev-shipitapi-app-1 or shipitapi-stage-shipitapi-app-1. I always forget which one corresponds to our staging :/ Maybe it'll resolve itself if you get rid of 1)
Probably it'd be better to align the names at some point, to get rid of this dev/stage/staging confusion with shipit.
Thank you in advance!
Comment 16•5 years ago
|
||
Regarding the 403, it's because you're trying to talk to it through the public IP. You should be able to connect to the Kubernetes service directly over HTTP (not HTTPS, since we terminate that at the edge).
In stage.shipitapi.nonprod.cloudops.mozgcp.net
, this is http://shipitapi-stage-shipitapi-app-1
.
In testing.shipitapi.nonprod.cloudops.mozgcp.net
, this is http://shipitapi-testing-shipitapi-app-1
.
Assignee | ||
Comment 17•5 years ago
|
||
D'oh... We enforce HTTPS in our apps, so I get a 302 to the HTTPS:
2019-04-23T14:56:33 INFO - 2019-04-23 14:56:33,862 - urllib3.connectionpool - DEBUG - http://shipitapi-stage-shipitapi-app-1:80 "PATCH /releases/Fennec-67.0b3-build1 HTTP/1.1" 302 345
E
2019-04-23T14:56:33 INFO - 2019-04-23 14:56:33,864 - urllib3.connectionpool - DEBUG - Starting new HTTPS connection (1): shipitapi-stage-shipitapi-app-1:443
Need to think what to do....
Comment 18•5 years ago
|
||
This could be our NGINX redirecting you. If you send an X-Forwarded-Proto header set to https, that should prevent NGINX from redirecting. Not sure if that's doable. Otherwise we could expose the application directly instead of through NGINX through a Kubernetes Service, but that's not ideal.
Assignee | ||
Comment 19•5 years ago
|
||
Yeah, it's getting a bit hairier that I thought. :)
Rok, maybe you have some ideas?
There are a couple issues:
- When I use FQDN to access the API endpoint, the requests end up hitting the public IP, which requires whitelisting of the k8s replicas, what defeats the idea that we should bypass public routes in the same cluster.
I wonder if the source IPs of the requests coming from the same cluster should be the same with the public IP of that cluster, so we can easily whitelist it.
- If I use the k8s names (e.g. shipitapi-stage-shipitapi-app-1), then I have to use http instead of https, but flask-talisman redirects to https in our case. Then the request times out.
I can hack the client requests and set the X-Forwarded-Proto header to "https". In this case we bypass flask-talisman, but now I hit an issue with mohawk, which verifies the auth headers, but falls back to using port 443 for some reason instead of 80. Probably it fails to properly guess the port in https://github.com/mozilla/release-services/blob/30fe29c037cb2a58d64ebdbf6dcf5b1456e14820/lib/backend_common/backend_common/auth.py#L391.
TBH, 2) sounds a bit dirty and hacky. :/
Any other alternatives?
Assignee | ||
Comment 20•5 years ago
|
||
We chatted about this today with Rok and I think I'm going to take the second route. It will not require any special changes neither in ship-it or GCP/k8s. This way we don't rely on special setup, but only on the client.
Assignee | ||
Comment 21•5 years ago
|
||
I submitted a couple PRs to address this:
https://github.com/mozilla-releng/shipitscript/pull/30
https://github.com/mozilla-releng/shipitapi/pull/15
Assignee | ||
Comment 22•5 years ago
|
||
Looks like we are ready to go with prod in bug 1547317. Let's do eet! :)
Comment 23•5 years ago
|
||
Found in triaging. We moved shipitscriptworkers into GCP a while ago in bug 1581149.
I think we can close this for now. Feel free to re-open should I'm wrong.
Description
•