Closed Bug 1508780 Opened 7 years ago Closed 6 years ago

Please setup pipeline to serve product-details files

Categories

(Cloud Services :: Operations: Miscellaneous, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rail, Assigned: oremj)

References

Details

The idea is to serve static files from https://github.com/mozilla-releng/product-details/ to https://product-details.mozilla.org (eventually). I'll file a separate bug to move the DNS of the production site. We would need to have multiple domains, depending on the branch of the repo: 1) `staging` branch -> staging environment -> https://product-details.staging.mozilla-releng.net 2) `testing` branch -> testing environment -> https://product-details.testing.mozilla-releng.net 3)`production` branch -> production environment -> https://??? something that we will replace in the future Rok or I can setup the CNAMEs of the domains above pointing to a particular DNS. BTW. How we would need to trigger a sync from github? A webhook maybe?
We have two options here: 1) Build a docker image from https://hub.docker.com/_/nginx/ and deploy this like any other "dockerflowed" application. 2) Come up a with a custom process to sync these to a GCS bucket that is fronted by a GLB (https://cloud.google.com/load-balancing/docs/https/adding-a-backend-bucket-to-content-based-load-balancing). Option 1 is a lot more flexible, we can add custom headers, redirects, rewrites or anything else that nginx can do. Option 2 has a bit less complexity, but also less flexibility. Since option 2 is going to need a custom script/process that does the syncing it might be a bit more time consuming to set up. Thoughts? If we choose option 1, I can assist with creating the circle and docker files.
Flags: needinfo?(rail)
:oremj lets go with number one and we then follow the same workflow as we do with deploying other services. we give you a docker image and you deploy it. Do you maybe have an example nginx configuration that I can reuse or reference to as a guide, especially if there are any dockerflow specific bits in there? We'll setup taskcluster as we do with other projects. Would that be ok? :rail This would be a useful functionality to also have generally in out base docker image for other frontend things in the future.
Maybe another options (3) could be just to not use GCP (I'm not sure how married are we to use GCP) but to use another services such as https://www.netlify.com. Its a CDN which nicely integrates with Github and allows us to set custom headers as well. I think taskcluster team is using it for their frontend already.
I also looked at google cloud storage buckets and it looks like they don't support setting "Content-Security-Policy" header which I think should be a dealbreaker. There is a feature request to get this https://issuetracker.google.com/issues/36427250 If this would be possible I would actually vote to use this options, since we already have code to push to s3 adding another would be a simple task for us. netlify.com does on the other hand support setting custom headers (also "Content-Security-Policy" header), https://www.netlify.com/docs/headers-and-basic-auth/#custom-headers So if it is possible to use netlify.com i think that would be the easiest for us to do, but I'm not sure about the policy we have for this service. A lot of frontend developers like to use that service for their deployment (we also woulud like to move our frontend pages to netlify, currently we push them to S3, but we can not set "Content-Security-Policy" header there as well) which might be an incentive more to look closer at it. On the other hand this increases the complexity since we would be using more services and I would understand if we can not use netlify.com.
Looking at the whole workflow of netlify.com it seems that it would completly exclude CloudOps from the deployment, since we would just setup everything there and keep committing to a repository. I'm not sure this is desired (you would have less work and be able to focus on other things), but maybe you want to have this control over where/where/how we deploy to. Maybe having the owner rights to change things (SSL cert, headers) in netlify.com would be your job and you would only give us permission to commit to the repository you own. Does this makes sense? /me promises to stop commenting on my comments :)
These are interesting points! There is an opportunity here to provide a lightweight self-service way of doing things and then the traditional CloudOps model. We don't have a properly defined answer yet, but this is something we need to consider as we head into 2019. There are a few options to consider and I'm leaning towards Firebase to stay in the current hosting eco-system. This will require some testing though, but it is a part of our overall contract.
:habib I think what i said above only applied for frontend projects. In this case we don't ship any javascript and therefore those security headers are pointless. I think over complicated here. :oremj I think the easiest way would be to sync everything to Google Cloud Storage. Here is my suggestion how to do it and the reasons behind it. 1. I'd use taskcluster to run a task on each commit for "testing", "staging" and "production" branches. I would prefer to use taskcluster since it lets us see when things break as oposed to running it in your cloudops jenkins instance (here i'm assuming it would be harded to get read only access and have nice github integration, I might be wrong here) 2. To "rsync" to Google Cloud Storage we need gcloud/gsutil tools packaged in a docker image. I would use the "blessed" google/cloud-sdk[1], which comes with all the tools we need. 3. Because of using taskcluster we can simply store GCS api key in taskcluster secrets. The model that taskcluster github integration follows here that taskcluster task gets automatically assigned scopes per branch. eg. when running a task that triggers on testing branch we would automatically have read to repo:github.com/mozilla-releng/product-details:branch:testing. 4. Once we fetch api key (which has only scopes to upload to one bucket) we can then use "gsutil rsync"[2] Do you want me to implement this? For this I would either need those aki keys or if you create those taskcluster secrets for me. [1] https://hub.docker.com/r/google/cloud-sdk/ [2] https://cloud.google.com/storage/docs/interoperability#using_the_gsutil_command_line
Flags: needinfo?(oremj)
Flags: needinfo?(oremj)
Habib is out this week. I'll chat with him early next week and we can figure out a strategy for this.
Flags: needinfo?(oremj)
Oops, I missed the NI somehow. I think we are on track with this issue.
Flags: needinfo?(rail)
Can we move all files meant to be public to a subdirectory? For example, public/ so we can add CI or other configuration files without needing to exclude them when syncing to their final location.
Flags: needinfo?(oremj) → needinfo?(rail)
:oremj sure i will try to do this soon (today or tomorrow). i will use public/ folder.
Flags: needinfo?(rail)
:oremj I implemented this[1] but i'm still testing if this works correctly. I should soon finish this. Just to make sure, this is going to be done via system where releng team (and also ci duty) can investigate what when wrong and in some cases also be able to rerun it. I suggest taskcluster since releng and ci team already know how to work with it. [1] https://github.com/mozilla/release-services/pull/1778

We've decided that firebase is not going to be flexible enough for us for our static hosting needs, so we are going to deploy this to S3 + Cloudfront.

Talked about in ops/rel meeting today:

We need to change the format of https://github.com/mozilla-releng/product-details/tree/master/public to match what is currently at https://product-details.mozilla.org/

The buckets and Cloudfront distributions are ready to go. I'll need access to taskcluster or CircleCI (admin on the project) to configure the AWS credentials.

Flags: needinfo?(rgarbas)

:oremj i created a dummy secrets for each branch:

as member of releng group we get all the scopes for repo:github.com/mozilla-releng repos, but we don't have scopes to grant to others. for this we need to ask taskcluster team.

should we give permission (scopes in taskcluster terms) to a group? i assume you have some ldap group for cloudops admins, right? with that info we can then request change of permission to allow that ldap group to change those secrets.

Flags: needinfo?(rgarbas) → needinfo?(oremj)

Our group should all have assume:mozilla-group:team_services_ops. Does that work?

Also, can you grant me admin access on https://github.com/mozilla-releng/product-details as well?

Flags: needinfo?(rgarbas)

I see I now have access to the github repo, so looks like I just need access to the secrets now.

:oremj I submitted Bug 1527571 to get whole cloudops team appropriate taskcluster scopes.

Depends on: 1527571
Flags: needinfo?(rgarbas)

Is it alright to commit files to this repo (Taskcluster or CircleCI configs)?

Flags: needinfo?(rgarbas)
Flags: needinfo?(rail)

:oremj yes feel free to create a taskcluster configuration (lets not use circleci) at root of the repository. Lets design this configuration on master branch and then we can copy it over to other branches.

Flags: needinfo?(rgarbas)
Flags: needinfo?(rail)
Flags: needinfo?(oremj)
Flags: needinfo?(oremj)

Still waiting on those permissions, do you know the typical time it takes to get those pushed through? Maybe we need to ping someone?

Flags: needinfo?(oremj) → needinfo?(rail)

Sorry it took so long. We are should be clear now.

Flags: needinfo?(rail)

I also need permissions to create a role for the repo. Example error:

You are not authorized to perform the requested action. Please sign in and try again, or verify your scopes in the Credentials Manager.
Client ID mozilla-auth0/ad|Mozilla-LDAP|oremj does not have sufficient scopes and are missing the following scopes:

auth:create-role:repo:github.com/mozilla-releng/product-details:branch:testing

Flags: needinfo?(rgarbas)
Flags: needinfo?(rail)

Jeremy, according to https://bugzilla.mozilla.org/show_bug.cgi?id=1527571#c3 we want to use declarative roles. What roles do we need to create? Sorry, I'm a bit outdated on the context of this bug.

Flags: needinfo?(rail)

:oremj while we wait for ci-config / ci-admin to work, tell me which roles you want to create and i (or rail) will create it for you.

Flags: needinfo?(rgarbas)

I'm new to taskcluster, so might be mistaken, but I think we need a role for each branch on the repo with each role having access to its related secret.

:oremj you are correct how taskcluster roles/secrets should be setup. i just created the roles for each branch and gave them scope to read from branch secrets

Please make the following DNS changes:

CNAME https://product-details.testing.mozilla-releng.net to productdetails-testing.dev.mozaws.net
CNAME https://product-details.staging.mozilla-releng.net to productdetails-staging.stage.mozaws.net

Flags: needinfo?(rgarbas)
Flags: needinfo?(rail)

:oremj https://github.com/mozilla-releng/build-cloud-tools/pull/374

are you able to also apply the terraform changes? or should i apply them?

Flags: needinfo?(rgarbas)
Flags: needinfo?(rail)

we have a successful first diff of product details between V1 and V2:

https://gist.github.com/garbas/b008d13a3128efa5d5ca2d94a3a1f58a

we already discussed all the changes with :rail and they are expected.

i will keep updating this diff on a daily or per request for a week or two, just so we see how few cycles of betas are processed.

I don't have access to apply changes to the releng zone.

Jake applied the changes yesterday and they should be live by now. I try to connect https://product-details.testing.mozilla-releng.net/ and https://product-details.staging.mozilla-releng.net/, but Firefox complains and refuses to connect:

An error occurred during a connection to product-details.testing.mozilla-releng.net. Cannot communicate securely with peer: no common encryption algorithm(s). Error code: SSL_ERROR_NO_CYPHER_OVERLAP

Jeremy, any idea why this happens?

Flags: needinfo?(oremj)

Needed a small fix to the cloudfront config. That should be fixed now, currently propagating, but is already working for me.

Flags: needinfo?(oremj)

Sweet! thank you.

:oremj can we turn on index listing on those buckets? since this currently possible with product-details.mozilla.org

Flags: needinfo?(oremj)

also another thing. can we able hsts headers for this domains (and in the future also for production). i suppose we would need to setup a lambda function to add this headers.

(In reply to Rok Garbas [:garbas] from comment #37)

:oremj can we turn on index listing on those buckets? since this currently possible with product-details.mozilla.org

There is no way to turn on index listing on S3 buckets. You'd have to generate an index.html file if you want this behavior.

(In reply to Rok Garbas [:garbas] from comment #38)

also another thing. can we able hsts headers for this domains (and in the future also for production). i suppose we would need to setup a lambda function to add this headers.

I'll look in to setting HSTS headers.

Note: mozilla-django-product-details relies on the file listing pages to find all the files to update locally, as well as Last-Modified headers to determine which files should be updated. This library is in use by a number of our websites including www.mozilla.org, MDN, and SUMO.

See bug 1226677 from when product-details.m.o was first implemented.

:oremj oh i thought there is an option for index listing. i will generate index.html in each folder.

I think we are going to need to move the deployment from taskcluster to circle. Our team is much more familiar with CircleCI and I'm running in to problems with secrets and taskcluster. It has no protection against printing secrets where CircleCI will strip them from output.

Turns out turning on CircleCI for mozilla-releng requires some extra steps, so we might as well stay the course with taskcluster.

https://product-details.staging.mozilla-releng.net and https://product-details.testing.mozilla-releng.net/ are syncing. And I've set strict-transport-security: max-age=31536000 on all the objects.

Flags: needinfo?(oremj)

Let me know when I should proceed with the production set up.

CNAME record was updated at

https://github.com/mozilla-releng/build-cloud-tools/pull/375

for

product-details.mozilla-releng.net

domain and is pointing to

productdetails-prod.prod.mozaws.net

:oremj is it configured correctly?

Flags: needinfo?(oremj)

Looks like everything is up and running \o/

Paul, can you see if the new product-details site (https://product-details.mozilla-releng.net/) works fine with your automation. We are going to switch the DNS entries and make it work as https://product-details.mozilla.org/

Flags: needinfo?(oremj) → needinfo?(pmac)

I just tested against the product-details.mozilla-releng.net and everything worked perfectly! Nice work all! And thanks for the ping.

Flags: needinfo?(pmac)

\o/ thank for the quick turn around!

Paul, BTW, in case you need something better than scraping, you can use https://github.com/mozilla-releng/product-details/tree/production to track the changes behind that web site.

Oh funny. Rail: we're mostly using our scripts to scrape and keep another git repo updated:

https://github.com/mozilla/product-details-json

So perhaps we can ditch our thing all together and just use your repo?

Other sites still use the scraping, but bedrock (www.m.o) uses the above repo which is kept up-to-date with a periodic Jenkins job.

AFAIK https://github.com/mozilla/product-details-json uses some extra tests in order to sanity check the data. I wonder if we can integrate these repos and simplify the our lives.

It does indeed. Those tests are in place because the files we mostly use on the site (firefox_versions.json and firefox_primary_builds.json) depend on each other, but are updated independently. So we ran into race conditions where an update would get a new firefox_versions.json file for example, but no builds would be available for the new versions in firefox_primary_builds.json, so pages like www.mozilla.org/firefox/all/ could be blank for a bit. The checks in that repo simply make sure builds are available for all versions in firefox_versions.json.

https://github.com/mozilla/product-details-json/blob/master/update-product-details.py#L39-L63

But I could move those checks elsewhere or perhaps your new system doesn't have this issue? In any case I'd definitely be interested in getting together to discuss how best to consolidate our efforts on this.

:pmac I moved those 2 checks into product details generation code. would you be able to review them?

https://github.com/mozilla/release-services/pull/1987

Woooooo, I like this bug :D

I think we can close this. Please reopen if anything is missing.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.