Automate biweekly runs of passwordmgr-related-realms-updater
Categories
(Cloud Services :: General, task)
Tracking
(Not tracked)
People
(Reporter: brian, Assigned: cvalaas)
References
Details
Attachments
(1 file)
|
193.03 KB,
image/png
|
Details |
The team behind the https://github.com/mozilla/passwordmgr-remote-settings-updater tool needs it to run once biweekly.
There is a docker image for it available at https://hub.docker.com/repository/docker/mozilla/passwordmgr-related-realms-updater
This could be implemented similarly to the existing https://github.com/mozilla-services/cloudops-infra/tree/master/projects/ccadb2onecrl job.
Previous work on this was tracked in https://jira.mozilla.com/browse/SE-1735
Sven is handing this off to Chris, who should be able to get help from teammates on Services SRE for setting this up. In particular I'm cc'ing Wei, since I know he can help with provisioning credentials and setting permissions Remote Settings.
Comment 1•4 years ago
|
||
I already provisioned credentials for this, but the permissions need to be updated, since the scope has been expanded. The username is related-realms-publisher, and the password is in hiera-sops/app/kinto.{stage,prod}.yaml. I've filed https://github.com/mozilla-services/remote-settings-permissions/pull/235 to expand the permissions of the bot user to both the collections the script is supposed to update. (Originally we planned to use that bot user only for one of the collections, hence the name.)
The Jenkis job should be pretty similar to the ccadb2onecrl one that Brian linked above. One difference is that we want to run the Docker image twice in this case, once for stage and once for prod, so there will be an additional stage in the Jenkinsfile.
Comment 2•4 years ago
|
||
One more thing – the ticket description says "biweekly runs". In my opinion, we could make that daily, or maybe Monday to Friday. The script checks whether there were any changes, and only updates the data if acutally needed. I don't see any advantage of running this only every two weeks.
Tim, I think you requested to run this every fortnight. Do you have any concerns about running this daily instead?
Comment 3•4 years ago
|
||
Sven, the only concern I have about running this job daily is that by running the job daily, this timeline forces myself or Dimi to review the collection the day the review request comes in. I didn't test the script's behavior if there is an outstanding review request on Remote Settings...so I'm not sure if Remote Settings will create a duplicate request or update the currently requested review.
Other than that, I don't have any concerns about running this job daily. Let me know if you need any other information from me!
| Assignee | ||
Comment 4•4 years ago
|
||
Hi folkx!
What are the values for FX_REMOTE_SETTINGS_WRITER_SERVER[1] for stage and prod? And are they secrets?
thanks!
[1] https://github.com/mozilla/passwordmgr-remote-settings-updater/blob/main/README.md
Comment 5•4 years ago
|
||
Hey :cvalaas, FX_REMOTE_SETTINGS_WRITER_SERVER should be "https://settings-writer.prod.mozaws.net/v1" for prod, and "https://settings-writer.stage.mozaws.net/v1" for stage. These are not secrets as they are seen in the Remote Settings documentation.
Let me know if you need more information from me, thanks!
Comment 6•4 years ago
|
||
Hi :cvalaas, do you think this work will be done by the end of the week? I didn't realize that QA was starting their testing next week and I need to have this data in Remote Settings so that they can test Bug 1686071. Just trying to keep my QA contact in the loop!
Thanks for the help!
| Assignee | ||
Comment 7•4 years ago
|
||
:tgiles possibly!
Unfortunately, I don't know what I don't know (and this is my first time through this process), but seems doable!
Sven mentioned you in the PR (https://github.com/mozilla-services/cloudops-infra/pull/3214) I made for this:
It looks like the Docker image still has the name passwordmgr-related-realms-updater. It would probably make sense to coordinate with @TGiles to get that renamed to be in line with the project name here.
Are you able to rename the docker image to passwordmgr-remote-settings-updater ?
Comment 8•4 years ago
|
||
:cvallas, thanks for the transparency, I appreciate it! :)
I'm not able to see the actual PR unfortunately, so thank you for quoting the relevant information! I don't know why I don't have access to that org, but I probably don't need access in the long run.
I'm looking into renaming the docker image and will keep you posted. I'm surprised the circleCI configuration doesn't handle this naming, I know I didn't have to specify a name before.
Keeping NI open so I don't forget to follow up
| Assignee | ||
Comment 9•4 years ago
|
||
I can't see this project (https://github.com/mozilla/passwordmgr-remote-settings-updater) in Circle-CI, so maybe it's controlled in that UI somewhere?
| Assignee | ||
Comment 10•4 years ago
|
||
Looks like it is set in the Circle CI UI (the DOCKERHUB_REPO variable): https://app.circleci.com/settings/project/github/mozilla/passwordmgr-remote-settings-updater/environment-variables
I can try changing that and see what happens, although I don't know if any other steps are needed (like on the Dockerhub side).
Comment 11•4 years ago
|
||
Wish I was more help here but I'm not too familiar with our docker and circleCI setup, I leaned on Sven's help getting that part of the repository set up. Hopefully changing the DOCKERHUB_REPO variable and rebuilding will be all that needs to happen, fingers crossed at least hah
| Reporter | ||
Comment 12•4 years ago
|
||
like on the Dockerhub side
Yes, you'd have to create a new repo with the new name on the dockerhub side and grant the appropriate permissions. So good to learn how to do that, but possibly a bad rabbit hole to go down right now if the initial run of this is time-sensitive.
Comment 13•4 years ago
|
||
Yeah, the initial run is relatively time-sensitive. I need to make sure I have the data that this job generates by end of day Friday, so I can review and merge it into Remote Settings so QA isn't crunched when testing this next week.
Please let me know if there's anything else I can do to help out!
| Assignee | ||
Comment 14•4 years ago
|
||
I've set up the Jenkins job and done a few test runs. After solving some issues of my own making, I've hit this error:
[...]
2021-06-22 22:06:37,216 INFO Running: docker pull mozilla/passwordmgr-related-realms-updater:v0.0.1
2021-06-22 22:06:43,092 INFO Running: docker create --name 54f81ab5-59b6-4185-8eac-b236578bb773 mozilla/passwordmgr-related-realms-updater:v0.0.1
2021-06-22 22:06:43,915 INFO Running: d o c k e r c p 5 4 f 8 1 a b 5 - 5 9 b 6 - 4 1 8 5 - 8 e a c - b 2 3 6 5 7 8 b b 7 7 3 : / a p p / v e r s i o n . j s o n - | t a r x O
2021-06-22 22:06:43,968 ERROR Error: No such container:path: 54f81ab5-59b6-4185-8eac-b236578bb773:/app/version.json
tar: This does not look like a tar archive
tar: Exiting with failure status due to previous errors2021-06-22 22:06:43,969 INFO Running: docker rm 54f81ab5-59b6-4185-8eac-b236578bb773
[...]
Obviously it's looking for a version.json file which doesn't exist in the image.
I see that the Circle CI config (https://github.com/mozilla/passwordmgr-remote-settings-updater/blob/507a47e889fe8dd4a356cda3e08aec15774b65ce/.circleci/config.yml#L23) is supposed to create that file, so I'm guessing that maybe the Dockerfile (https://github.com/mozilla/passwordmgr-remote-settings-updater/blob/main/Dockerfile) needs a COPY ./version.json /app/version.json line in it?
Comment 15•4 years ago
|
||
I think you are right that the file needs to be explicitly copied, and we should also provide a placeholder version.json file inside the repo so the image can be built locally. However, the error message looks like something is trying to extract version.json as a tar archive. I don't understand why this is happening, and it will fail even when the file exists.
Comment 16•4 years ago
|
||
I've gone ahead and created a quick PR for adding the COPY step and the version.json. Didn't want to push to main without making sure this is what needs to happen. Docker on my Windows machine is acting up so I can't verify the changes right now, going to get Docker set up on my Mac and see if I can verify some of the changes at least.
| Assignee | ||
Comment 17•4 years ago
|
||
Just looked at the docker cp help. It extracts the files as a tar archive, so that explains the piping thru tar.
I think we can push the PR to main, do another release, and try Jenkins again.
| Assignee | ||
Comment 18•4 years ago
|
||
Jenkins was able to get the image running. The command (node /app/update-script.js) ran for about 10 minutes before exiting (this was running against stage). No output was seen.
Part of the Jenkins logs:
[...]
$ docker top 27960c46c463e76e23276b85f244489f45623d47905b6fb9677544ac69e37e6c -eo pid,comm
ERROR: The container started but didn't run the expected command. Please double check your ENTRYPOINT does execute the command passed as docker run argument, as required by official docker images (see https://github.com/docker-library/official-images#consistency for entrypoint consistency requirements).
Alternatively you can force image entrypoint to be disabled by adding option--entrypoint=''.
[...]
[Pipeline] sh
- node /app/update-script.js
wrapper script does not seem to be touching the log file in /home/jenkins/slave/workspace/pipelines/utils/passwordmgr-remote-settings-updater@tmp/durable-37614c23
(JENKINS-48300: if on an extremely laggy filesystem, consider -Dorg.jenkinsci.plugins.durabletask.BourneShellScript.HEARTBEAT_CHECK_INTERVAL=86400)
$ docker stop --time=1 27960c46c463e76e23276b85f244489f45623d47905b6fb9677544ac69e37e6c
[...]
Should there be output? How long should the script take?
Comment 19•4 years ago
|
||
Strange. Thanks for the update. Yeah there should be console.log output during the process...and it shouldn't take 10 minutes to run, maybe a minute or so. I'm debugging the script now and will keep you posted!
| Assignee | ||
Comment 20•4 years ago
|
||
Found out Jenkins doesn't like ENTRYPOINTs. So after some quick changes we got this working.
It's currently set to run on the 1st and the 15th of each month at 17:00 (UTC, I'd presume).
I think this can be closed?
Comment 21•4 years ago
|
||
Sounds good to me. Thanks for all the help Chris!
| Assignee | ||
Updated•4 years ago
|
Comment 22•3 years ago
|
||
Hey :cvalaas, are you able to see when this job runs? I haven't seen any updates from the "passwordmgr-related-realms-updater" account since we initially resolved this. I'm not sure if there's an issue in the update script or an environment issue or what, but I think a good first step is being able to determine if the Jenkins job is running as expected.
| Assignee | ||
Comment 23•3 years ago
|
||
Hello, I'm out for a month or so(?) probably, so I'm CC'ing :thealy to get this assigned to someone else.
Comment 24•3 years ago
|
||
I'll probably be taking ownership of the job soon, so I took a quick look. The job is running every two weeks, but it's failing every time. It should notify cvalaas in Slack when it fails, but it looks like the Jenkins/Slack integration is broken as well.
The reason for the failure appears to be the branch configuration in Jenkins. The job was configured to use the branch /refs/heads/master of cloudops-infra. I changed the branch name to refs/heads/master, without the leading slash, and now it seems to be working fine.
I'll try to get the Slack notifications fixed as well.
Comment 25•3 years ago
|
||
Tom, I think this is fixed for the time being. I'll be in touch about taking ownership of the Jenkins job.
Comment 26•3 years ago
|
||
The job seems to be working fine from my end, just to confirm. I received the review requests from the automated "passwordmgr-related-realms-updater" account, guess we'll see in two weeks if the job is green or not. Thanks all for the help!
Comment 27•3 years ago
|
||
For what it's worth, here is a history of the Jenkins runs of the job. Runs #8 and #9 were successful, but they were manually triggered and did not run from the master branch of the cloudops-infra repo. All scheduled runs after that failed because of the branch misconfiguration. Today, the manually triggered runs succeeded (I accidentally triggered the job twice).
Description
•