./d0cker build-hgmo fails with cert error for s3-us-west-2.amazonaws.com



Developer Services
2 years ago
2 years ago


(Reporter: Pike, Unassigned)




(1 attachment)



2 years ago
Created attachment 8767156 [details]
more of the log

Trying to create the test environment for version-control-tools, and it's getting down to 
./d0cker build-hgmo
now. I'm just interested in hgmo, so I'm not doing build-all, error also looks unrelated. Snippet below, fuller log as an attachment.

ERROR hgmaster> TASK: [kafka-broker | install system packages] ******************************** 
ERROR hgmaster> changed: [localhost] => (item=java-1.8.0-openjdk-headless,tar)
ERROR hgmaster> TASK: [kafka-broker | download ZooKeeper and Kafka] *************************** 
ERROR hgmaster> failed: [localhost] => (item={'path': 'zookeeper-3.4.6.tar.gz', 'sha256': '01b3938547cd620dc4c93efe07c0360411f4a66962a70500b163b59014046994'}) => {"failed": true, "item": {"path": "zookeeper-3.4.6.tar.gz", "sha256": "01b3938547cd620dc4c93efe07c0360411f4a66962a70500b163b59014046994"}}
ERROR hgmaster> msg: failed to create temporary content file: ('The read operation timed out',)
ERROR hgmaster> failed: [localhost] => (item={'path': 'kafka_2.10-', 'sha256': '3ba1967ee88c7f364964c8a8fdf6f5075dcf7572f8c9eb74f0285b308363ecab'}) => {"failed": true, "item": {"path": "kafka_2.10-", "sha256": "3ba1967ee88c7f364964c8a8fdf6f5075dcf7572f8c9eb74f0285b308363ecab"}}
ERROR hgmaster> msg: Failed to validate the SSL certificate for s3-us-west-2.amazonaws.com:443. Make sure your managed systems have a valid CA certificate installed.  If the website serving the url uses SNI you need python >= 2.7.9 on your managed machine.  You can use validate_certs=False if you do not need to confirm the server\s identity but this is unsafe and not recommended Paths checked for this platform: /etc/ssl/certs, /etc/pki/ca-trust/extracted/pem, /etc/pki/tls/certs, /usr/share/ca-certificates/cacert.org, /etc/ansible
ERROR hgmaster> FATAL: all hosts have already failed -- aborting
ERROR hgmaster> PLAY RECAP ******************************************************************** 
ERROR hgmaster>            to retry, use: --limit @/root/docker-hgmaster.retry
ERROR hgmaster> localhost                  : ok=19   changed=9    unreachable=0    failed=1

Comment 1

2 years ago
Is there a command to just build hgweb, for example?

'cause I can just screw over security via

diff --git a/ansible/roles/kafka-broker/tasks/main.yml b/ansible/roles/kafka-broker/tasks/main.yml
--- a/ansible/roles/kafka-broker/tasks/main.yml
+++ b/ansible/roles/kafka-broker/tasks/main.yml
@@ -13,6 +13,7 @@
   get_url: url=https://s3-us-west-2.amazonaws.com/moz-packages/{{ item.path }}
            dest=/var/tmp/{{ item.path }}
            sha256sum={{ item.sha256 }}
+           validate_certs=no
     - { path: zookeeper-3.4.6.tar.gz, sha256: 01b3938547cd620dc4c93efe07c0360411f4a66962a70500b163b59014046994 }
     - { path: kafka_2.10-, sha256: 3ba1967ee88c7f364964c8a8fdf6f5075dcf7572f8c9eb74f0285b308363ecab }

but then my network connection doesn't live through kafka being downloaded in parallel, so every time I try, one of the various builds dies, and none of the successful ones are kept in the current state.

And then, rinse, repeat, curse life.

Comment 2

2 years ago
I'd say the ca-certificates package on your machine (assuming it is Linux) is out of date.

Disabling cert verification isn't the worst thing in the world since we do SHA-256 verification. I'd accept that patch.

Comment 3

2 years ago
I'm on a mac, it's hard for me to figure out at which point the cert actually fails. Many succeed, I think. Not sure how many of the references to s3 come from which layer of the stack. The boot-to-docker VM would be one candidate.

Comment 4

2 years ago
This failure is occurring inside a CentOS container. That container likely has an out-of-date ca-certificates package. Unfortunately, we don't run `yum update` every time we build Docker containers. So if the base image is old, it could have out-of-date packages.

Try nuking the ansible-centos6 and ansible-centos7 docker images and anything derived from them (notably hgweb, hgmaster, and hgrb) and try building again. ansible-centos* containers do run `yum update` when built. Since that's used as a base image for the other containers, you should be set.
You need to log in before you can comment on or make changes to this bug.