Closed Bug 1448383 Opened 7 years ago Closed 7 years ago

[MDC2] Provision all linux moonshot nodes in MDC2 chassis

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dividehex, Assigned: dhouse)

References

Details

Attachments

(2 files, 1 obsolete file)

We will need to mass pxeboot all 100 moonshot nodes on the MDC2 moonshot chassis. This work is dependent on bug 1444469 being completed. We cannot PXE boot nodes until then. Moonshot nodes status is tracked on this spreadsheet: https://docs.google.com/spreadsheets/d/1IPTmppvqDw0PQV-O1LgXLJg_7TC-H_IAAnSxcur8c7I/edit#gid=1941875751
:dhosue, I'm assigning this to you since I did it for MDC1 last time. You should do this work in MDC2 for the sake of experience. Also, note, chassis 14 should not actually go into production since we are going let netops use it to test switch configs on. Therefore, you should skip those nodes entirely otherwise they will be placed into production immediately.
Assignee: jwatkins → dhouse
To check if some part of bug 1444469 is complete (and if we need other things changed), I tried a pxeboot of mdc2 chassis 8 cartridge 1: the PXE boot times out when trying to reach out over ipv4 and ipv6.
Depends on: 1451090
I've successfully run the network boot and ubuntu installer on nodes on every mdc2 moonshot chassis. However, they are all sticking on the preseed screen. The PUPPET_PASS is not getting carried through to the preseed_to_puppet.sh script as logged in /var/log/syslog: ``` + for word in '$(</proc/cmdline)' + case $word in + for word in '$(</proc/cmdline)' + case $word in + for word in '$(</proc/cmdline)' + case $word in + for word in '$(</proc/cmdline)' + case $word in + for word in '$(</proc/cmdline)' + case $word in + for word in '$(</proc/cmdline)' + case $word in + '[' -z '' ']' + echo 'PUPPET_PASS was not set; aborting setup.' ``` The preseed log (/var/lib/preseed/log) shows the last executed step as the last of the preseed script's steps. So it is completing the preseed: ``` [...] d-i finish-install/reboot_in_progress note d-i preseed/late_command string sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/g' /target/etc/ssh/sshd_config; wget http://repos/repos/kickstart/preseed_to_puppet.sh -O /target/tmp/preseed_to_puppet.sh; chmod 755 /target/tmp/preseed_to_puppet.sh; in-target /tmp/preseed_to_puppet.sh ``` The file /tmp/preseed_to_puppet.sh exists and contains the script pulled from the repos url.
The commandline was too long. This works: ``` setparams 'Install Ubuntu 16.04.0 LTS x86_64 on moonshot' linuxefi images/Ubuntu-16.04.0-x86_64-server/linux ro auto=true url=http:/\ /repos/repos/kickstart/ubuntu_16.04_x64_moonshot.preseed priority=critical int\ erface=auto PUPPET_PASS=password initrdefi images/Ubuntu-16.04.0-x86_64-server\ /initrd.gz ``` It failed when I put PUPPET_PASS=password at the end.
the preseed_to_puppet.sh script is failing on connection to the proxy to pull the puppetize.sh from hg.m.o: ``` [...] failed: Connection timed out. Retrying --2018-04-13 13:26:04-- (try: 8) http://hg.mozilla.org/build/puppet/raw-file/default/modules/puppet/files/puppetize.sh Connecting to proxy.dmz.mdc1.mozilla.com (proxy.dmz.mdc1.mozilla.com|10.48.74.17|:3128... failed: Connection timed out. ```
Blocks: 1454256
When trying the wget for the puppetize.sh (matching the commands in preseed_to_puppet.sh): 1. the proxy fails 2. mdc2 proxy works 3. certificate complaint from wget "cannot verify hg.mozilla.org's certificate [...] Unable to locally verify [...]": needed to use wget's --no-check-certificate
I'm using this for the mdc2 moonshots. There may be a better way to use the local datacenter proxy, but this seems simplest for mdc2.
Attachment #8968331 - Flags: review?(jwatkins)
Comment on attachment 8968331 [details] [diff] [review] mdc2 proxy first for preseed. fewer retries (wget default is 20) Review of attachment 8968331 [details] [diff] [review]: ----------------------------------------------------------------- I think this is good enough. It would be more optimal to manage this with puppet and to be able to set the proxy be the datacenter location of the puppet master. And the need for --no-check-certificate indicates it is being redirected to https which we should probably use to begin with. Either way we still would need the --no-check-certificate. So r+ and shipit
Attachment #8968331 - Flags: review?(jwatkins) → review+
Attachment #8968344 - Flags: review?(jwatkins)
Attachment #8968344 - Flags: review?(dcrisan)
Attachment #8968344 - Flags: review?(jwatkins) → review+
Attachment #8968344 - Flags: review?(dcrisan) → review+
(In reply to Jake Watkins [:dividehex] from comment #8) > Comment on attachment 8968331 [details] [diff] [review] > mdc2 proxy first for preseed. fewer retries (wget default is 20) > > Review of attachment 8968331 [details] [diff] [review]: > ----------------------------------------------------------------- > > I think this is good enough. It would be more optimal to manage this with > puppet and to be able to set the proxy be the datacenter location of the > puppet master. And the need for --no-check-certificate indicates it is > being redirected to https which we should probably use to begin with. > Either way we still would need the --no-check-certificate. > > So r+ and shipit To be clear: you MUST use https:// when connecting to hg.mozilla.org and you MUST verify the certificate.
(In reply to Kendall Libby [:fubar] from comment #11) > (In reply to Jake Watkins [:dividehex] from comment #8) > > Comment on attachment 8968331 [details] [diff] [review] > > mdc2 proxy first for preseed. fewer retries (wget default is 20) > > > > Review of attachment 8968331 [details] [diff] [review]: > > ----------------------------------------------------------------- > > > > I think this is good enough. It would be more optimal to manage this with > > puppet and to be able to set the proxy be the datacenter location of the > > puppet master. And the need for --no-check-certificate indicates it is > > being redirected to https which we should probably use to begin with. > > Either way we still would need the --no-check-certificate. > > > > So r+ and shipit > > To be clear: you MUST use https:// when connecting to hg.mozilla.org and you > MUST verify the certificate. I'm not sure why this wasn't using https yet. I'll patch it.
I tested this manually from t-linux64-ms-302 and the mdc2 proxy works with https (same port).
Attachment #8968331 - Attachment is obsolete: true
Attachment #8969274 - Flags: review?(jwatkins)
Attachment #8969274 - Flags: review?(dcrisan)
Blocks: 1455301
Comment on attachment 8969274 [details] [diff] [review] use https for wget of preseed_to_puppet.sh lgtm
Attachment #8969274 - Flags: review?(dcrisan) → review+
Attachment #8969274 - Flags: review?(jwatkins) → review+
Pushed by dhouse@mozilla.com: https://hg.mozilla.org/build/puppet/rev/3806b61e0fdd use https for wget preseed_to_puppet. r=dividehex
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
I've kickstarted the nodes on moon-chassis-{10..11} and I'm starting the ones on 12..14 and verifying that they are taking jobs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I confirmed that the new nodes are found in the taskcluster ui as taking jobs: ``` [david@george releng]$ UP=0; DOWN=0; for c in {8..14}; do nstart=$[(c-1)*45-15]; echo $c" "$nstart; for i in {1..15}; do I=$[nstart+i]; if ! (( $c % 7 )) && (( $i > 9 )); then break; fi; wget -q -O - https://queue.taskcluster.net/v1/provisioners/releng-hardware/worker-types/gecko-t-linux-talos/workers/mdc2/t-linux64-ms-${I} >/dev/null && UP=$[UP+1] || DOWN=$[DOWN+1]; done;done >> state.$(date +"%H:%M:%S").log; echo "$DOWN down, $UP up" 5 down, 94 up ``` I'm checking on the 5 down; I think these are loaner/test machines from before.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: