Closed
Bug 1448383
Opened 7 years ago
Closed 7 years ago
[MDC2] Provision all linux moonshot nodes in MDC2 chassis
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dividehex, Assigned: dhouse)
References
Details
Attachments
(2 files, 1 obsolete file)
|
521 bytes,
patch
|
dividehex
:
review+
dragrom
:
review+
dhouse
:
checked-in+
|
Details | Diff | Splinter Review |
|
1.21 KB,
patch
|
dragrom
:
review+
dividehex
:
review+
dhouse
:
checked-in+
|
Details | Diff | Splinter Review |
We will need to mass pxeboot all 100 moonshot nodes on the MDC2 moonshot chassis. This work is dependent on bug 1444469 being completed. We cannot PXE boot nodes until then.
Moonshot nodes status is tracked on this spreadsheet: https://docs.google.com/spreadsheets/d/1IPTmppvqDw0PQV-O1LgXLJg_7TC-H_IAAnSxcur8c7I/edit#gid=1941875751
| Reporter | ||
Comment 1•7 years ago
|
||
:dhosue, I'm assigning this to you since I did it for MDC1 last time. You should do this work in MDC2 for the sake of experience.
Also, note, chassis 14 should not actually go into production since we are going let netops use it to test switch configs on. Therefore, you should skip those nodes entirely otherwise they will be placed into production immediately.
Assignee: jwatkins → dhouse
To check if some part of bug 1444469 is complete (and if we need other things changed), I tried a pxeboot of mdc2 chassis 8 cartridge 1: the PXE boot times out when trying to reach out over ipv4 and ipv6.
I've successfully run the network boot and ubuntu installer on nodes on every mdc2 moonshot chassis. However, they are all sticking on the preseed screen. The PUPPET_PASS is not getting carried through to the preseed_to_puppet.sh script as logged in /var/log/syslog:
```
+ for word in '$(</proc/cmdline)'
+ case $word in
+ for word in '$(</proc/cmdline)'
+ case $word in
+ for word in '$(</proc/cmdline)'
+ case $word in
+ for word in '$(</proc/cmdline)'
+ case $word in
+ for word in '$(</proc/cmdline)'
+ case $word in
+ for word in '$(</proc/cmdline)'
+ case $word in
+ '[' -z '' ']'
+ echo 'PUPPET_PASS was not set; aborting setup.'
```
The preseed log (/var/lib/preseed/log) shows the last executed step as the last of the preseed script's steps. So it is completing the preseed:
```
[...]
d-i finish-install/reboot_in_progress note
d-i preseed/late_command string sed -i 's/PermitRootLogin prohibit-password/PermitRootLogin yes/g' /target/etc/ssh/sshd_config; wget http://repos/repos/kickstart/preseed_to_puppet.sh -O /target/tmp/preseed_to_puppet.sh; chmod 755 /target/tmp/preseed_to_puppet.sh; in-target /tmp/preseed_to_puppet.sh
```
The file /tmp/preseed_to_puppet.sh exists and contains the script pulled from the repos url.
The commandline was too long.
This works:
```
setparams 'Install Ubuntu 16.04.0 LTS x86_64 on moonshot'
linuxefi images/Ubuntu-16.04.0-x86_64-server/linux ro auto=true url=http:/\
/repos/repos/kickstart/ubuntu_16.04_x64_moonshot.preseed priority=critical int\
erface=auto PUPPET_PASS=password initrdefi images/Ubuntu-16.04.0-x86_64-server\
/initrd.gz
```
It failed when I put PUPPET_PASS=password at the end.
the preseed_to_puppet.sh script is failing on connection to the proxy to pull the puppetize.sh from hg.m.o:
```
[...]
failed: Connection timed out.
Retrying
--2018-04-13 13:26:04-- (try: 8) http://hg.mozilla.org/build/puppet/raw-file/default/modules/puppet/files/puppetize.sh
Connecting to proxy.dmz.mdc1.mozilla.com (proxy.dmz.mdc1.mozilla.com|10.48.74.17|:3128...
failed: Connection timed out.
```
When trying the wget for the puppetize.sh (matching the commands in preseed_to_puppet.sh):
1. the proxy fails
2. mdc2 proxy works
3. certificate complaint from wget "cannot verify hg.mozilla.org's certificate [...] Unable to locally verify [...]": needed to use wget's --no-check-certificate
I'm using this for the mdc2 moonshots. There may be a better way to use the local datacenter proxy, but this seems simplest for mdc2.
Attachment #8968331 -
Flags: review?(jwatkins)
| Reporter | ||
Comment 8•7 years ago
|
||
Comment on attachment 8968331 [details] [diff] [review]
mdc2 proxy first for preseed. fewer retries (wget default is 20)
Review of attachment 8968331 [details] [diff] [review]:
-----------------------------------------------------------------
I think this is good enough. It would be more optimal to manage this with puppet and to be able to set the proxy be the datacenter location of the puppet master. And the need for --no-check-certificate indicates it is being redirected to https which we should probably use to begin with. Either way we still would need the --no-check-certificate.
So r+ and shipit
Attachment #8968331 -
Flags: review?(jwatkins) → review+
Attachment #8968344 -
Flags: review?(jwatkins)
Attachment #8968344 -
Flags: review?(dcrisan)
| Reporter | ||
Updated•7 years ago
|
Attachment #8968344 -
Flags: review?(jwatkins) → review+
| Assignee | ||
Comment 10•7 years ago
|
||
Comment on attachment 8968344 [details] [diff] [review]
add mdc2 moonshots to node configs
remote: https://hg.mozilla.org/build/puppet/rev/ffb46f7b80adccf468ec2943b49c89b755423860
Travis passed. Pushed to production:
remote: https://hg.mozilla.org/build/puppet/rev/7ebbb03206499f05b6ba28c7683a57d33ef4f621
Attachment #8968344 -
Flags: checked-in+
Updated•7 years ago
|
Attachment #8968344 -
Flags: review?(dcrisan) → review+
Comment 11•7 years ago
|
||
(In reply to Jake Watkins [:dividehex] from comment #8)
> Comment on attachment 8968331 [details] [diff] [review]
> mdc2 proxy first for preseed. fewer retries (wget default is 20)
>
> Review of attachment 8968331 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I think this is good enough. It would be more optimal to manage this with
> puppet and to be able to set the proxy be the datacenter location of the
> puppet master. And the need for --no-check-certificate indicates it is
> being redirected to https which we should probably use to begin with.
> Either way we still would need the --no-check-certificate.
>
> So r+ and shipit
To be clear: you MUST use https:// when connecting to hg.mozilla.org and you MUST verify the certificate.
| Assignee | ||
Comment 12•7 years ago
|
||
(In reply to Kendall Libby [:fubar] from comment #11)
> (In reply to Jake Watkins [:dividehex] from comment #8)
> > Comment on attachment 8968331 [details] [diff] [review]
> > mdc2 proxy first for preseed. fewer retries (wget default is 20)
> >
> > Review of attachment 8968331 [details] [diff] [review]:
> > -----------------------------------------------------------------
> >
> > I think this is good enough. It would be more optimal to manage this with
> > puppet and to be able to set the proxy be the datacenter location of the
> > puppet master. And the need for --no-check-certificate indicates it is
> > being redirected to https which we should probably use to begin with.
> > Either way we still would need the --no-check-certificate.
> >
> > So r+ and shipit
>
> To be clear: you MUST use https:// when connecting to hg.mozilla.org and you
> MUST verify the certificate.
I'm not sure why this wasn't using https yet. I'll patch it.
| Assignee | ||
Comment 13•7 years ago
|
||
I tested this manually from t-linux64-ms-302 and the mdc2 proxy works with https (same port).
Attachment #8968331 -
Attachment is obsolete: true
Attachment #8969274 -
Flags: review?(jwatkins)
Attachment #8969274 -
Flags: review?(dcrisan)
Comment 14•7 years ago
|
||
Comment on attachment 8969274 [details] [diff] [review]
use https for wget of preseed_to_puppet.sh
lgtm
Attachment #8969274 -
Flags: review?(dcrisan) → review+
| Reporter | ||
Updated•7 years ago
|
Attachment #8969274 -
Flags: review?(jwatkins) → review+
Comment 15•7 years ago
|
||
Pushed by dhouse@mozilla.com:
https://hg.mozilla.org/build/puppet/rev/3806b61e0fdd
use https for wget preseed_to_puppet. r=dividehex
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
| Assignee | ||
Comment 16•7 years ago
|
||
Comment on attachment 8969274 [details] [diff] [review]
use https for wget of preseed_to_puppet.sh
remote: https://hg.mozilla.org/build/puppet/rev/3806b61e0fdd1cbda99584bcec2bfa735e81523d
remote: https://hg.mozilla.org/build/puppet/rev/07bf2b3efe1e761536ff913cffd7babcec2e2104
Attachment #8969274 -
Flags: checked-in+
| Assignee | ||
Comment 17•7 years ago
|
||
I've kickstarted the nodes on moon-chassis-{10..11} and I'm starting the ones on 12..14 and verifying that they are taking jobs.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
| Assignee | ||
Comment 18•7 years ago
|
||
I confirmed that the new nodes are found in the taskcluster ui as taking jobs:
```
[david@george releng]$ UP=0; DOWN=0; for c in {8..14}; do nstart=$[(c-1)*45-15]; echo $c" "$nstart; for i in {1..15}; do I=$[nstart+i]; if ! (( $c % 7 )) && (( $i > 9 )); then break; fi; wget -q -O - https://queue.taskcluster.net/v1/provisioners/releng-hardware/worker-types/gecko-t-linux-talos/workers/mdc2/t-linux64-ms-${I} >/dev/null && UP=$[UP+1] || DOWN=$[DOWN+1]; done;done >> state.$(date +"%H:%M:%S").log; echo "$DOWN down, $UP up"
5 down, 94 up
```
I'm checking on the 5 down; I think these are loaner/test machines from before.
Status: REOPENED → RESOLVED
Closed: 7 years ago → 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•