1002624 - Setup Mac Mini for testing DeployStudio implementation for qa.scl3.mozilla.com

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Description

•

11 years ago

We need a new Mac Mini set-up, which we can use to test the DeployStudio installation as done by bug 997230. This is only a temporary solution and necessary for verification before we want to apply it to real and existing machines. It doesn't matter which system is installed, it will be overwritten anyway. The only thing AFAIC is that the machine has to be located in the QA VLAN. Thanks

Vinh Hua [:vinh]

Assignee

Updated

•

11 years ago

colo-trip: --- → scl3

Dustin J. Mitchell [:dustin] (he/him)

Comment 1

•

11 years ago

You could steal https://inventory.mozilla.org/systems/show/4993/ and just switch the VLAN and DNS/DHCP. Just mark it "temporary" as described in the notes.

Vinh Hua [:vinh]

Assignee

Comment 2

•

11 years ago

I've set up a temp mac mini for your test. https://inventory.mozilla.org/en-US/systems/show/4788/ qa-deploystudio1.qa.scl3.mozilla.com 10.22.73.46

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Dustin J. Mitchell [:dustin] (he/him)

Comment 3

•

11 years ago

Vinh, did depoystudio actually work to do this install? From irc it sounded like there wre problems.

Vinh Hua [:vinh]

Assignee

Comment 4

•

11 years ago

Dustin, I went ahead and set up a spare mac mini that we have in storage.

Dustin J. Mitchell [:dustin] (he/him)

Comment 5

•

11 years ago

Sorry to just be getting back to this. I tried to do a DS install of this host, and it went away and has not come back - I can't ping it. From what I can see, it's not on a PDU, so I think I'm helpless. Can you see what state it's in, and maybe try netbooting it? Keep in mind it's not a working mac that we need -- it's a working install of DeployStudio. So if you can give me any info on what might be wrong with the DS server in that VLAN, that'd be a lot more helpful than getting this test mac back online.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Vinh Hua [:vinh]

Assignee

Comment 6

•

11 years ago

:dustin - I've tried netbooting but getting the following error: unknown host - The address 'tester1.local' is not valid or the host is unreachable

Jake Watkins [:dividehex]

Comment 7

•

11 years ago

(In reply to Vinh Hua [:vinh] from comment #6) > :dustin - I've tried netbooting but getting the following error: > > unknown host - The address 'tester1.local' is not valid or the host is > unreachable This is from the netboot image not being configured or setup properly. I've started building a new image but it will take a couple hours to complete.

Jake Watkins [:dividehex]

Comment 8

•

11 years ago

The image has been rebuilt and set to default. see https://bugzilla.mozilla.org/show_bug.cgi?id=997230#c18 :vinh, please go ahead and try this again when you have time. Ping me if you run into problems. At the very least, it should login to the DS server, mount the repo and display the workflow list

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 9

•

11 years ago

Dustin, I need an OS X host for testing my proxy patch. What is the status here? Can we get this finalized?

Blocks: 997721

Dustin J. Mitchell [:dustin] (he/him)

Comment 10

•

11 years ago

Status as I have it is in comment 8.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Updated

•

11 years ago

Flags: needinfo?(vhua)

Jake Watkins [:dividehex]

Comment 11

•

11 years ago

Last state I know of: The netboot image worked but instead of auto-selecting the deploystudio.qa server, it required a manual selection. It provided a dropdown to select either deploystudio.qa or tester1.local. My suspicion is that the ds runtime is auto-discovering tester1.local even though I unchecked the bonjour settings during the netboot creation. I would suggest either disabling any zeroconf services on deploystudio.qa and/or make sure all the names on deploystudio.qa match its fqdn. (there might also be another host with the name tester1.qa) And by names, I mean 'computername' 'hostname' and 'fqdn' as reported and set by scutil. Other than that, nothing should be stopping you from testing the workflows. It will just require manual intervention on the client side for the time being.

Vinh Hua [:vinh]

Assignee

Comment 12

•

11 years ago

Attached image photo-2.JPG — Details

Henrik - In the meantime, which image do you want me to select?

Flags: needinfo?(vhua)

Dustin J. Mitchell [:dustin] (he/him)

Comment 13

•

11 years ago

Well, the purpose of that mac mini is to get deploystudio working, so no sense installing an image on it. deploystudio1:~ root# scutil --get ComputerName deploystudio1.qa.scl3.mozilla.com deploystudio1:~ root# scutil --get LocalHostName deploystudio1qascl3mozillacom deploystudio1:~ root# scutil --get HostName deploystudio1.qa so I suspect that there's some other system on this VLAN that's running bonjour. Henrik, do you know what 'tester1' might be?

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 14

•

11 years ago

Dustin, not sure I understand. If we want to test that Deployment studio is working, we might have to also test that installing an image on this machine works. Or am I wrong? Maybe we should talk about that tool, given that it's a bit hard to understand for me right now. Also I would like to have a Mini I can use for testing puppet. As best we should use exactly that mini here. So if something screws up we can easily re-image the Mini. (In reply to Dustin J. Mitchell [:dustin] from comment #13) > deploystudio1:~ root# scutil --get ComputerName > deploystudio1.qa.scl3.mozilla.com > deploystudio1:~ root# scutil --get LocalHostName > deploystudio1qascl3mozillacom > deploystudio1:~ root# scutil --get HostName > deploystudio1.qa > > so I suspect that there's some other system on this VLAN that's running > bonjour. Henrik, do you know what 'tester1' might be? Where did you get this from? I don't see any reference for 'tester1' in the output above.

Dustin J. Mitchell [:dustin] (he/him)

Comment 15

•

11 years ago

I don't want to start using that mini for testing puppet until we're sure deploystudio works. If deploystudio is requiring user interaction, then it doesn't work yet. If we install an image and start playing with puppet, then Vinh's just going to reboot it out from under you next time he's in scl3 and tries deploystudio. As for tester1, yes that was exactly my observation. I think that means that some *other* host on the VLAN is identifying itself as 'tester1.local' via Bonjour, and causing the deploystudio runtime to prompt. And for the record, this is pretty much the normal level of annoyance and frustration that deploystudio brings.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 16

•

11 years ago

Vinh, I talked with Dustin and we agreed on that we can share the box. Given that I'm in Europe I would like to use the box for testing puppetagain. Later in the day you could use if for the final testing of deployserver. Would that work for you? The only thing to obey would be to leave the mini up running. So I would suggest that we get OS X 10.7.5 installed on that box. (In reply to Dustin J. Mitchell [:dustin] from comment #15) > As for tester1, yes that was exactly my observation. I think that means > that some *other* host on the VLAN is identifying itself as 'tester1.local' > via Bonjour, and causing the deploystudio runtime to prompt. Where did you got this information from? I cannot find it.

Vinh Hua [:vinh]

Assignee

Comment 17

•

11 years ago

:whimboo - In case there's no confusion, you want me to install OS X 10.7.5 on the mini?

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 18

•

11 years ago

Yes, via deploystudio if possible. Thanks.

Dustin J. Mitchell [:dustin] (he/him)

Comment 19

•

11 years ago

So it turns out that `dig @224.0.0.251 -p 5353` allows you to query mDNS. I used that to try to find tester1.local, with no luck. i also used it to do a reverse lookup of every IP in the VLAN, and while it turned up a bunch of minis, none gave the name 'tester1.local'. I wonder if tester1 is actually the netboot image? As a side-note, deploystudio1 is kicking errors like this every 10s, and has a high load average: May 28 10:21:16 deploystudio1.qa collabd[240]: [CSConnectionPool.m:196 fa7d000 +9998ms] Could not open a connection to Postgres. Please make sure it is running and has the correct access. May 28 10:21:16 deploystudio1.qa collabd[240]: [CSXCWorkSchedulerService.m:196 fa7d000 +0ms] Failed to open DB connection, retrying in 10s: [CSDatabaseError] Connection to DB failed I stopped collabd: deploystudio1:~ root# serveradmin stop collabd collabd:state = "STOPPED" On Jake's advice, I disabled Bonjour on the DS server: deploystudio1:~ root# launchctl unload -w /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist deploystudio1:~ root# launchctl unload -w /System/Library/LaunchDaemons/com.apple.mDNSResponderHelper.plist deploystudio1:~ root# ps aux | grep mDNS root 22029 0.0 0.0 2432784 604 s000 S+ 10:27AM 0:00.00 grep mDNS So, vinh, let's see if DS works now, and if it doesn't, just get 10.7.5 on there for whimboo's work overnight. Then we can try again in the morning.

Vinh Hua [:vinh]

Assignee

Comment 20

•

11 years ago

DS did not work. Now I'm at the DS runtime screen with only "http://tester1.local:60080" as the drop down option. I do not recall what the other path address is to enter manually. Should I try to install 10.7.5 via DVD install disc?

Jake Watkins [:dividehex]

Comment 21

•

11 years ago

(In reply to Vinh Hua [:vinh] from comment #20) > DS did not work. Now I'm at the DS runtime screen with only > "http://tester1.local:60080" as the drop down option. I do not recall what > the other path address is to enter manually. > > Should I try to install 10.7.5 via DVD install disc? This really makes me believe there is another netinstall(netboot) service running somewhere on qa or dhcp is being fwd'd to one.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 22

•

11 years ago

(In reply to Vinh Hua [:vinh] from comment #20) > Should I try to install 10.7.5 via DVD install disc? If it would be possible, that would help me a lot. But if you think the above problems can be solved before the weekend, you dont have to do it. I will be back on Monday.

Jake Watkins [:dividehex]

Comment 23

•

11 years ago

So I think we got DS working. Turning off the Netinstall service proved, the mini was netbooting to the correct server but I don't think it was getting the correct image. I move the old NB images to root to ensure DSR-2001 was the only image being served. I also had to re-enable bonjour since the DS service was having a fit not being able to publish to it. According to :vinh, the mini now boots right into the workflow list. I don't think there is a 10.7.5 image though. So it might be best to install 10.7.5 with media and then capture an image and add it to a new workflow.

Dustin J. Mitchell [:dustin] (he/him)

Comment 24

•

11 years ago

Once that's captured and in a workflow, I can set up the puppety automation on that workflow.

Vinh Hua [:vinh]

Assignee

Comment 25

•

11 years ago

I'm in the process of installing 10.7.5 now.

Vinh Hua [:vinh]

Assignee

Comment 26

•

11 years ago

10.7.5 was installed. Currently capturing the image, titled "osx-10.7.5_05_29_14".

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 27

•

11 years ago

Great news all! I just want to add that for our purpose we need net boot images for 10.6, 10.7, 10.8, and 10.9. All those systems are still used for testing. But that can be done later. It would be good to see 10.7.5 working first. Thanks.

Dustin J. Mitchell [:dustin] (he/him)

Comment 28

•

11 years ago

So for my reference, DS is now working, and this bug has wandered info "capture 10.7.5". That failed overnight, but vinh will try again. We haven't really discussed, why not just use the 10.7.2 that we already have an image for? If there's a good reason for that (enough for vinh and relops to spend precious time on it), please open a new bug and close this one. If there's not a good reason, then let's just use 10.7.2 (and still close this bug).

Vinh Hua [:vinh]

Assignee

Comment 29

•

11 years ago

The mini has froze again during the image creation process; twice.

Vinh Hua [:vinh]

Assignee

Comment 30

•

11 years ago

per irc with Jake, looks like the image creation completed. What's the next course of action? vinh> Hey Jake I came back to check on the image and the mini froze again. 14:01 <vinh> So I tried attempt #3 and it says filename already exists. 14:02 <vinh> I'm hoping attempt #2 finished successfully before the mini froze? 14:30 <dividehex> i think #2 succeeded 14:30 <dividehex> i see files for them

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 31

•

11 years ago

10.7.2 would be fine for now. Finally we should have the latest versions for each version of OS X, right?

Dustin J. Mitchell [:dustin] (he/him)

Comment 32

•

11 years ago

Perhaps, but we won't get to "finally" by trying to do everything at once :) So, it sounds like we now have a 10.7.5 snapshot, but it's not yet set up for puppet deployments, nor tested. However, that means that the mini in question can probably have 10.7.2 installed on it using the existing workflow. In the DS Admin, I created a "10.7.2 PuppetAgain" group and moved the mini into it. I then set Automation -> Start workflow automatically for that group to "Restore bld-lion-r5-puppetagain". Of the words in that name, "lion" and "puppetagain" are the important bits. I don't seem to be able to connect to the test mini with either VNC or SSH, so I can't netboot it, but at this point I *believe* that it will "just work" if Vinh netboots it. "Just work" should mean that it comes up, puppetizes, and can be logged into as root. I've verified the deploypass is correct and checked the inventory/DNS/DHCP info for the host. Henrik, please add a node definition for it ASAP, or it won't successfully run puppet (in which case you'll need to use the kickstart root password if you need to get in). And then let's close this long, winding bug :)

Vinh Hua [:vinh]

Assignee

Comment 33

•

11 years ago

I've netbooted the mini but the workflow did not start automatically. After manually selecting "restore bld-lion-r5-puppetagain", RAID creation failed because the mini only has one 500gb hard drive. I am swamped with the SCL1 move today so will not be able to install the 2nd hard drive.

Dustin J. Mitchell [:dustin] (he/him)

Comment 34

•

11 years ago

OK, definitely don't add a new HDD. What is the configuration of the other minis in this VLAN? We can make a new workflow for their hardware configuration.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 35

•

11 years ago

Ok, I have added this host to the qa nodes in qa.scl3.mozilla.com: https://hg.mozilla.org/qa/puppet/rev/948a0d168128 (In reply to Dustin J. Mitchell [:dustin] from comment #34) > OK, definitely don't add a new HDD. What is the configuration of the other > minis in this VLAN? We can make a new workflow for their hardware > configuration. None of the Mini's are using RAID. All have a single HDD included. And there is actually no need for a RAID system.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 36

•

11 years ago

We have to make some progress here given that I cannot test anything related to our puppetagain configuration without any node being available. So what can we do in the short term here?

Flags: needinfo?(vhua)

Flags: needinfo?(jwatkins)

Vinh Hua [:vinh]

Assignee

Comment 37

•

11 years ago

I believe jwatkins and dustin can best answer, unless you need another mini racked and configured I can jump on it.

Flags: needinfo?(vhua)

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 38

•

11 years ago

Vinh, if we can get this machine setup so it is reachable and has 10.7.5 installed it would be totally fine. We can surely delay testing for the deploystudio.

Vinh Hua [:vinh]

Assignee

Comment 39

•

11 years ago

:whimboo - The mini is now running 10.9.3

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 40

•

11 years ago

Dustin, what would I have to do to prepare the mini for puppet? I installed puppet and facter. But when I try to run the agent it reports an error because "this master is not a CA". What does it mean, and how can I get it working? I would like to test the latest changes for our QA org in Puppetagain. Thanks.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 41

•

11 years ago

Well, I may have to add this node to the qa config in my environment. Will test this in a bit.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 42

•

11 years ago

Actually this host is already part of the qa nodes manifest file. Could this be that we have problems with the certificate given that the machine has been reinstalled?

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 43

•

11 years ago

Sorry, but I accidentally shutdown this box. Now I'm not able to re-connect to it. Vin, can you please bring it back online? Thanks.

Flags: needinfo?(jwatkins)

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Updated

•

11 years ago

Flags: needinfo?(vhua)

Vinh Hua [:vinh]

Assignee

Comment 44

•

11 years ago

Mini should be on now.

Flags: needinfo?(vhua)

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 45

•

11 years ago

Thanks Vinh! So I did a run of puppetize.sh on that box, which seemed to be successful. But the puppet command right after failed again: $ sudo ./puppetize.sh Password: Contacting puppet server puppet deploypass: 23 Jun 15:14:45 ntpdate[288]: no server suitable for synchronization found Certificate request for qa-deploystudio1.qa.scl3.mozilla.com Certificates are ready; run puppet now. qa-deploystudio1:~ mozauto$ sudo puppet agent --test Error: Could not request certificate: Error 400 on SERVER: this master is not a CA Exiting; failed to retrieve certificate and waitforcert is disabled Dustin, do you have an idea what's missing here?

Dustin J. Mitchell [:dustin] (he/him)

Comment 46

•

11 years ago

The default ssldir is wrong on OS X, so you need to pass --ssldir=/var/lib/puppet/ssl. That's in puppetize.sh if you want to just copy/paste it.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 47

•

11 years ago

Damn. Now that you are saying that! I can remember. Could we add this to the following wiki page? https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/HowTo/Re-issue_Certificates_for_a_host Or something better? You may have a suggestion here.

Dustin J. Mitchell [:dustin] (he/him)

Comment 48

•

11 years ago

Sure, feel free, but note that it's only required on the first install. Once puppet runs, it updates puppet.conf to contain the correct path, and everything is happy.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 49

•

11 years ago

Ok, I got all sorted out with the help from Dustin on IRC. I have added all the necessary hiera secrets for root and the builder user. Puppet agent works now and can successfully initiate the system with the current state of QA slaves. Moving bug over to QA infrastructure for continued testing of puppet and then the deployserver.

Assignee: server-ops-dcops → nobody

Status: REOPENED → ASSIGNED

Component: Server Operations: DCOps → Infrastructure

Product: mozilla.org → Mozilla QA

QA Contact: dmoore

Version: other → unspecified

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 50

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #48) > Sure, feel free, but note that it's only required on the first install. > Once puppet runs, it updates puppet.conf to contain the correct path, and > everything is happy. Right. So actually I updated the following page under 'puppetize.sh' to mention that option for the first puppet agent call. https://wiki.mozilla.org/ReleaseEngineering/PuppetAgain/Puppetization_Process#puppetize.sh

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Updated

•

11 years ago

Depends on: 1029614

Dustin J. Mitchell [:dustin] (he/him)

Comment 51

•

11 years ago

I created a new workflow named "Restore bld-r5-lion-puppetagain without raid" which should work on a non-RAID host like qa-puppetagain1. However, I didn't try imaging the host. When you get a chance, try netbooting it again (bless --netboot --server bsdp://10.22.73.45; reboot) and let's see what happens.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 52

•

11 years ago

Thanks Dustin. I may not do this the next days, given that I want to finish the usual puppet flow first. That is more important for us, and I don't want to risk loosing this testing Mac. It's too valuable for me at the moment.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 53

•

11 years ago

Cannot actually test all the bits until proxies are set. Moving dependency to blocking.

No longer blocks: 997721

Depends on: 997721

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Updated

•

11 years ago

Depends on: 1008880, 1008879

Dustin J. Mitchell [:dustin] (he/him)

Comment 54

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #51) > I created a new workflow named "Restore bld-r5-lion-puppetagain without > raid" which should work on a non-RAID host like qa-puppetagain1. However, I > didn't try imaging the host. When you get a chance, try netbooting it again > (bless --netboot --server bsdp://10.22.73.45; reboot) and let's see what > happens. Henrik -- up to you.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 55

•

11 years ago

Ok, I triggered such an image process by the command as given above. Lets see how it works, and if the box comes back.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 56

•

11 years ago

The box didn't come back until now. Dustin or Vinh, can one of you please have a look at? I would need this box tomorrow for testing Flash. Thanks.

Dustin J. Mitchell [:dustin] (he/him)

Comment 57

•

11 years ago

You can remote into the box that's being installed from the DeployStudio console. Generally the error will be apparent. However, VNC's not working for me (either with Chicken, which rejects the password, or Screen Sharing, which gives me a black screen), so I can't do this. I assume you've adjusted VNC settings based on our irc conversation, so I'll leave the remoting to you.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 58

•

11 years ago

Sorry, but I only changed the screenresolution on that host. Sadly you will have to reboot to allow VNC to work. Now the screen is visible. Sadly I cannot find any way to control that host, also there are no logs available for it. Maybe Vin should have a look at the hardware itself, and reboot if possible.

Flags: needinfo?(vhua)

Vinh Hua [:vinh]

Assignee

Comment 59

•

11 years ago

The mini is getting "cannot find a valid disk to partition" error during netboot.

Flags: needinfo?(vhua)

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 60

•

11 years ago

I think that's something for Dustin then. Looks like the partition setup is still not correct. Dustin, just out of interest, where is this code all located?

Dustin J. Mitchell [:dustin] (he/him)

Comment 61

•

11 years ago

It's not code, it's clicky-pointy stuff. It's in the workflow defined in the DeployStudio admin console. Since it's getting an error about partitioning, then the client node is booting to the DS netboot image. So I'm not sure why it ended up at a gray screen (which was how vinh found it, and I'm guessing how it looks now after he re-tried a netboot). It didn't boot back to the original OS, either. So, we need to change the target disk for the partitioning step in the workflow. Your guess is as good as mine as to what it should be. IIRC, there's an option that just selects the "first" disk, which would probably work. Try making that change? I'm on my Linux laptop today and can't connect to the deploystudio box. Once that's done, we'll need another manual touch in scl3 to try netbooting it again.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 62

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #61) > work. Try making that change? I'm on my Linux laptop today and can't > connect to the deploystudio box. Once that's done, we'll need another > manual touch in scl3 to try netbooting it again. With Remmina you can perfectly connect to this box. I would have to deep dive into all that first, which would take a while to get confident with. :/ Maybe you can have a try?

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 63

•

11 years ago

Wait. Is that the Restore a master on a volume step? If yes, there was user selection active for the target volume! So this might have been the reason. I changed it now to first volume as you said. So Vinh has to reboot the machine again?

Dustin J. Mitchell [:dustin] (he/him)

Comment 64

•

11 years ago

Maybe -- I think it was from the partition step that failed, but you seem to have found a smoking gun. You and I are both at about the same level of deep-diving on deploystudio.. and I don't know what Remmina is, but vinagre didn't work. So yes, with that change, let's try a new netboot.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 65

•

11 years ago

(In reply to Dustin J. Mitchell [:dustin] from comment #64) > You and I are both at about the same level of deep-diving on deploystudio.. > and I don't know what Remmina is, but vinagre didn't work. FYI: http://sourceforge.net/projects/remmina/ > So yes, with that change, let's try a new netboot. Vinh, can you please try again? It would be great to get the Mini back for testing. :)

Flags: needinfo?(vhua)

Vinh Hua [:vinh]

Assignee

Comment 66

•

11 years ago

Still seeing the same valid disk partition error.

Flags: needinfo?(vhua)

Dustin J. Mitchell [:dustin] (he/him)

Comment 67

•

11 years ago

per IRC, vinh booted this back to the on-disk OS, and it should be usable in that state in the interim. We'll have to figure out what's wrong with the DS workflow next week.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 68

•

11 years ago

Yeah would be good. Please be aware that I will be away from 07/19 to 08/03.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 69

•

10 years ago

Something went bad with this mini again. It is not reachable via screen sharing nor SSH. Vinh, can you please reboot it?

Flags: needinfo?(vhua)

Vinh Hua [:vinh]

Assignee

Comment 70

•

10 years ago

The host was hung so I had to reboot it. However it is auto netbooting into Deploy Studio.

Flags: needinfo?(vhua)

Jake Watkins [:dividehex]

Comment 71

•

10 years ago

(In reply to Vinh Hua [:vinh] from comment #70) > The host was hung so I had to reboot it. However it is auto netbooting into > Deploy Studio. Auto netbooting is an indication that the previous Deploy Studio workflow did not finish. One of the last steps in finalize.sh step of the workflow is to bless the disk to boot.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 72

•

10 years ago

This host is still not working Vinh. I cannot connect via ssh. Just to say again I'm talking about 10.22.73.46. I would need this up given that we want upgrade the 10.7 box in mozmill-ci to 10.10 soon.

Flags: needinfo?(vhua)

Vinh Hua [:vinh]

Assignee

Comment 73

•

10 years ago

:dustin - Would you be able to help out with what Jake suggested in comment 71?

Flags: needinfo?(vhua)

Dustin J. Mitchell [:dustin] (he/him)

Comment 74

•

10 years ago

I can't get the mouse to work via VNC, so I can't even login, let alone look at the workflows. We know what the issue is (comment 66), just not how to fix it. I assume that's why it's still blessed for the netboot. I suspect that just blessing the disk would fix the problem with repeated netboots. I'm not sure how best to do that, but if there's a "Startup Items" control panel item in the deploystudio runtime, that would do the trick. By the way, 10.10 doesn't work with DeployStudio yet -- Jake's working on that in another bug.

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Comment 75

•

10 years ago

Ok, so please wait. Lets recap what this bug was for. By the original request it handled the setup of a testing machine for deploystudio. Meanwhile it diverged into a nearly catch all about deploystudio issues. I would say we close this bug given that the test mini exists, and follow-up with anything else on bug 997230, which is about the deploystudio itself.

Status: ASSIGNED → RESOLVED

Closed: 11 years ago → 10 years ago

Resolution: --- → FIXED

Henrik Skupin [:whimboo][⌚️UTC+2] (away 02/17 - 02/21)

Reporter

Updated

•

10 years ago

Assignee: nobody → vhua

BMO Automation

Updated

•

7 years ago

Product: Mozilla QA → Mozilla QA Graveyard