386074 - Build cycle times are up across the board after moving to NetApp-backed storage

Reporter

Description

•

18 years ago

We've seen increases -- sometimes quite large -- in build cycle times since the VMs have been migrated to ESX3 hosts backed by the NetApp. Things were meant to be getting _faster_. :/ Here are some sample cycle times: Machine: before migration vs. after patrocles-vm: 1 hr vs. 6 hr fx-linux-tbox: 10 min vs. 30 min fx-win32-tox: 30 min vs. 90 min Is there anything in the new host configuration that would be slowing things down like this? Any tweakable params on the NetApps we can work with?

Chris Cooper [:coop] (he/him)

Reporter

Updated

•

18 years ago

Blocks: 385972

Justin Fitzhugh

Assignee

Comment 1

•

18 years ago

This shouldn't have anything to do with the netapp as it's running at about 15% cpu load. Preed said he thought this had to do with poor distribution on the ESX hosts - thoughts preed?

Severity: critical → major

matthew zeier [:mrz]

Comment 2

•

18 years ago

It's not clear to me that the move was supposed to improve speed as much as it was to grow the storage and the ability to move VMs around. We are using software iscsi and, if that's the constraint, we could experiment with a host using a QLogic iSCSI HBA. I was on the phone with vmware yesterday going over the settings and there wasn't anything to tweak on that end. In any event, fx-win32-tbox is moving to bm-vmware06.

Severity: major → critical

matthew zeier [:mrz]

Updated

•

18 years ago

Severity: critical → major

Justin Fitzhugh

Assignee

Comment 3

•

18 years ago

The move is done - can you report back what the new cycle times are and verify that builds are working as expected?

Phil Ringnalda (:philor)

Comment 4

•

18 years ago

I'm not sure what this says, but since I looked them up - times for fx-win32-tbox to do the nightly clobber, and to do the no-checkin depend closest to the nightly: clobber depend 6/26 1:41 :57 6/25 1:42 :57 6/24 1:40 :56 6/23 1:39 :58 6/22 1:29 :46 6/21 1:22 :36 6/20 1:21 :37 6/19 1:20 :36 The question, though, is whether that means that last week's migrations added a flat 20 minutes to the build time, or that it increased clobbers by 25% and depends by almost 100% (see bug 381247 for an example of how changing storage could make a much bigger difference for depend builds).

Nick Thomas [:nthomas] (UTC+12)

Comment 5

•

18 years ago

For comparison, these are "no-change depend"/clobber times (in minutes). "Before" is June 18th, "After" is June 27th. App-Branch Before/After Windows Linux Mac (control) Firefox-Trunk Before 35/80 11/44 31/55 After 100/? 25/68 31/55 Thunderbird-Trunk Before 38/130 10/43 20/51 After 90/160 25/64 21/51

Chris Cooper [:coop] (he/him)

Reporter

Updated

•

18 years ago

No longer blocks: 385972

Chris Cooper [:coop] (he/him)

Reporter

Comment 7

•

18 years ago

We moved fx-win32-tbox onto an empty VM host today, and the cycle time is still crappy (100 minutes for a depend build, 150 minutes for a clobber build). Host loading doesn't seem to be a factor in this case.

Chris Cooper [:coop] (he/him)

Reporter

Comment 8

•

18 years ago

I talked to kev who's worked with NetApps before: [8:39pm] kev: you running iscsi or nfs? [8:54pm] coop: using software iscsi right now, according to mrz [8:56pm] kev: we had a problem with network controllers in a cluster config [8:57pm] kev: basically one of them would synflood the network [8:57pm] kev: ended up replacing the controller on three boxes [8:57pm] kev: very annoying [9:05pm] coop: sadly, i don't know enough about how it's setup to be very helpful in debugging the problem [9:06pm] kev: yeah, it took a box that from all appearances was fine, and completely hosed performance Not sure whether that's pertinent in this case. I'm unsure as to next steps here. Is there any provision for going *back* to local storage, even if only for one host? How about changing the iscsi as mrz suggests in comment #2?

matthew zeier [:mrz]

Comment 9

•

18 years ago

I talked to Kev - he'll give me more info on the netapp issue in the morning. Comment #2 involves some money and downtime on some ESX host so if we go that route, we should keep one free. None of the ESX hosts have a lot of local storage (mostly 72GB) and doing so would eliminate vmotion. For comparison testing I could manually move one.

Assignee: server-ops → mrz

Phil Ringnalda (:philor)

Comment 10

•

18 years ago

Just because I can't resist ringing the patrocles bell one more time, cf's table for 1.8: App-Branch Before/After Windows Linux Mac (control) Firefox-1.8 Before 24/55 13/59 19/88 After 38/72 34/81 18/87 Thunderbird-1.8 Before 58/134 9/23 13/90 After 377/424 34/54 12/90 If you're looking for one to move where you can't possibly do any harm, patrocles with the 550% increase in depend build time seems like a good choice.

Justin Fitzhugh

Assignee

Comment 11

•

18 years ago

I'd be very surprised if this is a netapp issue for a few reasons: 1) We tested local storage vs netapp iscsi for sequential, non-sequential, random reads/writes and all compared withing 5% to local storage. 2) We have many other apps that are as fast or faster on iscsi 3) The netapp cpu is 20% or less 4) Preed tested builds on the netapp before the migration and purchase and didn't mention any issue about huge slowdowns If we were having synflood issues, we'd see these issues on many other apps. I would look more to vmware - I'll work tomorrow with mrz/vmware/netapp to try to see what the issue is.

J. Paul Reed [:preed]

Comment 12

•

18 years ago

As of now, we're moving fx-win32-tbox from the Netapp to local storage, so there will be ~1 hour of downtime. We'll see if build times improve for this VM. We put this on a machine by itself and it didn't get any faster, so we're trying to isolate if it's iSCSI or not.

Chris Cooper [:coop] (he/him)

Reporter

Comment 13

•

18 years ago

Just had another IRC convo with kev about the NetApp: [11:59am] kev: so yeah, there's two data stores defined for all the VM disks [12:00pm] kev: how many VM disks are on them? [12:17pm] coop: i.e. how many VMs do we have, total? or individual disks within those VMs? [12:17pm] kev: individual disks [12:18pm] kev: mrz mentioned 32 machines [12:18pm] kev: but that can still be high, especially if they're paging [12:18pm] kev: the datastore is essentially on LUN [12:18pm] kev: one, even [12:18pm] coop: we have more disks than machines, rest assured [12:18pm] coop: although i don't know precisely how many [12:18pm] kev: so, if you have 32 virtual disks, all of those requests go through the one device on the filer [12:19pm] coop: ugh [12:19pm] kev: so you get i/o request queuing, and bad things happen [12:19pm] coop: older VMs will only have one disk because they were cloned from original desktop machines [12:19pm] coop: new VMs based on ref images could have up to 3 disks each [12:20pm] kev: a re-allocation may be required [12:20pm] coop: let me grab an approximate number from the email i sent out a few days ago [12:20pm] kev: apparently there are two datastores using all the disk [12:20pm] kev: which will make things entertaining [12:21pm] kev: the other thing is to get a good idea of how much swap the vms are using [12:21pm] kev: because swap/paging file usage will just compound the problem [12:23pm] coop: i'm estimating 40-45 disks used by active VMs [12:24pm] kev: so 20 per datastore potentially. which isn't that high, but I'm guessing I/O utilization is relatively high when they're building [12:24pm] coop: yeah, crazy amounts of file access, especially for clobber builds [12:24pm] coop: which interestingly almost all occur at the same time every night [12:25pm] kev: that'll do it [12:25pm] kev: other thing is to verify that all the network links are gbit and are set to autonegotiate [12:26pm] kev: there's also some fun settings you have to use to make sure interfaces serving linux boxes are tweaked or bad things can also happen [12:27pm] coop: k [12:28pm] kev: I sent mrz the guide [12:28pm] coop: k, i'm also gonna post most of this transcript in the bug for IT, if that's alright by you [12:28pm] kev: http://www.netapp.com/news/techontap/3248.pdf [12:28pm] kev: soitenly [12:30pm] coop: when you say re-allocation, what are you referring to? consolidating disks on the newer VMs? [12:32pm] kev: assuming they are actually set up in one VMFS Datastore, by reallocation I mean creating new stores and spreading the virtual disks across them [12:32pm] coop: ah, k

Justin Fitzhugh

Assignee

Comment 14

•

18 years ago

I'm pretty sure I've said this before in the bug, but i/o ops on the heads is not the bottleneck here. I've run far more intensive stats than above, and the issue is not there. Also, setting network interfaces to autoneg is in fact a *bad* thing, especially on the netapp. You should force 1gb full duplex and you need correct flow control settings - something auto neg doesn't do right usually.

matthew zeier [:mrz]

Comment 15

•

18 years ago

I agree with justin - at the end of the day, regardless of how many LUNs, traffic is going out the same interface to the same netapp to the same set of disks (which the netapp has virtualized a LUN out of). In comparison, our install is small so I'm confident that 32 VMs on two datastores (19/13) isn't over capacity.

Chris Cooper [:coop] (he/him)

Reporter

Comment 16

•

18 years ago

Here's some more data. I just ran iostat on tb-linux-tbox during the 'cvs co' portion of its latest run. The high %util is troubling: [cltbld@tb-linux-tbox ~]$ iostat -d -x 2 10 | grep -v sda Linux 2.6.9-42.ELsmp (tb-linux-tbox.build.mozilla.org) 06/29/2007 Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sdb 0.35 377.64 61.12 25.57 3058.54 3227.32 1529.27 1613.66 72.51 5.32 61.37 8.44 73.14 sdb 0.00 0.00 41.29 0.00 334.33 0.00 167.16 0.00 8.10 0.98 23.47 23.75 98.06 sdb 0.00 0.00 53.96 0.00 431.68 0.00 215.84 0.00 8.00 0.98 18.35 18.17 98.07 sdb 0.00 12.63 60.10 1.01 480.81 109.09 240.40 54.55 9.65 1.00 16.02 16.31 99.70 sdb 0.00 0.00 9.45 0.00 75.62 0.00 37.81 0.00 8.00 1.00 107.32 105.32 99.55 sdb 0.00 5.97 51.74 1.00 413.93 55.72 206.97 27.86 8.91 1.02 18.83 18.73 98.76 sdb 0.00 38.69 62.81 8.54 502.51 377.89 251.26 188.94 12.34 1.02 14.70 13.97 99.70 sdb 0.00 0.00 73.76 0.00 594.06 0.00 297.03 0.00 8.05 0.98 13.32 13.28 97.92 sdb 0.00 19.00 58.00 1.50 464.00 164.00 232.00 82.00 10.55 1.04 17.31 16.71 99.45 sdb 0.00 0.00 67.34 0.00 538.69 0.00 269.35 0.00 8.00 1.00 13.86 14.81 99.75 Also posted here: http://pastebin.mozilla.org/111373

Chris Cooper [:coop] (he/him)

Reporter

Comment 17

•

18 years ago

I should perhaps also note that the lowest I've seen %util go during other portions of the build is 58%.

Justin Fitzhugh

Assignee

Comment 18

•

18 years ago

Good info - how do those cpu number compare to the build running on local disk?

matthew zeier [:mrz]

Comment 19

•

18 years ago

I have fx-win32-tbox-vmware02 online at 10.2.71.254. vmware02 is using a seperate, untagged ethernet interface for iscsi traffic. Can someone start off the build process and time it on this machine?

Nick Thomas [:nthomas] (UTC+12)

Comment 20

•

18 years ago

(In reply to comment #19) Build is running now, reporting to http://tinderbox.mozilla.org/showbuilds.cgi?tree=MozillaExperimental as WINNT 5.2 fx-win32-tbox Dep VM testing First build is a clobber, then depends like its sibling. Note that bsmedberg turned objdir's on today, but it hasn't impacted build time - still 1 hour for a clobber. [build: config is to not release builds, symbols, or update info from this instance. Nor is it updating its configs from CVS - ie manual clobbers only]

matthew zeier [:mrz]

Comment 21

•

18 years ago

Build times look better, will let it run till tomorrow, build to discuss at build meeting and let me know if times are more acceptable.

matthew zeier [:mrz]

Comment 22

•

18 years ago

I wonder if the problem is with the tagging itself, or at least VMWare's implementation of 802.1q (we have none of these issues with RHEL). bm-vmware02 has four physical interfaces, two of which are on the storage network (untagged, no 802.1q). I want to move back to the other two interfaces but without any Vlan tagging. If the numbers are on par then great, if they're longer than it appears separate physical interfaces is better.

matthew zeier [:mrz]

Comment 23

•

18 years ago

Times look normal, no? I'm still waiting on vmware but if I can move vms off a host for 15 minutes, I can change the storage interfaces. When I did so on bm-vmware02, it needed a reboot before vmkping worked.

Nick Thomas [:nthomas] (UTC+12)

Comment 24

•

18 years ago

To summarize the data (fx-win32-tbox): Depend Clobber before migration (local storage, shared host) 35 140 after migration (netapp, shared host) 100 ? make devs happier (local storage, host to self) 16 60 clone, mrz tests round 1 (netapp, host to self) 42 ? clone, mrz tests round 2 (netapp, host to self) 39 83

matthew zeier [:mrz]

Comment 25

•

18 years ago

In order to debug this further I need to move the original fx-win32-tbox off local disk and back to iscsi and re-verify that dropping the Vlan tag on the storage ports (both service console and vmkernel) make numbers more acceptable. When can I do this?

Justin Fitzhugh

Assignee

Comment 26

•

18 years ago

I have some things back from netapp too - I'll talk with mrz and we'll formulate a plan for the tests/migrations we need to do.

J. Paul Reed [:preed]

Comment 27

•

18 years ago

Adding bug 378440 to this bug; it tracks file system corruption on cereberus-vm; cf also points out that using the newer SCSI driver may improve build times.

Depends on: 378440

Justin Fitzhugh

Assignee

Comment 28

•

18 years ago

For tracking - netapp says this could be a perf issue. We'll need to migrate the LUNs to type netapp and we'll need the build team to move the guest partitions as outlined below... http://now.netapp.com/Knowledgebase/solutionarea.asp?id=kb24492

J. Paul Reed [:preed]

Comment 29

•

18 years ago

To clarify, these are the current issues we're tracking here: 1. Problem with Netapp LUN settings; I don't have all the details, but there's something with block alignment that's not turned on or something, we'll be fixing that with tomorrow's 12-hour downtime. 2. Storage network connections: mrz did some testing and found out that the VLAN tagging perf in ESX isn't great, I guess. So he's solving it at the switch level. He also found better performance on the machines with multiple network ports and an external four-port network card. It's unclear to me if we're going to outfit all the machines with these new cards. 3. The SCSI driver issue that Nick detailed. Nick was going to clone cerberus-vm and give this is a try. In some cases, we may *have* to upgrade the SCSI drivers, because the drivers on the current new ref platform seem to have corruption problems we've seen before. mrz found an updated driver from VMware that we'll have to use.

matthew zeier [:mrz]

Comment 30

•

18 years ago

> 2. Storage network connections: mrz did some testing and found out that the > VLAN tagging perf in ESX isn't great, I guess. So he's solving it at the switch > level. He also found better performance on the machines with multiple network > ports and an external four-port network card. It's unclear to me if we're going > to outfit all the machines with these new cards. I haven't been able to test a machine with the two onboard NICs to see if non-tagged storage interfaces are better or not. The test I did doesn't really match how the other boxes are configured. I want to move fx-win32-tbox back to iSCSI on vmware06 and try with non-tagged interfaces.

J. Paul Reed [:preed]

Comment 31

•

18 years ago

(In reply to comment #30) > I haven't been able to test a machine with the two onboard NICs to see if > non-tagged storage interfaces are better or not. The test I did doesn't really > match how the other boxes are configured. > > I want to move fx-win32-tbox back to iSCSI on vmware06 and try with non-tagged > interfaces. Can we do this with a clone of fx-win32-tbox (with different configs) or do we have to use that VM?

matthew zeier [:mrz]

Comment 32

•

18 years ago

I suppose I could clone it but the test will only be similiar since I don't think there are any empty ESX hosts. We can talk about it in person Tuesday.

matthew zeier [:mrz]

Updated

•

18 years ago

Whiteboard: Waiting on NetApp shelf

matthew zeier [:mrz]

Comment 35

•

18 years ago

Demo shelf is up and I posted numbers for fx-win32-tbox-fcal. This was on an unloaded ESX host and on an unloaded NAS shelf. I want to simulate a loaded shelf and have cloned fx-win32-tbox-fcal 5 times. I talked to preed yesterday about needing someone to show me what needs to change on each such that each reports to the MozillaExperiemental page under different names or doesn't report at all (I'm only tracking fx-win32-tbox-fcal's times).

matthew zeier [:mrz]

Comment 36

•

18 years ago

http://tinderbox.mozilla.org/MozillaExperimental/ is showing all 6 vms...

Justin Fitzhugh

Assignee

Comment 37

•

18 years ago

Info sent - waiting on resolution.

Assignee: mrz → justin

Justin Fitzhugh

Assignee

Comment 38

•

18 years ago

Closing as we have a new architecture and will be deploying in the next few weeks.

Status: NEW → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

12 years ago

Component: Server Operations: RelEng → RelOps

Product: mozilla.org → Infrastructure & Operations