Closed Bug 1129534 Opened 10 years ago Closed 9 years ago

Autophone - Replace autophone.qa.mtv2.mozilla.com

Categories

(Testing Graveyard :: Autophone, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bc, Assigned: bc)

References

Details

Attachments

(8 files)

autophone.qa.mtv2.mozilla.com is a mac mini which hosts the Autophone tests in the QA lab. It is a single point of failure and is not the preferred platform for running the Autophone framework. Repeated USB disconnections which require a reboot of the host to recover along with a recent scare where the machine appeared to be failing make this a higher priority than we had originally thought. I would like to replace with mac mini with a rack mounted linux server. Also, due to the limitation of 15 usb connections via adb, it would be good for scaling and failover if we were able to replace the mac mini with 2 linux servers. I understand that we have recently decommissioned a number of linux servers when we moved the linux builds to the cloud. Though they may be out of warranty, it may be simpler to reuse them and reserve a couple for spares/parts. Marc, do you see any problems with replacing the mac mini with two rack mounted linux servers in the Autophone rack? Laura, do you know if we could reuse the decommissioned linux servers and if we could get 2+some spares?
Van, just replied that we do have some decommissioned servers. As for specs, they do not need to be extremely powerful. The current mac mini is a mid 2010 model, with Intel Core 2 Duo, 4G of RAM and a 300G disk. The cpu is currently at 25-35% load with a separate process for each worker/device. RAM usage is currently at 1.6G and 4G RAM is sufficient for normal runs though I have found that more RAM is necessary when running under debug mode when the multiple loggers consume more RAM. I would say 4G is minimal, 6-8G or more would be optimal. Autophone stores builds in a cache so they can be reused between devices without needing to redownload them. The storage requirements are higher now that we have switched to the use of split apks and the size of the tests.zip files. The requirement is about 1G per cached build. The mac mini has 300G of storage and I've recently had to reduce the cache size to 100 builds to keep the storage under a reasonable limit. 300G storage would be the minimum I think. Of course the machines will also need USB ports which can be connected to the powered hubs.
I see no problem with doing this. We have plenty of space in the racks still.
we have some DL120 G7 that should work (recently decommissioned by RelEng). i can bring it to mtv2 sometime next week after scavenging the parts. disk - currently 250GB but I know we have some old 1TB/1.5TB drives that would work. CPU - not 100% sure but should be a quad core Xeon-E3* processor memory - 8GB let me know if this will work.
The processor and ram should be more than sufficient. 250G would work for caching up to about 100 builds for a single day which is what the current mac mini is using: 102 builds: total: 297Gi used: 150Gi free: 147Gi. 1/1.5TB drives would be good to allow more builds to be cached for longer periods. I'm a little concerned considering the history of the drives how long they will last. Do you think the existing 250G drives are any more or less reliable than the possible 1/1.5TB drives? If we could get two servers that would be good. It would allow us to expand up to 30 devices and allow for some redundancy should one server fail. Do we need to keep any spare parts, extra disks around for maintenance?
>I'm a little concerned considering the history of the drives how long they will last. Do you think the existing 250G drives are any more or less reliable than the possible 1/1.5TB drives? these drives are all pretty old and im not sure hard they've been hammered in the past so there's really no data to determine their reliability. how does this sound for a spare pool (subject to change as i need to validate the number of spares we have is sufficient): 1 extra dl1270 g7 5 extra 1tb/1.5tb drives 6 extra memory dimms
:bc, i can give you guys 3x dl120g7s. you can use 2 for your config and 1 as a spare.
Van, that sounds good to me.
Van, I don't think the current usb hubs will support more than 15 devices total. Do you think we should take this opportunity to purchase rack mounted usb hubs? I found several possibilities where the hub was rack mountable with over 15 ports. The prices were on the order of $300 each. If we got one for each of the servers, it might help in organizing the server+hub+devices. Do you have a preferred vendor/manufaturer we could count on for a quality and reliable product?
Flags: needinfo?(vle)
i've honestly never looked into rack mounted USB hubs. last time we spoke about this, we were looking for a solution that not only charges your USB devices but also syncs them, is that still a requirement? i think i may have found something we can try out. let me know which one will work and we'll probably order it through service-now as i dont believe any of our vendors deal with USB hubs. http://www.hubgear.com/
:bc, sorry i misconstrued what you were saying. i think the best solution is to go through service now.
Van: ok, i was just trying to get some guidance on what would be the best solution. I'd like to try to get this right and I'm a bit out of my comfort zone here. Funnily, one of the links from that page was to one of the usb 2.0 16 port rack mount hubs I was looking at before. But the more I think about it, having so many ports on a usb 2.0 hub is probably not the best approach since they will all share the bandwidth of the host's port. So we would be splitting the 480Mbps among all of the connected devices. The DL120 G7 has 6 external usb 2.0 ports, but I can't find anything about how many controllers it contains. It does have pci express expansion slots available though. I wonder what you think about getting a usb 3.0 card and a 16 port usb 3.0 hub. I found a usb 3.0 16 port usb hub <http://www.hubgear.com/usb3-16u1.php>. It supposedly has Multi Transaction Translators (Multi-TT) per hub. Would that work to provide the full 480Mbps to each port with a usb 2.0 device? Reading http://en.wikipedia.org/wiki/USB_3.0 worries me a little about radio interference. The wifi situation in the QA lab is flaky enough as it stands. Do we know if the wifi in the lab runs at 2.4 GHz or 5 GHz? Unless we are pretty sure the usb 3.0 approach will work and not hork the wifi network in the lab, I'd like to go slow and only do one server and make sure it all works before we buy the additional card and hub.
we have both 2.4ghz and 5ghz running in the lab. i would recommend against using usb 3.0 if it's going to cause interference as your lab (3 contiguous racks) isn't large enough to move things away from each other and it might cause you more headaches with random issues. we're also looking at a pretty high density of usb 3.0 ports (i'm not sure of the spectrum of interference), but it may or may not skewer your tests. you can look into purchasing 2 internal pci-e USB 2.0 cards (1 regular and 1 low profile) to give you more ports.
:bc, will these rack mount USB shelves work for you? they seem to be pretty robust with their options. http://www.digi.com/products/usb/anywhereusb#specs
Flags: needinfo?(vle)
I don't know if that would work. "Compatible with bulk or interrupt type USB devices; Isochronous devices not supported" "Windows® 8, Windows 7, Windows Vista®, Windows Server® 2012 R2, Windows Server 2008 R2, Windows Server 2003, Windows XP®, Windows XP Embedded" From http://www.digi.com/learningcenter/stories/create-pc-free-access-control-systems "Third, USB Over IP software drivers are also loaded on the same host. This software acts as a COM port redirector – it makes the host “think” that the remote AnywhereUSB/5 devices are locally attached. All processing happens at the remote host, resulting in fewer points of failure and less risk of tampering." This makes me think the Windows requirement is actually real. I'm not clear as to how we would be able to see the devices via adb on Linux. The 2 port box is almost $300 and the 14 port is over $1800. I think rather than trying to rush the choice of the next-gen usb solution, can we just stick with the existing usb2 ports on the servers for now and connect them to the existing hubs? That will get us off the mac mini and give us some breathing room to figure out the best usb solution.
Rather than obtaining a hub which would connect to a single usb port on the server, do you think we could add additional usb 2.0 cards to the servers? If we could get each card with enough ports, and use the existing 4 usb ports on the back of the server we would have sufficient number of ports to support up to adb limit of 15 devices or at least very close to it and the bandwidth would be spread across 3 controllers instead of just one. From http://www8.hp.com/h20195/v2/getpdf.aspx/c04128298.pdf?ver=18 1 PCIe x16 Gen2 (x16 speed) (full-length, full-height) 1 PCIe x8 Gen2 (x4 speed) (half-length low-profile)
Flags: needinfo?(vle)
:jbarnell, are we still planning to move forward with using decommissioned hardware for autophone? we last discussed with ctalbert but i believe he's gone now. :bc that will work. do you need help ordering the USB HBAs if we're still good to reuse the hardware?
Flags: needinfo?(vle)
(In reply to Van Le [:van] from comment #16) > :jbarnell, are we still planning to move forward with using decommissioned > hardware for autophone? we last discussed with ctalbert but i believe he's > gone now. > He is still here until the end of the month. I think mcote and/or jgriffin will be the deciders in the future. > :bc that will work. do you need help ordering the USB HBAs if we're still > good to reuse the hardware? Yes please. I'm not familiar with specifying the appropriate cards that will work in these types of rack-mounted 1u servers. If at all possible it would be great to get at least 11 additional ports out of the two cards which would allow us to have all 15 usb cables attached to the rear of the server. If the two usb ports on the front of the server won't screw up the cabling, we could reduce that from 11 to 9. If we can't get to the limit of 15 ports, the closer we can get the better.
emailed our vendor inquiring about compatible PCI-e USB cards. i'll let you know if they have any options available.
Flags: needinfo?(vle)
our vendor doesn't have any sources available for pci-e USB cards. are there any you're interested in that you'd like to purchase to test? i can help you order and install but i don't have the free cycles to research what works for your applications. Van I have had no luck with this request. Can we use a USB Hub and connect it to the server ? Rich
Van, I'm ok with just reusing the current usb hubs and purchasing additional hubs. Can we fit two more 8 port hubs into the space available?
space is fine, we don't have any more USB ports on the mac mini unless you plan to daisy chain the hubs.
Flags: needinfo?(vle)
Depends on: 1172950
Depends on: 1155876
Depends on: 1229091
Depends on: 1229139
Attached file inventory
The servers are arranged in the rack with autophone-1.qa.mtv2.mozilla.com autophone-3.qa.mtv2.mozilla.com autophone-2.qa.mtv2.mozilla.com The attachment gives the locations of each device. More work needs to be done to help manage 3 instances. I'll attach patches for the autophone .ini and the test manifest .ini files after I get home. I'll write everything up after I've had a break.
production: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-tier=1&filter-tier=2&filter-tier=3&filter-searchStr=autophone&fromchange=df444117c7be&exclusion_profile=false including a try run https://treeherder.mozilla.org/#/jobs?repo=try&revision=54e37c4d7477&exclusion_profile=false&filter-tier=1&filter-tier=2&filter-tier=3 try: -b o -p android-api-9,android-api-11 -u autophone-s1s2,autophone-webapp,autophone-mochitest-dom-media -t none Note I also changed the device names to remove the old jdq39, kot49h and jss15q. Device names are simply the devicetype-number. I also updated the database on phonedash. Note this is running without a service and will not automatically reboot if a device disconnects. I'll get the upstart scripts written asap and turn on automatic rebooting.
Attachment #8709051 - Attachment mime type: application/x-sh → text/plain
This runs the autophone server under my id. Originally I had intended to run this under the separate autophone id, however when I enabled usb debugging for the devices on the hosts I was logged in using my id. It turns out that the Always allow USB on permission is not only host specific it is tied to the user id of the adb request as well. In order to change the autophone server to run under the autophone id we need someone in the lab who can press the "always allow" checkbox on each affected device. For now, we'll just run under mine.
They are deployed. Marking fixed.
Assignee: nobody → bob
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: