Closed Bug 999930 Opened 11 years ago Closed 11 years ago

put tegras that were on loan back onto a foopy and into production

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jlund, Unassigned)

References

Details

Attachments

(3 files)

these tegras were on loan and need to be configured back into production. based off: devs = json.load(open('/Users/jlund/Downloads/devices.json')) for k, v in sorted(Counter([devs[d]['foopy'] for d in devs]).iteritems(), key=operator.itemgetter(1)): print k, v I get: foopy31 5 foopy106 6 foopy32 7 foopy30 7 foopy28 8 foopy109 8 foopy111 9 foopy113 9 None 10 foopy115 10 foopy114 10 foopy117 10 foopy112 10 foopy45 10 foopy83 11 foopy67 11 foopy46 11 foopy89 11 foopy29 12 foopy90 12 foopy95 12 foopy97 12 foopy96 12 foopy79 12 foopy76 12 foopy75 12 foopy74 12 foopy81 12 foopy68 12 foopy69 12 foopy60 12 foopy61 12 foopy62 12 foopy55 12 foopy54 12 foopy53 12 foopy47 12 foopy48 12 foopy82 12 foopy125 12 foopy120 12 foopy121 12 foopy123 12 foopy88 12 foopy27 13 foopy99 13 foopy98 13 foopy91 13 foopy93 13 foopy92 13 foopy94 13 foopy102 13 foopy103 13 foopy100 13 foopy101 13 foopy104 13 foopy105 13 foopy78 13 foopy73 13 foopy72 13 foopy71 13 foopy70 13 foopy77 13 foopy86 13 foopy87 13 foopy84 13 foopy85 13 foopy119 13 foopy80 13 foopy116 13 foopy63 13 foopy64 13 foopy65 13 foopy66 13 foopy57 13 foopy56 13 foopy51 13 foopy50 13 foopy52 13 foopy59 13 foopy58 13 foopy42 13 foopy43 13 foopy40 13 foopy41 13 foopy44 13 foopy49 13 foopy124 13 foopy126 13 foopy122 13 foopy128 13 foopy39 13 foopy108 14 foopy110 15 foopy107 17 I would assume I should distribute these on the foopies that have the least devices but I'll wait to see what callek thinks. I think all I have to do is make the tegra dirs on the foopy (https://bugzilla.mozilla.org/show_bug.cgi?id=971859#c4) add the tools patch (something like: https://hg.mozilla.org/build/tools/rev/bfd44dcd7f79) and then update tools revision across foopies.
Blocks: tegra-287
arg, I just saw: buildfarm/mobile/devices_per_foopy.py. I guess I should have used that! callek: needinfo WRT ^ https://bugzilla.mozilla.org/show_bug.cgi?id=999930#c0
Flags: needinfo?(bugspam.Callek)
c#0 sounds accurate, just points of order: * You need to create the tegra dir's as root * You need to chmod/chown -R those dirs to cltbld * You need to choose foopies that are in scl3 beyond that your plan sounds perfect.
Flags: needinfo?(bugspam.Callek)
as per comment 0 + 2 of this bug: foopy31.tegra.releng.scl3.mozilla.com is -- - on scl3 - holds the least amount of tegras ATM I will hold off mkdir'n the tegra dirs on this foopy until this patch lands. general Q to anyone: Once landed, what's the current procedure for deploying? Should I just update the local tools repo on foopy31? Or do we keep all our foopies in sync?
Attachment #8411350 - Flags: review?(pmoore)
going to add tegra-205 to this list. Possibly more to come.
Summary: give tegra-275 and tegra-287 a foopy and put back in production → put tegras that were on loan back onto a foopy
Blocks: tegra-205
tegra-204 has a foopy125 named to it in devices.json however, foopy125 does not have a tegra-204 dir in /build/. This might not have been created after loan (https://bugzilla.mozilla.org/show_bug.cgi?id=755739#c12). This bug can track tegra-204 as well. steps needed: 1) make sure the tools repo rev on foopy125 has a devices.json file that states tegra-204 has foopy125 attached to it. 2) create tegra-204 dir (eg: https://bugzilla.mozilla.org/show_bug.cgi?id=999930#c2) 3) power cycle tegra just to be safe
Depends on: tegra-204
Blocks: tegra-204
No longer depends on: tegra-204
tegra-203 will needs a home (foopy) too. It listed on loan in devices.json with no foopy named to it. It is no longer on loan: https://bugzilla.mozilla.org/show_bug.cgi?id=749637
Blocks: tegra-203
Comment on attachment 8411350 [details] [diff] [review] 140427-999930_add_tegras_back_to_foopies.patch I am going to be adding more tegras to this one patch
Attachment #8411350 - Flags: review?(pmoore) → review-
=== status update [I have added dep bugs that are directly related to this (994916 and 928122)] in devices.json, foopy is not listed -> - tegra-203 - tegra-205 - tegra-275 - tegra-287 - tegra-198 - tegra-202 in devices.json, foopy is listed, tegra dir does not exist on specified foopy (foopy125) -> - tegra-204 van thinks these are still on loan. they are in scl1. they might not be in devices.json at all, foopy therefore not listed, tegra dir does not exist -> - tegra-161 - tegra-123 - tegra-122 for 122, 123, and 161, I see no history of them on loan (slavealloc, problem tracking bug, or bugzilla). But like van said, these are listed in inventory as mtv1 and are not on their racks. I did find: 'Bug 994916 - add missing pandas/tegras to buildbot configs' So they should be good to go from buildbot end. rollout plan for getting these into production: for rollout plan Tegra chunk (A) -> tegra-161, tegra-123, tegra-122: 1) find out where/who has these 2) get them re-imaged 3) step 1 from Tegra chunk B 4) step 2 from Tegra chunk B Tegra chunk (B) -> tegra-203, tegra-205, tegra-275, tegra-287, tegra-198 1) ensure they are in devices.json with a foopy listed 2) ensure tegra dirs exist with right privs on listed foopy unknowns: callek: do you know how we can find out where these tegras are? How do we get them back in scl3? devices.json patch incoming
Depends on: 928122, 994916
Flags: needinfo?(bugspam.Callek)
Summary: put tegras that were on loan back onto a foopy → put tegras that were on loan back onto a foopy and into production
[correction] > van thinks these are still on loan. they are in scl1. they might not be in > devices.json at all, foopy therefore not listed, tegra dir does not exist -> > - tegra-161 > - tegra-123 > - tegra-122 s/scl1/mtv1
[correction] > Tegra chunk (B) -> tegra-203, tegra-205, tegra-275, tegra-287, tegra-198 Tegra chunk (B) -> tegra-203, tegra-205, tegra-275, tegra-287, tegra-198, tegra-202 /me can't type today
pete, for reference see: https://bugzilla.mozilla.org/show_bug.cgi?id=999930#c0 and https://bugzilla.mozilla.org/show_bug.cgi?id=999930#c8 I am: - from the 3 in chunk A: adding the missing tegra from this list. ensuring that all 3 do not have a foopy yet and updated their comments - from the 6 in chunk B: adding 4 tegras to foopy31 and 2 tegras to foopy32. This would result in both foopy31 and foopy32 having 9 tegras (still lower than the average of 12 per foopy). after r+ and landing this, I will create the tegra dirs and update the tools repo on the respective foopies.
Attachment #8412275 - Flags: review?(pmoore)
Looking at this now...
Comment on attachment 8412275 [details] [diff] [review] 140424-999930_add_tegras_back_to_foopies.patch Review of attachment 8412275 [details] [diff] [review]: ----------------------------------------------------------------- Great (and systematic) work Jordan. I spotted in slavealloc that currently tegra-287 and tegra-275 are marked as being in mtv1 data center, but should be in scl3: https://secure.pub.build.mozilla.org/slavealloc/ui/#slaves I thought that the data is sync'd here periodically from inventory, so I checked inventory too, to see if it was also incorrect there. But alas the data is correct there: https://inventory.mozilla.org/en-US/systems/show/4781/ https://inventory.mozilla.org/en-US/systems/show/4769/ I'll ping :dustin to see if he knows why they might be out of sync. In any case it would be a good idea to mark them as scl3 in slavealloc too. Once you create the directories on the foopies, it would be good to check /builds/watcher.log on each foopy to check that each tegra gets picked up, and reports them as ok. Then of course to check they get seen by the buildbot master (either http://buildbot-master88.srv.releng.scl3.mozilla.com:8201/buildslaves?no_builders=1 or http://buildbot-master99.srv.releng.scl3.mozilla.com:8201/buildslaves?no_builders=1 I believe), and that a buildbot slave process gets started on the foopy for each of them (e.g. on the foopies: ps -ef | grep buildb[o]t ). If you see the watcher is happy with them (you can also look at /builds/tegra-XXX/watcher.log to see output per tegra) and that the buildbot slave is running on the foopy, and that the slaves connect to the masters, we should be good to go. I'm not sure how long it takes before one picks up a job on average, so to just check they are connected is probably enough. Looking forward to seeing you next week! Pete ::: buildfarm/mobile/devices.json @@ +458,5 @@ > "pdu": "pdu3.r601-11.tegra.releng.scl3.mozilla.com", > "pduid": ".AA10" > }, > "tegra-122": { > + "_comment": "ToDo: Bug 999930 - add to scl3 from mtv1, re-image, add foopy", Is it worth including in the comment string that the devices are currently "Missing in Action"?
Attachment #8412275 - Flags: review?(pmoore) → review+
Hey Dustin, see comment 13 above - could this be a sync issue between inventory and slavealloc? (or is the sync process I'm thinking of between inventory and mozpool, and there is not a sync between inventory and slavealloc?). Thanks! Pete
Flags: needinfo?(dustin)
There's no automatic sync from inventory to slavealloc. There is for mozpool.
cheers pete, I'll let you know how I get on. thanks dustin. clearing your needinfo due to answer. I can update slavealloc to reflect inventory accordingly.
Flags: needinfo?(dustin)
passes json lint test won't let that bite me again.
Attachment #8413014 - Flags: review?(aki)
Attachment #8413014 - Flags: review?(aki) → review+
Comment on attachment 8413014 [details] [diff] [review] 140425-999930_add_tegras_back_to_foopies.patch pushed https://hg.mozilla.org/build/tools/rev/e114be31d781 slave health errors should be fixed.
Attachment #8413014 - Flags: checked-in+
Comment on attachment 8412275 [details] [diff] [review] 140424-999930_add_tegras_back_to_foopies.patch added 'Missing In Action' to respective tegra comments this has been checked in and deployed on foopy31: http://hg.mozilla.org/build/tools/rev/aab7a911d221
Attachment #8412275 - Flags: checked-in+
=== update ==== tegra-202, tegra-203, tegra-205, tegra-275 back in production and greening. closing those tracking bugs for now tegra-287 threw a warning on first job robocop-2 (see problem tracking bug tegra-287) tegra-198 gives errors about SUTAgent (see problem tracking bug tegra-198)
I am going to close this now as the role of this bug as to add tegras that we had back into prod pool via devices. Tegras that never took jobs will be tracked in their respective bugs
Status: NEW → RESOLVED
Closed: 11 years ago
Flags: needinfo?(bugspam.Callek)
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: