Closed Bug 1131269 Opened 9 years ago Closed 9 years ago

use SETA data to disable unneeded tests

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kmoir, Assigned: kmoir)

References

Details

Attachments

(17 files, 14 obsolete files)

5.21 KB, patch
catlee
: review+
Details | Diff | Splinter Review
59.19 KB, patch
catlee
: review+
Callek
: checked-in-
Details | Diff | Splinter Review
5.19 KB, patch
kmoir
: checked-in+
Details | Diff | Splinter Review
877 bytes, patch
Callek
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
975 bytes, patch
kmoir
: checked-in+
Details | Diff | Splinter Review
734 bytes, patch
Details | Diff | Splinter Review
5.56 KB, patch
catlee
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
1.41 KB, patch
catlee
: review+
Details | Diff | Splinter Review
72.32 KB, patch
catlee
: review+
Details | Diff | Splinter Review
131.04 KB, patch
catlee
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
178.86 KB, patch
jmaher
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
83.06 KB, patch
jmaher
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
109.99 KB, patch
coop
: review+
Details | Diff | Splinter Review
142.86 KB, patch
jmaher
: review+
kmoir
: checked-in+
Details | Diff | Splinter Review
114.88 KB, patch
coop
: review+
Details | Diff | Splinter Review
114.85 KB, patch
kmoir
: checked-in+
Details | Diff | Splinter Review
1.34 KB, patch
coop
: review+
Details | Diff | Splinter Review
jmaher has data that describes which 10.8 tests need to run less frequently because his SETA project. 

He suggested implementing his findings for this might relieve some of the load on 10.8 test machines and allow us to move more quickly to 10.10 given delays in getting new hardware to add this to the pool.

(bug to implement these scheduling changes in our configs is in bug 1127527)
jmaher: can you provide the list for 10.8 debug?
Flags: needinfo?(jmaher)
the main data:
http://alertmanager.allizom.org/seta.html

What is useful here is that we need to run:
Mn X M(bc1 dt2 oth) R(R)

In general on debug we seem to need to run:
mochitest browser-chrome*
mochitest devtools*
xpcshell

If we want to get to the next tier, probably Reftest,Crashtest, and Jittests


the theory of some coverage is better than none should apply here until we fully retire 10.8 :)
Flags: needinfo?(jmaher)
jmaher: Okay thanks.  So in this case we are looking at disabling these specified 10.8 tests completely to allow us to deploy 10.10, not running them on a periodic basis as in the intent in the other SETA bug 1127527?
Flags: needinfo?(jmaher)
kmoir- I see a few options:

1) we run a limited set of tests on osx 10.8 overall (as identified here: http://alertmanager.allizom.org/seta.html), and in addition run a full set of 10.10 jobs.  This would increase the load by about 20% on the existing infrastructure (just in job volume, end to end time might add another bit of time)

2) we run a limited set of 10.8 and 10.10 jobs.  Probably the jobs identified by seta for 10.8, and a superset of 10.6+10.8 jobs identified by seta for 10.10.  This would be default and we would schedule the non default jobs as needed (highly coalesced)

3) we don't run debug 10.10 tests at all, only opt.  Given 1 month of data, we could then make data driven decision about which opt jobs to run, etc.  For 10.8 we would run a reduced set (either seta recommendations or some additional ones)

I view all 3 options as equal in weight and probably something to stand up for 4-8 weeks until we have more data and are ready to switch stuff off.  There are probably other options as well!  Maybe we only run 10.8 on aurora/beta/release/esr until 10.10 rolls through the uplift cycle :)
Flags: needinfo?(jmaher)
I talked to catlee about this this afternoon because I was stuck with my current approach which was not working. (modifying builderNames in decide_and_remove_changes to reflect current priority) 

He suggested that I take one of two approaches by modifying buildbotcustom/misc.py
1) 
-on bbconfig side add priorities
-add skipcount here from branch config
-map priority, change test_builders to a dictionary
-test_builders to a dictionary keys are priorities values are builder names
-schedulers are created
-pgo and unittest builders need to be updated to use dictionary
-create a new scheduler for every level in this structure

2)
-leave test builders alone
-loop through
-look at priority at another value i.e. platform config
-group together one with the same skip count
-create schedulers

Initially the data from seta could be implemented statically into the configs.  Later on when we are sure it is working correctly, we could parse the seta data upon reconfig and have it sit on the masters to be loaded by the configs
Since we are going to be turning off 10.8 debug soon in bug 1126493, I updated the bug description.  Perhaps disabling tests on win 7 would be more useful since it seems to have long wait times.
Summary: use SETA data to disable unneeded tests on debug for 10.8 → use SETA data to disable unneeded tests on Windows 7
Attached patch bug1131269bbconfigs.patch (obsolete) — Splinter Review
this is just a proof of concept patch
I'm just testing with 10.10 opt because I had a machine of that type attached to my master for other testing
The intent is that the real patches will be on another platform where we have SETA data showing tests can be disabled.
Attached patch bug1131269bbcustom.patch (obsolete) — Splinter Review
This is a proof of concept patch for a similar approach to the ones catlee suggested in our meeting on Friday
It runs p1 jobs every time
p1 jobs are specified in config.py
The jobs that are not in the p1 list are removed from the test_builders and run at the less often as specified in skipcount
They are removed because we run the p1 jobs every time and if they are not removed the p1 test run twice because we the patch calls the scheduler twice
I had to append "_p1" to the  scheduler_name otherwise there were duplicate scheduler names.  Not sure if this is kosher.
This is really just implemented for opt tests now, will fix that

Anyways, feedback on whether this approach is on the right path is appreciated
Attachment #8575466 - Flags: feedback?(catlee)
Comment on attachment 8575466 [details] [diff] [review]
bug1131269bbcustom.patch

Review of attachment 8575466 [details] [diff] [review]:
-----------------------------------------------------------------

Good start!

One thing I'm worried about is when we have multiple levels of priorities here - the code will get ugly.

What if we tried to make everything use skipcounts/timeouts? something like this:

test_suites = branch_config['platforms'][platform][slave_platform][unittest_suites]
# Mapping of skipcount/timeout to list of test suites
# with that skip count
suites_by_skipconfig = defaultdict(list)
for test in test_suites:
    skipcount = test_suites.get('skipcount', 0)
    skiptimeout = test_suites.get('skiptimeout', 0)

    suites_by_skipconfig[skipcount, skiptimeout].append(test)

# Create a new Scheduler for every skip config
for (skipcount, skiptimeout), test_builders in suites_by_skipconfig.items():
    scheduler_class = Scheduler
    extra_args = {}
    if skipcount > 0:
        scheduler_class = EveryNthScheduler
        extra_args['n'] = skipcount
        extra_args['idleTimeout'] = skiptimeout

    branchObjects['schedulers'].append(scheduler_class(
        name=scheduler_name,
        branch=scheduler_branch,
        builderNames=test_builders,
        treeStableTimer=None,
        **extra_args
    ))


Then you can set skipcount=0 for all the tests in the p1_{opt,debug}_tests definition in buildbot-configs.

::: misc.py
@@ +2726,5 @@
> +                                            branch=scheduler_branch,
> +                                            builderNames=p1_tests,
> +                                            treeStableTimer=None,
> +                                            **extra_args
> +                                        ))                                        

did you mean this block to be only for 'opt' tests?
Attachment #8575466 - Flags: feedback?(catlee) → feedback+
Attached patch bug1131269mar13.patch (obsolete) — Splinter Review
This the state of the patch that from your earlier suggestions. It doesn't work because the builder names are in the wrong format when passed to the scheduler

Which leads me to some questions
Currently, test_builders are defined like this
 ['Rev4 MacOSX Snow Leopard 10.6 mozilla-b2g37_v2_2 opt test mochitest-1', 'Rev4 MacOSX Snow Leopard 10.6 mozilla-b2g37_v2_2 opt test mochitest-2', ...]

The test_suites from this new line
test_suites = branch_config['platforms'][platform][slave_platform][unittest_suites]

are defined like this

tests_suites[('mochitest', {'blob_upload': True, 'extra_args': ['--mochitest-suite', 'plain-chunked'], 'totalChunks': 5, 'use_mozharness': True, 'script_path': 'scripts/desktop_unittest.py', 'script_maxtime': 7200}), ('mochitest-other', {'use_mozharness': True, 'extra_args': ['--mochitest-suite', 'chrome,a11y,plugins'], 'script_path': 'scripts/desktop_unittest.py', 'blob_upload': True, 'script_maxtime': 7200}), ('reftest', {'use_mozharness': True, 'extra_args': ['--reftest-suite', 'reftest'], 'script_path': 'scripts/desktop_unittest.py', 'blob_upload': True, 'script_maxtime': 7200}), ('jsreftest', {'use_mozharness': True, 'extra_args': ['--reftest-suite', 'jsreftest'], 'script_path': 'scripts/desktop_unittest.py', 'blob_upload': True, 'script_maxtime': 7200}), ('crashtest', {'use_mozharness': True, 'extra_args': ['--reftest-suite', 'crashtest'], 'script_path': 'scripts/desktop_unittest.py', 'blob_upload': True, 'script_maxtime': 7200}), ('xpcshell', {'use_mozharness': True, 'extra_args': ['--xpcshell-suite', 'xpcshell'], 'script_path': 'scripts/desktop_unittest.py', 'blob_upload': True, 'script_maxtime': 7200}), ('cppunit', {'use_mozharness': True, 'extra_args': ['--cppunittest-suite', 'cppunittest'], 'script_path': 'scripts/desktop_unittest.py', 'blob_upload': True, 'script_maxtime': 7200}), ('jittest', {'use_mozharness': True, 'extra_args': ['--jittest-suite', 'jittest'], 'script_path': 'scripts/desktop_unittest.py', 'script_maxtime': 7200}), ('mochitest-gl', {'use_mozharness': True, 'extra_args': ['--mochitest-suite', 'mochitest-gl'], 'script_path': 'scripts/desktop_unittest.py', 'blob_upload': True, 'script_maxtime': 12000})]

so when you append a test to 
suites_by_skipconfig[skipcount, skiptimeout].append(test) it is in this format not the list of test builder names the scheduler expects

How did you anticipate defining tests in our configs so they have scheduling defined with the granularity of we need.  Right now it's 
BRANCHES['mozilla-inbound']['platforms']['macosx64']['yosemite']['p1_opt_tests'] = ['Rev5 MacOSX Yosemite 10.10 mozilla-inbound opt test xpcshell', 'Rev5 MacOSX Yosemite 10.10 mozilla-inbound opt test reftest', 'Rev5 MacOSX Yosemite 10.10 mozilla-inbound opt test mochitest-other', 'Rev5 MacOSX Yosemite 10.10 mozilla-inbound opt test mochitest-browser-chrome-1']
Flags: needinfo?(catlee)
if I can provide the data in any other format, do speak up!
There are a few things we could do in the configs to specify this. Initially I was thinking we could modify the test_suites configs directly, so you'd have something like this:

('mochitest', {'blob_upload': True, 'extra_args': ['--mochitest-suite', 'plain-chunked'], 'totalChunks': 5, 'use_mozharness': True, 'script_path': 'scripts/desktop_unittest.py', 'script_maxtime': 7200, 'skipcount': 5})

that could be tricky to maintain in our current layout, especially if we have different priorities per branch.

how about if we add a new config item in the platform configs, like

BRANCHES['mozilla-inbound']['platforms']['macosx64']['yosemite']['skipconfig']

and this is a dict of test type, suitename to the skip count, skip timeout. e.g.
{('opt', 'mochitest'): (5, 1800)}

Then in generateBranchObjects we look up the test in that dict and set the skip parameters accordingly, defaulting to no skipping if we can't find anything.
Flags: needinfo?(catlee)
Attached patch bug1131269mar26.patch (obsolete) — Splinter Review
These are the latest patches I have but the scheduling is not working as expected so I'm doing some further testing to see the cause of that.

The tests are defined in buildbot-configs/mozilla-tests/config.py as follows as an example

BRANCHES['mozilla-inbound']['platforms']['macosx64']['yosemite']['skipconfig'] = {('debug', 'mochitest-3'): (5, 1800),
                                                                                  ('debug', 'mochitest-4'): (5, 1800),
                                                                                  ('debug', 'mochitest-browser-chrome-1'): (5, 1800),
                                                                                  ('opt', 'mochitest-1'): (5, 1800),
                                                                                  ('opt', 'mochitest-other'): (5, 1800)
                                                                                 }
The problem with scheduling is that if I invoke five sendchanges, the following behaviour occurs:
-no jobs are scheduled for the jobs defined with skiptimeout of 5 and skiptimeout of 1800 for the first four senchanges
-when the fifth sendchange is invoked, there are five jobs listed as pending for the tests that have a skipcount of 5 and a skiptimeout of 1800
-the desired behavior is to have only one pending job for the types of tests we want to run on a less frequent basis as specified in skipcount and skiptimeout

so I'm debugging that problem now
I think that's expected and wanted - the jobs should be coalesced together in a single run once a slave picks them up.

If you were only to get a single pending job, it would make regression hunting much more difficult.
Okay thanks for the explanation.  It looks like it coalesced the 5 pending jobs into  2 actual jobs each test type. I would have expected it to run only one.
I think that probably comes from here:
https://github.com/mozilla/build-buildbotcustom/blob/master/misc.py#L614

which is referenced here:
https://github.com/mozilla/build-buildbotcustom/blob/master/misc.py#L672

If that's the case, then we should bump the builderMergeLimits to at least 5 (or whatever our skipcount is) for these builders.
Okay thanks. Perhaps I'll adjust those a bit later.  I just used those skipcounts as an example.
Attachment #8583822 - Flags: feedback?(catlee)
Comment on attachment 8583822 [details] [diff] [review]
bug1131269mar26.patch

Review of attachment 8583822 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm!

need to handle the try case, and possibly the PGO case as well?

::: misc.py
@@ +2708,5 @@
>                                  extra_args['chooserFunc'] = tryChooser
>                                  extra_args['prettyNames'] = prettyNames
>                                  extra_args['unittestSuites'] = unittestSuites
>                                  extra_args['buildersWithSetsMap'] = builders_with_sets_mapping
>                                  extra_args['buildbotBranch'] = branch

we'll need to create schedulers for try here now that the logic for creating schedulers has moved into the 'else' block below.

@@ +2710,5 @@
>                                  extra_args['unittestSuites'] = unittestSuites
>                                  extra_args['buildersWithSetsMap'] = builders_with_sets_mapping
>                                  extra_args['buildbotBranch'] = branch
>                              else:
> +                                scheduler_class = Scheduler                               

probably don't need this any more since you're setting the class down below
Attachment #8583822 - Flags: feedback?(catlee) → feedback+
Attached patch bug1131269mar31.patch (obsolete) — Splinter Review
still have to changes for add pgo builders
Attached patch bug1131269bbconfigs.patch (obsolete) — Splinter Review
patch to specify SkipConfig definitions in a standalone file
-Right now this is generated from a script that parses the seta data
-long term plan is to have it generated upon reconfig
-I removed mountain lion definitions since that platform is on it's way out
for other platforms, I replaced it with the SETA data on m-i except for yosemite which lacks seta data and ubuntu32_vm which is a todo
Attached patch bug1131269bbconfigs-2.patch (obsolete) — Splinter Review
Attachment #8586382 - Attachment is obsolete: true
:jmaher where is the seta data for the pgo tests 
http://alertmanager.allizom.org/data/setadetails/?date=2015-03-30&buildbot=1&branch=mozilla-inbound&inactive=1
Should I assume it to be the same as the unittest data?
Flags: needinfo?(jmaher)
ack, I hadn't looked at that!  We can have pgo in the database, but for now we had planned on ignoring it as it was run so infrequent.  

this is something we can do- it might take a bit of time to massage and update our tooling, maybe a 2 days.  We can get an api with preliminary data in it pretty quickly as in a few hours of hacking tomorrow.  Making it fit into the other queries and the UI on the webpage will take a bit of hacking.  Also backfilling the data will take a few hours, but it is very doable.
Flags: needinfo?(jmaher)
:kmoir, would you like the pgo data in the same query, or a different query?
Flags: needinfo?(kmoir)
If pgo is so infrequent, is it worth running reducing the tests we run?  The patches for unittests look good so far we could deploy them first and look at pgo later.  IF we do decide it's worth it probably a different query would be useful or an option on the existing query
Flags: needinfo?(kmoir)
we will get the data ready regardless.  My thoughts on pgo were also around the long build times (especially windows) and if we were backfilling, that would be a long wait.  Assuming this works and is fairly painless for sheriffs and developers, we could investigate pgo and have more conservative recommendations there.  We also don't have android/b2g stuff, that is something we wanted to tackle after getting the desktop stuff done up.
hold off on pgo until we determine it's worth it and have data
you actually need 

the
scheduler_class = Scheduler

the review mentioned needed to be removed otherwise checkconfig fails on talos
Attachment #8586377 - Attachment is obsolete: true
Attachment #8586927 - Attachment is obsolete: true
Attachment #8587001 - Flags: review?(catlee)
Attachment #8587441 - Flags: review?(catlee)
Comment on attachment 8587001 [details] [diff] [review]
bug1131269apr1.patch

Review of attachment 8587001 [details] [diff] [review]:
-----------------------------------------------------------------

::: misc.py
@@ +2734,5 @@
> +                                        test_name = test.split()[-1]
> +                                        if (test_type, test_name) in branch_config['platforms'][platform][slave_platform]['skipconfig']:
> +                                            skipcount, skiptimeout = branch_config['platforms'][platform][slave_platform]['skipconfig'][test_type, test_name]                                
> +                                    suites_by_skipconfig[skipcount, skiptimeout].append(test)
> +                                

style nit: trailing whitespace

@@ +2738,5 @@
> +                                
> +                                # Create a new Scheduler for every skip config
> +                                for (skipcount, skiptimeout), test_builders in suites_by_skipconfig.items():
> +                                    scheduler_class = Scheduler
> +                                    s_name=scheduler_name

style nit: should have spaces around '=' here

@@ +2745,5 @@
> +                                    if skipcount > 0:
> +                                        scheduler_class = EveryNthScheduler
> +                                        extra_args['n'] = skipcount
> +                                        extra_args['idleTimeout'] = skiptimeout
> +                                        s_name=scheduler_name + "-" + str(skipcount) + "-"  + str(skiptimeout)

and here
Attachment #8587001 - Flags: review?(catlee) → review+
Attachment #8587441 - Flags: review?(catlee) → review+
patch with style nits and whitespace fixed
Attachment #8575462 - Attachment is obsolete: true
Attachment #8577254 - Attachment is obsolete: true
Attachment #8575466 - Attachment is obsolete: true
Attachment #8583822 - Attachment is obsolete: true
Attachment #8587441 - Flags: checked-in+
Attachment #8587504 - Flags: checked-in+
Summary: use SETA data to disable unneeded tests on Windows 7 → use SETA data to disable unneeded tests
The remaining two parts of this bug are
1) To parse the SETA data upon reconfig and update config_seta.py with the latest SETA data
2) To adjust config_seta.py so the builderMergeLimits can be greater than 3
https://github.com/mozilla/build-buildbotcustom/blob/master/misc.py#L614

Catlee, for adjusting the merge limit, was your intent when we discussed this before to raise the overall default limit or just limit it to a list of builders that we know need to run less often according to SETA data?
Flags: needinfo?(catlee)
My intent was to modify builderMergeLimits in the same place in the code where we're processing the skipcount settings. Would it make sense to add an entry in builderMergeLimits for each builder we've got a skipcount set for?
Flags: needinfo?(catlee)
(In reply to Justin Wood (:Callek) from comment #35)
> Backed out with http://hg.mozilla.org/build/buildbot-configs/rev/8d4058f55d28
> 
> Due to https://travis-ci.org/mozilla/build-buildbot-configs/builds/57342142

...

Presumably we need to make sure its symlinked in the master dir (via setup-master) and create/use a fabric action to create the symlink on all existing masters.
Just landed config_seta.py by itself so I can add a run a custom fabric action to add the link on all masters later
custom fabric action to enable link to config_seta.py
Attachment #8589228 - Flags: review?(bugspam.Callek)
Attached patch bug1131269setupmaster.patch (obsolete) — Splinter Review
add new symlink when setting up master for config_seta.py
Attachment #8589245 - Flags: review?(bugspam.Callek)
Attachment #8589228 - Flags: review?(bugspam.Callek) → review+
Attachment #8589245 - Flags: review?(bugspam.Callek) → review+
Actually this fixed the failing build and I setup a tests dev-master and verified it had the correct link
Attachment #8589245 - Attachment is obsolete: true
Attachment #8589228 - Flags: checked-in+
Attachment #8589273 - Flags: checked-in+
have to add new config file to scheduler too
I looked at treeherder this morning and it looks like the tests are running at the lower intervals as expected on m-i and fx-team as specified in config_seta.py
remove excess spaces
Attachment #8590864 - Flags: review?(catlee)
Attachment #8590864 - Flags: review?(catlee) → review+
Attachment #8590864 - Flags: checked-in+
override merge limit for specific test with the value of skipcount
Attachment #8591031 - Flags: review?(catlee)
Attachment #8591031 - Flags: review?(catlee) → review+
jmaher, Are there other branches that this should be enabled on besides m-i and fx-team?  
I assume this is not something that will ride the trains but rather we will enable on a per branch basis?
I'm just trying to define the scope for my Q2 goal.  Since it seems to be working now, my next steps are to
1) enable skipcount = 5 for the tests from the seta data now that we can override merge limits
2) Finish code that consumes SETA data and updates seta_config.py dynamically
Flags: needinfo?(jmaher)
we should have a good way to see the impact this has by mid next week!

Future work could be:
* android
* b2g
* adding in other jobs (such as hidden ones)
* multi tiered jobs (i.e. some run every 5th build, other run every 10th builds)
* expanding to other branches, maybe m-c (which would include pgo work)

all of these above mentioned items are ideas, it would require more analysis and work before moving on them.  Maybe the Q2 goals is finished your two items, see how this works for 1 month, then figure out if we want to tackle any other items in late Q2 if time permits or schedule them for Q3.

Thanks for making this happen!
Flags: needinfo?(jmaher)
Adjust frequency that seta tests run from 3 to 5 now that the builderMergeLimits issue has been fixed.  This was originally requested in the early seta meetings https://etherpad.mozilla.org/SETA-mtg-notes-20150211
Attachment #8591700 - Flags: review?(catlee)
Assignee: nobody → kmoir
Attachment #8591700 - Flags: review?(catlee) → review+
Attached patch setaapr17.patchSplinter Review
Changes to seta config based on today's data.  I have been reworking the script that generates so the order changed quite a bit. Also, there is enough data on yosemite to include it now.  Due to a bug in my script, linux64 tests were not included originally, this rectifies it.
Attachment #8594109 - Flags: review?(catlee)
Comment on attachment 8594109 [details] [diff] [review]
setaapr17.patch

Review of attachment 8594109 [details] [diff] [review]:
-----------------------------------------------------------------

\o/
Attachment #8594109 - Flags: review?(catlee) → review+
Attached patch bug1131269generateseta.patch (obsolete) — Splinter Review
This is the script I use to currently generate the seta data.  I'm not sure of the best way to run it upon reconfig, while committing and updated copy of config_seta.py to the repo.
Attachment #8594155 - Flags: feedback?(catlee)
Attachment #8594109 - Flags: checked-in+
Talked to coop about this in our 1x1 today, he suggested since the impact of SETA test reduction has been offset by more tests being enabled we should only run tests on every 10th run and with a longer timeout. I'll prepare patches.
as a data point, the first full day of linux64 + osx10.10 using SETA was yesterday (april 22), and we had the lowest jobs/push for any given day since march 1.  With that said, it is only ~20% lower.  I think this comes on the heels of e10s jobs being added and extra chunks being added.  Also we removed the forced coalescing of debug jobs on some platforms and replaced it with SETA.

It is too early to tell the data, but I am leaning towards between 18-25% reduction in jobs prior to SETA.  Adding android will help a lot, so will adjusting the skip count and timeout.  Looking forward to seeing the changes in the near future!
patch to run seta identified tests on only every 10 commits and 3x the timeout
Attachment #8596565 - Flags: review?(jmaher)
Comment on attachment 8596565 [details] [diff] [review]
bug1131269seta.patch

Review of attachment 8596565 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm!  thanks for doing this.
Attachment #8596565 - Flags: review?(jmaher) → review+
Attachment #8596565 - Flags: checked-in+
Attached patch bug1131269generateseta-2.patch (obsolete) — Splinter Review
Attachment #8594155 - Attachment is obsolete: true
Attachment #8594155 - Flags: feedback?(catlee)
today's seta data
also, the timeout was incorrectly configured for mobile
Attached patch bug1131269generate.patch (obsolete) — Splinter Review
Attachment #8597444 - Attachment is obsolete: true
Attachment #8599465 - Flags: review?(jmaher)
Comment on attachment 8599467 [details] [diff] [review]
bug1131269generate.patch

This is the script I use to currently generate the seta data.  I'm not sure of the best way to run it upon reconfig, while committing and updated copy of config_seta.py and config_seta_mobile.py toto the repo.
Attachment #8599467 - Flags: feedback?(catlee)
Comment on attachment 8599465 [details] [diff] [review]
bug1131269setaapr29.patch

Review of attachment 8599465 [details] [diff] [review]:
-----------------------------------------------------------------

these adjustments look fine!
Attachment #8599465 - Flags: review?(jmaher) → review+
Attachment #8599465 - Flags: checked-in+
Comment on attachment 8599467 [details] [diff] [review]
bug1131269generate.patch

Review of attachment 8599467 [details] [diff] [review]:
-----------------------------------------------------------------

Overall, I'm unsure of having code that generates python code in this manner. Would it make sense to have your code pull down the SETA data as json, and make the appropriate modifications to the BRANCHES structure?

::: mozilla-tests/generate_seta.py
@@ +9,5 @@
> +skipconfig_defaults = "10, 5400"
> +# todo: should get platform names from PLATFORMS in config.py
> +
> +today = date.today()
> +today = str(today.year) + "-" + str(today.month) + "-" + str(today.day)

could use today.strftime("%Y-%m-%d")?

@@ +26,5 @@
> +
> +def get_seta_platforms(branch, platform_filter):
> +
> +    url = "http://alertmanager.allizom.org/data/setadetails/?date=" + today + "&buildbot=1&branch=" + branch + "&inactive=1"
> +    response = ""

no need for this if you're re-raising exceptions in all the cases below

@@ +31,5 @@
> +    try:
> +        response = urllib2.urlopen(url)
> +    except urllib2.HTTPError, e:
> +            print('HTTPError = ' + str(e.code))
> +            raise

you've got some inconsistent indentation levels here and below

@@ +84,5 @@
> +    try:
> +        f = open(filename, 'a')
> +    except IOError as e:
> +        print "I/O error({0}): {1}".format(e.errno, e.strerror)
> +        raise

this kind of file access can generally be rewritten as:

with open(filename, 'a') as f:
    for sp in test_dict:
        ...
        f.write(format(....))

this will automatically close 'f' when leaving the 'with' block

is there a reason you need the custom IOError output?
Attachment #8599467 - Flags: feedback?(catlee)
Attached patch bug1131269generate-2.patch (obsolete) — Splinter Review
I updated the script with the feedback you mentioned.  The way that the script generates the files is to 
generate_seta.py desktop 
-> creates config_seta.py
generate_seta.py mobile
-> creates config_seta_mobile.py

regarding this comment, I'm not sure what you mean.  

>>Overall, I'm unsure of having code that generates python code in this manner. >>Would it make sense to have your code pull down the SETA data as json, and >>make the appropriate modifications to the BRANCHES structure?

It does pull down the SETA data as json and creates new files with the BRANCHES data that are loaded by config.py and mobile_config.py
Attachment #8599467 - Attachment is obsolete: true
Flags: needinfo?(catlee)
Attached patch bug1131269rework.patch (obsolete) — Splinter Review
patch to load seta config data dynamically for mobile and desktop
Attachment #8601574 - Attachment is obsolete: true
Attachment #8603328 - Flags: review?(catlee)
updated bitrotten patch to update upon reconfig
update seta data manually until other patches are approved
there weren't any changes to the mobile data when I ran the script
Attachment #8612307 - Flags: review?(jmaher)
Comment on attachment 8612307 [details] [diff] [review]
bug1131269dataupdate.patch

Review of attachment 8612307 [details] [diff] [review]:
-----------------------------------------------------------------

thanks!
Attachment #8612307 - Flags: review?(jmaher) → review+
Attachment #8612307 - Flags: checked-in+
Attachment #8610569 - Flags: review?(coop)
Attachment #8603328 - Flags: review?(catlee)
Attachment #8603328 - Attachment is obsolete: true
Comment on attachment 8610569 [details] [diff] [review]
bug1131269rework-2.patch

Review of attachment 8610569 [details] [diff] [review]:
-----------------------------------------------------------------

Some nits, but otherwise good.

::: mozilla-tests/config_seta.py
@@ +9,5 @@
> +skipconfig_defaults = (10, 5400)
> +# todo: should get platform names from PLATFORMS in config.py
> +
> +today = date.today()
> +today = today.strftime("%Y-%m-%d")

You should be able to chain these:

date.today().strftime("%Y-%m-%d")

@@ +55,5 @@
> +            platforms.append(platform)
> +    return(platforms)
> +
> +
> +def sort_android_tests(plat, slave_plat, tests):

Do we gain anything by using shorter variable names here? I'd prefer the full name for "platform" and "slave_platform," and "tests_by_slave_platform" before.

@@ +58,5 @@
> +
> +def sort_android_tests(plat, slave_plat, tests):
> +
> +    #create a dictionary that maps slave platform to tests
> +    #initialize the dictionary of tests per platform

Switch to docstrings and delete the blank line at the start of each function.

@@ +90,5 @@
> +def define_configs(branch, platforms, BRANCHES):
> +
> +    for p in platforms:
> +        tests = []
> +        for l in c['jobtypes'][today]:

"l" as a variable name is confusing. I thought it was 1 when I first scanned it in bugzilla. If it represents a job, let's use "j" or better still, "job."
Attachment #8610569 - Flags: review?(coop) → review+
Flags: needinfo?(catlee)
Attached patch bug1131269rework-3.patch (obsolete) — Splinter Review
patch updated with review comments
Attachment #8614286 - Flags: checked-in+
Attachment #8614286 - Flags: checked-in+ → checked-in-
variable was initialized in the wrong place 

diff compared to previous patch is
< @@ -1,794 +1,115 @@
---
> @@ -1,794 +1,116 @@
898a899
> +        test_config = {}
904c905
< +            test_config = {}

running tests in staging now, seems to be going fine
Attachment #8614286 - Attachment is obsolete: true
Comment on attachment 8615471 [details] [diff] [review]
bug1131269rework-4.patch

testing that ran on my master last night ran the jobs as expected
Attachment #8615471 - Flags: review?(coop)
Comment on attachment 8615471 [details] [diff] [review]
bug1131269rework-4.patch

Review of attachment 8615471 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with nits addressed.

::: mozilla-tests/config_seta.py
@@ +76,5 @@
> +def print_configs(branch, plat, test_dict, BRANCHES):
> +
> +    test_config = {}
> +    for sp in test_dict:
> +        test_config = {}

Now you're defining this twice. Is the inner scope sufficient?

@@ +82,5 @@
> +            test = t.split()[-1]
> +            test_type = t.split()[-3]
> +            test_config[test_type, test] = skipconfig_defaults
> +            BRANCHES[branch]['platforms'][plat][str(sp)]['skipconfig'] = test_config
> +            

Remove whitespace.
Attachment #8615471 - Flags: review?(coop) → review+
patch with feedback from review

It looks like this is working on m-i and fx-team after I ran a reconfig this morning
Attachment #8616054 - Flags: checked-in+
noticed my dev-master was reporting errors because Joel enabled 4.3 data as I requested in bug 1155362 

this will fix it
Attachment #8616144 - Flags: review?(coop)
Comment on attachment 8616144 [details] [diff] [review]
bug1131269android4-3.patch

I landed this because if this is a reconfig over the weekend without this patch there will be key errors on the masters
Attachment #8616144 - Flags: review?(coop) → review+
Fixed
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.