Closed Bug 1280474 Opened 8 years ago Closed 7 years ago

Run firefox build tests on TaskCluster Windows worker types

Categories

(Release Engineering :: General, defect)

x86
Windows 7
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: grenade, Assigned: grenade)

References

Details

Attachments

(2 files, 2 obsolete files)

      No description provided.
Depends on: 1280476
Depends on: 1280952
Depends on: 1280953
Depends on: 1281771
No longer blocks: 1285576
Attached patch wip (obsolete) — Splinter Review
dustin: i'm hoping you can give me some pointers. i have @payload_builder('generic-worker') and @worker_setup_function('generic-worker') methods. i'm not sure where the decision for which implementation to use, lives. can you point me in the right direction?
Flags: needinfo?(dustin)
Attachment #8775932 - Attachment is patch: true
Comment on attachment 8775932 [details] [diff] [review]
wip

Review of attachment 8775932 [details] [diff] [review]:
-----------------------------------------------------------------

The worker_setup_function in make_task_description.py is called based on test['worker-implementation'].  In https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/7c7b271f567c/taskcluster/taskgraph/transforms/builds/make_task_description.py I extended that to take a test['run']['using'] selector so that I could indicate how to run the task differently for Windows and Linux, since they do not really inter-map.

The payload_builder in make_task.py is called based on the value of task['worker-implementation'].

I'm working on some of the same problems with builds as you are here with tests.  I don't have great solutions yet -- maybe test['run'] is a list keyed on the platform?? -- so I invite you to explore the design space a little and maybe you'll come up with something I can use for builds!

::: taskcluster/taskgraph/transforms/make_task.py
@@ +147,5 @@
> +        # worker features that should be enabled
> +        Required('relengapi-proxy', default=False): bool,
> +        Required('allow-ptrace', default=False): bool,
> +        Required('loopback-video', default=False): bool,
> +        Required('loopback-audio', default=False): bool,

I don't think generic-worker supports any? most? of these things, right?

@@ +284,5 @@
>          payload['capabilities'] = capabilities
>  
>  
> +@payload_builder('generic-worker')
> +def build_generic_worker_payload(config, task, task_def):

You can copy this, and the schema def, from https://hg.mozilla.org/users/dmitchell_mozilla.com/mozilla-central/file/7c7b271f567c/taskcluster/taskgraph/transforms/make_task.py#l131

Note that I didn't implement caches, though that should be easy enough to add.

::: taskcluster/taskgraph/transforms/tests/make_task_description.py
@@ +109,5 @@
>  @worker_setup_function("docker-engine")
>  @worker_setup_function("docker-worker")
>  def docker_worker_setup(config, test, taskdesc):
> +
> +    ARTIFACTS = [

best for this to be lower-cased since it's not a "constant" anymore
Flags: needinfo?(dustin)
This is not complete, but may be handy for try pushes.
Comment on attachment 8782231 [details] [diff] [review]
more wip - update desktop yml for windows

as a note many of these test types will not run on AWS win7 VMs- also some need the expensive VMs with dedicated GPU's- we should file a bug to get that instance type available to taskcluster and figure out how to by-test-platform the instance type :)
Depends on: 1300135
Depends on: 1300901
Depends on: 1265537
there is no longer a dependency for windows tests on simple package names. eg: the tests can correctly determine the full package name including version. see: https://hg.mozilla.org/try/rev/fc0ffe8510563a28f97baaa2624dfe6249867c0d#l10.128
No longer depends on: 1265537
Summary: Run win32 firefox build tests on TaskCluster 32 bit Windows 7 worker types → Run firefox build tests on TaskCluster Windows worker types
Depends on: 1303305
Comment on attachment 8791958 [details]
Bug 1280474 - enable tc win cpp;

https://reviewboard.mozilla.org/r/79228/#review77796

Just some drive-by thoughts...feel free to ignore for now.

::: taskcluster/ci/desktop-test/test-sets.yml:77
(Diff revision 1)
> +    - mochitest-gpu
> +    - mochitest-jetpack
> +    - mochitest-media
> +    - mochitest-webgl
> +    - reftest
> +    - reftest-no-accel

Some of these tests are not currently running on "Windows 7 VM"; maybe we shouldn't be trying to run them yet? (mochitest-gpu, mochitest-webgl, reftest, reftest-no-accel)

::: taskcluster/taskgraph/transforms/tests/make_task_description.py:306
(Diff revision 1)
> +    worker['max-run-time'] = test['max-run-time']
> +
> +    worker['artifacts'] = artifacts
> +
> +    # assemble the command line
> +    mh_command = ['c:\\mozilla-build\\python\\python.exe', '-u', normpath(mozharness['script'])]

Could we unify the mozharness command construction for docker and generic? Just break it out into a helper function so the code isn't duplicated.
Comment on attachment 8791958 [details]
Bug 1280474 - enable tc win cpp;

https://reviewboard.mozilla.org/r/79228/#review77796

> Some of these tests are not currently running on "Windows 7 VM"; maybe we shouldn't be trying to run them yet? (mochitest-gpu, mochitest-webgl, reftest, reftest-no-accel)

this patch enables them as tier 2 on staging (allizom) only, they will not be very visible there. my hope was to move them to production individually (as they go green) and to later move them up the tier (as we develop confidence in their validity)

> Could we unify the mozharness command construction for docker and generic? Just break it out into a helper function so the code isn't duplicated.

they're actually pretty different. generic-worker takes a list of commands whereas docker-worker takes a single command. as such d-w uses batch files to wrap the commands that need to be run, whereas g-w just expects the actual commands. i suppose the helper function could take an argument for the type of command it's generating (d-w vs g-w) but that's sorta the problem being addressed by the worker_setup_functions, i think.
Status: NEW → ASSIGNED
Comment on attachment 8791958 [details]
Bug 1280474 - enable tc win cpp;

https://reviewboard.mozilla.org/r/79228/#review77796

> this patch enables them as tier 2 on staging (allizom) only, they will not be very visible there. my hope was to move them to production individually (as they go green) and to later move them up the tier (as we develop confidence in their validity)

Best not to confuse *treeherder* staging with tiers.  If something's known to fail, it's probably best to run it only on-demand on try, at which point tiers don't matter, but it's fine to report to production treeherder.  If a number of people are collaborating to quash the oranges, then it's OK to land it into mozilla-central, preferably still configured to run only on try.  Once a suite is pretty much all green, you can turn it on in integration and release branches at tier 2 to increase confidence before bumping to tier 1.  But in any case, reporting to production treeherder.

> they're actually pretty different. generic-worker takes a list of commands whereas docker-worker takes a single command. as such d-w uses batch files to wrap the commands that need to be run, whereas g-w just expects the actual commands. i suppose the helper function could take an argument for the type of command it's generating (d-w vs g-w) but that's sorta the problem being addressed by the worker_setup_functions, i think.

This will eventually be handled with job descriptions pending bug 1302192, so best not to worry too much right now.
Comment on attachment 8791958 [details]
Bug 1280474 - enable tc win cpp;

https://reviewboard.mozilla.org/r/79228/#review77862

I really like the form of this.  I'm worried about the platform names, but maybe that's the way it has to be (in which case some comments in desktop_test.py would help).  I'm worried about the proliferation of checks for the "win" prefix -- I think it's OK in the YAML, but would like to limit it and be explicit in the Python.  And I'm worried about workerType names..

::: taskcluster/ci/desktop-test/test-platforms.yml:40
(Diff revision 1)
> +win64/debug:
> +    build-platform: win64/debug
> +    test-set: windows-tests
> +win64/opt:
> +    build-platform: win64/opt
> +    test-set: windows-tests

Should these test platforms be more descriptive (e.g., "winxp", "win7")?

::: taskcluster/taskgraph/transforms/tests/all_kinds.py:30
(Diff revision 1)
>      """Set the worker implementation based on the test platform."""
>      for test in tests:
> -        # this is simple for now, but soon will not be!
> +        if test['test-platform'].startswith('win'):
> +            test['worker-implementation'] = 'generic-worker'
> +        else:
> -        test['worker-implementation'] = 'docker-worker'
> +            test['worker-implementation'] = 'docker-worker'

I would like to limit the proliferation of the "platform.startswith" logic.  What do you think about adding a preliminary transform that sets something like test['worker-platform'] to 'win', 'linux', .., and then subsequent transforms can use that value?  I'd like Wander's input on this too..

::: taskcluster/taskgraph/transforms/tests/desktop_test.py:42
(Diff revision 1)
>          'linux64-asan/opt': 'linux64/asan',
>          'linux64-pgo/opt': 'linux64/pgo',
> +        'win32/opt': 'windows7-32-vm/opt',
> +        'win32/debug': 'windows7-32-vm/debug',
> +        'win64/opt': 'windows10-64/opt',
> +        'win64/debug': 'windows10-64/debug',

Following on my earlier comment about platforms -- ideally these would be the same and not need to be translated.  I suppose the difference here is that the strings on the left are necessary for try syntax, and the strings on the right are how the existing jobs are reported to treeherder?

::: taskcluster/taskgraph/transforms/tests/desktop_test.py:101
(Diff revision 1)
> -            assert test['instance-size'] != 'legacy',\
> +                assert test['instance-size'] != 'legacy',\
> -                   'Software GL layers on a legacy instance is disallowed (bug 1296086).'
> +                       'Software GL layers on a legacy instance is disallowed (bug 1296086).'
>  
> -            # This should be set always once bug 1296086 is resolved.
> +                # This should be set always once bug 1296086 is resolved.
> -            test['mozharness'].setdefault('extra-options', [])\
> +                test['mozharness'].setdefault('extra-options', [])\
> -                              .append("--allow-software-gl-layers")
> +                                  .append("--allow-software-gl-layers")

In the win case here, it'd be good to check that allow-software-gl-layers is false (I'm assuming it's not supported..)

::: taskcluster/taskgraph/transforms/tests/make_task_description.py
(Diff revision 1)
>  
>      if 'actions' in mozharness:
>          env['MOZHARNESS_ACTIONS'] = ' '.join(mozharness['actions'])
>  
> -    if config.params['project'] == 'try':
> -        env['TRY_COMMIT_MSG'] = config.params['message']

Where did this go?  This is how we pass try syntax into mozharness..

::: taskcluster/taskgraph/transforms/tests/make_task_description.py:283
(Diff revision 1)
> +        }
> +    ]
> +    mozharness = test['mozharness']
> +
> +    test_platform = taskdesc['attributes']['test_platform']
> +    target = 'firefox-{}.en-US.{}'.format(get_firefox_version(), test_platform)

I'd really like to have this be done the same way across all platforms.  Is gbrown committed to this same approach to implementing this?

::: taskcluster/taskgraph/transforms/tests/make_task_description.py:295
(Diff revision 1)
> +        '<build>', 'public/build/mozharness.zip')
> +
> +    taskdesc['worker-type'] = {
> +        'win32': 'aws-provisioner-v1/gecko-{}-t-win7-32'.format(config.params['level']),
> +        'win64': 'aws-provisioner-v1/gecko-{}-t-win10-64'.format(config.params['level'])
> +    }[test_platform]

This table should probably be broken out into a top-level constant in the file.  Also, note that test workerTypes are not divided by level, which gets us a nice bit of sharing between them.
Attachment #8791958 - Flags: review?(dustin) → review-
Wander, I mentioned you in the review above :)
Flags: needinfo?(wcosta)
(In reply to Dustin J. Mitchell [:dustin] from comment #10)
> > +    test_platform = taskdesc['attributes']['test_platform']
> > +    target = 'firefox-{}.en-US.{}'.format(get_firefox_version(), test_platform)
> 
> I'd really like to have this be done the same way across all platforms.  Is
> gbrown committed to this same approach to implementing this?

I'm fairly happy with this and don't have a better suggestion.
Attachment #8782231 - Attachment is obsolete: true
See Also: → 1303455
(In reply to Dustin J. Mitchell [:dustin] from comment #10)

> 
> ::: taskcluster/taskgraph/transforms/tests/all_kinds.py:30
> (Diff revision 1)
> >      """Set the worker implementation based on the test platform."""
> >      for test in tests:
> > -        # this is simple for now, but soon will not be!
> > +        if test['test-platform'].startswith('win'):
> > +            test['worker-implementation'] = 'generic-worker'
> > +        else:
> > -        test['worker-implementation'] = 'docker-worker'
> > +            test['worker-implementation'] = 'docker-worker'
> 
> I would like to limit the proliferation of the "platform.startswith" logic. 
> What do you think about adding a preliminary transform that sets something
> like test['worker-platform'] to 'win', 'linux', .., and then subsequent
> transforms can use that value?  I'd like Wander's input on this too..
> 

Yeah, I have this logic in Mac too. Whatever solution we come with, I would like to see an easy spot for people adding new platforms be able to see without much external help.
Flags: needinfo?(wcosta)
"worker-platform-family" might be better, actually :/
Attachment #8791958 - Flags: review?(dustin)
Depends on: 1303974
Comment on attachment 8791958 [details]
Bug 1280474 - enable tc win cpp;

https://reviewboard.mozilla.org/r/79228/#review77862

> Should these test platforms be more descriptive (e.g., "winxp", "win7")?

swapped for full test platform names as per irc discussion

> Following on my earlier comment about platforms -- ideally these would be the same and not need to be translated.  I suppose the difference here is that the strings on the left are necessary for try syntax, and the strings on the right are how the existing jobs are reported to treeherder?

done. this logic is no longer required for windows and the linux asan/pgo was left unmodified

> In the win case here, it'd be good to check that allow-software-gl-layers is false (I'm assuming it's not supported..)

we don't set it for windows. there is a [default setting in the schema](https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/tests/test_description.py#92) which sets it to true for all platforms which was apparently an [intentional decision](https://bugzilla.mozilla.org/show_bug.cgi?id=1296086#c0) so we're just opting to override that here.

> Where did this go?  This is how we pass try syntax into mozharness..

this was an accidental deletion. restored now

> This table should probably be broken out into a top-level constant in the file.  Also, note that test workerTypes are not divided by level, which gets us a nice bit of sharing between them.

done
Depends on: 1304033
Depends on: 1304034
Depends on: 1304039
Depends on: 1304040
Comment on attachment 8791958 [details]
Bug 1280474 - enable tc win cpp;

https://reviewboard.mozilla.org/r/79228/#review78496

Looks great!

::: taskcluster/ci/desktop-test/test-platforms.yml:29
(Diff revision 6)
> +# win32 vm
> +windows7-32-vm/debug:
> +    build-platform: win32/debug
> +    test-set: windows-vm-tests
> +windows7-32-vm/opt:
> +    build-platform: win32/opt
> +    test-set: windows-vm-tests
> +
> +# win32 gpu
> +windows7-32/debug:
> +    build-platform: win32/debug
> +    test-set: windows-gpu-tests
> +windows7-32/opt:
> +    build-platform: win32/opt
> +    test-set: windows-gpu-tests
> +
> +# win64 vm
> +windows10-64-vm/debug:
> +    build-platform: win64/debug
> +    test-set: windows-vm-tests
> +windows10-64-vm/opt:
> +    build-platform: win64/opt
> +    test-set: windows-vm-tests
> +
> +# win64 gpu
> +windows10-64/debug:
> +    build-platform: win64/debug
> +    test-set: windows-gpu-tests
> +windows10-64/opt:
> +    build-platform: win64/opt
> +    test-set: windows-gpu-tests
> +

so good..

::: taskcluster/taskgraph/transforms/tests/desktop_test.py:43
(Diff revision 6)
> -        'linux64-pgo/opt': 'linux64/pgo',
> +        'linux64-pgo/opt': 'linux64/pgo'
>      }
>      for test in tests:
>          build_platform = test['build-platform']
> -        test['treeherder-machine-platform'] = translation.get(build_platform, build_platform)
> +        test_platform = test['test-platform']
> +        test['treeherder-machine-platform'] = translation.get(build_platform, test_platform)

very nice!

::: taskcluster/taskgraph/transforms/tests/make_task_description.py:41
(Diff revision 6)
> +    'windows7-32': 'aws-provisioner-v1/gecko-g-win7-32',
> +    'windows10-64': 'aws-provisioner-v1/gecko-g-win10-64',

Change to `gecko-t-win7-32-gpu`, etc. per irc conversation
Attachment #8791958 - Flags: review?(dustin) → review+
Depends on: 1304430
Depends on: 1304433
Depends on: 1304435
Depends on: 1304438
See Also: → 1304176
turned out our problems were with a leftover `"virtualenv_modules": ['pywin32'],` in mozconfig that should have been deprecated.
No longer depends on: 1306421
Comment on attachment 8791958 [details]
Bug 1280474 - enable tc win cpp;

https://reviewboard.mozilla.org/r/79228/#review86310

continued thumbs-up :)
got a couple more intermittent failures, so falling back to only enabling cppunit which is the only test which has been consistently green.
Attachment #8775932 - Attachment is obsolete: true
Depends on: 1311966
Depends on: 1311978
Depends on: 1312383
Depends on: 1319453
Depends on: 1321776
Depends on: 1321782
Depends on: 1321784
when more of the blocking bugs are fixed, we can get this ready for review.
right now we are just waiting on signing the binaries or disabling MOZ_UPDATER (bug 1323732).  Once that is done, we have greened up all the VM based tests.

What still remains are the hardware tests- do we have a plan for those?  I see a few choices:
1) run hardware tests via BBB
2) get a small pool of machines and run the 6 jobs on hardware (we would need to install the machines properly and get the puppet/etc. scripts setup to not do buildbot, but tc worker)

Given there is no more work to do on greening up tests for windows 7, I am going to hand over the reigns for getting win7 tests running to the taskcluster/releng teams to handle the signing.
#1 is the correct option for hardware tests (and talos).
No longer depends on: 1321784
Depends on: 1351272
Depends on: 1352200
Depends on: 1351273
Summary of known issues:
* x7 - bug 1352200: test that needs disabling
* x8 - bug 1304040: not signed binaries
* c3 - bug 1340786: investigating
* cl - probably related to bug 1270962

jmaher: what should we do about the clipboard job?

Jobs that I believe we can enable today:
* e10s debug reftests
* a11y
* gpu

Jobs that we can't yet enable:
* chrome jobs (waiting on c3 and cl issues mentioned above)
* xpcshell jobs (waiting on issues mentioned above)

What's involved to making this jobs tier 1 besides greening the jobs up? (and making the builds tier1)
Depends on: 1351307
Thanks for the great summary Armen.

the cl (clipboard) job historically doesn't run on VM because the VM has a clipboard service which we were not able to disable and that interferes with the OS level one causing problems.  If we can show that cl runs 40 times in opt/debug e10s/non-e10s without problem (or very rare non repeating intermittent) I would be happy to switch it to vm.
OK. There's nothing we can do for the clipboard job atm. Punting it to bug 1354219 until we have a HW+TC solution (same issue as running talos on in-house machines).

I think that's it for the π team investigation.
Depends on: 1356240
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: