Closed
Bug 1337903
Opened 8 years ago
Closed 8 years ago
Pull python-tests out of 'make check' on OSX
Categories
(Firefox Build System :: General, defect)
Firefox Build System
General
Tracking
(firefox55 fixed)
RESOLVED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox55 | --- | fixed |
People
(Reporter: ahal, Assigned: dustin)
References
(Blocks 1 open bug)
Details
Attachments
(4 files)
Bug 1320194 and bug 1003417 are close to landing which means we'll be able to run python-tests in their own taskcluster tasks on Linux.
There still isn't a great story for OSX and Windows though, as the taskcluster support isn't quite there. I think OSX is going to be the higher priority due to the data centre migration (?), so this bug will only focus on that for now.
We'll either need to get taskcluster run_task support working on OSX, or else stand up some buildbot jobs that do it. Obviously the former would be preferable, but I'm not sure what dependencies (other than implementing the in-tree configs) it has.
Comment 1•8 years ago
|
||
Ahal, can you follow up with dustin about what TaskCluster support needs to be added?
Flags: needinfo?(ahalberstadt)
Reporter | ||
Comment 2•8 years ago
|
||
On OSX taskcluster uses something called "taskcluster-worker" instead of "docker-worker". From the taskcluster service side of things, we should be able to use taskcluster-worker tasks right now, though for the moment caching doesn't work. Greg Arndt said this is mostly just because it hasn't been prioritized yet.
So the good news is that we should be able to get it working only with in-tree changes to the taskcluster configs side (sans caches).
I don't think anyone is fully aware of what will be needed, but I did some digging and found we'll at least need to do the following:
1) Go through the files in this directory:
https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/job
Then pull out "docker-worker" specific logic/configs, and share the rest with a "taskcluster-worker" run_using entry point.
2) Add a "taskcluster-worker" payload builder here:
https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/task.py#430
I started working on a patch and I think I mostly know how to do 1). But I gave up once I realized that I needed to do 2) as well. There may be more steps beyond 2). Overall I think this is pretty do-able, though it probably requires support from the taskcluster team.
If I were working on this, I'd estimate 2 weeks to working prototype and a month to complete.
Reporter | ||
Updated•8 years ago
|
Flags: needinfo?(ahalberstadt)
Assignee | ||
Comment 3•8 years ago
|
||
tc-worker is super-flexible, so we are building its payloads on a per-engine basis. The native engine is the one we use on OS X (since it runs tasks natively, not in containers or VMs). Happily, that payload builder already exists
https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/task.py#253
https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/task.py#591
so, I think (2) is largely done for you. And you're right about (1). I think you'd need to add
@run_job_using("native-engine", "run-task", schema=run_task_schema)
similar to
https://dxr.mozilla.org/mozilla-central/source/taskcluster/taskgraph/transforms/job/run_task.py#42
and then add a new alternative for mozbase, using `implementation: native-engine`:
https://dxr.mozilla.org/mozilla-central/source/taskcluster/ci/source-test/python-tests.yml#37
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → dustin
Assignee | ||
Comment 4•8 years ago
|
||
Assignee | ||
Comment 5•8 years ago
|
||
Assignee | ||
Comment 6•8 years ago
|
||
So run-task is fairly linux-specific. In particular, it does lots of chowny stuff on the assumption that it's running as root, which it is not on a mac. So that's pretty annoying. It also has logic to fetch hg fingerprints from the secrets service, but we don't have a taskcluster proxy on OS X, so it doesn't have access to that service. I think I will just rely on the fallback behavior for that, for the interim.
Assignee | ||
Comment 7•8 years ago
|
||
Assignee | ||
Comment 8•8 years ago
|
||
Assignee | ||
Comment 9•8 years ago
|
||
Assignee | ||
Comment 10•8 years ago
|
||
Assignee | ||
Comment 11•8 years ago
|
||
Assignee | ||
Comment 12•8 years ago
|
||
Assignee | ||
Comment 13•8 years ago
|
||
Comment hidden (mozreview-request) |
Assignee | ||
Comment 15•8 years ago
|
||
Assignee | ||
Comment 16•8 years ago
|
||
I think I have the tests running. They are "succeeding" in that they exit with status 0, but seem to not be actually doing anything.
https://public-artifacts.taskcluster.net/Urcrfsx9RzCbnnlreuoV5w/0/public/logs/live_backing.log
[task 2017-03-06T20:30:09.163251Z] Installed /Users/cltbld/checkouts/gecko/obj-x86_64-apple-darwin14.5.0/_virtualenv/lib/python2.7/site-packages/blessings-1.6-py2.7.egg
[task 2017-03-06T20:30:09.163298Z] Finished processing dependencies for mozlog==3.4
[task 2017-03-06T20:30:09.163331Z] Error in atexit._run_exitfuncs:
[task 2017-03-06T20:30:09.163364Z] Traceback (most recent call last):
[task 2017-03-06T20:30:09.163425Z] File "/tools/python27/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
[task 2017-03-06T20:30:09.163458Z] func(*targs, **kargs)
[task 2017-03-06T20:30:09.163525Z] File "/tools/python27/lib/python2.7/multiprocessing/util.py", line 284, in _exit_function
[task 2017-03-06T20:30:09.163557Z] info('process shutting down')
[task 2017-03-06T20:30:09.163592Z] TypeError: 'NoneType' object is not callable
[task 2017-03-06T20:30:09.163618Z] Error in sys.exitfunc:
[task 2017-03-06T20:30:09.163651Z] Traceback (most recent call last):
[task 2017-03-06T20:30:09.163713Z] File "/tools/python27/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
[task 2017-03-06T20:30:09.163743Z] func(*targs, **kargs)
[task 2017-03-06T20:30:09.163813Z] File "/tools/python27/lib/python2.7/multiprocessing/util.py", line 284, in _exit_function
[task 2017-03-06T20:30:09.163845Z] info('process shutting down')
[task 2017-03-06T20:30:09.163886Z] TypeError: 'NoneType' object is not callable
[task 2017-03-06T20:30:09.163898Z]
[task 2017-03-06T20:30:09.261528Z] cat: backend.TestManifestBackend.in: No such file or directory
[task 2017-03-06T20:30:09.264755Z] Build configuration changed. Regenerating backend.
[task 2017-03-06T20:30:09.651903Z] No build detected, test metadata may be incomplete.
[task 2017-03-06T20:30:09.677458Z] 0:05.72 Return code from mach python-test: 0
so maybe that's what's missing? At any rate this seems a bit out of my expertise at this point.
And, in fact, I don't see anything called "backend.TestManifestBackend.in"
Flags: needinfo?(dave.hunt)
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Updated•8 years ago
|
Attachment #8844141 -
Flags: review?(ahalberstadt)
Assignee | ||
Comment 19•8 years ago
|
||
(I'd like to get :gps's review on the run-task changes, since he wrote it, but he's away for a few more weeks)
Comment 20•8 years ago
|
||
mozreview-review |
Comment on attachment 8844125 [details]
Bug 1337903: install robustclone on workers
https://reviewboard.mozilla.org/r/117672/#review119386
just a nit in commit message, lgtm
::: commit-message-0d94f:3
(Diff revision 1)
> +Bug 1337903: install robustclone on workers; r?wcosta r?grenade
> +
> +This installes the extension everywhere, unconditionally. Because we do
nit: "installs"
Attachment #8844125 -
Flags: review?(wcosta) → review+
Reporter | ||
Comment 21•8 years ago
|
||
Dave, just clarifying your needinfo here. I told Dustin to needinfo you because it looked like there was a problem setting up mozlog (due to the fact we now setup.py install it), and was wondering if you had seen a traceback like that before. But on closer inspection it looks like the error is happening after mozlog is installed.. so I guess just confirm you haven't that traceback before.
The 'cat: backend.TestManifestBackend.in: No such file or directory' message is harmless, and actually expected in this case. We should probably file a bug to suppress it, as it does add confusion.
The real problem must be those tracebacks that look like they are coming out of the 'virtualenv install' step from the build. Though, I can't be sure and have no idea where specifically they are originating.. Mike, have you seen anything like those tracebacks in comment 16 before?
Flags: needinfo?(mshal)
Reporter | ||
Comment 22•8 years ago
|
||
Another possibility is that those tracebacks are also harmless, and you are simply trying to run tests that require a build to get detected. If that's the case, then everything is working as expected (minus cryptic tracebacks in the log).
Assignee | ||
Comment 23•8 years ago
|
||
I guess that wouldn't quite be "as expected" since there's no build in this case, and there should be one?
Comment 24•8 years ago
|
||
mozreview-review |
Comment on attachment 8844125 [details]
Bug 1337903: install robustclone on workers
https://reviewboard.mozilla.org/r/117672/#review119542
Attachment #8844125 -
Flags: review?(rthijssen) → review+
Comment 25•8 years ago
|
||
> Dave, just clarifying your needinfo here. I told Dustin to needinfo you because it looked like there was a problem setting up mozlog (due to the fact we now setup.py install it), and was wondering if you had seen a traceback like that before. But on closer inspection it looks like the error is happening after mozlog is installed.. so I guess just confirm you haven't that traceback before.
Thanks for the clarification. I can confirm that I've not seen that traceback before, and as you say it doesn't appear to be from mozlog.
Flags: needinfo?(dave.hunt)
Reporter | ||
Comment 26•8 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #23)
> I guess that wouldn't quite be "as expected" since there's no build in this
> case, and there should be one?
No, there shouldn't be a build. Actually, just noticed that you are running the mozbase tests. Unless you modified their subsuites in the manifests to include OSX, e.g:
https://dxr.mozilla.org/mozilla-central/source/testing/mozbase/mozcrash/tests/manifest.ini
Then, they won't get picked up. Try adding `|| os == "mac"` there and see if it gets picked up.
Comment 27•8 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #21)
> The real problem must be those tracebacks that look like they are coming out
> of the 'virtualenv install' step from the build. Though, I can't be sure and
> have no idea where specifically they are originating.. Mike, have you seen
> anything like those tracebacks in comment 16 before?
I have not seen that before, though it looks like somehow the info function in multiprocessing/util.py is getting overridden with None. I don't know how that would happen without some intentional overwriting of sys.modules or something.
Why is blessings being downloaded as part of the virtualenv setup in the first place? It's already in the tree as python/blessings. We don't want to rely on external servers for things like that.
Flags: needinfo?(mshal)
Reporter | ||
Comment 28•8 years ago
|
||
mozreview-review |
Comment on attachment 8844140 [details]
Bug 1337903: add and use a job_try_name attribute;
https://reviewboard.mozilla.org/r/117676/#review119748
Great cleanup, thanks!
::: commit-message-76bf7:19
(Diff revision 1)
> +The tasks remain un-selectable in try, as before (they always run, but can be
> +optimized away by when.files-changed).
They actually can be selected with -j.. It's just undocumented and not supported by 'mach try', but we should fix both those things.
::: taskcluster/ci/source-test/mozlint.yml:1
(Diff revision 1)
> -mozlint-eslint/opt:
> +mozlint-eslint:
Thanks for getting 'opt' out of the task name.. that was bugging me :)
::: taskcluster/taskgraph/transforms/source_test.py:34
(Diff revision 1)
> + # The platform on which this task runs. This will be used to set up attributes
> + # (for try selection) and treeherder metadata (for display). If given as a list,
> + # the job will be "split" into multiple tasks, one with each platform.
> + Required('platform'): Any(basestring, [basestring]),
I wonder if we should even support a string here.. Could just require a list which makes it more obvious how to add more.. but guess it's not a big deal.
Attachment #8844140 -
Flags: review?(ahalberstadt) → review+
Reporter | ||
Comment 29•8 years ago
|
||
mozreview-review |
Comment on attachment 8844141 [details]
Bug 1337903: add support for OS X in run-task;
https://reviewboard.mozilla.org/r/117678/#review119760
Thanks so much for doing this looks great! Just some minor comments and questions. Agreed it would have been ideal for gps to review this, but not worth delaying this for several more weeks.
::: taskcluster/ci/source-test/python-tests.yml:57
(Diff revision 1)
>
> mozbase:
> description: testing/mozbase unit tests
> platform:
> - linux64/opt
> + - macosx64/opt
I'd be in favour of leaving the changes in this file for a follow-up bug. We can deal with getting the subsuites right and investigate those tracebacks there as well.
That said it's useful to include this here for now for testing purposes.
::: taskcluster/docker/recipes/run-task:105
(Diff revision 1)
> + # Obtain certificate fingerprints. Without this, the checkout will use the fingerprint
> + # on the system, which is (hopefully) managed some other way
So how is it managed for native-worker?
::: taskcluster/docker/recipes/run-task:188
(Diff revision 1)
> + # expand ~ in some paths
> + if args.vcs_checkout:
> + args.vcs_checkout = os.path.expanduser(args.vcs_checkout)
> + if args.tools_checkout:
> + args.tools_checkout = os.path.expanduser(args.tools_checkout)
I almost want to say use a for loop with 'getattr', but their formats are all just different enough that I won't.
::: taskcluster/taskgraph/transforms/job/mach.py:33
(Diff revision 1)
> run = job['run']
>
> # defer to the run_task implementation
> - run['command'] = 'cd /home/worker/checkouts/gecko && ./mach ' + run['mach']
> + run['command'] = 'cd ~/checkouts/gecko && ./mach ' + run['mach']
> run['checkout'] = True
> del run['mach']
This could be factored out to a 'run_mach' function.
::: taskcluster/taskgraph/transforms/job/run_task.py:79
(Diff revision 1)
> + if run['checkout']:
> + support_vcs_checkout(config, job, taskdesc)
> +
> + if run['requires-build']:
> + add_build_dependency(config, job, taskdesc)
A lot of common code here as well.
Attachment #8844141 -
Flags: review?(ahalberstadt) → review-
Assignee | ||
Comment 30•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8844140 [details]
Bug 1337903: add and use a job_try_name attribute;
https://reviewboard.mozilla.org/r/117676/#review119748
> They actually can be selected with -j.. It's just undocumented and not supported by 'mach try', but we should fix both those things.
Oh, interesting -- this change would make them selectable with `-j linux64`, which is probably not what you want. Currently `-j` selects jobs by looking for tasks with a kind in JOB_KINDS and with `build_platform` matching the `-j` option. Should I set `build_platform` to mozbase for these tasks? That seems wrong (the build platform is not, in fact, mozbase).
Assignee | ||
Comment 31•8 years ago
|
||
mozreview-review |
Comment on attachment 8844141 [details]
Bug 1337903: add support for OS X in run-task;
https://reviewboard.mozilla.org/r/117678/#review120086
::: taskcluster/ci/source-test/python-tests.yml:57
(Diff revision 1)
>
> mozbase:
> description: testing/mozbase unit tests
> platform:
> - linux64/opt
> + - macosx64/opt
If I'm understanding correctly, you're saying to omit this additional platform, since the resulting tests aren't ready yet, and landing the rest?
::: taskcluster/docker/recipes/run-task:105
(Diff revision 1)
> + # Obtain certificate fingerprints. Without this, the checkout will use the fingerprint
> + # on the system, which is (hopefully) managed some other way
With Puppet. I'm not sure why I wrote "(hopefully)"!
::: taskcluster/taskgraph/transforms/job/mach.py:33
(Diff revision 1)
> run = job['run']
>
> # defer to the run_task implementation
> - run['command'] = 'cd /home/worker/checkouts/gecko && ./mach ' + run['mach']
> + run['command'] = 'cd ~/checkouts/gecko && ./mach ' + run['mach']
> run['checkout'] = True
> del run['mach']
Oh, good point, I didn't see that they had converged to be almost identical!
Reporter | ||
Comment 32•8 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #30)
> Oh, interesting -- this change would make them selectable with `-j linux64`,
> which is probably not what you want. Currently `-j` selects jobs by looking
> for tasks with a kind in JOB_KINDS and with `build_platform` matching the
> `-j` option. Should I set `build_platform` to mozbase for these tasks?
> That seems wrong (the build platform is not, in fact, mozbase).
Yeah let's not do that, it sounds like maybe -j needs to be re-worked. Practically no one uses -j, so maybe we can just leave it as is for now and file a follow-up to make it look at the 'label' (or name or whatever it ends up as) instead. Or if the fix is simple, you could throw it in here too.
Assignee | ||
Comment 33•8 years ago
|
||
I think I came up with a fix for that.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Assignee | ||
Comment 37•8 years ago
|
||
I filed bug 1345554 for mozreview's confusion over which of those has and has not been reviewed.
Reporter | ||
Comment 38•8 years ago
|
||
mozreview-review |
Comment on attachment 8844141 [details]
Bug 1337903: add support for OS X in run-task;
https://reviewboard.mozilla.org/r/117678/#review120186
Thanks, looks good!
::: taskcluster/ci/source-test/python-tests.yml:57
(Diff revisions 1 - 2)
>
> mozbase:
> description: testing/mozbase unit tests
> platform:
> - linux64/opt
> - - macosx64/opt
> + # macosx64/opt -- not ready yet
This will be removed before landing right? Otherwise I could picture this deterring people from working on it under ths assumption there are mysterious blockers :)
Attachment #8844141 -
Flags: review?(ahalberstadt) → review+
Reporter | ||
Comment 39•8 years ago
|
||
mozreview-review-reply |
Comment on attachment 8844141 [details]
Bug 1337903: add support for OS X in run-task;
https://reviewboard.mozilla.org/r/117678/#review120186
> This will be removed before landing right? Otherwise I could picture this deterring people from working on it under ths assumption there are mysterious blockers :)
Sorry, guess I should have clarified my last issue. By useful for "testing" purposes, I just mean while we were debugging this patch.. Please uncommit this file.
Reporter | ||
Comment 40•8 years ago
|
||
mozreview-review |
Comment on attachment 8844140 [details]
Bug 1337903: add and use a job_try_name attribute;
https://reviewboard.mozilla.org/r/117676/#review120262
::: taskcluster/taskgraph/try_option_syntax.py:573
(Diff revision 2)
> - elif attr('kind') in JOB_KINDS:
> + elif attr('kind') in JOB_KINDS_MATCHING_BUILD_PLATFORM:
> # This will add 'job' tasks to the target set even if no try syntax was specified.
> if not self.jobs:
> return True
> if attr('build_platform') in self.jobs:
> return True
Would it be hard to get 'toolchain' and 'android-stuff' to also use job_try_name and set it to the build_platform via transform? Then you could remove this block.
Consider this optional.
Reporter | ||
Comment 41•8 years ago
|
||
mozreview-review |
Comment on attachment 8845003 [details]
Bug 1337903: handle platforms for source-test distinctly;
https://reviewboard.mozilla.org/r/118250/#review120266
In case it wasn't clear, I properly reviewed the 2nd changeset as well.
Attachment #8845003 -
Flags: review?(ahalberstadt) → review+
Assignee | ||
Comment 42•8 years ago
|
||
(In reply to Andrew Halberstadt [:ahal] from comment #39)
> Sorry, guess I should have clarified my last issue. By useful for "testing"
> purposes, I just mean while we were debugging this patch.. Please uncommit
> this file.
To be clear, you're asking me to remove all hunks in this file, and change the commit message from "add, and use, support for OS X in run-task" to just "add support for OS X in run-task"? Then we'll actually use it in a followup, based on comment 29?
> Would it be hard to get 'toolchain' and 'android-stuff' to also use
> job_try_name and set it to the build_platform via transform? Then you could
> remove this block.
>
> Consider this optional.
I looked at it briefly, and it wasn't obvious what try names were expected. For toolchain, I think I can ask glandium. 'android-stuff' is sort of a graveyard for unloved tasks, so I could probably change that with no deep concern. I made a follow-on: bug 1345863.
Assignee | ||
Updated•8 years ago
|
Keywords: leave-open
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Reporter | ||
Comment 46•8 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #42)
> To be clear, you're asking me to remove all hunks in this file, and change
> the commit message from "add, and use, support for OS X in run-task" to just
> "add support for OS X in run-task"? Then we'll actually use it in a
> followup, based on comment 29?
Yes. I'll file the bug to get mozbase pulled out on OSX. Just felt it would be cleaner to tackle separately as there are several other non-taskcluster related things that will need to be changed.
> I looked at it briefly, and it wasn't obvious what try names were expected.
> For toolchain, I think I can ask glandium. 'android-stuff' is sort of a
> graveyard for unloved tasks, so I could probably change that with no deep
> concern. I made a follow-on: bug 1345863.
Couldn't you just set it to 'build-platform' and then it would have the same behaviour as now? Or is it more complicated than that? Either way, it's not a huge priority and I'm happy with the follow-up on file. Thanks!
Reporter | ||
Comment 47•8 years ago
|
||
Initially I had filed this as a meta bug, but now it contains the necessary precursor work. So moving bug 1342230 from dependency to blocked by.
Comment 48•8 years ago
|
||
Pushed by dmitchell@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/52d4c68eddd2
handle platforms for source-test distinctly; r=ahal
https://hg.mozilla.org/integration/autoland/rev/f725dc76b398
add and use a job_try_name attribute; r=ahal
https://hg.mozilla.org/integration/autoland/rev/774a218c2c4c
add support for OS X in run-task; r=ahal
Comment 49•8 years ago
|
||
bugherder |
Assignee | ||
Comment 50•8 years ago
|
||
Ah, I was leaving open, but bug 1345874 is the follow-on.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
Updated•7 years ago
|
Product: Core → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•