Closed Bug 1371992 Opened 7 years ago Closed 7 years ago

Intermittent mozmake.EXE: *** [check] Error 1 after testsuite-targets.mk:269: recipe for target 'check' failed

Categories

(Firefox Build System :: General, defect)

defect
Not set
normal

Tracking

(firefox56 fixed, firefox57 fixed)

RESOLVED FIXED
mozilla57
Tracking Status
firefox56 --- fixed
firefox57 --- fixed

People

(Reporter: aryx, Assigned: ted)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell fixed:other])

Attachments

(1 file)

https://treeherder.mozilla.org/logviewer.html#?job_id=106093705&repo=autoland

16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_cannot_cross
16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_clang
16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_clang_cl
16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_gcc
16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_msvc
16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_overridden_unsupported_clang
16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_overridden_unsupported_gcc
16:42:18     INFO - TEST-PASS | z:\task_1497110818\build\src\python\mozbuild\mozbuild\test\configure\test_toolchain_configure.py | WindowsToolchainTest.test_unsupported_msvc
16:42:18     INFO - Return code from mach python-test: 1
16:42:18     INFO - 1
16:42:18     INFO - z:/task_1497110818/build/src/testing/testsuite-targets.mk:269: recipe for target 'check' failed
16:42:18     INFO - mozmake.EXE: *** [check] Error 1
this picked up in frequency on August 18th on win32/64 opt/debug.

:gps, I see you as the triage owner for the build config component.  Is this something you or another build peer can look into and either classify this correctly or work on fixing it?
Flags: needinfo?(gps)
Whiteboard: [stockwell needswork]
From looking at a few logs, these all seem to be failures in the mozlint Python tests while shutting down multiprocessing:

17:01:17     INFO - ..\python\mozlint\test\test_types.py::test_no_filter FAILED
17:01:17     INFO - ================================== FAILURES ===================================
17:01:17     INFO - _______________________________ test_no_filter ________________________________
17:01:17     INFO - lint = <mozlint.roller.LintRoller object at 0x0333E770>
17:01:17     INFO - lintdir = 'z:\\build\\build\\src\\python\\mozlint\\test\\linters'
17:01:17     INFO - files = ['z:\\build\\build\\src\\python\\mozlint\\test\\files\\foobar.js', 'z:\\build\\build\\src\\python\\mozlint\\test\\files\\foobar.py', 'z:\\build\\build\\src\\python\\mozlint\\test\\files\\no_foobar.js']
17:01:17     INFO -     def test_no_filter(lint, lintdir, files):
17:01:17     INFO -         lint.read(os.path.join(lintdir, 'explicit_path.yml'))
17:01:17     INFO - >       result = lint.roll(files)
17:01:17     INFO - ..\python\mozlint\test\test_types.py:45:
17:01:17     INFO - _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
17:01:17     INFO - ..\python\mozlint\mozlint\roller.py:159: in roll
17:01:17     INFO -     m.shutdown()
17:01:17     INFO - c:\mozilla-build\python\Lib\multiprocessing\util.py:207: in __call__
17:01:17     INFO -     res = self._callback(*self._args, **self._kwargs)
17:01:17     INFO - c:\mozilla-build\python\Lib\multiprocessing\managers.py:625: in _finalize_manager
17:01:17     INFO -     process.terminate()
17:01:17     INFO - c:\mozilla-build\python\Lib\multiprocessing\process.py:137: in terminate
17:01:17     INFO -     self._popen.terminate()
17:01:17     INFO - _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
17:01:17     INFO - self = <multiprocessing.forking.Popen object at 0x03331F90>
17:01:17     INFO -     def terminate(self):
17:01:17     INFO -         if self.returncode is None:
17:01:17     INFO -             try:
17:01:17     INFO - >               _subprocess.TerminateProcess(int(self._handle), TERMINATE)
17:01:17     INFO - E               WindowsError: [Error 5] Access is denied
17:01:17     INFO - c:\mozilla-build\python\Lib\multiprocessing\forking.py:312: WindowsError
17:01:17     INFO - ===================== 1 failed, 4 passed in 37.74 seconds =====================
17:01:17     INFO - Setting retcode to 1 from z:\build\build\src\python\mozlint\test\test_types.py

A few random thoughts while poking around docs looking into this:
1) multiprocessing.Manager's shutdown method's docs say "This is only available if start() has been used to start the server process." I don't actually see a call to start in the mozlint code, is that an issue?
   https://docs.python.org/2/library/multiprocessing.html#multiprocessing.managers.BaseManager.shutdown
   https://dxr.mozilla.org/mozilla-central/rev/1867d7931c0a70ab90edf4aa84876525773a7139/python/mozlint/mozlint/roller.py#128
2) We seem to be creating that Manager purely to get a Queue out of it, but we could instead simply create a multiprocessing.Queue directly. Am I missing anything there?
   https://docs.python.org/2/library/multiprocessing.html#pipes-and-queues
:ted, could you help find someone to look at this?  we are seeing many failures here
Flags: needinfo?(gps) → needinfo?(ted)
I ran `mach lint` locally on my Windows machine and it seems to work fine with this change, including raising an exception in the wpt linter because I don't have node installed (the mach command exited fine, though). I also ran `mach python-test python/mozlint` and all the tests passed locally. We'll see what try has to say.
Assignee: nobody → ted
Flags: needinfo?(ted)
Comment on attachment 8902264 [details]
bug 1371992 - make mozlint's LintRoller use concurrent.futures.

https://reviewboard.mozilla.org/r/173802/#review179102

Thanks for doing this, this looks much better! Works for me locally too, and the SIGINT handling also seems to be improved.
Attachment #8902264 - Flags: review?(ahalberstadt) → review+
I ran 7 each of Win32 and Win64 debug builds on that try push and they're all green, so it's at least not completely broken. I triggered a few more just to check, but this failure mode is just infrequent enough that it's probably hard to catch on try. (I wish we had standalone Python test jobs on Windows!)
Pushed by tmielczarek@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f9885a560f3c
make mozlint's LintRoller use concurrent.futures. r=ahal
https://hg.mozilla.org/mozilla-central/rev/f9885a560f3c
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
https://hg.mozilla.org/releases/mozilla-beta/rev/6664fab5dea9
Whiteboard: [stockwell needswork] → [stockwell fixed]
Whiteboard: [stockwell fixed] → [stockwell fixed:other]
Looking at the orangefactor link for this bug it looks like this patch fixed this:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1371992&startday=2017-08-28&endday=2017-09-05&tree=all

There are no occurrences since August 29th, which is when this merged to central.
Product: Core → Firefox Build System
Blocks: 1474129
You need to log in before you can comment on or make changes to this bug.