https://treeherder.mozilla.org/logviewer.html#?job_id=98866334&repo=autoland 08:01:59 INFO - mozmake.EXE: Leaving directory 'z:/build/build/src/obj-firefox/dom/plugins/test/testplugin/secondplugin' 08:01:59 INFO - dtlscon.c 08:01:59 INFO - z:/build/build/src/sccache2/sccache.exe z:/build/build/src/vs2015u3/VC/bin/amd64_x86/cl.exe -Foprfile.obj -c -DNDEBUG=1 -DTRIMMED=1 -D_NSPR_BUILD_ -DWIN32 -DXP_PC -D_PR_GLOBAL_THREADS_ONLY -DWIN95 -UWINNT -D_X86_ -Iz:/build/build/src/config/external/nspr/pr -Iz:/build/build/src/obj-firefox/config/external/nspr/pr -Iz:/build/build/src/config/external/nspr -Iz:/build/build/src/nsprpub/pr/include -Iz:/build/build/src/nsprpub/pr/include/private -Iz:/build/build/src/obj-firefox/dist/include -Iz:/build/build/src/obj-firefox/dist/include/nspr -Iz:/build/build/src/obj-firefox/dist/include/nss -MD -FI z:/build/build/src/obj-firefox/mozilla-config.h -DMOZILLA_CLIENT -deps.deps/prfile.obj.pp -TC -nologo -wd4091 -D_HAS_EXCEPTIONS=0 -W3 -Gy -Zc:inline -utf-8 -arch:SSE2 -Gw -wd4244 -wd4267 -we4553 -Z7 -O1 -Oi -Oy- z:/build/build/src/nsprpub/pr/src/io/prfile.c 08:01:59 INFO - z:/build/build/src/sccache2/sccache.exe z:/build/build/src/vs2015u3/VC/bin/amd64_x86/cl.exe -Fos_asinh.obj -c -Iz:/build/build/src/obj-firefox/dist/stl_wrappers -DNDEBUG=1 -DTRIMMED=1 -DMOZ_HAS_MOZGLUE -Iz:/build/build/src/modules/fdlibm/src -Iz:/build/build/src/obj-firefox/modules/fdlibm/src -Iz:/build/build/src/obj-firefox/dist/include -Iz:/build/build/src/obj-firefox/dist/include/nspr -Iz:/build/build/src/obj-firefox/dist/include/nss -MD -FI z:/build/build/src/obj-firefox/mozilla-config.h -DMOZILLA_CLIENT -deps.deps/s_asinh.obj.pp -TP -nologo -wd5026 -wd5027 -Zc:sizedDealloc- -wd4091 -wd4577 -D_HAS_EXCEPTIONS=0 -W3 -Gy -Zc:inline -utf-8 -arch:SSE2 -Gw -wd4251 -wd4244 -wd4267 -wd4800 -wd4595 -we4553 -GR- -Z7 -O1 -Oi -Oy- -WX -wd4018 -wd4146 -wd4305 -wd4723 -wd4756 z:/build/build/src/modules/fdlibm/src/s_asinh.cpp 08:01:59 INFO - z:\build\build\src\js\src\ctypes\libffi\msvcc.sh: line 235: cl: command not found 08:01:59 INFO - z:/build/build/src/config/rules.mk:1055: recipe for target 'win32.obj' failed 08:01:59 INFO - mozmake.EXE: *** [win32.obj] Error 127 08:01:59 INFO - mozmake.EXE: Leaving directory 'z:/build/build/src/obj-firefox/config/external/ffi' 08:01:59 INFO - z:/build/build/src/config/recurse.mk:73: recipe for target 'config/external/ffi/target' failed
2 years ago
Summary: Intermittent-infra → Intermittent-infra mozmake.EXE: *** [win32.obj] Error 127 OR [win64.obj] Error 127
Since this bug is about taskcluster jobs running on taskcluster instances where (insert some handwaving here) something starts deleting things like entire directories or essential programs like cl.exe in the middle of a build, let's let taskcluster have it.
Component: Mozharness → General
Product: Release Engineering → Taskcluster
Version: unspecified → Trunk
Pete, do you have some background on this?
Assignee: nobody → pmoore
just a thought: very early in the build-on-taskcluster-windows experiment, we had weird problems with builds failing with strange race conditions that we didn't properly understand. we could make the builds succeed by adding -j1 to the mach command. as i understand it, this forced mach to do everything sequentially rather than in parallel. the downside was that builds using -j1 would take upwards of 4 hours to complete. one day we discovered, through much trial and error, that wrapping calls to mach with bash.exe (instead of python.exe), magically made all the race conditions go away without the use of -j1 so building in parallel just worked. we didn't try to understand why, we were just very happy for our good fortune. i notice that on may 3rd, changes landed that got rid of the magic bash hack (see https://hg.mozilla.org/mozilla-central/rev/843439b1f0d5#l1.22). i also note that this bug was opened 10 days later. i don't know if this is a coincidence.
I'm going to push a backout of that patch to see what happens.
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/54163bd59f7b Add back the hack invoking mach via bash to see if it makes the TC build machines happy again. r=pmoore
And so we're clear, here's a rough list of the problems that may have been caused by reverting that hack back in early May: https://docs.google.com/spreadsheets/d/1T5SL6jflnRByIVfNt4-MNiXQje4Ov6qMWiqs42L0hcg/edit#gid=0
I did 10 runs of every TC Windows build job on the push in comment 13 and not a single version of the failures covered by lines 2-7 in the spreadsheet. I think we have a winner! Greg, do you want to investigate this more for a root cause or should we close the bug out when it merges around?
I'd love to investigate root cause. But my understanding is grenade and others burned hours on this esoteric workaround as part of early TC work. I'm not inclined to spend several hours to reach the same head scratching conclusion. The workaround - hacky and mysterious as it is - works for me.
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
arr has asked me to annotate the code-base pointing to this bug, to warn future refactor authors
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
reopen is only to get review-board to accept the commit. can be closed again on merge.
Comment on attachment 8876650 [details] Bug 1364651 - annotate mach bash hack; https://reviewboard.mozilla.org/r/147998/#review152514 ::: testing/mozharness/mozharness/mozilla/building/buildbase.py:1626 (Diff revision 1) > self.copyfile( > buildprops, > os.path.join(dirs['abs_work_dir'], 'buildprops.json')) > > if 'MOZILLABUILD' in os.environ: > + # here be dragons. see bug 1364651 NIT: lets put more actual info into the comment than a scary pointer at a bug: "We found many issues with intermittent build failures when not invoking mach via bash. See bug 1364651 before considering changing" or some such. All in all though, +1 to commenting and I won't block on a second round of review.
Attachment #8876650 - Flags: review?(bugspam.Callek) → review+
(In reply to Justin Wood (:Callek) from comment #24) > NIT: lets put more actual info into the comment than a scary pointer at a > bug: > > "We found many issues with intermittent build failures when not invoking > mach via bash. See bug 1364651 before considering changing" > > or some such. FWIW I (personally) prefer the succintness and immediate danger of "here be dragons". A wordy comment can be more readily overlooked.
Pushed by email@example.com: https://hg.mozilla.org/integration/autoland/rev/71a661bc483c annotate mach bash hack; r=Callek
Status: REOPENED → RESOLVED
Closed: 2 years ago → 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.