Closed Bug 1269809 Opened 9 years ago Closed 9 years ago

mspdb140.dll is missing from your computer

Categories

(Release Engineering :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: pmoore, Unassigned)

References

Details

Attachments

(6 files)

When trying to build Firefox opt/debug 32/64 bit builds in https://treeherder.mozilla.org/#/jobs?repo=try&revision=a8c9b9a790db9f22cb354d3739d12fcf45de8778 we are getting a pop-up window on the Windows worker which says: "The program can't start because mspdb140.dll is missing from your computer. Try reinstalling the program to fix this problem." I'm able to locate mspdb140.dll (both 32 bit and 64 bit versions): X:\Task_1462293303\build\src\vs2015u1\VC\bin\amd64\mspdb140.dll X:\Task_1462293303\build\src\vs2015u1\VC\bin\mspdb140.dll These should both be in the path, so I'm not sure why they are not getting found. This is the tail of the log file, at the point the build hangs with this popup: 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/layout/style' 16:56:35 INFO - c:/mozilla-build/python/python2.7.EXE x:/Task_1462292895/build/src/sccache/sccache.py x:/Task_1462292895/build/src/vs2015u1/VC/bin/amd64_x86/cl.EXE -E -nologo -DNDEBUG=1 -DTRIMMED=1 -DWIN32_LEAN_AND_MEAN -D_WIN32 -DWIN32 -D_CRT_RAND_S -DCERT_CHAIN_PARA_HAS_EXTRA_FIELDS -DOS_WIN=1 -D_UNICODE -DCHROMIUM_BUILD -DU_STATIC_IMPLEMENTATION -DUNICODE -D_WINDOWS -D_SECURE_ATL -DCOMPILER_MSVC -DSTATIC_EXPORTABLE_JS_API -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -DA11Y_LOG=1 -DACCESSIBILITY=1 -DBUILD_CTYPES=1 -DCROSS_COMPILE='' -DD_INO=d_ino -DE10S_TESTING_ONLY=1 -DEARLY_BETA_OR_EARLIER=1 -DENABLE_INTL_API=1 -DENABLE_MARIONETTE=1 -DENABLE_SYSTEM_EXTENSION_DIRS=1 -DENABLE_TESTS=1 -DEXPOSE_INTL_API=1 -DFIREFOX_VERSION=48.0a1 -DFORCE_PR_LOG=1 -DGTEST_HAS_RTTI=0 -DHAVE_FORCEINLINE=1 -DHAVE_INTTYPES_H=1 -DHAVE_IO_H=1 -DHAVE_ISATTY=1 -DHAVE_LOCALECONV=1 -DHAVE_MALLOC_H=1 -DHAVE_SEH_EXCEPTIONS=1 -DHAVE_STDINT_H=1 -DHAVE_UINT64_T=1 -DJS_DEFAULT_JITREPORT_GRANULARITY=3 -DMALLOC_H='<malloc.h>' -DMALLOC_USABLE_SIZE_CONST_PTR=const -DMOZILLA_OFFICIAL=1 -DMOZILLA_UAVERSION='"48.0"' -DMOZILLA_VERSION='"48.0a1"' -DMOZILLA_VERSION_U=48.0a1 -DMOZ_ACTIVITIES=1 -DMOZ_APP_UA_NAME='""' -DMOZ_APP_UA_VERSION='"48.0a1"' -DMOZ_B2G_OS_NAME='""' -DMOZ_B2G_VERSION='"1.0.0"' -DMOZ_BUILD_APP=browser -DMOZ_CONTENT_SANDBOX=1 -DMOZ_CRASHREPORTER=1 -DMOZ_CRASHREPORTER_ENABLE_PERCENT=100 -DMOZ_CRASHREPORTER_INJECTOR=1 -DMOZ_DATA_REPORTING=1 -DMOZ_DEBUG_SYMBOLS=1 -DMOZ_DIRECTSHOW=1 -DMOZ_DISTRIBUTION_ID='"org.mozilla"' -DMOZ_DLL_SUFFIX='".dll"' -DMOZ_EME=1 -DMOZ_ENABLE_PROFILER_SPS=1 -DMOZ_ENABLE_SIGNMAR=1 -DMOZ_ENABLE_SKIA=1 -DMOZ_FEEDS=1 -DMOZ_FFVPX=1 -DMOZ_FMP4=1 -DMOZ_GAMEPAD=1 -DMOZ_GMP_SANDBOX=1 -DMOZ_INSTRUMENT_EVENT_LOOP=1 -DMOZ_JSDOWNLOADS=1 -DMOZ_LIBAV_FFT=1 -DMOZ_LOGGING=1 -DMOZ_MACBUNDLE_ID=org.mozilla.nightly -DMOZ_MAINTENANCE_SERVICE=1 -DMOZ_MEMORY=1 -DMOZ_MEMORY_WINDOWS=1 -DMOZ_MSVC_STL_WRAP_RAISE=1 -DMOZ_PAY=1 -DMOZ_PEERCONNECTION=1 -DMOZ_PERMISSIONS=1 -DMOZ_PHOENIX=1 -DMOZ_PLACES=1 -DMOZ_PROFILING=1 -DMOZ_RAW=1 -DMOZ_REPLACE_MALLOC=1 -DMOZ_RUST_MP4PARSE=1 -DMOZ_SAFE_BROWSING=1 -DMOZ_SAMPLE_TYPE_FLOAT32=1 -DMOZ_SANDBOX=1 -DMOZ_SCTP=1 -DMOZ_SECUREELEMENT=1 -DMOZ_SERVICES_CLOUDSYNC=1 -DMOZ_SERVICES_COMMON=1 -DMOZ_SERVICES_CRYPTO=1 -DMOZ_SERVICES_HEALTHREPORT=1 -DMOZ_SERVICES_SYNC=1 -DMOZ_SOCIAL=1 -DMOZ_SRTP=1 -DMOZ_STACKWALKING=1 -DMOZ_STATIC_JS=1 -DMOZ_TELEMETRY_ON_BY_DEFAULT=1 -DMOZ_TELEMETRY_REPORTING=1 -DMOZ_TREE_CAIRO=1 -DMOZ_TREE_PIXMAN=1 -DMOZ_UPDATER=1 -DMOZ_UPDATE_CHANNEL=default -DMOZ_URL_CLASSIFIER=1 -DMOZ_USER_DIR='"Mozilla"' -DMOZ_VERIFY_MAR_SIGNATURE=1 -DMOZ_VORBIS=1 -DMOZ_VPX_ERROR_CONCEALMENT=1 -DMOZ_VPX_NO_MEM_REPORTING=1 -DMOZ_VTUNE=1 -DMOZ_WEBGL_CONFORMANT=1 -DMOZ_WEBM_ENCODER=1 -DMOZ_WEBRTC=1 -DMOZ_WEBRTC_ASSERT_ALWAYS=1 -DMOZ_WEBRTC_SIGNALING=1 -DMOZ_WEBSPEECH=1 -DMOZ_WEBSPEECH_TEST_BACKEND=1 -DMOZ_WINSDK_MAXVER=0x0A000000 -DMOZ_WINSDK_TARGETVER=0x06030000 -DMOZ_WMF=1 -DMOZ_XUL=1 -DMSVC_HAS_DIA_SDK=1 -DNIGHTLY_BUILD=1 -DNOMINMAX=1 -DNO_NSPR_10_SUPPORT=1 -DNS_ENABLE_TSF=1 -DNS_PRINTING=1 -DNS_PRINT_PREVIEW=1 -DSTATIC_JS_API=1 -DSTDC_HEADERS=1 -DTARGET_XPCOM_ABI='"x86-msvc"' -DUSE_SKIA=1 -DUSE_SKIA_GPU=1 -DU_STATIC_IMPLEMENTATION=1 -DU_USING_ICU_NAMESPACE=0 -DVPX_X86_ASM=1 -DWIN32=1 -DWIN32_LEAN_AND_MEAN=1 -DWINVER=0x502 -DXP_WIN=1 -DXP_WIN32=1 -DX_DISPLAY_MISSING=1 -D_CRT_NONSTDC_NO_WARNINGS=1 -D_CRT_SECURE_NO_WARNINGS=1 -D_USE_MATH_DEFINES=1 -D_VARIADIC_MAX=10 -D_WIN32_IE=0x0603 -D_WIN32_WINNT=0x502 -D_WINDOWS=1 -D_X86_=1 -DAB_CD=en-US \ 16:56:35 INFO - x:/Task_1462292895/build/src/layout/style/PythonCSSProps.h | \ 16:56:35 INFO - PYTHONDONTWRITEBYTECODE=1 x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe \ 16:56:35 INFO - x:/Task_1462292895/build/src/layout/style/GenerateCSSPropsGenerated.py \ 16:56:35 INFO - x:/Task_1462292895/build/src/layout/style/nsCSSPropsGenerated.inc.in > nsCSSPropsGenerated.inc 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/dom/base' 16:56:35 INFO - PropertyUseCounterMap.inc 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/dom/base/gen-usecounters.py property_map PropertyUseCounterMap.inc .deps/PropertyUseCounterMap.inc.pp x:/Task_1462292895/build/src/dom/base/UseCounters.conf 16:56:35 INFO - nsStyleStructList.h 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests' 16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests' 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/layout/style/generate-stylestructlist.py main nsStyleStructList.h .deps/nsStyleStructList.h.pp 16:56:35 INFO - UseCounterList.h 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/intl/locale' 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/dom/base/gen-usecounters.py use_counter_list UseCounterList.h .deps/UseCounterList.h.pp x:/Task_1462292895/build/src/dom/base/UseCounters.conf 16:56:35 INFO - PythonCSSProps.h 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'nsStyleStructList.h' '../../dist/include' 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/accessible/xpcom' 16:56:35 INFO - xpcAccEvents.cpp 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/accessible/xpcom/AccEventGen.py gen_cpp_file xpcAccEvents.cpp .deps/xpcAccEvents.cpp.pp x:/Task_1462292895/build/src/accessible/xpcom/AccEvents.conf 16:56:35 INFO - xpcAccEvents.h 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/components/telemetry' 16:56:35 INFO - TelemetryHistogramData.inc 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/toolkit/components/telemetry/gen-histogram-data.py main TelemetryHistogramData.inc .deps/TelemetryHistogramData.inc.pp x:/Task_1462292895/build/src/toolkit/components/telemetry/Histograms.json x:/Task_1462292895/build/src/dom/base/UseCounters.conf x:/Task_1462292895/build/src/dom/base/nsDeprecatedOperationList.h 16:56:35 INFO - TelemetryHistogramEnums.h 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/layout/style' 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/accessible/xpcom/AccEventGen.py gen_header_file xpcAccEvents.h .deps/xpcAccEvents.h.pp x:/Task_1462292895/build/src/accessible/xpcom/AccEvents.conf 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'UseCounterList.h' '../../dist/include/mozilla/dom' 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/netwerk/dns' 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/toolkit/components/telemetry/gen-histogram-enum.py main TelemetryHistogramEnums.h .deps/TelemetryHistogramEnums.h.pp x:/Task_1462292895/build/src/toolkit/components/telemetry/Histograms.json x:/Task_1462292895/build/src/dom/base/UseCounters.conf x:/Task_1462292895/build/src/dom/base/nsDeprecatedOperationList.h 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/base' 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre' 16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre' 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests/component_no_aslr' 16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests/component_no_aslr' 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/intl/locale/windows' 16:56:35 INFO - wincharset.properties.h 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/intl/locale/props2arrays.py main wincharset.properties.h .deps/wincharset.properties.h.pp x:/Task_1462292895/build/src/intl/locale/windows/wincharset.properties 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/intl/locale/windows' 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'xpcAccEvents.h' '../../dist/include' 16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre/test/win' 16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre/test/win' 16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/accessible/xpcom' 16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'TelemetryHistogramEnums.h' '../../../dist/include/mozilla' 16:56:36 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/components/telemetry' 16:56:37 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 16:56:37 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings' 16:56:37 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings/test' 16:56:37 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 16:56:37 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings/test' 16:56:44 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/ipc/ipdl' The task itself is running: python x:\Task_1462291293\build\src\testing\mozharness\scripts\fx_desktop_build.py --config builds\releng_base_windows_32_builds.py --disable-mock --no-setup-mock --no-checkout-sources --no-clone-tools --no-clobber --no-update --no-upload-files --no-sendchange --log-level=debug --work-dir=x:\Task_1462291293\build --no-action=generate-build-stats --branch=try --build-pool=taskcluster Here is an example failed run: https://public-artifacts.taskcluster.net/W-aUQu9LQdKDBI6ys_Lycg/0/public/logs/all_commands.log At the point of the failure, I'm not sure why the pertinent subdirectory of the tooltool binary download of vs2015u1 has disappeared from the PATH. I presume this is the reason it is not being found. More information can be provided if there are any questions! Please note the AMI was set up according to this configuration: https://github.com/MozRelOps/OpenCloudConfig/blob/9f3f63341081903202237e2c49391b747797b446/userdata/Manifest/win2012.json The installation steps outlined in this json configuration are applied atop of a fresh Windows Server 2012 R2 base image. The task steps can be seen here: https://queue.taskcluster.net/v1/task/W-aUQu9LQdKDBI6ys_Lycg
this is a popup on the computer, not in the log provided. One thing I see in the log is sccache failure (bug 1187257), we should ensure we have |set NO_CACHE=1| in the environment to disable configure caching and sccache.
Screenshot of the pop up
Are we sure vcvarsall.bat is executed properly before the build? I see it in the task steps above, so I guess so, but I believe I saw this issue last week and was able to work around it by running "%VCINSTALLDIR%\..\vcvarsall.bat" prior to the build.
vcvarsall.bat shouldn't be relevant to tooltool-based VS2015 builds, as vcvarsall.bat is provided by Microsoft and looks for the files in the standard locations (Program Files). What's likely happening is we're attempting to spawn a process from outside the main client.mk context. This process doesn't inherit PATH which is set when sourcing the mozconfig. Lemme look at the log in more detail...
That pop-up could have been generated by any program, including something during configure. You'll need to build with -j1 with remote desktop enabled and see if you can track down which process is spawning the pop-up. You may be able to identify the process via process monitor: I /think/ the .exe will still be alive as long as the pop-up is visible.
Also, the `hg clone` of mozilla-central at the top of the job is not necessary. Use `hg share` to cut several minutes from the build. Also, you should update to VS2015u2, as that is what mozilla-central is now using.
(In reply to Gregory Szorc [:gps] from comment #5) > That pop-up could have been generated by any program, including something > during configure. > > You'll need to build with -j1 with remote desktop enabled and see if you can > track down which process is spawning the pop-up. You may be able to identify > the process via process monitor: I /think/ the .exe will still be alive as > long as the pop-up is visible. Hi guys, Many thanks for your quick feedback! I should have added, I'm pretty sure the first pop up comes immediately after "Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/ipc/ipdl'". 16:56:37 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings/test' 16:56:44 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/ipc/ipdl' 18:16:44 INFO - Automation Error: mozprocess timed out after 4800 seconds running ['c:\\mozilla-build\\python\\python.exe', 'mach', '--log-no-times', 'build', '-v'] I happened to notice I had a hung build, and then logged in, and saw the pop up. You will see the timestamp jump here (from 16:56:44 -> 18:16:44, where it times out). There have been a few runs, and on other ones, I got there before the timeout, clicked "OK" and then it continued happily. I guess this is therefore unrelated to the sccache matters which show up later in the build logs. Also, thanks for the `hg share` tip. We'll implement that too. To pick up VS2015u2, is it just a case of rebasing against latest mozilla central to get a newer tooltool manifest. We can do that too. Thanks!
:gps if it is ok with you, maybe the best is for me and you and grenade to meet (joel welcome too) and we can go through some of this together, to explain how the AMI is set up, what we've done so far etc. It should be mostly transparent from the AMI setup config: https://github.com/MozRelOps/OpenCloudConfig/blob/9f3f63341081903202237e2c49391b747797b446/userdata/Manifest/win2012.json Plus the task definitions, e.g. from the try pushes above. However, it might also be useful to talk it through to explain some of the reasoning behind it. If you have a convenient time tomorrow (Wed 4 May) which works with European timezones too, that would be awesome!
as a note, I got to the ipdl line and it stopped outputting data in my logs- I did the set NO_CACHE=1 and this fixed the sccache issue in the logs. I assume this is the same spot where we got hung up and it could be related to mspdb140.dll. pmoore, can you work on ensuring we s/hg clone/hg share/ to speed up replication of the data? would it be possible to get -j1 into the build process (yes, this will slow it down, but make error finding much easier). :gps, for adding -j1, is there a convenient place to put this? I assume this won't be located in mozharness code, but possibly a mozconfig ac_add_options?
Run `mach build -j1`. If it stopped after the IPDL line, that's possibly the transition between the mostly Python pre-build and compilation. It may get that pop-up on the first invocation of the compiler. Although we should invoke the compiler during configure. So it might be some other random executable.
(In reply to Gregory Szorc [:gps] from comment #6) > Also, the `hg clone` of mozilla-central at the top of the job is not > necessary. Use `hg share` to cut several minutes from the build. > > Also, you should update to VS2015u2, as that is what mozilla-central is now > using. The challenge here is that each task runs as a different (temporary) user, not in the Administrator group. We do this so that when a task has completed, that user and all its resources can be deleted, and we can have relatively high confidence that the OS is in a clean state, since in theory the temporary task user cannot affect any global state, rather only affect its own resources. If using hg share as part of the task definition, the shared hg repository would be created by one temporary user, and then later could not be updated by a subsequent task user due to write permissions. If we made the directory writable by all task users, in theory a malicious task could affect the contents, and thus introduce malicious content into a subsequent build. If there is a way to only inherit objects from the share, but to write new objects into a localised repo, this could work (i.e. using read only). However, one disadvantage would be that the store would become slowly stale, however workers don't live very long so this isn't really a problem, as only up to around 4 days commits could be missing. Another option might be to get the worker itself to manage the shares, rather than the task. This way we might be able to find a way to operate a shared repository safely, such that it could be updated with new objects as each task runs... This would need some thought and design as at the moment the worker doesn't understand anything about vcs shares, it just executes task definitions. I need to read up more on hg share to see if it really can be used as a read-only store - if that is the case, sounds like we'd make significant savings.
I think the reason that bb slaves don't barf on the pdb140 dll is that we disable jit debuggers there: https://hg.mozilla.org/build/puppet/file/tip/modules/tweaks/manifests/disablejit.pp We also add a few extra hacks: https://hg.mozilla.org/build/puppet/file/tip/modules/tweaks/manifests/disable_desktop_interruption.pp Windows 2012 (used by tc windows workertypes) has a new JIT debugger (RyuJIT) so the hacks for disabling JIT are different. I'm experimenting with using this: Set-ItemProperty -Path HKLM:\Software\Microsoft\.NETFramework -Name useLegacyJit -Type DWord -Value 1 Since the keys that we would normally delete, already don't exist. I'm hoping the uselegacy setting will trigger legacy jit, which will detect the absence of the debug registry keys/flags and prevent that dialog box from appearing.
Would we still detect the problem if we disable jit? In other words, would this hide the problem, rather than fix it?
(In reply to Gregory Szorc [:gps] from comment #10) > Run `mach build -j1`. > > If it stopped after the IPDL line, that's possibly the transition between > the mostly Python pre-build and compilation. It may get that pop-up on the > first invocation of the compiler. Although we should invoke the compiler > during configure. So it might be some other random executable. Running with -j1 caused the problem to disappear entirely. I'm not quite sure why. Removing the -j1 made it reappear, but I noticed this time I could get some more info from the interactive dialogue: =================================================== Problem signature: Problem Event Name: APPCRASH Application Name: cl.EXE Application Version: 19.0.23918.0 Application Timestamp: 56eb9318 Fault Module Name: mspdb140.dll Fault Module Version: 6.3.9600.18194 Fault Module Timestamp: 56951674 Exception Code: c0000135 Exception Offset: 00000000000ecdd0 OS Version: 6.3.9600.2.0.0.272.7 Locale ID: 1033 Additional Information 1: ac05 Additional Information 2: ac0507478d1c5bd693cfc4fe3987e900 Additional Information 3: ac05 Additional Information 4: ac0507478d1c5bd693cfc4fe3987e900 Read our privacy statement online: http://go.microsoft.com/fwlink/?linkid=280262 If the online privacy statement is not available, please read our privacy statement offline: C:\Windows\system32\en-US\erofflps.txt =========================== I'm not sure if this helps diagnosis in any way. After clicking "OK" on the original warning, another warning popped up to say cl.EXE had crashed, and this was the information it had....
(In reply to Rob Thijssen (:grenade - GMT) from comment #12) > I'm experimenting with using this: > Set-ItemProperty -Path HKLM:\Software\Microsoft\.NETFramework -Name > useLegacyJit -Type DWord -Value 1 Did this fix it for you Rob? Thanks!
Flags: needinfo?(rthijssen)
Hi Greg, Does comment 14 provide any useful information for you about the source? Thanks!
Flags: needinfo?(gps)
now that we have proven we can get a green build from a taskcluster job: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c22048d7f1243989f6d327af560e90dbb55069ca getting a job to complete in a similar time window is at the top of our list. That means removing -j1 and solving this bug. as :gps stated in comment 10, this could be the first time we are using cl.exe (possibly the configure step doesn't have the flags needed to depend on mspdb140.dll, or we have the proper environment as it not spawned in a different process/thread?) in comment 12, :grenade mentioned disabling jit debuggers and that we need to do the same on our new AMIs. I think next steps here are to look at disabling the jit debuggers, if that doesn't work, then working with a build hacker to figure out why mspdb140.dll is not being found in the case where we invoke cl.exe during the build proper.
We need to know which cl.exe invocation is failing. That could be... difficult if this only reproduces in -j>1. You can probably use https://technet.microsoft.com/en-us/sysinternals/processexplorer.aspx to look at the process tree and see what target mozmake.exe is building.
Flags: needinfo?(gps)
Thanks, I'll take a look.
Attached image mspdb140.dll-error.png
Process explorer shows the command which throws the error as: x:\Task_1463424522\build\src\vs2015u2\VC\bin\amd64_x86\cl.EXE "-?" Which was spawned several layers up by this command: x:/Task_1463424522/build/src/mozmake.EXE -C x:/Task_1463424522/build/src/security/nss/lib/nss/../libpkix/pkix/params export "CC= x:/Task_1463424522/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE" SOURCE_MD_DIR=x:/Task_1463424522/build/src/obj-firefox/dist SOURCE_MDHEADERS_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include/nspr DIST=x:/Task_1463424522/build/src/obj-firefox/dist NSPR_INCLUDE_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include/nspr NSPR_LIB_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/lib MOZILLA_CLIENT=1 NO_MDUPDATE=1 NSS_ENABLE_ECC=1 SQLITE_LIB_NAME=nss3 SQLITE_INCLUDE_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include topsrcdir=x:/Task_1463424522/build/src "BUILD=x:/Task_1463424522/build/src/obj-firefox/security/$(subst $(topsrcdir)/security/,,$(CURDIR))" BUILD_TREE=$(BUILD) OBJDIR=$(BUILD) DEPENDENCIES=$(BUILD)/.deps SINGLE_SHLIB_DIR=$(BUILD) SOURCE_XP_DIR=x:/Task_1463424522/build/src/obj-firefox/dist BUILD_OPT=1 OPT_CODE_SIZE=1 NS_USE_GCC= OS_TARGET=WIN95 NSS_SSL_ENABLE_ZLIB= PROGRAMS= CHECKLOC= FREEBL_NO_DEPEND=0 NSS_NO_PKCS11_BYPASS=1 PUBLIC_EXPORT_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include/$(MODULE) SOURCE_XPHEADERS_DIR=$(SOURCE_XP_DIR)/include/$(MODULE) "MODULE_INCLUDES=$(addprefix -I$(SOURCE_XP_DIR)/include/,$(REQUIRES))" "MAKE_OBJDIR=$(INSTALL) -D $(OBJDIR)" "TARGETS=$(LIBRARY) $(SHARED_LIBRARY) $(PROGRAM)" NSS_ENABLE_WERROR=0 PYTHON=x:/Task_1463424522/build/src/obj-firefox/_virtualenv/Scripts/python.exe NSINSTALL_PY=x:/Task_1463424522/build/src/config/nsinstall.py "NSINSTALL=$(PYTHON) $(NSINSTALL_PY)" "INSTALL=$(NSINSTALL) -t" PRIVATE_EXPORTS=
Flags: needinfo?(rthijssen) → needinfo?(gps)
Its also worth noting that I have only ever seen this error triggered in 32 bit builds. 64 bit builds sail past every time.
I think comment #22 is spot on. NSS's build system appears to be unsetting/overwriting environment variables we rely on (although I'm not sure where). We didn't run into this before because files were in expected locations. Since Visual Studio isn't installed in the TC environment, NSS fails. I'd try echoing PATH and LIB before it makes that cl.exe call and verify that our custom paths are there. If not, this is either a) not passing the variables to NSS's build system b) the NSS build system trampling on our custom values. Either way is a bug.
Flags: needinfo?(gps)
For me (on a slightly different windows AMI to :grenade) it is happening a couple of lines further down in the same make file. I can see it is here, because the mozmake call is: c:/Users/Task_1463653946/build/src/mozmake.EXE -C c:/Users/Task_1463653946/build/src/security/nss/lib/crmf export "CC= c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE" SOURCE_MD_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist SOURCE_MDHEADERS_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include/nspr DIST=c:/Users/Task_1463653946/build/src/obj-firefox/dist NSPR_INCLUDE_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include/nspr NSPR_LIB_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/lib MOZILLA_CLIENT=1 NO_MDUPDATE=1 NSS_ENABLE_ECC=1 SQLITE_LIB_NAME=nss3 SQLITE_INCLUDE_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include topsrcdir=c:/Users/Task_1463653946/build/src "BUILD=c:/Users/Task_1463653946/build/src/obj-firefox/security/$(subst $(topsrcdir)/security/,,$(CURDIR))" BUILD_TREE=$(BUILD) OBJDIR=$(BUILD) DEPENDENCIES=$(BUILD)/.deps SINGLE_SHLIB_DIR=$(BUILD) SOURCE_XP_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist BUILD_OPT=1 OPT_CODE_SIZE=1 NS_USE_GCC= OS_TARGET=WIN95 NSS_SSL_ENABLE_ZLIB= PROGRAMS= CHECKLOC= FREEBL_NO_DEPEND=0 NSS_NO_PKCS11_BYPASS=1 PUBLIC_EXPORT_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include/$(MODULE) SOURCE_XPHEADERS_DIR=$(SOURCE_XP_DIR)/include/$(MODULE) "MODULE_INCLUDES=$(addprefix -I$(SOURCE_XP_DIR)/include/,$(REQUIRES))" "MAKE_OBJDIR=$(INSTALL) -D $(OBJDIR)" "TARGETS=$(LIBRARY) $(SHARED_LIBRARY) $(PROGRAM)" NSS_ENABLE_WERROR=0 PYTHON=c:/Users/Task_1463653946/build/src/obj-firefox/_virtualenv/Scripts/python.exe NSINSTALL_PY=c:/Users/Task_1463653946/build/src/config/nsinstall.py "NSINSTALL=$(PYTHON) $(NSINSTALL_PY)" "INSTALL=$(NSINSTALL) -t" PRIVATE_EXPORTS= And this is calling a sh.exe process which is running the following temporary file: $ cat 'C:/Users/TEMP~1.WIN/AppData/Local/Temp/make3604-2.sh' c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE 2>&1 | sed -ne 's|.* \([0-9]\+\.[0-9]\+\.[0-9]\+\(\.[0-9]\+\)\?\).*|\1|p' This cc+sed expression matches only this makefile line in the build system, so it must be this one: https://hg.mozilla.org/try/file/e524199f3299/security/nss/coreconf/WIN32.mk#l43 The pertinent env vars of the cl.EXE process are: LIB=c:\Users\Task_1463653946\build\src\vs2015u2\VC\lib;c:\Users\Task_1463653946\build\src\vs2015u2\VC\atlmfc\lib;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\ucrt\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\um\x86;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\lib PATH=c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x86\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x64\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64_x86;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x64;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\bin;c:\mozilla-build\nsis-3.0b1;c:\mozilla-build\python;C:\mozilla-build\msys\local\bin;c:\mozilla-build\7zip;c:\mozilla-build\info-zip;c:\mozilla-build\python\Scripts;c:\mozilla-build\yasm;C:\mozilla-build\msys\bin;c:\Windows\system32;c:\mozilla-build\upx391w;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32 INCLUDE=c:\Users\Task_1463653946\build\src\vs2015u2\VC\include;c:\Users\Task_1463653946\build\src\vs2015u2\VC\atlmfc\include;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\ucrt;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\shared;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\um;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\winrt;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\include The last lines of the log before the freeze occurs (due to the interactive dialog) are: 11:43:16 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/toolkit/xre/test/win' 11:43:16 INFO - c:/Users/Task_1463653946/build/src/obj-firefox/_virtualenv/Scripts/python.exe c:/Users/Task_1463653946/build/src/config/nsinstall.py -t -m 644 'xpcAccEvents.h' '../../dist/include' 11:43:16 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/intl/locale/windows' 11:43:16 INFO - c:/Users/Task_1463653946/build/src/obj-firefox/_virtualenv/Scripts/python.exe c:/Users/Task_1463653946/build/src/config/nsinstall.py -t -m 644 'TelemetryHistogramEnums.h' '../../../dist/include/mozilla' 11:43:16 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/accessible/xpcom' 11:43:17 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/toolkit/components/telemetry' 11:43:20 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 11:43:20 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/dom/bindings' 11:43:20 INFO - mozmake.EXE[5]: Entering directory 'c:/Users/Task_1463653946/build/src/obj-firefox/dom/bindings/test' 11:43:20 INFO - mozmake.EXE[5]: Nothing to be done for 'export'. 11:43:20 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/dom/bindings/test' 11:43:28 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/ipc/ipdl' I have no explanation for why these lines extra debug lines that I added are not included in the log: https://hg.mozilla.org/try/file/e524199f3299/security/nss/coreconf/WIN32.mk#l35 However, it does not matter too much since I could grab the env vars from ProcessExplorer. This is my try push, for reference: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e524199f3299ae789f0fe82045a7dccc5f14e6f7 Aside from this missing dll problem, I'm also wondering if the call should be: c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE -v rather than: c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE although this wouldn't change the fact it can't find mspdb140.dll. My suggestion is just because -v is used in other places (e.g. https://hg.mozilla.org/try/file/e524199f3299/nsprpub/configure.in#l1892)
Flags: needinfo?(gps)
FWIW, the DLLs exist here: C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64/mspdb140.dll C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\mspdb140.dll This build is a win32 opt build. I'm assuming C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin is missing from one or more of PATH/LIB/LIBPATH/INCLUDE. I didn't explicitly state it in comment 24, but LIBPATH isn't set at all in the cl.EXE process.
We add those dirs to PATH in this mozconfig: https://dxr.mozilla.org/mozilla-central/rev/c67dc1f9fab86d4f2cf3224307809c44fe3ce820/build/win32/mozconfig.vs2015-win64#10 Maybe something is going awry with the way we spawn submakes to build NSS. (PATH is used as the equivalent of LD_LIBRARY_PATH on Windows, FYI.)
(In reply to Pete Moore [:pmoore][:pete] from comment #24) > I have no explanation for why these lines extra debug lines that I added are > not included in the log: > https://hg.mozilla.org/try/file/e524199f3299/security/nss/coreconf/WIN32. > mk#l35 Looking again, they are in the log, just not at the point where the build hangs... e.g. 11:42:44 INFO - ../../coreconf/WIN32.mk:35: in WIN32.mk LIB is c:\Users\Task_1463653946\build\src\vs2015u2\VC\lib;c:\Users\Task_1463653946\build\src\vs2015u2\VC\atlmfc\lib;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\ucrt\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\um\x86;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\lib 11:42:44 INFO - ../../coreconf/WIN32.mk:36: in WIN32.mk PATH is c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x86\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x64\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64_x86;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x64;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\bin;c:\mozilla-build\nsis-3.0b1;c:\mozilla-build\python;C:\mozilla-build\msys\local\bin;c:\mozilla-build\7zip;c:\mozilla-build\info-zip;c:\mozilla-build\python\Scripts;c:\mozilla-build\yasm;C:\mozilla-build\msys\bin;c:\Windows\system32;c:\mozilla-build\upx391w;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32 These match exactly the results of ProcessExplorer in comment 24 - just wanted to confirm the logging worked...
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #26) > We add those dirs to PATH in this mozconfig: > https://dxr.mozilla.org/mozilla-central/rev/ > c67dc1f9fab86d4f2cf3224307809c44fe3ce820/build/win32/mozconfig.vs2015- > win64#10 > > Maybe something is going awry with the way we spawn submakes to build NSS. > (PATH is used as the equivalent of LD_LIBRARY_PATH on Windows, FYI.) so the PATH does contain the dirs. The complete PATH of the failing process is: c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x86\Microsoft.VC140.CRT c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x64\Microsoft.VC140.CRT c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x86 c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x64 c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64_x86 c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64 c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x86 c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x64 c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\bin c:\mozilla-build\nsis-3.0b1 c:\mozilla-build\python C:\mozilla-build\msys\local\bin c:\mozilla-build\7zip c:\mozilla-build\info-zip c:\mozilla-build\python\Scripts c:\mozilla-build\yasm C:\mozilla-build\msys\bin c:\Windows\system32 c:\mozilla-build\upx391w c:\mozilla-build\python\lib\site-packages\pywin32_system32 c:\mozilla-build\python\lib\site-packages\pywin32_system32 c:\mozilla-build\python\lib\site-packages\pywin32_system32 The 32/64 bit versions exist here: C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64\mspdb140.dll C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\mspdb140.dll And the error message that appears as an interactive dialogue as the user Task_1463653946 says: "The program can't start because mspdb140.dll is missing from your computer. Try reinstalling the program to fix this problem." It should pick up "C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64\mspdb140.dll" since this is the first one to appear in the PATH. Is the problem instead that "C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64\mspdb140.dll" is the 64 bit version of mspdb140.dll and the firefox desktop 32 bit opt build needs the version "C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\mspdb140.dll" instead? If this is the case, I'm surprised the error is not something like "mspdb140.dll has wrong architecture - expecting 32 bit, but found 64 bit" etc. Any ideas?
Flags: needinfo?(ted)
(In reply to Pete Moore [:pmoore][:pete] from comment #14) > =================================================== > > Problem signature: > Problem Event Name: APPCRASH > Application Name: cl.EXE > Application Version: 19.0.23918.0 > Application Timestamp: 56eb9318 > Fault Module Name: mspdb140.dll > Fault Module Version: 6.3.9600.18194 > Fault Module Timestamp: 56951674 > Exception Code: c0000135 > Exception Offset: 00000000000ecdd0 > OS Version: 6.3.9600.2.0.0.272.7 > Locale ID: 1033 > Additional Information 1: ac05 > Additional Information 2: ac0507478d1c5bd693cfc4fe3987e900 > Additional Information 3: ac05 > Additional Information 4: ac0507478d1c5bd693cfc4fe3987e900 > > Read our privacy statement online: > http://go.microsoft.com/fwlink/?linkid=280262 > > If the online privacy statement is not available, please read our privacy > statement offline: > C:\Windows\system32\en-US\erofflps.txt > > =========================== Revisiting this problem signature, the module name is already cited as mspdb140.dll, and as an application crash. This makes me think that the OS successfully loaded the DLL, and then crashed, rather than was not able to load the DLL. This might also suggest it was using the 64 bit version instead of the 32 bit version, rather than not being able to find a version at all.
Could the problem be that vs2015u2/VC/bin/amd64_x86/mspdb140.dll is missing from the tooltool artifact "55814aaabcd4aa51fe85918ec02a8c29bc067d41ee79ddcfd628daaba5a06d4241a73a51bf5a8bc69cc762b52551009f44b05e65682c45b4684c17fb2d017c2c" referenced in browser/config/tooltool-manifests/win32/releng.manifest? These versions exist in the tooltool binary: vs2015u2/VC/bin/amd64/mspdb140.dll vs2015u2/VC/bin/mspdb140.dll but this version doesn't: vs2015u2/VC/bin/amd64_x86/mspdb140.dll However, the "amd64_x86" directory exists in PATH, earlier than the other two PATH extries, which makes me wonder if it is finding the amd64 version rather than the amd64_x86 version, which perhaps it needs instead?
I'm sticking an extra -? in the cl command to see if that fixes it. https://hg.mozilla.org/try/rev/7ecd7df1f1f2c0141408a7a406c2acd1315f06b1 Since line 1.6 in this diff executes without problems, I'm hoping 1.13 will.
NOPE! gps, over to you! :)
Dependency walker reports problems for vs2015u2\vc\bin\amd64_x86\CL.EXE but not for its vs2015u2\vc\bin\amd64\MSPDB140.DLL dependency.... This screenshot shows the vs2015u2\vc\bin\amd64\MSPDB140.DLL dependencies...
All dependency problems occur instead under c:\windows\system32\SHELL32.DLL (so MSPDB140.DLL seems to be ok). These are the missing dependencies of c:\windows\system32\SHELL32.DLL: API-MS-WIN-CORE-KERNEL32-PRIVATE-L1-1-1.DLL API-MS-WIN-CORE-PRIVATEPROFILE-L1-1-1.DLL API-MS-WIN-SERVICE-PRIVATE-L1-1-1.DLL API-MS-WIN-CORE-SHUTDOWN-L1-1-1.DLL EXT-MS-WIN-NTUSER-UICONTEXT-EXT-L1-1-0.DLL IESHIMS.DLL MFPLAT.DLL SETTINGSYNCPOLICY.DLL WLANAPI.DLL
This is the .dwi I was able to generate from http://www.dependencywalker.com/ for vs2015u2/VC/bin/amd64_x86/cl.EXE on a live running worker (https://tools.taskcluster.net/task-inspector/#QDUOOFTjTIOR1fcnRxhm4A/0).
Note, when I ran the dependency walker, I set the PATH correctly to the same value I had in the hanging process, and removed the default module searches so that it only used the PATH I had set.
This is a "mini" dump from ProcessExplorer for the cl.exe process. I can also attach a "full" dump if required, but it is ~8MB so I'll hold off for now unless someone needs it.
Looking again at https://bugzilla.mozilla.org/attachment.cgi?id=8754912 it seems the red colour suggests a warning: http://www.dependencywalker.com/help/html/hidr_module_list_view.htm I wonder if this is because the parent modules are 64 bit, and the PATH is set to find the 32 bit dependencies first. I can have a dig...
A naïve try push where I've slightly adjusted the PATH ordering to favour x86 dirs over amd64 dirs in the 32 bit builds: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cfaf1c48cc8da9656f2913e34e5ac9f5461ef942 It is currently running, let's see how it turns out....
(In reply to Pete Moore [:pmoore][:pete] from comment #39) > A naïve try push where I've slightly adjusted the PATH ordering to favour > x86 dirs over amd64 dirs in the 32 bit builds: > > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=cfaf1c48cc8da9656f2913e34e5ac9f5461ef942 > > It is currently running, let's see how it turns out.... No, it didn't help. :(
(In reply to Pete Moore [:pmoore][:pete] from comment #30) > Could the problem be that > > vs2015u2/VC/bin/amd64_x86/mspdb140.dll is missing from the tooltool artifact On my Windows 10 desktop at work, I have the following mspdb140.dll files: /c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/mspdb140.dll /c/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/amd64/mspdb140.dll /c/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/mspdb140.dll
I attempted to reproduce this on my Windows 10 desktop and didn't have much luck. I started a generic MozillaBuild shell with the vs2015u2.zip archive extracted in the topsrcdir (read: no vcvarsall.bat). Perhaps mspdb140.dll is still getting picked up from my VS2015 install. Perhaps I should try with a fresh VM.
Flags: needinfo?(gps)
(In reply to Gregory Szorc [:gps] from comment #42) > I attempted to reproduce this on my Windows 10 desktop and didn't have much > luck. I started a generic MozillaBuild shell with the vs2015u2.zip archive > extracted in the topsrcdir (read: no vcvarsall.bat). Perhaps mspdb140.dll is > still getting picked up from my VS2015 install. Perhaps I should try with a > fresh VM. We can consistently reproduce on the taskcluster workers - I can provide credentials for logging into a box where ProcessExplorer and dependencywalker are installed, where the warning window is popped up, and the process is hung. That might be easier than reproducing locally. If you like we can step through it together in a live vidyo call, and share a screen with the RDP session.
So, I've tested and seen today that the issue doesn't occur on a mach build that isn't wrapped by mozharness: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a2a64024f329f8074532c975f950085323841b5 This at least narrows the problem to 32 bit builds that are triggered with a mozharness process parenting the mach process.
Flags: needinfo?(gps)
oh, mozharness is problematic! Possibly we can make a call to shift our build to using mach instead? That might not solve it for all build types, but it would get us closer to builds in automation matching local builds.
I'd like to try to replicate the firefox build that you are running after starting the MozillaBuild shell on one of the TC builders to see if we don't get the mspdb errors when not running under Mozharness (our builds all start with a mozharness call like: python -u %WORKSPACE%\build\src\testing\mozharness\scripts\fx_desktop_build.py --config %MOZHARNESS_CONFIG% %MH_CUSTOM_BUILD_VARIANT_FLAGS% --branch=%MH_BRANCH% --build-pool=%MH_BUILD_POOL% --skip-buildbot-actions --work-dir=%WORKSPACE%\build --clone-tools --build), but I suspect that is more build system oriented than what you guys are running when you replicate. Can you let us know so we can try it out?
Flags: needinfo?(gps)
We shouldn't need to drop mozharness, just adjust the mach call to not be wrapped by python. I have a test running here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f6045e17a3c8efc9f96cfd03adae8a24fdcbacd https://hg.mozilla.org/try/rev/7f6045e17a3c8efc9f96cfd03adae8a24fdcbacd#l8.21 I think it has already passed the point where we normally get the mspdb140 error. I could do with some help with tidying up the bash wrapper though. My 'bash.exe' call won't work in a lot of places. I just wanted to validate that this would fix it.
well, pretty lousy fix actually. seems very slow compared to running under python (without -j1)
(In reply to Rob Thijssen (:grenade - GMT) from comment #43) > I'd like to try to replicate the firefox build that you are running after > starting the MozillaBuild shell on one of the TC builders to see if we don't > get the mspdb errors when not running under Mozharness (our builds all start > with a mozharness call like: python -u > %WORKSPACE%\build\src\testing\mozharness\scripts\fx_desktop_build.py > --config %MOZHARNESS_CONFIG% %MH_CUSTOM_BUILD_VARIANT_FLAGS% > --branch=%MH_BRANCH% --build-pool=%MH_BUILD_POOL% --skip-buildbot-actions > --work-dir=%WORKSPACE%\build --clone-tools --build), The bareword "python" here is somewhat scary. I think it should use the full path to a Python executable, preferably something with "2.7" in the path, as "python" is reserved for the most modern version of Python, which will some day be Python 3.
(In reply to Gregory Szorc [:gps] from comment #49) > The bareword "python" here is somewhat scary. I think it should use the full > path to a Python executable, preferably something with "2.7" in the path, as > "python" is reserved for the most modern version of Python, which will some > day be Python 3. Understood. We can easily fix that. For clarity, these builders use python from mozilla-build and have no other python installed. Everything that goes onto the builders is visible in the manifest at https://github.com/MozRelOps/OpenCloudConfig/blob/master/userdata/Manifest/win2012.json. C:\Users\Administrator>python --version Python 2.7.11 C:\Users\Administrator>where python C:\mozilla-build\python\python.exe
So bizarrely, I just wrote a powershell script to run against a completely clean win2012r2 ec2 instance, that installs the prerequisites, configures the environment, and then runs a win32 opt release build, to demonstrate the problem. However, for whatever reason, it did not exhibit the problem. The only real difference I see between this example, and a real live worker, is that in this example I run the build as Administrator (whereas normally the generic worker would spawn a new user to run the task). I'll see if I can adapt the powershell script to run the task as a different user, and see if that reproduces the problem. We can still go ahead with our meeting, as I'm able to set up a test case to demonstrate the problem on a real worker - the reason I created the powershell was just to have a single script that can run against a clean instance, to demonstrate the problem, without needing a worker and everything else installed.
We (gps, ted, grenade, chmanchester, jlund) had a meeting on Friday 27 May (2016) and determined that the problem with the failing cl.exe process is that it is getting an msys style path rather than a win32 style PATH, e.g. c/Users/....:c/....:....:... rather than C:/Users/....;C:/....;.....;.... or C:\Users\....;C:\....;....;.... - and this breaks the cl.exe process. We were not able to establish why this doesn't happen when run with -j1. We looked into msys source code to try to get a better idea, but did not reach a conclusion. :grenade mentioned that when running mach from inside mozharness by invoking bash rather than python, the problem seems to disappear. This may be a temporary solution. Strangely, the builds can take twice as long just because of the different invocation style, but I do not know why. I feared it may be related to the tasks running using a temporary local profile, so adapted the generic worker to create a local profile for each task user. This unfortunately did not resolve the problem and therefore is not the cause. I've arranged a meeting with glandium to discuss, as gps mentioned he has some expertise in this area. I confirmed that succeeding cl.exe processes indeed do have a win32 style PATH, as expected.
:grenade has a workaround for this problem at the moment, which is to call mozharness from msys bash. We don't know why this works around the issue, but it seems to. This is still an issue, but removing it as a blocking bug for bug 1244750 since we have a workaround in place.
No longer blocks: 1244750
See Also: → 1279167
Flags: needinfo?(ted)
See Also: → 1244750
See https://bugzilla.mozilla.org/show_bug.cgi?id=1280325#c2 - it may be that bug 1280325 solves this problem, we'll need a try push with the workaround (calling mozharness via bash) disabled to see if it has or not.
Try push to see if we can now call mozharness natively: https://treeherder.allizom.org/#/jobs?repo=try&selectedJob=23058969
Hey Rob, Did we ever find a proper solution for this, or are we still building with -j1 ? I can't remember if/how this got resolved! Thanks!
Flags: needinfo?(rthijssen)
This issue may not be reproducible since bug 1295937 landed, since this was occurring in the NSS build and we're no longer using that build system.
what ted said
Flags: needinfo?(rthijssen)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: