Closed
Bug 1269809
Opened 9 years ago
Closed 9 years ago
mspdb140.dll is missing from your computer
Categories
(Release Engineering :: General, defect)
Release Engineering
General
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: pmoore, Unassigned)
References
Details
Attachments
(6 files)
When trying to build Firefox opt/debug 32/64 bit builds in https://treeherder.mozilla.org/#/jobs?repo=try&revision=a8c9b9a790db9f22cb354d3739d12fcf45de8778 we are getting a pop-up window on the Windows worker which says:
"The program can't start because mspdb140.dll is missing from your computer. Try reinstalling the program to fix this problem."
I'm able to locate mspdb140.dll (both 32 bit and 64 bit versions):
X:\Task_1462293303\build\src\vs2015u1\VC\bin\amd64\mspdb140.dll
X:\Task_1462293303\build\src\vs2015u1\VC\bin\mspdb140.dll
These should both be in the path, so I'm not sure why they are not getting found.
This is the tail of the log file, at the point the build hangs with this popup:
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/layout/style'
16:56:35 INFO - c:/mozilla-build/python/python2.7.EXE x:/Task_1462292895/build/src/sccache/sccache.py x:/Task_1462292895/build/src/vs2015u1/VC/bin/amd64_x86/cl.EXE -E -nologo -DNDEBUG=1 -DTRIMMED=1 -DWIN32_LEAN_AND_MEAN -D_WIN32 -DWIN32 -D_CRT_RAND_S -DCERT_CHAIN_PARA_HAS_EXTRA_FIELDS -DOS_WIN=1 -D_UNICODE -DCHROMIUM_BUILD -DU_STATIC_IMPLEMENTATION -DUNICODE -D_WINDOWS -D_SECURE_ATL -DCOMPILER_MSVC -DSTATIC_EXPORTABLE_JS_API -DMOZILLA_INTERNAL_API -DIMPL_LIBXUL -DA11Y_LOG=1 -DACCESSIBILITY=1 -DBUILD_CTYPES=1 -DCROSS_COMPILE='' -DD_INO=d_ino -DE10S_TESTING_ONLY=1 -DEARLY_BETA_OR_EARLIER=1 -DENABLE_INTL_API=1 -DENABLE_MARIONETTE=1 -DENABLE_SYSTEM_EXTENSION_DIRS=1 -DENABLE_TESTS=1 -DEXPOSE_INTL_API=1 -DFIREFOX_VERSION=48.0a1 -DFORCE_PR_LOG=1 -DGTEST_HAS_RTTI=0 -DHAVE_FORCEINLINE=1 -DHAVE_INTTYPES_H=1 -DHAVE_IO_H=1 -DHAVE_ISATTY=1 -DHAVE_LOCALECONV=1 -DHAVE_MALLOC_H=1 -DHAVE_SEH_EXCEPTIONS=1 -DHAVE_STDINT_H=1 -DHAVE_UINT64_T=1 -DJS_DEFAULT_JITREPORT_GRANULARITY=3 -DMALLOC_H='<malloc.h>' -DMALLOC_USABLE_SIZE_CONST_PTR=const -DMOZILLA_OFFICIAL=1 -DMOZILLA_UAVERSION='"48.0"' -DMOZILLA_VERSION='"48.0a1"' -DMOZILLA_VERSION_U=48.0a1 -DMOZ_ACTIVITIES=1 -DMOZ_APP_UA_NAME='""' -DMOZ_APP_UA_VERSION='"48.0a1"' -DMOZ_B2G_OS_NAME='""' -DMOZ_B2G_VERSION='"1.0.0"' -DMOZ_BUILD_APP=browser -DMOZ_CONTENT_SANDBOX=1 -DMOZ_CRASHREPORTER=1 -DMOZ_CRASHREPORTER_ENABLE_PERCENT=100 -DMOZ_CRASHREPORTER_INJECTOR=1 -DMOZ_DATA_REPORTING=1 -DMOZ_DEBUG_SYMBOLS=1 -DMOZ_DIRECTSHOW=1 -DMOZ_DISTRIBUTION_ID='"org.mozilla"' -DMOZ_DLL_SUFFIX='".dll"' -DMOZ_EME=1 -DMOZ_ENABLE_PROFILER_SPS=1 -DMOZ_ENABLE_SIGNMAR=1 -DMOZ_ENABLE_SKIA=1 -DMOZ_FEEDS=1 -DMOZ_FFVPX=1 -DMOZ_FMP4=1 -DMOZ_GAMEPAD=1 -DMOZ_GMP_SANDBOX=1 -DMOZ_INSTRUMENT_EVENT_LOOP=1 -DMOZ_JSDOWNLOADS=1 -DMOZ_LIBAV_FFT=1 -DMOZ_LOGGING=1 -DMOZ_MACBUNDLE_ID=org.mozilla.nightly -DMOZ_MAINTENANCE_SERVICE=1 -DMOZ_MEMORY=1 -DMOZ_MEMORY_WINDOWS=1 -DMOZ_MSVC_STL_WRAP_RAISE=1 -DMOZ_PAY=1 -DMOZ_PEERCONNECTION=1 -DMOZ_PERMISSIONS=1 -DMOZ_PHOENIX=1 -DMOZ_PLACES=1 -DMOZ_PROFILING=1 -DMOZ_RAW=1 -DMOZ_REPLACE_MALLOC=1 -DMOZ_RUST_MP4PARSE=1 -DMOZ_SAFE_BROWSING=1 -DMOZ_SAMPLE_TYPE_FLOAT32=1 -DMOZ_SANDBOX=1 -DMOZ_SCTP=1 -DMOZ_SECUREELEMENT=1 -DMOZ_SERVICES_CLOUDSYNC=1 -DMOZ_SERVICES_COMMON=1 -DMOZ_SERVICES_CRYPTO=1 -DMOZ_SERVICES_HEALTHREPORT=1 -DMOZ_SERVICES_SYNC=1 -DMOZ_SOCIAL=1 -DMOZ_SRTP=1 -DMOZ_STACKWALKING=1 -DMOZ_STATIC_JS=1 -DMOZ_TELEMETRY_ON_BY_DEFAULT=1 -DMOZ_TELEMETRY_REPORTING=1 -DMOZ_TREE_CAIRO=1 -DMOZ_TREE_PIXMAN=1 -DMOZ_UPDATER=1 -DMOZ_UPDATE_CHANNEL=default -DMOZ_URL_CLASSIFIER=1 -DMOZ_USER_DIR='"Mozilla"' -DMOZ_VERIFY_MAR_SIGNATURE=1 -DMOZ_VORBIS=1 -DMOZ_VPX_ERROR_CONCEALMENT=1 -DMOZ_VPX_NO_MEM_REPORTING=1 -DMOZ_VTUNE=1 -DMOZ_WEBGL_CONFORMANT=1 -DMOZ_WEBM_ENCODER=1 -DMOZ_WEBRTC=1 -DMOZ_WEBRTC_ASSERT_ALWAYS=1 -DMOZ_WEBRTC_SIGNALING=1 -DMOZ_WEBSPEECH=1 -DMOZ_WEBSPEECH_TEST_BACKEND=1 -DMOZ_WINSDK_MAXVER=0x0A000000 -DMOZ_WINSDK_TARGETVER=0x06030000 -DMOZ_WMF=1 -DMOZ_XUL=1 -DMSVC_HAS_DIA_SDK=1 -DNIGHTLY_BUILD=1 -DNOMINMAX=1 -DNO_NSPR_10_SUPPORT=1 -DNS_ENABLE_TSF=1 -DNS_PRINTING=1 -DNS_PRINT_PREVIEW=1 -DSTATIC_JS_API=1 -DSTDC_HEADERS=1 -DTARGET_XPCOM_ABI='"x86-msvc"' -DUSE_SKIA=1 -DUSE_SKIA_GPU=1 -DU_STATIC_IMPLEMENTATION=1 -DU_USING_ICU_NAMESPACE=0 -DVPX_X86_ASM=1 -DWIN32=1 -DWIN32_LEAN_AND_MEAN=1 -DWINVER=0x502 -DXP_WIN=1 -DXP_WIN32=1 -DX_DISPLAY_MISSING=1 -D_CRT_NONSTDC_NO_WARNINGS=1 -D_CRT_SECURE_NO_WARNINGS=1 -D_USE_MATH_DEFINES=1 -D_VARIADIC_MAX=10 -D_WIN32_IE=0x0603 -D_WIN32_WINNT=0x502 -D_WINDOWS=1 -D_X86_=1 -DAB_CD=en-US \
16:56:35 INFO - x:/Task_1462292895/build/src/layout/style/PythonCSSProps.h | \
16:56:35 INFO - PYTHONDONTWRITEBYTECODE=1 x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe \
16:56:35 INFO - x:/Task_1462292895/build/src/layout/style/GenerateCSSPropsGenerated.py \
16:56:35 INFO - x:/Task_1462292895/build/src/layout/style/nsCSSPropsGenerated.inc.in > nsCSSPropsGenerated.inc
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/dom/base'
16:56:35 INFO - PropertyUseCounterMap.inc
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/dom/base/gen-usecounters.py property_map PropertyUseCounterMap.inc .deps/PropertyUseCounterMap.inc.pp x:/Task_1462292895/build/src/dom/base/UseCounters.conf
16:56:35 INFO - nsStyleStructList.h
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests'
16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests'
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/layout/style/generate-stylestructlist.py main nsStyleStructList.h .deps/nsStyleStructList.h.pp
16:56:35 INFO - UseCounterList.h
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/intl/locale'
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/dom/base/gen-usecounters.py use_counter_list UseCounterList.h .deps/UseCounterList.h.pp x:/Task_1462292895/build/src/dom/base/UseCounters.conf
16:56:35 INFO - PythonCSSProps.h
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'nsStyleStructList.h' '../../dist/include'
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/accessible/xpcom'
16:56:35 INFO - xpcAccEvents.cpp
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/accessible/xpcom/AccEventGen.py gen_cpp_file xpcAccEvents.cpp .deps/xpcAccEvents.cpp.pp x:/Task_1462292895/build/src/accessible/xpcom/AccEvents.conf
16:56:35 INFO - xpcAccEvents.h
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/components/telemetry'
16:56:35 INFO - TelemetryHistogramData.inc
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/toolkit/components/telemetry/gen-histogram-data.py main TelemetryHistogramData.inc .deps/TelemetryHistogramData.inc.pp x:/Task_1462292895/build/src/toolkit/components/telemetry/Histograms.json x:/Task_1462292895/build/src/dom/base/UseCounters.conf x:/Task_1462292895/build/src/dom/base/nsDeprecatedOperationList.h
16:56:35 INFO - TelemetryHistogramEnums.h
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/layout/style'
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/accessible/xpcom/AccEventGen.py gen_header_file xpcAccEvents.h .deps/xpcAccEvents.h.pp x:/Task_1462292895/build/src/accessible/xpcom/AccEvents.conf
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'UseCounterList.h' '../../dist/include/mozilla/dom'
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/netwerk/dns'
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/toolkit/components/telemetry/gen-histogram-enum.py main TelemetryHistogramEnums.h .deps/TelemetryHistogramEnums.h.pp x:/Task_1462292895/build/src/toolkit/components/telemetry/Histograms.json x:/Task_1462292895/build/src/dom/base/UseCounters.conf x:/Task_1462292895/build/src/dom/base/nsDeprecatedOperationList.h
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/base'
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre'
16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre'
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests/component_no_aslr'
16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/xpcom/tests/component_no_aslr'
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/intl/locale/windows'
16:56:35 INFO - wincharset.properties.h
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.file_generate x:/Task_1462292895/build/src/intl/locale/props2arrays.py main wincharset.properties.h .deps/wincharset.properties.h.pp x:/Task_1462292895/build/src/intl/locale/windows/wincharset.properties
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/intl/locale/windows'
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'xpcAccEvents.h' '../../dist/include'
16:56:35 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre/test/win'
16:56:35 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/xre/test/win'
16:56:35 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/accessible/xpcom'
16:56:35 INFO - x:/Task_1462292895/build/src/obj-firefox/_virtualenv/Scripts/python.exe x:/Task_1462292895/build/src/config/nsinstall.py -t -m 644 'TelemetryHistogramEnums.h' '../../../dist/include/mozilla'
16:56:36 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/toolkit/components/telemetry'
16:56:37 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
16:56:37 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings'
16:56:37 INFO - mozmake.EXE[5]: Entering directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings/test'
16:56:37 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
16:56:37 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings/test'
16:56:44 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/ipc/ipdl'
The task itself is running:
python x:\Task_1462291293\build\src\testing\mozharness\scripts\fx_desktop_build.py --config builds\releng_base_windows_32_builds.py --disable-mock --no-setup-mock --no-checkout-sources --no-clone-tools --no-clobber --no-update --no-upload-files --no-sendchange --log-level=debug --work-dir=x:\Task_1462291293\build --no-action=generate-build-stats --branch=try --build-pool=taskcluster
Here is an example failed run:
https://public-artifacts.taskcluster.net/W-aUQu9LQdKDBI6ys_Lycg/0/public/logs/all_commands.log
At the point of the failure, I'm not sure why the pertinent subdirectory of the tooltool binary download of vs2015u1 has disappeared from the PATH. I presume this is the reason it is not being found.
More information can be provided if there are any questions!
Please note the AMI was set up according to this configuration:
https://github.com/MozRelOps/OpenCloudConfig/blob/9f3f63341081903202237e2c49391b747797b446/userdata/Manifest/win2012.json
The installation steps outlined in this json configuration are applied atop of a fresh Windows Server 2012 R2 base image.
The task steps can be seen here:
https://queue.taskcluster.net/v1/task/W-aUQu9LQdKDBI6ys_Lycg
Comment 1•9 years ago
|
||
this is a popup on the computer, not in the log provided. One thing I see in the log is sccache failure (bug 1187257), we should ensure we have |set NO_CACHE=1| in the environment to disable configure caching and sccache.
| Reporter | ||
Comment 2•9 years ago
|
||
Screenshot of the pop up
Comment 3•9 years ago
|
||
Are we sure vcvarsall.bat is executed properly before the build? I see it in the task steps above, so I guess so, but I believe I saw this issue last week and was able to work around it by running "%VCINSTALLDIR%\..\vcvarsall.bat" prior to the build.
Comment 4•9 years ago
|
||
vcvarsall.bat shouldn't be relevant to tooltool-based VS2015 builds, as vcvarsall.bat is provided by Microsoft and looks for the files in the standard locations (Program Files).
What's likely happening is we're attempting to spawn a process from outside the main client.mk context. This process doesn't inherit PATH which is set when sourcing the mozconfig.
Lemme look at the log in more detail...
Comment 5•9 years ago
|
||
That pop-up could have been generated by any program, including something during configure.
You'll need to build with -j1 with remote desktop enabled and see if you can track down which process is spawning the pop-up. You may be able to identify the process via process monitor: I /think/ the .exe will still be alive as long as the pop-up is visible.
Comment 6•9 years ago
|
||
Also, the `hg clone` of mozilla-central at the top of the job is not necessary. Use `hg share` to cut several minutes from the build.
Also, you should update to VS2015u2, as that is what mozilla-central is now using.
| Reporter | ||
Comment 7•9 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #5)
> That pop-up could have been generated by any program, including something
> during configure.
>
> You'll need to build with -j1 with remote desktop enabled and see if you can
> track down which process is spawning the pop-up. You may be able to identify
> the process via process monitor: I /think/ the .exe will still be alive as
> long as the pop-up is visible.
Hi guys,
Many thanks for your quick feedback!
I should have added, I'm pretty sure the first pop up comes immediately after "Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/ipc/ipdl'".
16:56:37 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/dom/bindings/test'
16:56:44 INFO - mozmake.EXE[5]: Leaving directory 'x:/Task_1462292895/build/src/obj-firefox/ipc/ipdl'
18:16:44 INFO - Automation Error: mozprocess timed out after 4800 seconds running ['c:\\mozilla-build\\python\\python.exe', 'mach', '--log-no-times', 'build', '-v']
I happened to notice I had a hung build, and then logged in, and saw the pop up.
You will see the timestamp jump here (from 16:56:44 -> 18:16:44, where it times out). There have been a few runs, and on other ones, I got there before the timeout, clicked "OK" and then it continued happily.
I guess this is therefore unrelated to the sccache matters which show up later in the build logs.
Also, thanks for the `hg share` tip. We'll implement that too.
To pick up VS2015u2, is it just a case of rebasing against latest mozilla central to get a newer tooltool manifest. We can do that too.
Thanks!
| Reporter | ||
Comment 8•9 years ago
|
||
:gps if it is ok with you, maybe the best is for me and you and grenade to meet (joel welcome too) and we can go through some of this together, to explain how the AMI is set up, what we've done so far etc.
It should be mostly transparent from the AMI setup config:
https://github.com/MozRelOps/OpenCloudConfig/blob/9f3f63341081903202237e2c49391b747797b446/userdata/Manifest/win2012.json
Plus the task definitions, e.g. from the try pushes above. However, it might also be useful to talk it through to explain some of the reasoning behind it. If you have a convenient time tomorrow (Wed 4 May) which works with European timezones too, that would be awesome!
Comment 9•9 years ago
|
||
as a note, I got to the ipdl line and it stopped outputting data in my logs- I did the set NO_CACHE=1 and this fixed the sccache issue in the logs.
I assume this is the same spot where we got hung up and it could be related to mspdb140.dll.
pmoore, can you work on ensuring we s/hg clone/hg share/ to speed up replication of the data? would it be possible to get -j1 into the build process (yes, this will slow it down, but make error finding much easier).
:gps, for adding -j1, is there a convenient place to put this? I assume this won't be located in mozharness code, but possibly a mozconfig ac_add_options?
Comment 10•9 years ago
|
||
Run `mach build -j1`.
If it stopped after the IPDL line, that's possibly the transition between the mostly Python pre-build and compilation. It may get that pop-up on the first invocation of the compiler. Although we should invoke the compiler during configure. So it might be some other random executable.
| Reporter | ||
Comment 11•9 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #6)
> Also, the `hg clone` of mozilla-central at the top of the job is not
> necessary. Use `hg share` to cut several minutes from the build.
>
> Also, you should update to VS2015u2, as that is what mozilla-central is now
> using.
The challenge here is that each task runs as a different (temporary) user, not in the Administrator group. We do this so that when a task has completed, that user and all its resources can be deleted, and we can have relatively high confidence that the OS is in a clean state, since in theory the temporary task user cannot affect any global state, rather only affect its own resources. If using hg share as part of the task definition, the shared hg repository would be created by one temporary user, and then later could not be updated by a subsequent task user due to write permissions. If we made the directory writable by all task users, in theory a malicious task could affect the contents, and thus introduce malicious content into a subsequent build. If there is a way to only inherit objects from the share, but to write new objects into a localised repo, this could work (i.e. using read only). However, one disadvantage would be that the store would become slowly stale, however workers don't live very long so this isn't really a problem, as only up to around 4 days commits could be missing. Another option might be to get the worker itself to manage the shares, rather than the task. This way we might be able to find a way to operate a shared repository safely, such that it could be updated with new objects as each task runs... This would need some thought and design as at the moment the worker doesn't understand anything about vcs shares, it just executes task definitions.
I need to read up more on hg share to see if it really can be used as a read-only store - if that is the case, sounds like we'd make significant savings.
Comment 12•9 years ago
|
||
I think the reason that bb slaves don't barf on the pdb140 dll is that we disable jit debuggers there:
https://hg.mozilla.org/build/puppet/file/tip/modules/tweaks/manifests/disablejit.pp
We also add a few extra hacks:
https://hg.mozilla.org/build/puppet/file/tip/modules/tweaks/manifests/disable_desktop_interruption.pp
Windows 2012 (used by tc windows workertypes) has a new JIT debugger (RyuJIT) so the hacks for disabling JIT are different.
I'm experimenting with using this:
Set-ItemProperty -Path HKLM:\Software\Microsoft\.NETFramework -Name useLegacyJit -Type DWord -Value 1
Since the keys that we would normally delete, already don't exist. I'm hoping the uselegacy setting will trigger legacy jit, which will detect the absence of the debug registry keys/flags and prevent that dialog box from appearing.
| Reporter | ||
Comment 13•9 years ago
|
||
Would we still detect the problem if we disable jit? In other words, would this hide the problem, rather than fix it?
| Reporter | ||
Comment 14•9 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #10)
> Run `mach build -j1`.
>
> If it stopped after the IPDL line, that's possibly the transition between
> the mostly Python pre-build and compilation. It may get that pop-up on the
> first invocation of the compiler. Although we should invoke the compiler
> during configure. So it might be some other random executable.
Running with -j1 caused the problem to disappear entirely. I'm not quite sure why.
Removing the -j1 made it reappear, but I noticed this time I could get some more info from the interactive dialogue:
===================================================
Problem signature:
Problem Event Name: APPCRASH
Application Name: cl.EXE
Application Version: 19.0.23918.0
Application Timestamp: 56eb9318
Fault Module Name: mspdb140.dll
Fault Module Version: 6.3.9600.18194
Fault Module Timestamp: 56951674
Exception Code: c0000135
Exception Offset: 00000000000ecdd0
OS Version: 6.3.9600.2.0.0.272.7
Locale ID: 1033
Additional Information 1: ac05
Additional Information 2: ac0507478d1c5bd693cfc4fe3987e900
Additional Information 3: ac05
Additional Information 4: ac0507478d1c5bd693cfc4fe3987e900
Read our privacy statement online:
http://go.microsoft.com/fwlink/?linkid=280262
If the online privacy statement is not available, please read our privacy statement offline:
C:\Windows\system32\en-US\erofflps.txt
===========================
I'm not sure if this helps diagnosis in any way. After clicking "OK" on the original warning, another warning popped up to say cl.EXE had crashed, and this was the information it had....
| Reporter | ||
Comment 15•9 years ago
|
||
(In reply to Rob Thijssen (:grenade - GMT) from comment #12)
> I'm experimenting with using this:
> Set-ItemProperty -Path HKLM:\Software\Microsoft\.NETFramework -Name
> useLegacyJit -Type DWord -Value 1
Did this fix it for you Rob?
Thanks!
Flags: needinfo?(rthijssen)
| Reporter | ||
Comment 16•9 years ago
|
||
Hi Greg,
Does comment 14 provide any useful information for you about the source?
Thanks!
Flags: needinfo?(gps)
Comment 17•9 years ago
|
||
now that we have proven we can get a green build from a taskcluster job:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=c22048d7f1243989f6d327af560e90dbb55069ca
getting a job to complete in a similar time window is at the top of our list. That means removing -j1 and solving this bug.
as :gps stated in comment 10, this could be the first time we are using cl.exe (possibly the configure step doesn't have the flags needed to depend on mspdb140.dll, or we have the proper environment as it not spawned in a different process/thread?)
in comment 12, :grenade mentioned disabling jit debuggers and that we need to do the same on our new AMIs.
I think next steps here are to look at disabling the jit debuggers, if that doesn't work, then working with a build hacker to figure out why mspdb140.dll is not being found in the case where we invoke cl.exe during the build proper.
Comment 18•9 years ago
|
||
We need to know which cl.exe invocation is failing. That could be... difficult if this only reproduces in -j>1. You can probably use https://technet.microsoft.com/en-us/sysinternals/processexplorer.aspx to look at the process tree and see what target mozmake.exe is building.
Flags: needinfo?(gps)
| Reporter | ||
Comment 19•9 years ago
|
||
Thanks, I'll take a look.
Comment 20•9 years ago
|
||
Process explorer shows the command which throws the error as:
x:\Task_1463424522\build\src\vs2015u2\VC\bin\amd64_x86\cl.EXE "-?"
Which was spawned several layers up by this command:
x:/Task_1463424522/build/src/mozmake.EXE -C x:/Task_1463424522/build/src/security/nss/lib/nss/../libpkix/pkix/params export "CC= x:/Task_1463424522/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE" SOURCE_MD_DIR=x:/Task_1463424522/build/src/obj-firefox/dist SOURCE_MDHEADERS_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include/nspr DIST=x:/Task_1463424522/build/src/obj-firefox/dist NSPR_INCLUDE_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include/nspr NSPR_LIB_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/lib MOZILLA_CLIENT=1 NO_MDUPDATE=1 NSS_ENABLE_ECC=1 SQLITE_LIB_NAME=nss3 SQLITE_INCLUDE_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include topsrcdir=x:/Task_1463424522/build/src "BUILD=x:/Task_1463424522/build/src/obj-firefox/security/$(subst $(topsrcdir)/security/,,$(CURDIR))" BUILD_TREE=$(BUILD) OBJDIR=$(BUILD) DEPENDENCIES=$(BUILD)/.deps SINGLE_SHLIB_DIR=$(BUILD) SOURCE_XP_DIR=x:/Task_1463424522/build/src/obj-firefox/dist BUILD_OPT=1 OPT_CODE_SIZE=1 NS_USE_GCC= OS_TARGET=WIN95 NSS_SSL_ENABLE_ZLIB= PROGRAMS= CHECKLOC= FREEBL_NO_DEPEND=0 NSS_NO_PKCS11_BYPASS=1 PUBLIC_EXPORT_DIR=x:/Task_1463424522/build/src/obj-firefox/dist/include/$(MODULE) SOURCE_XPHEADERS_DIR=$(SOURCE_XP_DIR)/include/$(MODULE) "MODULE_INCLUDES=$(addprefix -I$(SOURCE_XP_DIR)/include/,$(REQUIRES))" "MAKE_OBJDIR=$(INSTALL) -D $(OBJDIR)" "TARGETS=$(LIBRARY) $(SHARED_LIBRARY) $(PROGRAM)" NSS_ENABLE_WERROR=0 PYTHON=x:/Task_1463424522/build/src/obj-firefox/_virtualenv/Scripts/python.exe NSINSTALL_PY=x:/Task_1463424522/build/src/config/nsinstall.py "NSINSTALL=$(PYTHON) $(NSINSTALL_PY)" "INSTALL=$(NSINSTALL) -t" PRIVATE_EXPORTS=
Flags: needinfo?(rthijssen) → needinfo?(gps)
Comment 21•9 years ago
|
||
Its also worth noting that I have only ever seen this error triggered in 32 bit builds. 64 bit builds sail past every time.
| Reporter | ||
Comment 22•9 years ago
|
||
Could this be:
https://hg.mozilla.org/mozilla-central/file/a884b96685aa/security/nss/lib/libpkix/pkix/params/Makefile
hitting one of these(?):
https://hg.mozilla.org/mozilla-central/file/a884b96685aa/security/nss/coreconf/WIN32.mk#l35
https://hg.mozilla.org/mozilla-central/file/a884b96685aa/security/nss/coreconf/Werror.mk#l29
Comment 23•9 years ago
|
||
I think comment #22 is spot on. NSS's build system appears to be unsetting/overwriting environment variables we rely on (although I'm not sure where). We didn't run into this before because files were in expected locations. Since Visual Studio isn't installed in the TC environment, NSS fails.
I'd try echoing PATH and LIB before it makes that cl.exe call and verify that our custom paths are there. If not, this is either a) not passing the variables to NSS's build system b) the NSS build system trampling on our custom values. Either way is a bug.
Flags: needinfo?(gps)
| Reporter | ||
Comment 24•9 years ago
|
||
For me (on a slightly different windows AMI to :grenade) it is happening a couple of lines further down in the same make file.
I can see it is here, because the mozmake call is:
c:/Users/Task_1463653946/build/src/mozmake.EXE -C c:/Users/Task_1463653946/build/src/security/nss/lib/crmf export "CC= c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE" SOURCE_MD_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist SOURCE_MDHEADERS_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include/nspr DIST=c:/Users/Task_1463653946/build/src/obj-firefox/dist NSPR_INCLUDE_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include/nspr NSPR_LIB_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/lib MOZILLA_CLIENT=1 NO_MDUPDATE=1 NSS_ENABLE_ECC=1 SQLITE_LIB_NAME=nss3 SQLITE_INCLUDE_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include topsrcdir=c:/Users/Task_1463653946/build/src "BUILD=c:/Users/Task_1463653946/build/src/obj-firefox/security/$(subst $(topsrcdir)/security/,,$(CURDIR))" BUILD_TREE=$(BUILD) OBJDIR=$(BUILD) DEPENDENCIES=$(BUILD)/.deps SINGLE_SHLIB_DIR=$(BUILD) SOURCE_XP_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist BUILD_OPT=1 OPT_CODE_SIZE=1 NS_USE_GCC= OS_TARGET=WIN95 NSS_SSL_ENABLE_ZLIB= PROGRAMS= CHECKLOC= FREEBL_NO_DEPEND=0 NSS_NO_PKCS11_BYPASS=1 PUBLIC_EXPORT_DIR=c:/Users/Task_1463653946/build/src/obj-firefox/dist/include/$(MODULE) SOURCE_XPHEADERS_DIR=$(SOURCE_XP_DIR)/include/$(MODULE) "MODULE_INCLUDES=$(addprefix -I$(SOURCE_XP_DIR)/include/,$(REQUIRES))" "MAKE_OBJDIR=$(INSTALL) -D $(OBJDIR)" "TARGETS=$(LIBRARY) $(SHARED_LIBRARY) $(PROGRAM)" NSS_ENABLE_WERROR=0 PYTHON=c:/Users/Task_1463653946/build/src/obj-firefox/_virtualenv/Scripts/python.exe NSINSTALL_PY=c:/Users/Task_1463653946/build/src/config/nsinstall.py "NSINSTALL=$(PYTHON) $(NSINSTALL_PY)" "INSTALL=$(NSINSTALL) -t" PRIVATE_EXPORTS=
And this is calling a sh.exe process which is running the following temporary file:
$ cat 'C:/Users/TEMP~1.WIN/AppData/Local/Temp/make3604-2.sh'
c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE 2>&1 | sed -ne 's|.* \([0-9]\+\.[0-9]\+\.[0-9]\+\(\.[0-9]\+\)\?\).*|\1|p'
This cc+sed expression matches only this makefile line in the build system, so it must be this one:
https://hg.mozilla.org/try/file/e524199f3299/security/nss/coreconf/WIN32.mk#l43
The pertinent env vars of the cl.EXE process are:
LIB=c:\Users\Task_1463653946\build\src\vs2015u2\VC\lib;c:\Users\Task_1463653946\build\src\vs2015u2\VC\atlmfc\lib;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\ucrt\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\um\x86;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\lib
PATH=c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x86\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x64\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64_x86;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x64;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\bin;c:\mozilla-build\nsis-3.0b1;c:\mozilla-build\python;C:\mozilla-build\msys\local\bin;c:\mozilla-build\7zip;c:\mozilla-build\info-zip;c:\mozilla-build\python\Scripts;c:\mozilla-build\yasm;C:\mozilla-build\msys\bin;c:\Windows\system32;c:\mozilla-build\upx391w;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32
INCLUDE=c:\Users\Task_1463653946\build\src\vs2015u2\VC\include;c:\Users\Task_1463653946\build\src\vs2015u2\VC\atlmfc\include;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\ucrt;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\shared;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\um;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Include\winrt;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\include
The last lines of the log before the freeze occurs (due to the interactive dialog) are:
11:43:16 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/toolkit/xre/test/win'
11:43:16 INFO - c:/Users/Task_1463653946/build/src/obj-firefox/_virtualenv/Scripts/python.exe c:/Users/Task_1463653946/build/src/config/nsinstall.py -t -m 644 'xpcAccEvents.h' '../../dist/include'
11:43:16 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/intl/locale/windows'
11:43:16 INFO - c:/Users/Task_1463653946/build/src/obj-firefox/_virtualenv/Scripts/python.exe c:/Users/Task_1463653946/build/src/config/nsinstall.py -t -m 644 'TelemetryHistogramEnums.h' '../../../dist/include/mozilla'
11:43:16 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/accessible/xpcom'
11:43:17 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/toolkit/components/telemetry'
11:43:20 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
11:43:20 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/dom/bindings'
11:43:20 INFO - mozmake.EXE[5]: Entering directory 'c:/Users/Task_1463653946/build/src/obj-firefox/dom/bindings/test'
11:43:20 INFO - mozmake.EXE[5]: Nothing to be done for 'export'.
11:43:20 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/dom/bindings/test'
11:43:28 INFO - mozmake.EXE[5]: Leaving directory 'c:/Users/Task_1463653946/build/src/obj-firefox/ipc/ipdl'
I have no explanation for why these lines extra debug lines that I added are not included in the log:
https://hg.mozilla.org/try/file/e524199f3299/security/nss/coreconf/WIN32.mk#l35
However, it does not matter too much since I could grab the env vars from ProcessExplorer.
This is my try push, for reference: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e524199f3299ae789f0fe82045a7dccc5f14e6f7
Aside from this missing dll problem, I'm also wondering if the call should be:
c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE -v
rather than:
c:/Users/Task_1463653946/build/src/vs2015u2/VC/bin/amd64_x86/cl.EXE
although this wouldn't change the fact it can't find mspdb140.dll. My suggestion is just because -v is used in other places (e.g. https://hg.mozilla.org/try/file/e524199f3299/nsprpub/configure.in#l1892)
Flags: needinfo?(gps)
| Reporter | ||
Comment 25•9 years ago
|
||
FWIW, the DLLs exist here:
C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64/mspdb140.dll
C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\mspdb140.dll
This build is a win32 opt build.
I'm assuming C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin is missing from one or more of PATH/LIB/LIBPATH/INCLUDE.
I didn't explicitly state it in comment 24, but LIBPATH isn't set at all in the cl.EXE process.
Comment 26•9 years ago
|
||
We add those dirs to PATH in this mozconfig:
https://dxr.mozilla.org/mozilla-central/rev/c67dc1f9fab86d4f2cf3224307809c44fe3ce820/build/win32/mozconfig.vs2015-win64#10
Maybe something is going awry with the way we spawn submakes to build NSS. (PATH is used as the equivalent of LD_LIBRARY_PATH on Windows, FYI.)
| Reporter | ||
Comment 27•9 years ago
|
||
(In reply to Pete Moore [:pmoore][:pete] from comment #24)
> I have no explanation for why these lines extra debug lines that I added are
> not included in the log:
> https://hg.mozilla.org/try/file/e524199f3299/security/nss/coreconf/WIN32.
> mk#l35
Looking again, they are in the log, just not at the point where the build hangs...
e.g.
11:42:44 INFO - ../../coreconf/WIN32.mk:35: in WIN32.mk LIB is c:\Users\Task_1463653946\build\src\vs2015u2\VC\lib;c:\Users\Task_1463653946\build\src\vs2015u2\VC\atlmfc\lib;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\ucrt\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\lib\um\x86;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\lib
11:42:44 INFO - ../../coreconf/WIN32.mk:36: in WIN32.mk PATH is c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x86\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x64\Microsoft.VC140.CRT;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64_x86;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64;c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x86;c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x64;c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\bin;c:\mozilla-build\nsis-3.0b1;c:\mozilla-build\python;C:\mozilla-build\msys\local\bin;c:\mozilla-build\7zip;c:\mozilla-build\info-zip;c:\mozilla-build\python\Scripts;c:\mozilla-build\yasm;C:\mozilla-build\msys\bin;c:\Windows\system32;c:\mozilla-build\upx391w;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32;c:\mozilla-build\python\lib\site-packages\pywin32_system32
These match exactly the results of ProcessExplorer in comment 24 - just wanted to confirm the logging worked...
| Reporter | ||
Comment 28•9 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #26)
> We add those dirs to PATH in this mozconfig:
> https://dxr.mozilla.org/mozilla-central/rev/
> c67dc1f9fab86d4f2cf3224307809c44fe3ce820/build/win32/mozconfig.vs2015-
> win64#10
>
> Maybe something is going awry with the way we spawn submakes to build NSS.
> (PATH is used as the equivalent of LD_LIBRARY_PATH on Windows, FYI.)
so the PATH does contain the dirs. The complete PATH of the failing process is:
c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x86\Microsoft.VC140.CRT
c:\Users\Task_1463653946\build\src\vs2015u2\VC\redist\x64\Microsoft.VC140.CRT
c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x86
c:\Users\Task_1463653946\build\src\vs2015u2\SDK\Redist\ucrt\DLLs\x64
c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64_x86
c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64
c:\Users\Task_1463653946\build\src\vs2015u2\VC\bin
c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x86
c:\Users\Task_1463653946\build\src\vs2015u2\SDK\bin\x64
c:\Users\Task_1463653946\build\src\vs2015u2\DIASDK\bin
c:\mozilla-build\nsis-3.0b1
c:\mozilla-build\python
C:\mozilla-build\msys\local\bin
c:\mozilla-build\7zip
c:\mozilla-build\info-zip
c:\mozilla-build\python\Scripts
c:\mozilla-build\yasm
C:\mozilla-build\msys\bin
c:\Windows\system32
c:\mozilla-build\upx391w
c:\mozilla-build\python\lib\site-packages\pywin32_system32
c:\mozilla-build\python\lib\site-packages\pywin32_system32
c:\mozilla-build\python\lib\site-packages\pywin32_system32
The 32/64 bit versions exist here:
C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64\mspdb140.dll
C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\mspdb140.dll
And the error message that appears as an interactive dialogue as the user Task_1463653946 says:
"The program can't start because mspdb140.dll is missing from your computer. Try reinstalling the program to fix this problem."
It should pick up "C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64\mspdb140.dll" since this is the first one to appear in the PATH.
Is the problem instead that "C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\amd64\mspdb140.dll" is the 64 bit version of mspdb140.dll and the firefox desktop 32 bit opt build needs the version "C:\Users\Task_1463653946\build\src\vs2015u2\VC\bin\mspdb140.dll" instead?
If this is the case, I'm surprised the error is not something like "mspdb140.dll has wrong architecture - expecting 32 bit, but found 64 bit" etc.
Any ideas?
Flags: needinfo?(ted)
| Reporter | ||
Comment 29•9 years ago
|
||
(In reply to Pete Moore [:pmoore][:pete] from comment #14)
> ===================================================
>
> Problem signature:
> Problem Event Name: APPCRASH
> Application Name: cl.EXE
> Application Version: 19.0.23918.0
> Application Timestamp: 56eb9318
> Fault Module Name: mspdb140.dll
> Fault Module Version: 6.3.9600.18194
> Fault Module Timestamp: 56951674
> Exception Code: c0000135
> Exception Offset: 00000000000ecdd0
> OS Version: 6.3.9600.2.0.0.272.7
> Locale ID: 1033
> Additional Information 1: ac05
> Additional Information 2: ac0507478d1c5bd693cfc4fe3987e900
> Additional Information 3: ac05
> Additional Information 4: ac0507478d1c5bd693cfc4fe3987e900
>
> Read our privacy statement online:
> http://go.microsoft.com/fwlink/?linkid=280262
>
> If the online privacy statement is not available, please read our privacy
> statement offline:
> C:\Windows\system32\en-US\erofflps.txt
>
> ===========================
Revisiting this problem signature, the module name is already cited as mspdb140.dll, and as an application crash. This makes me think that the OS successfully loaded the DLL, and then crashed, rather than was not able to load the DLL. This might also suggest it was using the 64 bit version instead of the 32 bit version, rather than not being able to find a version at all.
| Reporter | ||
Comment 30•9 years ago
|
||
Could the problem be that
vs2015u2/VC/bin/amd64_x86/mspdb140.dll is missing from the tooltool artifact "55814aaabcd4aa51fe85918ec02a8c29bc067d41ee79ddcfd628daaba5a06d4241a73a51bf5a8bc69cc762b52551009f44b05e65682c45b4684c17fb2d017c2c" referenced in browser/config/tooltool-manifests/win32/releng.manifest?
These versions exist in the tooltool binary:
vs2015u2/VC/bin/amd64/mspdb140.dll
vs2015u2/VC/bin/mspdb140.dll
but this version doesn't:
vs2015u2/VC/bin/amd64_x86/mspdb140.dll
However, the "amd64_x86" directory exists in PATH, earlier than the other two PATH extries, which makes me wonder if it is finding the amd64 version rather than the amd64_x86 version, which perhaps it needs instead?
| Reporter | ||
Comment 31•9 years ago
|
||
I'm sticking an extra -? in the cl command to see if that fixes it.
https://hg.mozilla.org/try/rev/7ecd7df1f1f2c0141408a7a406c2acd1315f06b1
Since line 1.6 in this diff executes without problems, I'm hoping 1.13 will.
| Reporter | ||
Comment 32•9 years ago
|
||
NOPE!
gps, over to you! :)
| Reporter | ||
Comment 33•9 years ago
|
||
Dependency walker reports problems for vs2015u2\vc\bin\amd64_x86\CL.EXE but not for its vs2015u2\vc\bin\amd64\MSPDB140.DLL dependency....
This screenshot shows the vs2015u2\vc\bin\amd64\MSPDB140.DLL dependencies...
| Reporter | ||
Comment 34•9 years ago
|
||
All dependency problems occur instead under c:\windows\system32\SHELL32.DLL (so MSPDB140.DLL seems to be ok). These are the missing dependencies of c:\windows\system32\SHELL32.DLL:
API-MS-WIN-CORE-KERNEL32-PRIVATE-L1-1-1.DLL
API-MS-WIN-CORE-PRIVATEPROFILE-L1-1-1.DLL
API-MS-WIN-SERVICE-PRIVATE-L1-1-1.DLL
API-MS-WIN-CORE-SHUTDOWN-L1-1-1.DLL
EXT-MS-WIN-NTUSER-UICONTEXT-EXT-L1-1-0.DLL
IESHIMS.DLL
MFPLAT.DLL
SETTINGSYNCPOLICY.DLL
WLANAPI.DLL
| Reporter | ||
Comment 35•9 years ago
|
||
This is the .dwi I was able to generate from http://www.dependencywalker.com/ for vs2015u2/VC/bin/amd64_x86/cl.EXE on a live running worker (https://tools.taskcluster.net/task-inspector/#QDUOOFTjTIOR1fcnRxhm4A/0).
| Reporter | ||
Comment 36•9 years ago
|
||
Note, when I ran the dependency walker, I set the PATH correctly to the same value I had in the hanging process, and removed the default module searches so that it only used the PATH I had set.
| Reporter | ||
Comment 37•9 years ago
|
||
This is a "mini" dump from ProcessExplorer for the cl.exe process. I can also attach a "full" dump if required, but it is ~8MB so I'll hold off for now unless someone needs it.
| Reporter | ||
Comment 38•9 years ago
|
||
Looking again at https://bugzilla.mozilla.org/attachment.cgi?id=8754912 it seems the red colour suggests a warning: http://www.dependencywalker.com/help/html/hidr_module_list_view.htm
I wonder if this is because the parent modules are 64 bit, and the PATH is set to find the 32 bit dependencies first. I can have a dig...
| Reporter | ||
Comment 39•9 years ago
|
||
A naïve try push where I've slightly adjusted the PATH ordering to favour x86 dirs over amd64 dirs in the 32 bit builds:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=cfaf1c48cc8da9656f2913e34e5ac9f5461ef942
It is currently running, let's see how it turns out....
| Reporter | ||
Comment 40•9 years ago
|
||
(In reply to Pete Moore [:pmoore][:pete] from comment #39)
> A naïve try push where I've slightly adjusted the PATH ordering to favour
> x86 dirs over amd64 dirs in the 32 bit builds:
>
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=cfaf1c48cc8da9656f2913e34e5ac9f5461ef942
>
> It is currently running, let's see how it turns out....
No, it didn't help. :(
Comment 41•9 years ago
|
||
(In reply to Pete Moore [:pmoore][:pete] from comment #30)
> Could the problem be that
>
> vs2015u2/VC/bin/amd64_x86/mspdb140.dll is missing from the tooltool artifact
On my Windows 10 desktop at work, I have the following mspdb140.dll files:
/c/Program Files (x86)/Microsoft Visual Studio 14.0/Common7/IDE/mspdb140.dll
/c/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/amd64/mspdb140.dll
/c/Program Files (x86)/Microsoft Visual Studio 14.0/VC/bin/mspdb140.dll
Comment 42•9 years ago
|
||
I attempted to reproduce this on my Windows 10 desktop and didn't have much luck. I started a generic MozillaBuild shell with the vs2015u2.zip archive extracted in the topsrcdir (read: no vcvarsall.bat). Perhaps mspdb140.dll is still getting picked up from my VS2015 install. Perhaps I should try with a fresh VM.
Flags: needinfo?(gps)
| Reporter | ||
Comment 43•9 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #42)
> I attempted to reproduce this on my Windows 10 desktop and didn't have much
> luck. I started a generic MozillaBuild shell with the vs2015u2.zip archive
> extracted in the topsrcdir (read: no vcvarsall.bat). Perhaps mspdb140.dll is
> still getting picked up from my VS2015 install. Perhaps I should try with a
> fresh VM.
We can consistently reproduce on the taskcluster workers - I can provide credentials for logging into a box where ProcessExplorer and dependencywalker are installed, where the warning window is popped up, and the process is hung. That might be easier than reproducing locally. If you like we can step through it together in a live vidyo call, and share a screen with the RDP session.
Comment 44•9 years ago
|
||
So, I've tested and seen today that the issue doesn't occur on a mach build that isn't wrapped by mozharness: https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a2a64024f329f8074532c975f950085323841b5
This at least narrows the problem to 32 bit builds that are triggered with a mozharness process parenting the mach process.
Flags: needinfo?(gps)
Comment 45•9 years ago
|
||
oh, mozharness is problematic! Possibly we can make a call to shift our build to using mach instead? That might not solve it for all build types, but it would get us closer to builds in automation matching local builds.
Comment 46•9 years ago
|
||
I'd like to try to replicate the firefox build that you are running after starting the MozillaBuild shell on one of the TC builders to see if we don't get the mspdb errors when not running under Mozharness (our builds all start with a mozharness call like: python -u %WORKSPACE%\build\src\testing\mozharness\scripts\fx_desktop_build.py --config %MOZHARNESS_CONFIG% %MH_CUSTOM_BUILD_VARIANT_FLAGS% --branch=%MH_BRANCH% --build-pool=%MH_BUILD_POOL% --skip-buildbot-actions --work-dir=%WORKSPACE%\build --clone-tools --build), but I suspect that is more build system oriented than what you guys are running when you replicate. Can you let us know so we can try it out?
Flags: needinfo?(gps)
Comment 47•9 years ago
|
||
We shouldn't need to drop mozharness, just adjust the mach call to not be wrapped by python. I have a test running here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f6045e17a3c8efc9f96cfd03adae8a24fdcbacd
https://hg.mozilla.org/try/rev/7f6045e17a3c8efc9f96cfd03adae8a24fdcbacd#l8.21
I think it has already passed the point where we normally get the mspdb140 error. I could do with some help with tidying up the bash wrapper though. My 'bash.exe' call won't work in a lot of places. I just wanted to validate that this would fix it.
Comment 48•9 years ago
|
||
well, pretty lousy fix actually. seems very slow compared to running under python (without -j1)
Comment 49•9 years ago
|
||
(In reply to Rob Thijssen (:grenade - GMT) from comment #43)
> I'd like to try to replicate the firefox build that you are running after
> starting the MozillaBuild shell on one of the TC builders to see if we don't
> get the mspdb errors when not running under Mozharness (our builds all start
> with a mozharness call like: python -u
> %WORKSPACE%\build\src\testing\mozharness\scripts\fx_desktop_build.py
> --config %MOZHARNESS_CONFIG% %MH_CUSTOM_BUILD_VARIANT_FLAGS%
> --branch=%MH_BRANCH% --build-pool=%MH_BUILD_POOL% --skip-buildbot-actions
> --work-dir=%WORKSPACE%\build --clone-tools --build),
The bareword "python" here is somewhat scary. I think it should use the full path to a Python executable, preferably something with "2.7" in the path, as "python" is reserved for the most modern version of Python, which will some day be Python 3.
Comment 50•9 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #49)
> The bareword "python" here is somewhat scary. I think it should use the full
> path to a Python executable, preferably something with "2.7" in the path, as
> "python" is reserved for the most modern version of Python, which will some
> day be Python 3.
Understood. We can easily fix that. For clarity, these builders use python from mozilla-build and have no other python installed. Everything that goes onto the builders is visible in the manifest at https://github.com/MozRelOps/OpenCloudConfig/blob/master/userdata/Manifest/win2012.json.
C:\Users\Administrator>python --version
Python 2.7.11
C:\Users\Administrator>where python
C:\mozilla-build\python\python.exe
| Reporter | ||
Comment 51•9 years ago
|
||
So bizarrely, I just wrote a powershell script to run against a completely clean win2012r2 ec2 instance, that installs the prerequisites, configures the environment, and then runs a win32 opt release build, to demonstrate the problem. However, for whatever reason, it did not exhibit the problem.
The only real difference I see between this example, and a real live worker, is that in this example I run the build as Administrator (whereas normally the generic worker would spawn a new user to run the task).
I'll see if I can adapt the powershell script to run the task as a different user, and see if that reproduces the problem.
We can still go ahead with our meeting, as I'm able to set up a test case to demonstrate the problem on a real worker - the reason I created the powershell was just to have a single script that can run against a clean instance, to demonstrate the problem, without needing a worker and everything else installed.
| Reporter | ||
Comment 52•9 years ago
|
||
We (gps, ted, grenade, chmanchester, jlund) had a meeting on Friday 27 May (2016) and determined that the problem with the failing cl.exe process is that it is getting an msys style path rather than a win32 style PATH, e.g. c/Users/....:c/....:....:... rather than C:/Users/....;C:/....;.....;.... or C:\Users\....;C:\....;....;.... - and this breaks the cl.exe process.
We were not able to establish why this doesn't happen when run with -j1.
We looked into msys source code to try to get a better idea, but did not reach a conclusion.
:grenade mentioned that when running mach from inside mozharness by invoking bash rather than python, the problem seems to disappear. This may be a temporary solution. Strangely, the builds can take twice as long just because of the different invocation style, but I do not know why.
I feared it may be related to the tasks running using a temporary local profile, so adapted the generic worker to create a local profile for each task user. This unfortunately did not resolve the problem and therefore is not the cause.
I've arranged a meeting with glandium to discuss, as gps mentioned he has some expertise in this area.
I confirmed that succeeding cl.exe processes indeed do have a win32 style PATH, as expected.
| Reporter | ||
Comment 53•9 years ago
|
||
:grenade has a workaround for this problem at the moment, which is to call mozharness from msys bash. We don't know why this works around the issue, but it seems to. This is still an issue, but removing it as a blocking bug for bug 1244750 since we have a workaround in place.
No longer blocks: 1244750
Updated•9 years ago
|
Flags: needinfo?(ted)
| Reporter | ||
Comment 54•9 years ago
|
||
See https://bugzilla.mozilla.org/show_bug.cgi?id=1280325#c2 - it may be that bug 1280325 solves this problem, we'll need a try push with the workaround (calling mozharness via bash) disabled to see if it has or not.
| Reporter | ||
Comment 55•9 years ago
|
||
Try push to see if we can now call mozharness natively: https://treeherder.allizom.org/#/jobs?repo=try&selectedJob=23058969
| Reporter | ||
Comment 56•9 years ago
|
||
Hey Rob,
Did we ever find a proper solution for this, or are we still building with -j1 ? I can't remember if/how this got resolved!
Thanks!
Flags: needinfo?(rthijssen)
Comment 57•9 years ago
|
||
This issue may not be reproducible since bug 1295937 landed, since this was occurring in the NSS build and we're no longer using that build system.
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
| Assignee | ||
Updated•7 years ago
|
Component: General Automation → General
You need to log in
before you can comment on or make changes to this bug.
Description
•