Closed Bug 1110760 Opened 5 years ago Closed 5 years ago

Startup crash in TppWaiterpThread on Windows Vista 64-bit

Categories

(Toolkit :: Startup and Profile System, defect, critical)

37 Branch
All
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla37
Tracking Status
firefox34 --- unaffected
firefox35 + fixed
firefox36 + fixed
firefox37 + fixed

People

(Reporter: bogdan_maris, Assigned: bobowen)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(4 files, 11 obsolete files)

5.72 KB, patch
glandium
: review+
Details | Diff | Splinter Review
2.09 KB, patch
aklotz
: review+
Details | Diff | Splinter Review
21.36 KB, patch
aklotz
: review+
Details | Diff | Splinter Review
1.74 KB, patch
gps
: review+
philip.chee
: feedback+
Callek
: checkin+
Details | Diff | Splinter Review
This bug was filed from the Socorro interface and is 
report bp-41bd126d-0368-4438-9233-5ba9f2141212.
=============================================================

latest Nightly crashes every time I start it.

More reports:
bp-92e06978-4b06-499f-81c4-e0fd42141212	12/12/2014
bp-5d82148a-657f-4074-a4dd-730aa2141212	12/12/2014
bp-08b0172c-d083-4491-a445-f725d2141212	12/12/2014

Reproducible almost only on Windows Vista 99.77%
https://crash-stats.mozilla.com/report/list?signature=TppWaiterpThread&product=Firefox&query_type=contains&range_unit=weeks&process_type=any&hang_type=any&date=2014-12-12+13%3A00%3A00&range_value=1#tab-sigsummary
The stacks here don't provide much information. The appears to be in a brand-new thread, but it's hard to tell who created that thread or what it's supposed to be doing.

In each of these crashes there is a thread like this:

0 	ntdll.dll 	ZwWaitForMultipleObjects 	
1 	kernel32.dll 	BaseDllOpenMappingTarget 	
2 	user32.dll 	IsInsideMessagePumpHook 	
3 	xul.dll 	mozilla::widget::WinUtils::WaitForMessage(unsigned long) 	widget/windows/WinUtils.cpp
4 	xul.dll 	base::MessagePumpForUI::DoRunLoop() 	ipc/chromium/src/base/message_pump_win.cc

This is unexpected for a number of reasons: I can't see why we'd have a UI message loop on a non-main thread, and the PumpHook/BaseDllOpenMappingTarget combination appears fishy to me.

Can you reproduce this on any other Vista machines or just this one?

dmajor does anything about this look familiar, or do you have suggestions on how to debug further?
Flags: needinfo?(dmajor)
Flags: needinfo?(bogdan.maris)
I can reproduce on two different machines but with similar specs and only in Vista 64bit, somehow the on the 32bit version of Vista, nightly starts just fine.
Flags: needinfo?(bogdan.maris)
Which nightly did this start in?
Flags: needinfo?(bogdan.maris)
I had to go manually through tinderbox builds (mozregression did not help much) and found this regression range:

Good:
BuildID: 20141130030213
https://hg.mozilla.org/mozilla-central/rev/df3fc7cb7e80

Bad: 
BuildID: 20141130083146
https://hg.mozilla.org/mozilla-central/rev/74861ffc991f
Flags: needinfo?(bogdan.maris)
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=df3fc7cb7e80&tochange=74861ffc991f

I will fairly confidently blame bug 928044 for this.
Assignee: nobody → bobowencode
Flags: needinfo?(dmajor)
Keywords: regression
Summary: crash in TppWaiterpThread → Startup crash in TppWaiterpThread on Windows Vista
* bobowen ... for the first time ever ... downloads Vista ISOs
We seem to be missing part of the Chromium sandbox.

On 64-bit Vista and earlier it creates a program called wow_helper.exe
This is used as part of the dll patching for the function interceptions, when running 32-bit under Wow64.

I've tried dropping in the Chrome version of this program as it hasn't changed for over a year.
This means that the wow_helper process starts, but the patching fails for some reason and it still crashes.
Slightly worryingly, Chrome doesn't appear to start either, it appears to be for the same reason.
Although, 64-bit Vista seems to be particularly flaky on VirtualBox, I couldn't install VS2010 and it took a few attempts to install WinDbg.

I'll pull in the wow_helper code so I can compile a debug version, to see if I can work out what is going wrong there.

Ben - how urgently do we need a fix for this?
I could disable the sandbox on Vista in these circumstances, if we want a quick fix while I'm looking into this.

The other bad news is that this isn't isolated to the content sandbox, so the sandboxed GMP process will fail to start as well ... probably crashing the main process into the bargain.
Status: NEW → ASSIGNED
Flags: needinfo?(benjamin)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #2)
> I can reproduce on two different machines but with similar specs and only in
> Vista 64bit, somehow the on the 32bit version of Vista, nightly starts just
> fine.

This makes sense, the 32-bit version won't need to use this wow_helper.

Out of interest would you mind testing Chrome on your 64-bit Vista machine?
Flags: needinfo?(bogdan.maris)
Vista64 users are between 0.5% and 1% of our total userbase, so we can't let this regression ride the trains beyond nightly.
Flags: needinfo?(benjamin)
(In reply to Bob Owen (:bobowen) from comment #8)
> (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #2)
> > I can reproduce on two different machines but with similar specs and only in
> > Vista 64bit, somehow the on the 32bit version of Vista, nightly starts just
> > fine.
> 
> This makes sense, the 32-bit version won't need to use this wow_helper.
> 
> Out of interest would you mind testing Chrome on your 64-bit Vista machine?

I just tested Chrome on both machines with Vista 64-bit, works just fine.
Flags: needinfo?(bogdan.maris)
Chrome version: 39.0.2171.95 m (latest)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #10)
> (In reply to Bob Owen (:bobowen) from comment #8)
> > (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #2)
> > > I can reproduce on two different machines but with similar specs and only in
> > > Vista 64bit, somehow the on the 32bit version of Vista, nightly starts just
> > > fine.
> > 
> > This makes sense, the 32-bit version won't need to use this wow_helper.
> > 
> > Out of interest would you mind testing Chrome on your 64-bit Vista machine?
> 
> I just tested Chrome on both machines with Vista 64-bit, works just fine.

Thanks, would you mind testing something else.

Could you copy wow_helper.exe from:
C:\Program Files (x86)\Google\Chrome\Application\

to

C:\Program Files (x86)\Nightly\

and then retry starting Firefox.
Thanks.
Lost my needinfo for comment 12 in a bugzilla clash.
Flags: needinfo?(bogdan.maris)
(In reply to Bob Owen (:bobowen) from comment #12)
> (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #10)
> > (In reply to Bob Owen (:bobowen) from comment #8)
> > > (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #2)
> > > > I can reproduce on two different machines but with similar specs and only in
> > > > Vista 64bit, somehow the on the 32bit version of Vista, nightly starts just
> > > > fine.
> > > 
> > > This makes sense, the 32-bit version won't need to use this wow_helper.
> > > 
> > > Out of interest would you mind testing Chrome on your 64-bit Vista machine?
> > 
> > I just tested Chrome on both machines with Vista 64-bit, works just fine.
> 
> Thanks, would you mind testing something else.
> 
> Could you copy wow_helper.exe from:
> C:\Program Files (x86)\Google\Chrome\Application\
> 
> to
> 
> C:\Program Files (x86)\Nightly\
> 
> and then retry starting Firefox.
> Thanks.

Sure thing, just did that on both machines and Nightly starts with no problem. \o/
Flags: needinfo?(bogdan.maris)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #14)

> Sure thing, just did that on both machines and Nightly starts with no
> problem. \o/

Excellent, thanks.
This failed on my VM, but then Chrome failed to start as well.

I've pulled in the wow_helper code, I just need to get it building as x64 and packaged into the installer.

I'll probably need your help to test this once I've got it building.
(In reply to Bob Owen (:bobowen) from comment #15)
> (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #14)
> 
> > Sure thing, just did that on both machines and Nightly starts with no
> > problem. \o/
> 
> Excellent, thanks.
> This failed on my VM, but then Chrome failed to start as well.
> 
> I've pulled in the wow_helper code, I just need to get it building as x64
> and packaged into the installer.
> 
> I'll probably need your help to test this once I've got it building.

Sure thing, please just needinfo me and I (or my colleagues) will get this checked out ASAP.
Please, can you test this installer from my try push:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bobowencode@gmail.com-246fb9d508ea/try-win32/firefox-37.0a1.en-US.win32.installer.exe

Can you delete the wow_helper.exe files that you copied, just to make sure it is using the one from the build.
It looks significantly smaller that the Chromium one, so I'm not sure I've got all the build parameters correct yet.


Maire - just adding you to the CC list, so you can track this.
This is the issue that means the sandboxed process doesn't start and crashes Firefox on Vista 64-bit.
Flags: needinfo?(bogdan.maris)
(In reply to Bob Owen (:bobowen) from comment #17)
> Please, can you test this installer from my try push:
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bobowencode@gmail.
> com-246fb9d508ea/try-win32/firefox-37.0a1.en-US.win32.installer.exe
> 
> Can you delete the wow_helper.exe files that you copied, just to make sure
> it is using the one from the build.
> It looks significantly smaller that the Chromium one, so I'm not sure I've
> got all the build parameters correct yet.

I still get the crash using the build from above...

bp-ba3ef421-69ed-421f-b907-b84592141216
bp-8f25e0b3-f802-47bc-a657-25b752141216
Flags: needinfo?(bogdan.maris)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #18)
> (In reply to Bob Owen (:bobowen) from comment #17)
> > Please, can you test this installer from my try push:
> > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/bobowencode@gmail.
> > com-246fb9d508ea/try-win32/firefox-37.0a1.en-US.win32.installer.exe
> > 
> > Can you delete the wow_helper.exe files that you copied, just to make sure
> > it is using the one from the build.
> > It looks significantly smaller that the Chromium one, so I'm not sure I've
> > got all the build parameters correct yet.
> 
> I still get the crash using the build from above...
> 
> bp-ba3ef421-69ed-421f-b907-b84592141216
> bp-8f25e0b3-f802-47bc-a657-25b752141216

Thanks ... looks like I'm going to have to set up a different VM or dual boot to Vista to test this. :(
I've managed to get this working locally with a try push running:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b604bb9b8b77

I used MSBuild in the end because that is recommended over devenv for command line building.
It also meant I could pass various parameters to the build to get it to build in our obj dir and create the exe directly in the dist\bin dir.

The vcxproj file is a hacked version that was converted from their vcproj file.
So, I've kept that out of the chromium source directory.

I'm sure I'm doing all sorts of things wrong here, so just asking for feedback at the moment.
Attachment #8538602 - Flags: feedback?(mh+mozilla)
Comment on attachment 8538602 [details] [diff] [review]
Import, build and package Chromium Sandbox wow_helper.

Review of attachment 8538602 [details] [diff] [review]:
-----------------------------------------------------------------

This should be hidden behind a configure test that checks whether msbuild can be run.

::: browser/installer/package-manifest.in
@@ +780,5 @@
>  ; For process sandboxing
>  #if defined(MOZ_SANDBOX)
>  #if defined(XP_WIN)
>  @BINPATH@/@DLL_PREFIX@sandboxbroker@DLL_SUFFIX@
> +#if defined(_X86_)

I doubt we have such a macro defined for package-manifest.in.

::: build/win32/mozconfig.vs2013-win64
@@ +11,5 @@
>  export LIBPATH=/c/Program\ Files\ \(x86\)/Windows\ Kits/8.1/Lib/winv6.3/um/x86:${_VSPATH}/vc/lib:${_VSPATH}/vc/atlmfc/lib:/c/tools/sdks/dx10/lib
>  export LIB=/c/Program\ Files\ \(x86\)/Windows\ Kits/8.1/Lib/winv6.3/um/x86:${_VSPATH}/vc/lib:${_VSPATH}/vc/atlmfc/lib:/c/tools/sdks/dx10/lib
>  
>  ## paths: win8.1 sdk x86 (32-bit) tools, msvc (64-bit compiling 32-bit) build toolchain, moz tools  ##
> +export PATH="/c/Program Files (x86)/Windows Kits/8.1/bin/x86:${_VSPATH}/Common7/IDE:${_VSPATH}/VC/BIN/amd64_x86:${_VSPATH}/VC/BIN/amd64:${_VSPATH}/Common7/Tools:${_VSPATH}/VC/VCPackages:/c/mozilla-build/moztools:/c/Program Files (x86)/MSBuild/12.0/Bin:${PATH}"

12.0? That seems weird.

::: security/sandbox/win/wow_helper/Makefile.in
@@ +7,5 @@
> +else
> +config_prop = Release
> +endif
> +
> +builddir = `pwd -W`

$(CURDIR)

@@ +8,5 @@
> +config_prop = Release
> +endif
> +
> +builddir = `pwd -W`
> +top_builddir = $(builddir)/$(DEPTH)

$(abspath $(DEPTH))

@@ +10,5 @@
> +
> +builddir = `pwd -W`
> +top_builddir = $(builddir)/$(DEPTH)
> +
> +libs::

misc::

@@ +11,5 @@
> +builddir = `pwd -W`
> +top_builddir = $(builddir)/$(DEPTH)
> +
> +libs::
> +	(MSBuild.exe -property:"Configuration=$(config_prop);Platform=x64;MozBuildDir=$(builddir);MozTopBuildDir=$(top_builddir)" $(topsrcdir)/security/sandbox/win/wow_helper/wow_helper.vcxproj)

I think we could kind of live with something like this, but, if we can manage to have it a little better, that would be awesome: can you try to make msbuild only handle the compilation part, so that we can link ourselves (just worry about the former, I'll come up with something that works for the latter if that works) ?
Attachment #8538602 - Flags: feedback?(mh+mozilla)
I've split this out into different patches.
This just imports the code and adds the converted and modified version of their vcproj file.
Attachment #8538602 - Attachment is obsolete: true
Attachment #8539195 - Flags: review?(mh+mozilla)
I've addressed your original feedback even though you are going to write better versions of these build files.

Try push with these two patches:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9d9a838f3730

(In reply to Mike Hommey [:glandium] from comment #21)

> This should be hidden behind a configure test that checks whether msbuild
> can be run.

Added check for msbuild.exe.
I had to add the .exe or it doesn't find it locally, something to do with PATHEXT env var I think.
Hopefully this will work on the build servers.

> ::: build/win32/mozconfig.vs2013-win64
> > +export PATH="/c/Program Files (x86)/Windows Kits/8.1/bin/x86:${_VSPATH}/Common7/IDE:${_VSPATH}/VC/BIN/amd64_x86:${_VSPATH}/VC/BIN/amd64:${_VSPATH}/Common7/Tools:${_VSPATH}/VC/VCPackages:/c/mozilla-build/moztools:/c/Program Files (x86)/MSBuild/12.0/Bin:${PATH}"
> 
> 12.0? That seems weird.

I've removed this now as it looks like we don't need it, but the try push might prove otherwise.
 
> > +builddir = `pwd -W`
> 
> $(CURDIR)
 
> > +top_builddir = $(builddir)/$(DEPTH)
> 
> $(abspath $(DEPTH))

I moved these directly into the msbuild command.

> > +libs::
> 
> misc::

Changed, although I know from IRC you have a better solution.
 
> > +	(MSBuild.exe -property:"Configuration=$(config_prop);Platform=x64;MozBuildDir=$(builddir);MozTopBuildDir=$(top_builddir)" $(topsrcdir)/security/sandbox/win/wow_helper/wow_helper.vcxproj)
> 
> I think we could kind of live with something like this, but, if we can
> manage to have it a little better, that would be awesome: can you try to
> make msbuild only handle the compilation part, so that we can link ourselves
> (just worry about the former, I'll come up with something that works for the
> latter if that works) ?

This just compiles now, the target had to be ClCompile not just Compile.
This is just the package part, we need the linking adding back in before this will work, so I left it out of the try push.

Just realised that we should add a check for MSBUILD as well.
I think that might mean we would need to AC_DEFINE it in configure.in.
Looks like we do need the msbuild path adding for the build servers.

New try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7d3c4acd1957

This time it finds msbuild and actually compiles. :)
Attachment #8539209 - Attachment is obsolete: true
(In reply to Mike Hommey [:glandium] from comment #21)
> Comment on attachment 8538602 [details] [diff] [review]

> > +libs::
> > +	(MSBuild.exe -property:"Configuration=$(config_prop);Platform=x64;MozBuildDir=$(builddir);MozTopBuildDir=$(top_builddir)" $(topsrcdir)/security/sandbox/win/wow_helper/wow_helper.vcxproj)
> 
> I think we could kind of live with something like this, but, if we can
> manage to have it a little better, that would be awesome: can you try to
> make msbuild only handle the compilation part, so that we can link ourselves
> (just worry about the former, I'll come up with something that works for the
> latter if that works) ?

The absence of wow_helper will already cause the GMP process to crash Firefox in the current Live release. So, we're going to want to get this into Firefox 35, if we can.

So, given your comment here, I've created a full build and package patch which addresses your other original comments.
I wonder if you would be happy to r+ the import one and this one and then we can pick up the build improvements in a follow-up patch?
Attachment #8540282 - Flags: review?(mh+mozilla)
Try push with the "Import" and "Build and Package" patches:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ab3becd2223
Comment on attachment 8539195 [details] [diff] [review]
Import Chromium Sandbox wow_helper code and add a custom vcxproj for it.

Review of attachment 8539195 [details] [diff] [review]:
-----------------------------------------------------------------

Please find someone to review to code itself.

::: security/sandbox/win/wow_helper/wow_helper.vcxproj
@@ +97,5 @@
> +  </ItemDefinitionGroup>
> +  <ItemGroup>
> +    <ClCompile Include="..\..\chromium\sandbox\win\wow_helper\service64_resolver.cc" />
> +    <ClCompile Include="..\..\chromium\sandbox\win\wow_helper\target_code.cc" />
> +    <ClCompile Include="..\..\chromium\sandbox\win\wow_helper\wow_helper.cc" />

Why not put this file next to the others and avoid those long error-prone paths?
Attachment #8539195 - Flags: review?(mh+mozilla) → feedback+
Attachment #8539210 - Attachment is obsolete: true
Attachment #8539279 - Attachment is obsolete: true
Comment on attachment 8540282 [details] [diff] [review]
Build and Package Chromium Sandbox wow_helper.

Review of attachment 8540282 [details] [diff] [review]:
-----------------------------------------------------------------

::: configure.in
@@ +460,5 @@
>          _WIN32_MSVC=1
>          AC_CHECK_PROGS(MIDL, midl)
> +        AC_CHECK_PROGS(MSBUILD, msbuild.exe)
> +        if test -n "$MSBUILD"; then
> +            AC_DEFINE(MSBUILD)

This is overkill. You should just add something like:

if CONFIG['MSBUILD'] and CONFIG['CPU_ARCH'] == 'x86':
    DEFINES['WOWHELPER'] = True

in browser/installer/Makefile.in and #ifdef WOWHELPER in package-manifest.in.

::: security/sandbox/win/wow_helper/Makefile.in
@@ +10,5 @@
> +
> +misc::
> +	# All our tools are set up to build x86, but wow_helper needs to be x64, so we
> +	# use MSBuild and a vcxproj.
> +	($(MSBUILD) -target:Build -property:"Configuration=$(config_prop);Platform=x64;MozBuildDir=$(CURDIR);MozTopBuildDir=$(abspath $(DEPTH))" $(topsrcdir)/security/sandbox/win/wow_helper/wow_helper.vcxproj)

Since your reason to go with this version is to backport to 35, you have to know that misc doesn't exist in 35. So you'll need something different for 35 anyways. Might as well make it right from the start, which actually has a greater chance to be backportable without a different patch. Give me a moment to find the right incantation.
Attachment #8540282 - Flags: review?(mh+mozilla) → feedback-
For whoever is around close to the holidays. In the end, I found a not too awful way to make this work without msbuild. This built properly locally, I've pushed it to try:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=2b3776ec64d0
Bob, can you check the resulting build works as expected?
Attachment #8540282 - Attachment is obsolete: true
Attachment #8541136 - Flags: review?(ted)
Attachment #8541136 - Flags: review?(mshal)
Attachment #8541136 - Flags: review?(gps)
(In reply to Bob Owen (:bobowen) from comment #26)
> The absence of wow_helper will already cause the GMP process to crash
> Firefox in the current Live release. So, we're going to want to get this
> into Firefox 35, if we can.

You should update the tracking flags, btw. They currently only say that 36 is unaffected.
As always, last minute changes break things :)
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=9b90ef76cfc8
Attachment #8541136 - Attachment is obsolete: true
Attachment #8541136 - Flags: review?(ted)
Attachment #8541136 - Flags: review?(mshal)
Attachment #8541136 - Flags: review?(gps)
Attachment #8541145 - Flags: review?(ted)
Attachment #8541145 - Flags: review?(mshal)
Attachment #8541145 - Flags: review?(gps)
Comment on attachment 8541145 [details] [diff] [review]
Build and Package Chromium Sandbox wow_helper

Can't have nice things. My guess is that sccache messes up with it.
Attachment #8541145 - Flags: review?(ted)
Attachment #8541145 - Flags: review?(mshal)
Attachment #8541145 - Flags: review?(gps)
Also excluding non MSVC.
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=01be4d403167
Attachment #8541145 - Attachment is obsolete: true
Attachment #8541175 - Flags: review?(ted)
Attachment #8541175 - Flags: review?(mshal)
Attachment #8541175 - Flags: review?(gps)
Comment on attachment 8541175 [details] [diff] [review]
Build and Package Chromium Sandbox wow_helper

... and paths are different on automation...
Attachment #8541175 - Flags: review?(ted)
Attachment #8541175 - Flags: review?(mshal)
Attachment #8541175 - Flags: review?(gps)
Attachment #8541175 - Attachment is obsolete: true
Comment on attachment 8541221 [details] [diff] [review]
Build and Package Chromium Sandbox wow_helper

Review of attachment 8541221 [details] [diff] [review]:
-----------------------------------------------------------------

I could push back and tell you to write some configure gunk to do this right. But meh.
Attachment #8541221 - Flags: review?(ted)
Attachment #8541221 - Flags: review?(mshal)
Attachment #8541221 - Flags: review?(gps)
Attachment #8541221 - Flags: review+
I wish I could tell why that x86_amd64 cl hangs on automation and not locally... and why it doesn't hang when using msbuild...
By the way, before Christmas I tried applying my original patch to Beta and it didn't seem to require an older version of the Chromium code, which I thought it might.

All I had to do was a bit of rebasing and to change it from misc:: back to libs:: to get it to work.


I've not changed the flags yet, because I can't seem to get anything to attempt to start the GMP Process using Fx34 on my Vista 64 bit VM.
This is a version that (finally) builds. It has a change that gps r+'ed on irc, and a move of the LIB definition under the config.mk include because config.mk includes .mozconfig.mk, which contains the definition of LIB from the mozconfig on automation.

Bob, please check the build there:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-990609fd2ce5/try-win32/
Attachment #8541221 - Attachment is obsolete: true
Attachment #8542373 - Flags: review+
Nightly doesn't start with this version of wow_helper.

It does if I copy in the Chrome version still.

The Nightly version is much smaller compared to the Chrome one 26KB vs 72KB.
I wonder if some of the compile options they have in the vcxproj are important for it to work correctly.
Flags: needinfo?(mh+mozilla)
The two main differences that could explain the difference are the different optimization level and the fact that the VC project builds with the static CRT. The former shouldn't make a difference, and depending what the helper does, the latter shouldn't make a difference either.

You could try a build with USE_STATIC_LIBS = True added to the moz.build. I don't have time to do this before going away for the holidays.

If that still fails, I'd advise figuring out how this thing actually works and why it fails.
Flags: needinfo?(mh+mozilla)
Come to think of it, it has to be USE_STATIC_LIBS, because without that we'd need to ship the 64-bits CRT.
(In reply to Mike Hommey [:glandium] (out from Dec 31 to Jan 5 incl.) from comment #43)
> Come to think of it, it has to be USE_STATIC_LIBS, because without that we'd
> need to ship the 64-bits CRT.

OK, thanks.
We have people visiting again later, but I've kicked off a build with your patches to re-produce locally and then I'll try with USE_STATIC_LIBS.
USE_STATIC_LIBS didn't help, I'm not sure how much time I'll have before the 5th Jan.
Updating tracking and status as per comment 31.  Since this was just brought to relman's attention and it's too late for landing to 35 without enough time to shake out any potential regressions, marking wontfix there.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #9)
> Vista64 users are between 0.5% and 1% of our total userbase, so we can't let
> this regression ride the trains beyond nightly.

Is Vista64 still crashing on startup? We should consider stop-gaps to prevent that from happening on Aurora if the proper fix won't be ready in time.
(In reply to Lukas Blakk [:lsblakk] use ?needinfo from comment #46)
> Updating tracking and status as per comment 31.  Since this was just brought
> to relman's attention and it's too late for landing to 35 without enough
> time to shake out any potential regressions, marking wontfix there.

I've not managed to get the GMP Process to start on my Vista 64bit VM for Fx35, so I couldn't be sure that it was affected in reality.

I think the main reason is that my Web Cam doesn't seem to set up correctly on the VM.

jesup - is there a way I can force the GMP Process to attempt to start?
Flags: needinfo?(rjesup)
(In reply to David Major [:dmajor] (UTC+13) from comment #47)
> (In reply to Benjamin Smedberg  [:bsmedberg] from comment #9)
> > Vista64 users are between 0.5% and 1% of our total userbase, so we can't let
> > this regression ride the trains beyond nightly.
> 
> Is Vista64 still crashing on startup? We should consider stop-gaps to
> prevent that from happening on Aurora if the proper fix won't be ready in
> time.

This will only happen if e10s moves to Aurora, which I believe isn't planned until Fx39.

If we can't get this fixed before e10s does move to Aurora, then I have a patch that will turn the content sandbox off for 32-bit Gecko on 64-bit Vista or earlier.
> I've not managed to get the GMP Process to start on my Vista 64bit VM for
> Fx35, so I couldn't be sure that it was affected in reality.
> 
> I think the main reason is that my Web Cam doesn't seem to set up correctly
> on the VM.
> 
> jesup - is there a way I can force the GMP Process to attempt to start?

Easy:
Modify https://mozilla.github.io/webrtc-landing/pc_test.html to have fake: true for both mozGetUserMedia() calls.  Then it will use fake video/audio instead of trying to grab a webcam.
Flags: needinfo?(rjesup)
(In reply to Randell Jesup [:jesup] from comment #50)

> Easy:
> Modify https://mozilla.github.io/webrtc-landing/pc_test.html to have fake:
> true for both mozGetUserMedia() calls.  Then it will use fake video/audio
> instead of trying to grab a webcam.

Worked a treat ... and unfortunately it does crash Fx34 and Fx35.
Summary: Startup crash in TppWaiterpThread on Windows Vista → Startup crash in TppWaiterpThread on Windows Vista 64-bit
tl;dr: the reason this fails with my patch and worked with Bob's is because the whole thing relies on luck. I don't know how much Bob's VC project differs from chromium's, but if chromium's is similar, their binary may break at any time in the future.

First things first, there's something of note that would help future debugging with wow_helper: it exists with a status code depending on where it failed. So looking with e.g. procmon allows to see that with the working wow_helper, its exit code was 0 but with the broken one, it was 103. Which points to https://hg.mozilla.org/try/file/2d36400e8e65/security/sandbox/chromium/sandbox/win/wow_helper/wow_helper.cc#l46.

That got me started, but I originally thought that meant some permission problem, as WriteProcessMemory documentation suggests:
   If the function fails, the return value is 0 (zero). To get extended error information, call GetLastError. The function fails if the requested write operation crosses into an area of the process that is inaccessible.

But since that got me nowhere, I changed that return 103 to return ::GetLastError(), which gave me 87, which is... invalid arguments.

I didn't have a debugger, on the failing machine, but considering the few arguments, and the surrounding code, I figured this could very well be something weird with https://hg.mozilla.org/try/file/2d36400e8e65/security/sandbox/chromium/sandbox/win/wow_helper/wow_helper.cc#l37 and verifying my object files, indeed, TargetEnd was *before* TargetNtMapViewOfSection, so the calculated size was close to 2^64.

Now, as to why this happened to work with Bob's patch: it turns out the VC project enabled whole program optimization, which kind of does like pgo, which is that the compiler doesn't actually compile stuff, but the linker does. And it turns out for some random reason, that puts the two Target* functions in the right order, but this is a very fragile setup.

What this tells is that:
- PGO builds (which nightlies are) may have ended up working by chance, but may not have.
- PGO builds may actually fail to build at all, because we're not changing LD. This worked without the LD change because in the normal case where the compiler compiles, the x86 linker is actually happy to cross-link a x64 executable from the x64 objects. It doesn't know how to cross-compile, though.
- This is fragile both with our build system and likely chromium's.

Now, let's try to come up with the proper fix...
Only different from previous patch is:
+USE_STATIC_LIBS = True
+
+NO_PGO = True

Carrying over r+
Attachment #8542373 - Attachment is obsolete: true
Attachment #8545073 - Flags: review+
This makes the linker put the symbols in the order that wow_helper.cc assumes.
Who can review this?
The wow_helper code without the custom vcxproj file.
Attachment #8539195 - Attachment is obsolete: true
Attachment #8545083 - Flags: review?(mh+mozilla)
Comment on attachment 8545081 [details] [diff] [review]
Increase the chances of the wow_helper target code symbols being in the assumed order

Review of attachment 8545081 [details] [diff] [review]:
-----------------------------------------------------------------

See comment 52 and https://code.google.com/p/chromium/issues/detail?id=446764
Attachment #8545081 - Flags: review?(aklotz)
Attachment #8545083 - Flags: review?(mh+mozilla)
Attachment #8545083 - Flags: review?(aklotz)
Comment on attachment 8545081 [details] [diff] [review]
Increase the chances of the wow_helper target code symbols being in the assumed order

Review of attachment 8545081 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #8545081 - Flags: review?(aklotz) → review+
Comment on attachment 8545083 [details] [diff] [review]
Import Chromium Sandbox wow_helper code.

Review of attachment 8545083 [details] [diff] [review]:
-----------------------------------------------------------------

Fun!

rs=aklotz
Attachment #8545083 - Flags: review?(aklotz) → review+
> +NO_PGO = True

For the sake of future archaeology could you add a comment explaining why NO_PGO?
Comment on attachment 8545083 [details] [diff] [review]
Import Chromium Sandbox wow_helper code.

I certainly think we would want to uplift this to Aurora.
Beta and Release are more questionable given the timescales.

Approval Request Comment
[Feature/regressing bug #]:
Ultimately, this was missed from the original import of the Windows Chromium sandbox code (bug 922756), because it only affects Vista 64-bit (possibly WinXP 64-bit as well).

[User impact if declined]:
Anything that tries to start the GMP Process on Vista 64-bit will crash firefox.

[Describe test coverage new/current, TBPL]:
We don't currently have any automated testing on Vista 64-bit.

[Risks and why]:
Low: these patches introduce a new separate wow_helper.exe file that is only used on Vista 64-bit or earlier, when starting a sandboxed process.
At the moment the process fails to start completely and crashes the main firefox process as well.

[String/UUID change made/needed]:
None.
Attachment #8545083 - Flags: approval-mozilla-release?
Attachment #8545083 - Flags: approval-mozilla-beta?
Attachment #8545083 - Flags: approval-mozilla-aurora?
Comment on attachment 8545081 [details] [diff] [review]
Increase the chances of the wow_helper target code symbols being in the assumed order

See comment 63
Attachment #8545081 - Flags: approval-mozilla-release?
Attachment #8545081 - Flags: approval-mozilla-beta?
Attachment #8545081 - Flags: approval-mozilla-aurora?
Comment on attachment 8545073 [details] [diff] [review]
Build and Package Chromium Sandbox wow_helper

See comment 63.

This is the only patch that doesn't apply cleanly on Beta, but only because of some minor context changes.
Attachment #8545073 - Flags: approval-mozilla-release?
Attachment #8545073 - Flags: approval-mozilla-beta?
Attachment #8545073 - Flags: approval-mozilla-aurora?
Comment on attachment 8545083 [details] [diff] [review]
Import Chromium Sandbox wow_helper code.

OK, we don't support vista 64 bit in an official capacity but since this is a serious startup crasher for those users and this code only affects those same users we are probably erring on the riskier but better for the users side of things if we take this into our 35.0 build 3 that we have to do today for bug 1119189 anyway.  So let's get this in pronto.
Attachment #8545083 - Flags: approval-mozilla-release?
Attachment #8545083 - Flags: approval-mozilla-release+
Attachment #8545083 - Flags: approval-mozilla-beta?
Attachment #8545083 - Flags: approval-mozilla-aurora?
Attachment #8545083 - Flags: approval-mozilla-aurora+
Comment on attachment 8545081 [details] [diff] [review]
Increase the chances of the wow_helper target code symbols being in the assumed order

btw, not approving for mozilla-beta because it's merge day on monday and mozilla-beta has already been merged to mozilla-release.
Attachment #8545081 - Flags: approval-mozilla-release?
Attachment #8545081 - Flags: approval-mozilla-release+
Attachment #8545081 - Flags: approval-mozilla-beta?
Attachment #8545081 - Flags: approval-mozilla-aurora?
Attachment #8545081 - Flags: approval-mozilla-aurora+
Attachment #8545073 - Flags: approval-mozilla-release?
Attachment #8545073 - Flags: approval-mozilla-release+
Attachment #8545073 - Flags: approval-mozilla-beta?
Attachment #8545073 - Flags: approval-mozilla-aurora?
Attachment #8545073 - Flags: approval-mozilla-aurora+
This is not required on m-c, since we dropped support for Windows SDK < v8.1, but we need this on aurora, beta and release, more specifically for seamonkey and possibly thunderbird, because they're currently failing to build.

Pending validation from ewong that this does, in fact, fix the issue for seamonkey.
Attachment #8546565 - Flags: review?(gps)
Attachment #8546565 - Flags: feedback?(ewong)
Reproduced the original crash with yesterday's Nightly, on Windows Vista x64 (two separate machines) after enabling e10s and restarting the browser. The issue no longer reproduces with the same scenario and the latest Firefox 37 Nightly (BuildID=20150109030224). Latest 36 Aurora and Firefox 35 RC build 3 also start with no issues (not failing before anyway since this was an e10s issue).

One interesting thing I noticed was that the size of the "wow_helper.exe" file differs between 35 and Aurora/Nightly:
- Firefox 35      - ~72 kb
- Firefox 36 & 37 - ~124 kb

I'm not sure if this difference in size has any significance. Let us know if this is all expected, then we can mark this issue as Verified.
The difference probably comes from aurora and nightlies being built with different flags. Also, beta may still be building with MSVC 2010.
Comment on attachment 8546565 [details] [diff] [review]
Followup to avoid build failure with Windows SDK v7.0 and v7.0A

Review of attachment 8546565 [details] [diff] [review]:
-----------------------------------------------------------------

Oy.
Attachment #8546565 - Flags: review?(gps) → review+
Comment on attachment 8546565 [details] [diff] [review]
Followup to avoid build failure with Windows SDK v7.0 and v7.0A

Approval Request Comment
[Feature/regressing bug #]: This is a followup for what landed in this bug on aurora/beta/release, and broke seamonkey, and maybe thunderbird.
[User impact if declined]: Seamonkey can't ship.
[Describe test coverage new/current, TBPL]: Waiting for ewong's feedback whether it works or not, so please do not land before that happens, but considering the timing, I'm requesting a? soon. Note the patch shouldn't land on m-c, where we don't support WinSDK 7 anymore.
[Risks and why]: Low. The patch is not changing much of what was added in this bug. Technically, the patch is not needed for Firefox release and beta, which build with winsdk 8, so it /could/ land on a separate release branches for comm-central products.
[String/UUID change made/needed]: None
Attachment #8546565 - Flags: approval-mozilla-release?
Attachment #8546565 - Flags: approval-mozilla-beta?
Attachment #8546565 - Flags: approval-mozilla-aurora?
Comment on attachment 8546565 [details] [diff] [review]
Followup to avoid build failure with Windows SDK v7.0 and v7.0A

> Pending validation from ewong that this does, in fact, fix the issue
> for seamonkey.
Taking over from ewong (he's in bed sound asleep at the moment). I can confirm that with this patch I can build SeaMonkey comm-release. Without this patch the build fails when trying to build the wow_helper.exe.
Attachment #8546565 - Flags: feedback?(ewong) → feedback+
Attachment #8546565 - Flags: approval-mozilla-beta?
Attachment #8546565 - Flags: approval-mozilla-beta-
Attachment #8546565 - Flags: approval-mozilla-aurora?
Attachment #8546565 - Flags: approval-mozilla-aurora+
Comment on attachment 8546565 [details] [diff] [review]
Followup to avoid build failure with Windows SDK v7.0 and v7.0A

Approving this to land on release branch on a relbranch (as well as default so it's present for any potential dot release) so that SM/TB builds can occur off that relbranch.
Attachment #8546565 - Flags: approval-mozilla-release? → approval-mozilla-release+
Comment on attachment 8546565 [details] [diff] [review]
Followup to avoid build failure with Windows SDK v7.0 and v7.0A

Review of attachment 8546565 [details] [diff] [review]:
-----------------------------------------------------------------

To be clear, this missed uplifts, so I'll be landing on -release today, and on -beta (instead of aurora) sometime after the merges are done.

This is explicitly not landing on aurora/trunk due to not supporting the relevant SDK version.

(I got an ack in IRC from Sledru and lsblakk on the above plan matching everyone's assumptions)
(In reply to Lukas Blakk [:lsblakk] use ?needinfo from comment #76)
> Comment on attachment 8546565 [details] [diff] [review]
> Followup to avoid build failure with Windows SDK v7.0 and v7.0A
> 
> Approving this to land on release branch on a relbranch (as well as default
> so it's present for any potential dot release) so that SM/TB builds can
> occur off that relbranch.

Landed on m-r:

 https://hg.mozilla.org/releases/mozilla-release/rev/b2c4309cea44
 https://hg.mozilla.org/releases/mozilla-release/rev/601f9d329a5f

That is RELBRANCH: SEA_2_32_RELBRANCH (branched off GECKO350_2015010823_RELBRANCH  which was used for Firefox Build 3)

And on default
Comment on attachment 8546565 [details] [diff] [review]
Followup to avoid build failure with Windows SDK v7.0 and v7.0A

https://hg.mozilla.org/releases/mozilla-beta/rev/fe217a0d2e9a
Attachment #8546565 - Flags: checkin+
For anyone who needs some info.

I'm on Vista 32bit.
Downloaded Beta 38.0  to test.
Apart from a couple of addons (obviously not configured to work with version 38 as yet) I'm not experiencing anything wrong.

No crash on startup.
All three pop mail accounts downloaded ok.
Sent mail ok.

I cleared the error console, restarted from shortcut icon on desktop and the following displayed upon startup. 
Not sure if anything is relevant, so included everything.

----------------------------------

Could not read chrome manifest 'file:///C:/Program%20Files/Mozilla%20Thunderbird/chrome.manifest'.

Could not read chrome manifest 'file:///C:/Program%20Files/Mozilla%20Thunderbird/extensions/%7B972ce4c6-7e08-4474-a285-3208198ce6fd%7D/chrome.manifest'.

Timestamp: 7/4/15 12:46:15 PM
Warning: Unknown property 'text-emphasis'.  Declaration dropped.
Source File: resource://gre-resources/html.css
Line: 806, Column: 17
Source Code:
    text-emphasis: none;


Timestamp: 7/4/15 12:46:15 PM
Warning: Error in parsing value for 'unicode-bidi'.  Declaration dropped.
Source File: resource://gre-resources/html.css
Line: 820, Column: 18
Source Code:
    unicode-bidi: isolate;

Timestamp: 7/4/15 12:46:15 PM
Error: SyntaxError: const declaration not directly within block
Source File: jar:file:///C:/Users/XXXXX/AppData/Roaming/Thunderbird/Profiles/yba3h802.default/extensions/%7B3e17310d-82e8-4a43-bd2f-7c3055bfe589%7D.xpi!/components/vcf.js
Line: 153, Column: 1
Source Code:
 const NSGetFactory = XPCOMUtils.generateNSGetFactory([vcfMod

Contract ID '@mozilla.org/commandlinehandler/general-startup;1?type=myVCF' was registered as a command line handler for entry 'm-VCF', but could not be created.

Timestamp: 7/4/15 12:46:15 PM
Error: SyntaxError: const declaration not directly within block
Source File: jar:file:///C:/Users/XXXXX/AppData/Roaming/Thunderbird/Profiles/yba3h802.default/extensions/%7B3e17310d-82e8-4a43-bd2f-7c3055bfe589%7D.xpi!/components/openAB.js
Line: 151, Column: 1
Source Code:
 const NSGetFactory = XPCOMUtils.generateNSGetFactory([openAb

Contract ID '@mozilla.org/commandlinehandler/general-startup;1?type=myopenAB' was registered as a command line handler for entry 'm-openAB', but could not be created.

Timestamp: 7/4/15 12:46:16 PM
Warning: Failed to load overlay from chrome://global/content/charsetOverlay.xul.
Source File: chrome://messenger/content/messenger.xul
Line: 0

Timestamp: 7/4/15 12:46:17 PM
Warning: JavaScript 1.6's for-each-in loops are deprecated; consider using ES6 for-of instead
Source File: chrome://messenger/content/about-support/init.js
Line: 91, Column: 6
Source Code:
  for each (let [key, value] in Iterator(opt_attributes))

Timestamp: 7/4/15 12:46:17 PM
Warning: JavaScript 1.6's for-each-in loops are deprecated; consider using ES6 for-of instead
Source File: chrome://messenger/content/about-support/accounts.js
Line: 86, Column: 8
Source Code:
    for each (let [, tds] in Iterator(outgoingTDs.slice(1)))


Timestamp: 7/4/15 12:46:18 PM
Warning: JavaScript 1.6's for-each-in loops are deprecated; consider using ES6 for-of instead
Source File: chrome://messenger/content/about-support/prefs.js
Line: 115, Column: 22
Source Code:
                      for each (prefName in prefNames)

Timestamp: 7/4/15 12:46:19 PM
Warning: This site makes use of a SHA-1 Certificate; it's recommended you use certificates with signature algorithms that use hash functions stronger than SHA-1.
Source File: https://addons.cdn.mozilla.net/user-media/addon_icons/0/956-32.png?modified=1411668541
Line: 0

Timestamp: 7/4/15 12:46:19 PM
Warning: This site makes use of a SHA-1 Certificate; it's recommended you use certificates with signature algorithms that use hash functions stronger than SHA-1.
Source File: https://addons.cdn.mozilla.net/user-media/addon_icons/5/5373-32.png?modified=1352658732
Line: 0

Timestamp: 7/4/15 12:47:15 PM
Error: path is null
Source File: resource://gre/modules/osfile/ospath_win.jsm
Line: 220

Timestamp: 7/4/15 12:48:54 PM
Warning: Failed to load overlay from chrome://global/content/charsetOverlay.xul.
Source File: chrome://messenger/content/messengercompose/messengercompose.xul
Line: 0

Timestamp: 7/4/15 12:48:54 PM
Warning: window.controllers is deprecated. Do not use it for UA detection.
Source File: chrome://editor/content/ComposerCommands.js
Line: 198
You need to log in before you can comment on or make changes to this bug.