Closed Bug 1372958 Opened 4 years ago Closed 4 years ago

MinGW Debug Builds cause SIGTRAP and do not run

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: tjr, Assigned: tjr)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tor])

Attachments

(2 files)

Adding the log Tom has been linking to as it expires within 24 hours.
Attachment #8877732 - Attachment mime type: text/x-log → text/plain
Based on discussion with :dmajor in #jsapi, this is due to stack alignment checks.

On Windows, our C++ compilers only provide incoming stack alignment of 4-bytes. Due to the code in (https://dxr.mozilla.org/mozilla-central/source/js/src/jit/x86/Assembler-x86.h#130), we use the __GNUC__ macro to decide alignment.

The actual fault comes from here: https://searchfox.org/mozilla-central/rev/05c4c3bc0cfb9b0fc66bdfc8c47cac674e45f151/js/src/jit/x86/Trampoline-x86.cpp#58

The alignment check is primarily used by WASM currently, but it is best to fix this and not ignore the debug check.

David, did you have any update or bug number for this?
Blocks: 850548
Flags: needinfo?(dmajor)
Priority: -- → P3
(In reply to Ted Campbell [:tcampbell] from comment #2)
> David, did you have any update or bug number for this?

I was originally looking into bug 1330608 comment 108 but then ran into this. The msvcrt problem mentioned in that comment is only an issue on Win7. On later Windows we successfully get past the DLL loading steps and later hit this stack alignment assert.

Now that we've narrowed it down to the ifdef __GNUC__, I was going to let tjr take it from here when he's back from PTO.
Flags: needinfo?(dmajor)
Try run: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d47ae1add292f2eec7c3d4f2a5a5317af1ebd757

Still crashes; will need to investigate more (and learn how to), just don't want to lose the try link.
We're calling DWriteFontFace::GetDesignGlyphAdvances with bad parameters:

https://dxr.mozilla.org/mozilla-central/rev/5eba13f5b3a6ad80decdd8c7b30bff5fa477844f/gfx/thebes/gfxDWriteFonts.cpp#633

0:000:x86> dds @esp L6
00656a7c  0443cf4f xul!XRE_GetBootstrap+0x10330eb <-- return address (I don't have symbols)
00656a80  14995b90                                <-- this (looks fine, vtable checks out)
00656a84  41400000                                <-- glyphCount (!!!)
00656a88  3f800000                                <-- glyphIndices
00656a8c  00000000                                <-- glyphAdvances
00656a90  00000000                                <-- isSideways

For some reason the compiler tried to pass these on the floating point stack?

0443cf40 d9e8            fld1
0443cf42 d95c2408        fstp    dword ptr [esp+8]
0443cf46 d95c2404        fstp    dword ptr [esp+4]
0443cf4a 891c24          mov     dword ptr [esp],ebx
0443cf4d ffd6            call    esi

Does GetDesignGlyphAdvances have the right calling convention in the header?
> For some reason the compiler tried to pass these on the floating point stack?
> 
> 0443cf40 d9e8            fld1
> 0443cf42 d95c2408        fstp    dword ptr [esp+8]
> 0443cf46 d95c2404        fstp    dword ptr [esp+4]
> 0443cf4a 891c24          mov     dword ptr [esp],ebx
> 0443cf4d ffd6            call    esi

I misunderstood this sequence of instructions. Rather: it's trying to push floating point numbers onto the normal stack. It's as if it thinks the first parameter of GetDesignGlyphAdvances is a float, but it should look like: https://msdn.microsoft.com/en-us/library/windows/desktop/hh780411(v=vs.85).aspx
On closer look at the caller, I think we're actually executing the code that wants to call GetGdiCompatibleGlyphAdvances: https://dxr.mozilla.org/mozilla-central/rev/5eba13f5b3a6ad80decdd8c7b30bff5fa477844f/gfx/thebes/gfxDWriteFonts.cpp#638

But we got the wrong vtable offset and ended up in GetDesignGlyphAdvances.
This is what the first few entries of the vtable ought to look like, do the headers match this?

0:000:x86> dds poi ebx L20
6a6d2184  6a77b290 dwrite!DWriteFontFace::QueryInterface
6a6d2188  6a7930b0 dwrite!DWriteFontFace::AddRef
6a6d218c  6a793050 dwrite!DWriteFontFace::Release
6a6d2190  6a7e0c30 dwrite!DWriteFontFace::GetType
6a6d2194  6a8230c0 dwrite!DWriteFontFace::GetFiles
6a6d2198  6a7e0bb0 dwrite!DWriteFontFace::GetIndex
6a6d219c  6a7d4580 dwrite!DWriteFontFace::GetSimulations
6a6d21a0  6a7d4560 dwrite!DWriteFontFace::IsSymbolFont
6a6d21a4  6a7d7240 dwrite!DWriteFontFace::GetMetrics
6a6d21a8  6a7e0b90 dwrite!DWriteFontFace::GetGlyphCount
6a6d21ac  6a79d700 dwrite!DWriteFontFace::GetDesignGlyphMetrics
6a6d21b0  6a7c52f0 dwrite!DWriteFontFace::GetGlyphIndicesW
6a6d21b4  6a7dcc20 dwrite!DWriteFontFace::TryGetFontTable
6a6d21b8  6a7de3f0 dwrite!DWriteFontFace::ReleaseFontTable
6a6d21bc  6a779db0 dwrite!DWriteFontFace::GetGlyphRunOutline
6a6d21c0  6a8236c0 dwrite!DWriteFontFace::GetRecommendedRenderingMode
6a6d21c4  6a7cb340 dwrite!DWriteFontFace::GetGdiCompatibleMetrics
6a6d21c8  6a7e22d0 dwrite!DWriteFontFace::GetGdiCompatibleGlyphMetrics
6a6d21cc  6a78d360 dwrite!DWriteFontFace::GetMetrics
6a6d21d0  6a7cb3d0 dwrite!DWriteFontFace::GetGdiCompatibleMetrics
6a6d21d4  6a822ff0 dwrite!DWriteFontFace::GetCaretMetrics
6a6d21d8  6a7dffe0 dwrite!DWriteFontFace::GetUnicodeRanges
6a6d21dc  6a7dd800 dwrite!DWriteFontFace::IsMonospacedFont
6a6d21e0  6a77a870 dwrite!DWriteFontFace::GetDesignGlyphAdvances
6a6d21e4  6a7e7600 dwrite!DWriteFontFace::GetGdiCompatibleGlyphAdvances
(In reply to David Major [:dmajor] from comment #8)
> This is what the first few entries of the vtable ought to look like, do the
> headers match this?

Looking at both
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/include/dwrite_1.h#l52
and 
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/include/dwrite_2.h#l369

They both seem to have all of those functions, in that order. (Although MinGW omits the W from GetGlyphIndicesW)
Flags: needinfo?(jacek)
IDWriteFontFace1 ought to have two GetMetrics methods, one inherited from IDWriteFontFace:

virtual void GetMetrics(
  [out] DWRITE_FONT_METRICS * fontFaceMetrics
) = 0;

And another one specific to the derived class:

virtual void GetMetrics(
  [out] DWRITE_FONT_METRICS1 * fontMetrics
) = 0;

This line in the derived interface re-declares the one from the base, it's missing a "1":
https://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/include/dwrite_1.h#l135
Agreed, it's broken in mingw-w64. Great catch, thanks.
Flags: needinfo?(jacek)
New try run with the DWrite fix: https://treeherder.mozilla.org/#/jobs?repo=try&revision=dfd989deea7e0dba6d1b1884aa66bb284be0ea21

Still crashes. What is needed to set up symbols and such for these builds?
(In reply to Tom Ritter [:tjr] from comment #13)
> Still crashes. What is needed to set up symbols and such for these builds?

Breakpad doesn't currently have any support for mingw's DWARF-in-PE scheme, but assuming these binaries are built with -g (which I see in CFLAGS/CXXFLAGS in the build log) and we're not stripping them during packaging, then they ought to be debuggable with the mingw gdb.
Comment on attachment 8919989 [details]
Bug 1372958 Stack alignment on Windows is 4 bytes, not 16

https://reviewboard.mozilla.org/r/190930/#review197456

This looks fine to me but someone from JS should have a say too.
Attachment #8919989 - Flags: review?(dmajor) → review?(luke)
Comment on attachment 8919989 [details]
Bug 1372958 Stack alignment on Windows is 4 bytes, not 16

https://reviewboard.mozilla.org/r/190932/#review197642

Makes sense, thanks!
Attachment #8919989 - Flags: review?(luke) → review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/6780969991ed
Stack alignment on Windows is 4 bytes, not 16 r=luke
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/6780969991ed
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
You need to log in before you can comment on or make changes to this bug.