Closed Bug 1018402 Opened 6 years ago Closed 3 years ago

VP8VideoTrackEncoder.FrameEncode | SEH exception with code 0xc0000005 thrown in the test body. @ (null):-1 when GTest is enabled on a VS2013 Debug build

Categories

(Core :: Audio/Video: Recording, defect, P4)

x86_64
Windows 8.1
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: briansmith, Assigned: dmajor)

References

Details

(Whiteboard: [Start at comment 17])

Attachments

(6 files, 2 obsolete files)

Marked security-sensitive since I don't know if this is indicative of a security problem.

Here's my .mozconfig:
. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/../ff-dbg
ac_add_options --enable-debug 
ac_add_options --disable-crashreporter
ac_add_options --enable-logging
ac_add_options --enable-tests
ac_add_options --enable-gtest
ac_add_options --disable-optimize
ac_add_options --enable-warnings-as-errors
Attached patch Disable crashing test (obsolete) — Splinter Review
Chris, could you please look at this and tell me if it is OK to temporarily disable this test while we look for a solution? This is the *only* GTest that fails on Windows, and we can't turn on GTests for Windows without doing something about this test. Some of the mozilla::pkix (certificate verification) tests are GTests and so we'd like to get GTest on Windows running soon.

I am not familiar with who knows what in this module, so if you're not an appropriate reviewer for this, please let me know who is.

Thanks!
Attachment #8431839 - Flags: review?(cpearce)
Can you provide more details about the crash? 0xc0000005 is an access violation. Does the test print a stack, or can you trap the crash in a debugger and get a stack there?
Comment on attachment 8431839 [details] [diff] [review]
Disable crashing test

Review of attachment 8431839 [details] [diff] [review]:
-----------------------------------------------------------------

I definitely want gtest working on Windows.

C.J. can you review this or pass the review onto the right person? Thanks!
Attachment #8431839 - Flags: review?(cpearce) → review?(cku)
The stack trace was generated by running this:
GTEST_CATCH_EXCEPTIONS=0 GTEST_FILTER="VP8VideoTrackEncoder.FrameEncode" mach gtest
This is the patch I'm using to get GTests running on Windows.
Comment on attachment 8431839 [details] [diff] [review]
Disable crashing test

Review of attachment 8431839 [details] [diff] [review]:
-----------------------------------------------------------------

Please disable it. We will figure out root cause and enable it again after.
Attachment #8431839 - Flags: review?(cku) → review+
Assignee: nobody → bechen
I found that the test does *not* crash when compiled with VS2010 on my system, using this (similar) mozconfig:

. $topsrcdir/browser/config/mozconfig
mk_add_options MOZ_OBJDIR=@TOPSRCDIR@/../ff-dbg-vs2010
ac_add_options --enable-debug 
ac_add_options --disable-crashreporter
ac_add_options --enable-logging
ac_add_options --enable-tests
ac_add_options --enable-gtest
ac_add_options --disable-optimize
export XMOZ_PSEUDO_DERECURSE=no-pymake

Therefore, I changed my patch to only disable the test when compiled with VS2013. I believe this means that this bug is less security-sensitive than before.

Thanks for the review.
Attachment #8431839 - Attachment is obsolete: true
Attachment #8434596 - Flags: review+
No longer blocks: 883339
Comment on attachment 8434596 [details] [diff] [review]
Disable VP8VideoTrackEncoder.FrameEncode only when compiled with MS VS2013 [v2]

Review of attachment 8434596 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/integration/mozilla-inbound/rev/7673b3a07499
Attachment #8434596 - Flags: checkin+
I tried to reproduce this crash with the mozconfig from comment 9, but |mach gtest| failed with "couldn't load XPCOM" because I don't have a dependentlibs.list.gtest under dist/bin. Do I need to have some other patches applied or something?
Flags: needinfo?(brian)
(In reply to David Major [:dmajor] (UTC+12) from comment #12)
> I tried to reproduce this crash with the mozconfig from comment 9, but |mach
> gtest| failed with "couldn't load XPCOM" because I don't have a
> dependentlibs.list.gtest under dist/bin. Do I need to have some other
> patches applied or something?

Yes, you need one of the patches in bug 883339.
Flags: needinfo?(brian)
Thanks for the tip. Somehow VS2013 is messing up these struct offsets:

http://hg.mozilla.org/mozilla-central/file/7146e89a7b83/media/libvpx/vp8/encoder/x86/quantize_ssse3.asm#l50

2013:
   50 5a2ddefb 8b8778090000    mov     eax,dword ptr [edi+978h]
   51 5a2ddf01 8b8f78090000    mov     ecx,dword ptr [edi+978h]
   52 5a2ddf07 8b9778090000    mov     edx,dword ptr [edi+978h]

2010:
   50 1011707b 8b4704          mov     eax,dword ptr [edi+4]
   51 1011707e 8b4f1c          mov     ecx,dword ptr [edi+1Ch]
   52 10117081 8b570c          mov     edx,dword ptr [edi+0Ch]

VS2013's objdir/media/libvpx/vp8_asm_enc_offsets.asm is full of things like
vp8_block_coeff                          EQU  2424
vp8_block_zbin                           EQU  2424
vp8_block_round                          EQU  2424
...
The build uses a program host_obj_int_extract.exe to parse vp8_asm_enc_offsets.obj and generate the .asm file with the offsets. I am guessing something changed in the .obj format produced by VS2013. The actual .exe seems fine (running it on 2010's obj file produces the correct results).

This probably doesn't need to be security-sensitive.
Blocks: VC12
Group: core-security
I wonder if this also relates to Bug 939551.
I've found the difference in the obj files. Under VS2010, the various variables (_vp8_block_coeff, etc.) are placed into a single .rdata section, with a single corresponding .debug$S section. Under VS2012, we get one .rdata and one .debug$S for each of the symbols.

The extraction program assumes there will only be one .rdata section, so it stores a single pointer for that section [1]. That's how we get the wrong values in comment 14.

On my machine, there are 91 sections in VS2012's obj file, but I can easily see how a slight tweak could push us over the 96 limit seen in bug 939551 and bug 934984.

Can someone who speaks COFF chime in on whether it's normal to have multiple .rdata sections? Is VS2012 doing something wrong, or is the extraction program wrong in its assumptions?

[1] http://dxr.mozilla.org/mozilla-central/search?tree=mozilla-central&q=sectionrawdata_ptr
I only speak ELF, but I guess this is similar to what GCC/clang do with -fdata-sections, which allows to fine-grain-remove unused variable at link time.

I guess we're talking about the vp8_asm_*_offsets stuff, right? I can see two ways out:
- modifying the obj_int_extract tool to handle that
- building those objects with flags that remove that feature if there is one.

Also, since chromium is built with MSVC2013 (aiui), I'd check what they do (maybe they already fixed obj_int_extract.c).
(In reply to David Major [:dmajor] from comment #17)
Er, s/VS2012/VS2013/ -- aka MSVC12, I always get confused by those version numbers.

(In reply to Mike Hommey [:glandium] from comment #19)
Hmm, I wonder what their build's .obj file looks like. Must be only one (non-COMDAT) .rdata if that code works.
I wonder if we're just not giving too much CFLAGS to the command line to build those objects. I remember we've had problems with LTCG there (because we were giving the LTCG flags that made the objects contain AST instead of compiled code). That could very well be something similar.
In fact, the whole business around adding the offset sources to SOURCES is fishy. (and I should know, I reviewed that)
 5:15.91 s:/vs2013/objxfre/_virtualenv/Scripts/python.exe -m mozbuild.action.cl  cl -Fovp8_asm_enc_offsets.obj -c  -DHAVE_CONFIG_H='vpx_config.h' -DNO_NSPR_10_SUPPORT -Is:/vs2013/media/libvpx -I.  -I../../dist/include  -Is:/vs2013/objxfre/dist/include/nspr -Is:/vs2013/objxfre/dist/include/nss  -I/s/vs2013/objxfre/dist/include -I/s/vs2013/modules/zlib/src     -MD            -FI ../../dist/include/mozilla-config.h -DMOZILLA_CLIENT  -TC -nologo -D_HAS_EXCEPTIONS=0 -W3 -Gy -FS -Gw -wd4244 -wd4819 -we4553  -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -O1 -Oi -Oy   -Fdgenerated.pdb   s:/vs2013/media/libvpx/vp8/encoder/vp8_asm_enc_offsets.c
-Gw is likely the flag that causes the behavior you describe in comment 17.
http://msdn.microsoft.com/en-us/library/dn305952.aspx

Try adding something like:
   vpx_scale_asm_offsets.$(OBJ_SUFFIX) vp8_asm_enc_offsets.$(OBJ_SUFFIX): COMPILE_CFLAGS=
in Makefile.in
If that fails because of includes, try COMPILE_CFLAGS=$(INCLUDES)
Bug 931687 added -Gw (and it's something new to 2013)
Yep, the -Gw flag was the cause!

I tried with COMPILE_CFLAGS=$(INCLUDES) and the extractor was able to understand the .obj file, but a few of the structure offsets were slightly off. I presume it's due to flag differences. Maybe we should just remove -Gw but otherwise keep the flags intact.
Attached patch Disable -Gw for the offset files (obsolete) — Splinter Review
Does this need to go under some kind of _MSC_VER >= 1800 condition? How would I express that in the makefile?
Attachment #8446373 - Flags: review?(mh+mozilla)
Comment on attachment 8446373 [details] [diff] [review]
Disable -Gw for the offset files

Review of attachment 8446373 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/libvpx/Makefile.in
@@ +110,5 @@
>  # It was generated solely so it could be parsed by obj_int_extract.
>  OBJS := $(filter-out vp8_asm_enc_offsets.$(OBJ_SUFFIX),$(OBJS))
>  
> +# obj_int_extract doesn't understand files produced with the -Gw optimization.
> +vpx_scale_asm_offsets.$(OBJ_SUFFIX) vp8_asm_enc_offsets.$(OBJ_SUFFIX): COMPILE_CFLAGS +=-Gw-

I guess this just works on MSVC because it ignores flags it doesn't know, but this is likely to screw up clang-cl. I'd rather go with something like COMPILE_CFLAGS := $(filter-out -Gw,$(COMPILE_CFLAGS))
Attachment #8446373 - Flags: review?(mh+mozilla) → review-
Sure, I guess that's easier than compiler checks.
Attachment #8446373 - Attachment is obsolete: true
Attachment #8446385 - Flags: review?(mh+mozilla)
Assignee: bechen → dmajor
Attachment #8446385 - Flags: review?(mh+mozilla) → review+
sorry had to back this out since this caused bustages on windows xp pgo builds like https://tbpl.mozilla.org/php/getParsedLog.php?id=42606579&tree=Mozilla-Inbound
>"Too many sections"

That would be bug #939551
Wow. Not sure how that could happen given that our builders are still on VS2010. I'll take a look Monday (NZ) if nobody else beats me to it.
(In reply to David Major [:dmajor] from comment #34)
> Wow. Not sure how that could happen given that our builders are still on
> VS2010. I'll take a look Monday (NZ) if nobody else beats me to it.

It's not impossible that the COMPILE_CFLAGS fiddling is doing something wrong with PGO.
c:/builds/moz2_slave/m-in-w32-pgo-00000000000000000/build/obj-firefox/_virtualenv/Scripts/python.exe -m mozbuild.action.cl  cl -Fovp8_asm_enc_offsets.obj -c -DHAVE_CONFIG_H='vpx_config.h' -DNO_NSPR_10_SUPPORT -Ic:/builds/moz2_slave/m-in-w32-pgo-00000000000000000/build/media/libvpx -I. -I../../dist/include -Ic:/builds/moz2_slave/m-in-w32-pgo-00000000000000000/build/obj-firefox/dist/include/nspr -Ic:/builds/moz2_slave/m-in-w32-pgo-00000000000000000/build/obj-firefox/dist/include/nss -I/c/builds/moz2_slave/m-in-w32-pgo-00000000000000000/build/obj-firefox/dist/include -I/c/builds/moz2_slave/m-in-w32-pgo-00000000000000000/build/modules/zlib/src -MD -FI ../../dist/include/mozilla-config.h -DMOZILLA_CLIENT -TC -nologo -D_HAS_EXCEPTIONS=0 -W3 -Gy -wd4244 -wd4819 -we4553 -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -GL -O1 -Oi -Oy- -Fdgenerated.pdb   c:/builds/moz2_slave/m-in-w32-pgo-00000000000000000/build/media/libvpx/vp8/encoder/vp8_asm_enc_offsets.c

That's the command line that was used to build the obj. -GL is there while it's not supposed to...
-Gw is in OS_CFLAGS, try fiddling OS_CFLAGS instead of COMPILE_CFLAGS.

If you want to try PGO builds:
https://wiki.mozilla.org/ReleaseEngineering/TryChooser#What_if_I_want_PGO_for_my_build
(In reply to Mike Hommey [:glandium] from comment #37)
> -Gw is in OS_CFLAGS, try fiddling OS_CFLAGS instead of COMPILE_CFLAGS.

Err, this is not going to do any good, since OS_CFLAGS is also where the PGO flags end up.
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #11)
> https://hg.mozilla.org/mozilla-central/rev/7673b3a07499

I uplifted my patch to disable the test to Mozilla-Beta 31 as part of my work for fixing bug 1031542:
https://hg.mozilla.org/releases/mozilla-beta/rev/019bb4a42fb7
This also fails: COMPILE_CFLAGS := $(COMPILE_CFLAGS)

I guess it's because COMPILE_CFLAGS is now a simply-expanded variable?
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #39)
> I uplifted my patch to disable the test to Mozilla-Beta 31 as part of my
> work for fixing bug 1031542:
> https://hg.mozilla.org/releases/mozilla-beta/rev/019bb4a42fb7

What's the plan for Aurora32?
Would this be too much of a hack? $(filter-out -Gw $(PROFILE_GEN_CFLAGS),$(COMPILE_CFLAGS))

Otherwise I'm out of ideas, any better suggestions?
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(benjamin)
This also breaks Mochi-1 at content/media/test/test_mediarecorder_record_gum_video_timeslice.html
...and M3 at dom/media/tests/identity/test_peerConnection_peerIdentity.html
For the time being we should just remove -Gw to get this going on nightly builds. We can re-add it later after we figure out the build magic.
Flags: needinfo?(benjamin)
:m_kato, do you want to take a shot at fixing this? Otherwise we'll need to back out bug 931687 until we have a solution here.
Blocks: 931687
Flags: needinfo?(m_kato)
Whiteboard: [Start at comment 17]
(In reply to David Major [:dmajor] from comment #46)
> :m_kato, do you want to take a shot at fixing this? Otherwise we'll need to
> back out bug 931687 until we have a solution here.

I think that we should backout -Gw option from configure.in.
Flags: needinfo?(m_kato)
Ok, I backed out the -Gw change until we can figure this out: https://hg.mozilla.org/integration/mozilla-inbound/rev/382fcf193cba
Blocks: 1126662
Duplicate of this bug: 983890
Component: Audio/Video → Audio/Video: Recording
Rank: 45
Priority: -- → P4
Depends on: 1289989
I can't find the obj_int_extract code in our tree anymore. I am going to guess that we took a libvpx update and the current version no longer takes this approach of parsing struct offsets out the .obj files? Ralph, can you confirm?

If this is true then we should be able to re-land the -Gw flag changes.
Flags: needinfo?(giles)
That's correct. Also, I thought we didn't support VS2013 anymore?
Flags: needinfo?(giles)
Bug 931687 has re-landed the -Gw flag, bug 1289989 has re-enabled the failing test, and the world didn't explode. Let's call this fixed!

> Also, I thought we didn't support VS2013 anymore?
At the time, VS2013 was the latest compiler. The underlying problem would have been present in VS2015 as well, if we hadn't taken that vpx update.
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(mh+mozilla)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.