Open
Bug 952048
Opened 11 years ago
Updated 3 years ago
SIGBUS on OpenBSD since libvpx 1.3.0 update
Categories
(Core :: WebRTC: Audio/Video, defect)
Tracking
()
NEW
| Tracking | Status | |
|---|---|---|
| firefox27 | --- | unaffected |
| firefox28 | --- | affected |
| firefox29 | --- | affected |
| backlog | parking-lot |
People
(Reporter: gaston, Unassigned)
References
Details
Probably a fallout of 918550, right now nightly & aurora sigbuses at startup after showing the main window on OpenBSD/amd64 with the following backtrace:
#0 0x000017c04503e1b6 in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
(gdb) bt
#0 0x000017c04503e1b6 in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#1 0x000017c0450aae7f in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#2 0x000017c045003a4d in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#3 0x000017c044ffcfec in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#4 0x000017c04501f84d in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#5 0x000017c044ff82b1 in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#6 0x000017c04502e0af in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#7 0x000017c0450b6d59 in std::vector<short, std::allocator<short> >::_M_insert_aux () from /home/landry/firefox/libxul.so.1.0
#8 0x000017c043d28876 in imgLoader::SupportImageWithMimeType () from /home/landry/firefox/libxul.so.1.0
#9 0x000017c044d7617c in XRE_StartupTimelineRecord () from /home/landry/firefox/libxul.so.1.0
| Reporter | ||
Comment 1•11 years ago
|
||
Jan, this is still broken and is preventing work on trunk for me - any idea ?
Looking for that identifier in mxr, i only see a declaration, no definition...
http://mxr.mozilla.org/mozilla-central/ident?i=vp9_half_horiz_variance8x_h_sse2
Flags: needinfo?(j)
Comment 2•11 years ago
|
||
(In reply to Landry Breuil (:gaston) from comment #1)
> Looking for that identifier in mxr, i only see a declaration, no
> definition...
Try http://mxr.mozilla.org/mozilla-central/source/media/libvpx/vp9/encoder/x86/vp9_variance_impl_sse2.asm#631
Comment 3•11 years ago
|
||
I believe the definition is https://mxr.mozilla.org/mozilla-central/source/media/libvpx/vp9/encoder/x86/vp9_variance_impl_sse2.asm#631. SIGBUS is weird. Can that be an illegal instructions, or is it only a memory access error? Alignment problem? Does your CPU support sse2? Can you verify your toolchain is assembling this file correctly?
As a work around, try adding this to your mozconfig:
ac_add_options --disable-webm --disable-webrtc
or maybe:
#define HAVE_SSE2 0
in media/libvpx/config_x86-linux-gcc.* etc.
Flags: needinfo?(j)
| Reporter | ||
Comment 4•11 years ago
|
||
The cpu supports SSE2, the arch is amd64, and my builds are done with ac_add_options --enable-gstreamer --disable-webrtc - and i'm using yasm 1.2/clang 3.3 on OpenBSD.
A full build log is at http://buildbot.rhaalovely.net/builders/mozilla-central-amd64/builds/995/steps/build/logs/stdio
That file is built with:
vp9_variance_impl_sse2.o
yasm -o vp9_variance_impl_sse2.o -f elf64 -rnasm -pnasm -DPIC -I. -I/var/buildslave-mozilla/mozilla-central-amd64/build/media/libvpx/ -I/var/buildslave-mozilla/mozilla-central-amd64/build/media/libvpx/vpx_ports/ -g dwarf2 /var/buildslave-mozilla/mozilla-central-amd64/build/media/libvpx/vp9/encoder/x86/vp9_variance_impl_sse2.asm
I, of course, would rather see that fixed instead of having do disable webm or sse2...
| Reporter | ||
Comment 5•11 years ago
|
||
Tried aurora, broken too the same way (not surprising since libvpx 1.3 is there too now)
status-firefox27:
--- → unaffected
status-firefox28:
--- → affected
status-firefox29:
--- → affected
| Reporter | ||
Comment 6•11 years ago
|
||
Note that the crash only happens when loading web content (ie for example about:support) - interestingly i can load http://mozilla.github.io/webrtc-landing/gum_test.html without a crash, as about: and about:about. ggogle.fr segfaults..
Looking at the thread list in gdb when it crashes on about:support, two of them are in libvpx:
Thread 2 (process 31150):
#0 0x0000077a41f35d56 in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#1 0x0000077a41fa2a1f in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#2 0x0000077a41efc51d in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#3 0x0000077a41ef8d9c in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#4 0x0000077a41f1b6ed in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#5 0x0000077a41ef4611 in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#6 0x0000077a41f25c4f in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#7 0x0000077a41faeb79 in std::_Rb_tree<void const*, void const*, std::_Identity<void const*>, std::less<void const*>, std::allocator<void const*> >::_M_erase () from /home/landry/firefox/libxul.so.1.0
#8 0x0000077a40ce2e31 in imgLoader::SupportImageWithMimeType () from /home/landry/firefox/libxul.so.1.0
#9 0x0000077a41c813bc in XRE_StartupTimelineRecord () from /home/landry/firefox/libxul.so.1.0
#10 0x0000077a41c6426e in XRE_StartupTimelineRecord () from /home/landry/firefox/libxul.so.1.0
#11 0x0000077a40b6047d in std::vector<std::string, std::allocator<std::string> >::vector () from /home/landry/firefox/libxul.so.1.0
#12 0x0000077a40b6050a in std::vector<std::string, std::allocator<std::string> >::vector () from /home/landry/firefox/libxul.so.1.0
#13 0x0000077a4083d065 in NS_InvokeByIndex () from /home/landry/firefox/libxul.so.1.0
#14 0x0000077a4083ccdc in NS_InvokeByIndex () from /home/landry/firefox/libxul.so.1.0
#15 0x0000077a407c7210 in NS_NewLocalFile () from /home/landry/firefox/libxul.so.1.0
#16 0x0000077a407d8f86 in XRE_AddJarManifestLocation () from /home/landry/firefox/libxul.so.1.0
#17 0x0000077a40785115 in ?? () from /home/landry/firefox/libxul.so.1.0
#18 0x0000077a4098364f in std::vector<std::string, std::allocator<std::string> >::vector () from /home/landry/firefox/libxul.so.1.0
#19 0x0000077a4095bd4d in std::_Rb_tree<int, std::pair<int const, std::string>, std::_Select1st<std::pair<int const, std::string> >, std::less<int>, std::allocator<std::pair<int const, std::string> > >::_M_erase () from /home/landry/firefox/libxul.so.1.0
#20 0x0000077a411fc6fb in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf ()
from /home/landry/firefox/libxul.so.1.0
#21 0x0000077a41c4262e in XRE_StartupTimelineRecord () from /home/landry/firefox/libxul.so.1.0
#22 0x0000077a41c099f7 in XRE_InitCommandLine () from /home/landry/firefox/libxul.so.1.0
#23 0x0000077a41c09bc2 in XRE_InitCommandLine () from /home/landry/firefox/libxul.so.1.0
#24 0x0000077a41c0a04e in XRE_main () from /home/landry/firefox/libxul.so.1.0
#25 0x0000077830003eb7 in __register_frame_info () from /home/landry/firefox/firefox
#26 0x0000077830003821 in _start () from /home/landry/firefox/firefox
#27 0x0000000000000000 in ?? ()
Thread 1 (thread 1031150):
#0 0x0000077a41f35d56 in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#1 0x0000077a41fa2a1f in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#2 0x0000077a41efc51d in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#3 0x0000077a41ef8d9c in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#4 0x0000077a41f1b6ed in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#5 0x0000077a41ef4611 in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#6 0x0000077a41f25c4f in vp9_half_horiz_variance8x_h_sse2 () from /home/landry/firefox/libxul.so.1.0
#7 0x0000077a41faeb79 in std::_Rb_tree<void const*, void const*, std::_Identity<void const*>, std::less<void const*>, std::allocator<void const*> >::_M_erase () from /home/landry/firefox/libxul.so.1.0
#8 0x0000077a40ce2e31 in imgLoader::SupportImageWithMimeType () from /home/landry/firefox/libxul.so.1.0
#9 0x0000077a41c813bc in XRE_StartupTimelineRecord () from /home/landry/firefox/libxul.so.1.0
#10 0x0000077a41c6426e in XRE_StartupTimelineRecord () from /home/landry/firefox/libxul.so.1.0
#11 0x0000077a40b6047d in std::vector<std::string, std::allocator<std::string> >::vector () from /home/landry/firefox/libxul.so.1.0
#12 0x0000077a40b6050a in std::vector<std::string, std::allocator<std::string> >::vector () from /home/landry/firefox/libxul.so.1.0
#13 0x0000077a4083d065 in NS_InvokeByIndex () from /home/landry/firefox/libxul.so.1.0
#14 0x0000077a4083ccdc in NS_InvokeByIndex () from /home/landry/firefox/libxul.so.1.0
#15 0x0000077a407c7210 in NS_NewLocalFile () from /home/landry/firefox/libxul.so.1.0
#16 0x0000077a407d8f86 in XRE_AddJarManifestLocation () from /home/landry/firefox/libxul.so.1.0
#17 0x0000077a40785115 in ?? () from /home/landry/firefox/libxul.so.1.0
#18 0x0000077a4098364f in std::vector<std::string, std::allocator<std::string> >::vector () from /home/landry/firefox/libxul.so.1.0
#19 0x0000077a4095bd4d in std::_Rb_tree<int, std::pair<int const, std::string>, std::_Select1st<std::pair<int const, std::string> >, std::less<int>, std::allocator<std::pair<int const, std::string> > >::_M_erase () from /home/landry/firefox/libxul.so.1.0
#20 0x0000077a411fc6fb in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::~basic_stringbuf ()
from /home/landry/firefox/libxul.so.1.0
#21 0x0000077a41c4262e in XRE_StartupTimelineRecord () from /home/landry/firefox/libxul.so.1.0
| Reporter | ||
Comment 7•11 years ago
|
||
maps.google.fr and www.youtube.com load fine, so something is fishy - maybe the crash is only triggered when some specific mimetype is accessed, given that SupportImageWithMimeType is in the trace ?
| Reporter | ||
Comment 8•11 years ago
|
||
Hmmm, now thinking about what was commited to libvpx in the past, maybe 785638 & 774598 need revisiting for vp9 ?
Jan, does trunk runs fine for you on freebsd without crashes ?
| Reporter | ||
Comment 9•11 years ago
|
||
Looking more closely at https://hg.mozilla.org/mozilla-central/rev/f4f8faa3771c#l358.24 - this might be AVX support - is that a cpu flag ? According to http://en.wikipedia.org/wiki/Advanced_Vector_Extensions#Operating_system_support we (openbsd) dont have support for AVX.
But still, that doesnt look used anywhere in libvpx's code.... so i might be on the wrong track.
Comment 10•11 years ago
|
||
FWIW, those stacks are almost certainly wrong. In fact, AFAICT, nothing ever calls vp9_half_horiz_variance8x_h_sse2(). Thanks to the RTCD macro magic, it's hard to be certain... but I bet you could delete the code entirely with no ill effects. It was probably simply copied over from the corresponding VP8 code (vp8_half_horiz_variance8x_h_sse2(), which _is_ called from media/libvpx/vp8/common/x86/variance_sse2.c).
Comment 11•11 years ago
|
||
libvpx has a standalone decoder. Why not try to crash there based on gdb output and build config?
(In reply to Landry Breuil (:gaston) from comment #8)
> Hmmm, now thinking about what was commited to libvpx in the past, maybe
> 785638 & 774598 need revisiting for vp9 ?
That shouldn't matter unless you disable *.asm code in the port or forget to apply port-specific fixes.
(In reply to Landry Breuil (:gaston) from comment #8)
> Jan, does trunk runs fine for you on freebsd without crashes ?
It does, no issues viewing VP9 samples on my amd64 box or within 32bit jail. As both PkgSrc and FreeBSD ports now have 1.3.0 using --with-system-libvpx works, too.
| Reporter | ||
Comment 12•11 years ago
|
||
(In reply to Jan Beich from comment #11)
> It does, no issues viewing VP9 samples on my amd64 box or within 32bit jail.
> As both PkgSrc and FreeBSD ports now have 1.3.0 using --with-system-libvpx
> works, too.
Do you mean it works for you, both with system libvpx and bundled one ?
| Reporter | ||
Comment 13•11 years ago
|
||
Fwiw, a build of trunk on powerpc runs fine - of course, since it doesnt have all this asm goo.
Comment 14•11 years ago
|
||
Jan, are you interested in trying to fix this? It would be nice to get it resolved.
Flags: needinfo?(jbeich)
| Reporter | ||
Comment 15•11 years ago
|
||
_I_ am interested in fixing this before 28 hits beta, but i have no idea what could be the root cause of the SIGBUS, besides the whole libvpx update... i can only test diffs, or provide logs, but gdb is unusable for me.
Comment 16•11 years ago
|
||
(In reply to Ralph Giles (:rillian) from comment #14)
> Jan, are you interested in trying to fix this? It would be nice to get it
> resolved.
No, I don't use OpenBSD to try hunting for clues like:
- testing with different toolchains (recent gcc/binutils, clang -no-integrated-as) and on i386 with sse2
- craft a VP9 sample or emulate mozilla cflags/environment to try crashing vpxdec and/or ffmpeg
- gdb backtrace (with locals) for default -O0 -g non-debug build
- bisecting upstream libvpx commit history, --with-system-libvpx may be faster
Flags: needinfo?(jbeich)
| Reporter | ||
Comment 17•11 years ago
|
||
technically, libvpx 1.3.0 hasnt even been released, only tagged in hg - so i'll have to wrap up my own system libvpx..
| Reporter | ||
Comment 18•11 years ago
|
||
I'm trying to disable sse2/sse3/ssse3/sse4.1/avx in libvpx, setting the various *SSE* values to zero in media/libvpx/vpx_config_x86_64-linux-gcc.{h,asm} but the corresponding asm/c files seems to be still built - is it the correct way to disable those optimisations ?
Comment 19•11 years ago
|
||
(In reply to Landry Breuil (:gaston) from comment #17)
> technically, libvpx 1.3.0 hasnt even been released, only tagged in hg
http://webm.googlecode.com/files/libvpx-v1.3.0.tar.bz2
They finally posted a tarball based on the the 1.3.0 tag a couple of days ago.
| Reporter | ||
Comment 20•11 years ago
|
||
So i tried building with HAVE_SSE2/HAVE_SSE3/HAVE_SSE4_1/HAVE_SSSE3/HAVE_AVX set to 0, but libxul linking fails :
: In function `vp8_loop_filter_row_normal':
/home/landry/src/m-c/media/libvpx/vp8/common/loopfilter.c:229: undefined reference to `vp8_loop_filter_mbv_sse2'
Grr.
| Reporter | ||
Comment 21•11 years ago
|
||
Interestingly, a build from last night's tip with --enable-pulseaudio --enable-gstreamer (and then, --enable-webrtc implied) on amd64 unpatched doesnt seem to segfault like it used to, and displays about:support fine (and gmaps, and gmail and google's homepage...) My previous segfaulting builds were with --disable-pulseaudio --disable-webrtc.
A build of aurora with --enable-gstreamer --disable-webrtc (and --disable-pulseaudio implied) works on some pages, then SIGBUSes' on about:support, but the trace doesnt show libvpx.
(gdb) bt
#0 0x00000bbc75363a56 in std::vector<void*, std::allocator<void*> >::_M_fill_insert () from /home/landry/firefox/libxul.so.1.0
#1 0x0000000000000000 in ?? ()
So i'm wondering if all this could be linked to webrtc being enabled or not - and shown more since the libvpx update ?
Comment 22•11 years ago
|
||
(In reply to Landry Breuil (:gaston) from comment #20)
> So i tried building with HAVE_SSE2/HAVE_SSE3/HAVE_SSE4_1/HAVE_SSSE3/HAVE_AVX
> set to 0, but libxul linking fails :
libvpx supports doing so at runtime. Try VPX_SIMD_CAPS_MASK=0xfb to disable only SSE2 or VPX_SIMD_CAPS=0 for everything.
(In reply to Landry Breuil (:gaston) from comment #21)
> but the trace doesnt show libvpx.
Couldn't -fomit-frame-pointer corrupt the stack ? It's added by default for non-debug builds.
| Reporter | ||
Comment 23•11 years ago
|
||
(In reply to Jan Beich from comment #22)
> (In reply to Landry Breuil (:gaston) from comment #20)
> > So i tried building with HAVE_SSE2/HAVE_SSE3/HAVE_SSE4_1/HAVE_SSSE3/HAVE_AVX
> > set to 0, but libxul linking fails :
>
> libvpx supports doing so at runtime. Try VPX_SIMD_CAPS_MASK=0xfb to disable
> only SSE2 or VPX_SIMD_CAPS=0 for everything.
Thanks for the tip - i think this rules out optimizations in libvpx, since
$~/firefox-aurora/firefox -no-remote -P Aurora
Bus error (core dumped)
$VPX_SIMD_CAPS_MASK=0 ~/firefox-aurora/firefox -no-remote -P Aurora
Bus error (core dumped)
| Reporter | ||
Comment 24•11 years ago
|
||
Fwiw, i've done some testing with fx 28.0b3 built within our ports infrastructure, and i'm writing this comment from it - browsed a bit, saw no crash. That build still has --disable-webrtc (actually: --disable-webrtc --enable-gstreamer --with-system-zlib=/usr --with-system-libevent=/usr/ --with-system-bz2=/usr/local --with-system-nspr --with-system-nss --enable-official-branding --enable-gio --disable-gconf --disable-necko-wifi --disable-optimize --disable-tests --disable-updater --disable-dbus --enable-application=browser --prefix=/usr/local --sysconfdir=/etc --mandir=/usr/local/man --infodir=/usr/local/info --localstatedir=/var --disable-silent-rules) so i dont get what was wrong at the time.. and why it's not breaking the same way. I'll retest aurora and central (the latter, once bug 973310 is fixed)
| Reporter | ||
Comment 25•11 years ago
|
||
Interestingly, trunk still crashes after browsing some patches, and this time i've seen it crash with webrtc enabled (that is --enable-gstreamer --enable-pulseaudio --cache-file=/dev/null were the only configure args)
Updated•10 years ago
|
Component: Audio/Video → WebRTC: Audio/Video
Comment 26•10 years ago
|
||
Landry - is this still a problem? Libvpx has been updated since the last report. Thanks!
backlog: --- → parking-lot
Flags: needinfo?(landry)
| Reporter | ||
Comment 27•10 years ago
|
||
I still have some local patches working around sse build config issues (ie https://bugzilla.mozilla.org/show_bug.cgi?id=1122745) and i havent been able to get back to this. I also need to figure out if the issue was only with bundled libvpx and not present with systemwide libvpx (1.4.0 on OpenBSD nowadays)
Flags: needinfo?(landry)
Updated•3 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•