Closed Bug 963460 Opened 11 years ago Closed 9 years ago

Firefox crashes with invalid pointer during shutdown when run under xvfb [@ vp9_half_horiz_variance8x_h_sse2]

Categories

(Core :: Audio/Video: Playback, defect)

28 Branch
x86_64
Linux
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox26 --- wontfix
firefox27 - wontfix
firefox28 --- affected
firefox29 --- affected

People

(Reporter: whimboo, Assigned: derf)

References

Details

(Keywords: crash, sec-other)

Crash Data

Attachments

(2 files)

Name=Firefox Version=29.0a1 BuildID=20140117053553 SourceRepository=http://hg.mozilla.org/mozilla-central SourceStamp=6e102b9c89b9 Today I tried to run a debug Firefox build with Mozmill under xvfb and during a restart of Firefox it crashes all the time: *** Error in `/mozilla/code/mozmill/firefox/firefox': free(): invalid pointer: 0x00000000033c8110 *** ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x80996)[0x7f57e94eb996] /mozilla/code/mozmill/firefox/libxul.so(+0x203a419)[0x7f57e4f6a419] /mozilla/code/mozmill/firefox/libxul.so(+0x2039b09)[0x7f57e4f69b09] /mozilla/code/mozmill/firefox/libxul.so(+0x204ecf9)[0x7f57e4f7ecf9] /mozilla/code/mozmill/firefox/libxul.so(+0x2039ca6)[0x7f57e4f69ca6] /usr/lib/x86_64-linux-gnu/libX11.so.6(XCloseDisplay+0xa2)[0x7f57dfc714f2] /usr/lib/x86_64-linux-gnu/libgdk-x11-2.0.so.0(+0x4cbbe)[0x7f57e0b5ebbe] /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0(g_object_unref+0x1aa)[0x7f57e238db4a] /mozilla/code/mozmill/firefox/libxul.so(+0x1e1c0f6)[0x7f57e4d4c0f6] /mozilla/code/mozmill/firefox/libxul.so(+0x1e24a06)[0x7f57e4d54a06] /mozilla/code/mozmill/firefox/libxul.so(XRE_main+0xce)[0x7f57e4d54b5d] /mozilla/code/mozmill/firefox/firefox[0x403ba1] /mozilla/code/mozmill/firefox/firefox[0x403cc7] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f57e948cde5] /mozilla/code/mozmill/firefox/firefox[0x4031a9] The stack as retrieved via gdb is: #0 0x00007f57e952c85d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f57e952c701 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137 #2 0x00007f57e4d55f83 in ?? () from /mozilla/code/mozmill/firefox/libxul.so #3 0x00007f57e4d60640 in ?? () from /mozilla/code/mozmill/firefox/libxul.so #4 <signal handler called> #5 0x00007f57e94a1f77 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #6 0x00007f57e94a55e8 in __GI_abort () at abort.c:90 #7 0x00007f57e94df4fb in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f57e95f3240 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:199 #8 0x00007f57e94eb996 in malloc_printerr (ptr=0x33c8110, str=0x7f57e95ef205 "free(): invalid pointer", action=3) at malloc.c:4923 #9 _int_free (av=<optimized out>, p=0x33c8100, have_lock=0) at malloc.c:3779 #10 0x00007f57e4f6a419 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #11 0x00007f57e4f69b09 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #12 0x00007f57e4f7ecf9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #13 0x00007f57e4f69ca6 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #14 0x00007f57dfc714f2 in XCloseDisplay () from /usr/lib/x86_64-linux-gnu/libX11.so.6 #15 0x00007f57e0b5ebbe in ?? () from /usr/lib/x86_64-linux-gnu/libgdk-x11-2.0.so.0 #16 0x00007f57e238db4a in g_object_unref () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #17 0x00007f57e4d4c0f6 in ?? () from /mozilla/code/mozmill/firefox/libxul.so #18 0x00007f57e4d54a06 in ?? () from /mozilla/code/mozmill/firefox/libxul.so #19 0x00007f57e4d54b5d in XRE_main () from /mozilla/code/mozmill/firefox/libxul.so #20 0x0000000000403ba1 in _start () So it looks like that this comes from vp9_half_horiz_variance8x_h_sse2().
I quickly checked and this goes back at least to the latest tinderbox build on mozilla-release: Version=26.0.1 BuildID=20140122193406 SourceRepository=http://hg.mozilla.org/releases/mozilla-release SourceStamp=f17cd4055b55 #8 0x00007f7b8fad6996 in malloc_printerr (ptr=0x3bdaca0, str=0x7f7b8fbde328 "double free or corruption (!prev)", action=3) at malloc.c:4923 Not sure if this is a critical security bug, but for safety closing down as such for now.
Group: core-security
Crash Signature: [@ vp9_half_horiz_variance8x_h_sse2]
See bug 952048 comment 10. In particular, see the analysis that this function is never actually called anywhere, and thus that this stack is either a) bogus, or b) the result of memory corruption elsewhere. As a follow up to that comment, I actually prepared a local patch which removed the function entirely. Everything still built, verifying that it is never called.
Does it mean we need a fix for bug 952048 first before we can continue on that one?
I don't know. I don't really understand what's going on here. The stacks are different above the leaf, but that may not mean much.
This patch should remove that function. I'm very curious what happens if you try to reproduce with this applied.
Flags: needinfo?(hskupin)
Timothy, can you please trigger a try build for it? I would need that to be able to test the patch. I can't build myself at the moment. Thanks.
Flags: needinfo?(hskupin)
Pushed. You can watch <https://tbpl.mozilla.org/?tree=Try&rev=88395514c4ca> to see when the builds complete.
Looks like you were right with your assumption. The crash is still around and even shows the same stack as before: #9 _int_free (av=<optimized out>, p=0x4b487b0, have_lock=0) at malloc.c:3779 #10 0x00007f4e0b7201c9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #11 0x00007f4e0b71f8b9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so Here the steps if you want to reproduce: 1. Clone https://github.com/mozilla/mozmill 2. Create a virtualenv 3. Run ./setup_development.sh 4. Run mozmill -b %path_to_firefox% -m mutt/mutt/tests/js/manifest.ini It shouldn't take that long and you will see the crash.
Is this an exploitable crash? What security rating would you suggest?
waiting for a security rating here before making a tracking call.
(In reply to Henrik Skupin (:whimboo) from comment #8) > Looks like you were right with your assumption. The crash is still around > and even shows the same stack as before: > > #9 _int_free (av=<optimized out>, p=0x4b487b0, have_lock=0) at malloc.c:3779 > #10 0x00007f4e0b7201c9 in vp9_half_horiz_variance8x_h_sse2 () from > /mozilla/code/mozmill/firefox/libxul.so I don't understand. If you are testing with my patch applied, that function does not exist in the build. Where is the debug symbol for it coming from?
I have tested it again and it comes from the libxul.so: #0 0x00007f2a1129e85d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 #1 0x00007f2a1129e701 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137 #2 0x00007f2a0cad1e4b in ?? () from /mozilla/code/mozmill/firefox/libxul.so #3 0x00007f2a0cadc504 in ?? () from /mozilla/code/mozmill/firefox/libxul.so #4 <signal handler called> #5 0x00007f2a11213f77 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #6 0x00007f2a112175e8 in __GI_abort () at abort.c:90 #7 0x00007f2a112514fb in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x7f2a11365240 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/unix/sysv/linux/libc_fatal.c:199 #8 0x00007f2a1125d996 in malloc_printerr (ptr=0x45b03e0, str=0x7f2a11365328 "double free or corruption (!prev)", action=3) at malloc.c:4923 #9 _int_free (av=<optimized out>, p=0x45b03d0, have_lock=0) at malloc.c:3779 #10 0x00007f2a0cce61c9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #11 0x00007f2a0cce58b9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #12 0x00007f2a0ccfaaa9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #13 0x00007f2a0cce5a56 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #14 0x00007f2a079e94f2 in XCloseDisplay () from /usr/lib/x86_64-linux-gnu/libX11.so.6 #15 0x00007f2a088d6bbe in ?? () from /usr/lib/x86_64-linux-gnu/libgdk-x11-2.0.so.0 #16 0x00007f2a0a105b4a in g_object_unref () from /usr/lib/x86_64-linux-gnu/libgobject-2.0.so.0 #17 0x00007f2a0cac7fbe in ?? () from /mozilla/code/mozmill/firefox/libxul.so #18 0x00007f2a0cad090a in ?? () from /mozilla/code/mozmill/firefox/libxul.so #19 0x00007f2a0cad0a25 in XRE_main () from /mozilla/code/mozmill/firefox/libxul.so #20 0x0000000000403ba1 in _start ()
$ wget http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/tterriberry@mozilla.com-88395514c4ca/try-linux64/firefox-29.0a1.en-US.linux-x86_64.tar.bz2 $ tar xvf firefox-29.0a1.en-US.linux-x86_64.tar.bz2 $ nm firefox/libxul.so | grep vp9_half_horiz_variance 0000000001a777a6 T vp9_half_horiz_variance8x_h_sse2 0000000001a777d9 t vp9_half_horiz_variance8x_h_sse2.half_horiz_variance8x_h_1 So the patch (or try push) didn't work. I notice the valgrind build failed with duplicate opus symbols. I wonder if we have another spurious copy in the build somewhere.
Derf's patch removes vp9_half_horiz_*vert*_variance8x_h_sse2. This bug (and bug 952048) is complaining about vp9_half_horiz_variance8x_h_sse2 (no _vert_). One mystery explained. With this patch I see no matching symbol in libxul.so on my local build. One mystery solved.
I cannot reproduce with or without the patches using the steps from comment #8. I get a couple of tests failing with exceptions, but no crash report. whimboo, can you please test again with my try build?
Ralph, i will test once the builds are around. One thing through, I missed to add an important part to the last step which is how to run mozmill. You have to add 'DISPLAY:99.0 mozmill ...' to actually really use the xvfb display. Without it no crash will happen. Sorry for that.
So I tested again with the new Linux 64bit debug build from Ralph but the problem still exists :( [..] #10 0x00007fc769c5c1c9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #11 0x00007fc769c5b8b9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #12 0x00007fc769c70aa9 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so #13 0x00007fc769c5ba56 in vp9_half_horiz_variance8x_h_sse2 () from /mozilla/code/mozmill/firefox/libxul.so [..]
Ok, I get a crash (with or without the patches) if I prepend 'xvfb-run' to the mozmill invocation. I don't get symbols in the stack trace though. How do I hook up debug symbols from my build or the try build output? I do get nm output from the linux64 opt build from the try push. There's no vp9_half_horiz_variance8x_h_sse2. The linux64 debug build appears to be stripped. (?!?)
I used the linux64 debug build from try and when it crashed I attached gdb as the instructions are showing. With 'bt' I get the stacks as listed above.
To clarify: the linux64 debug build from my patched try push crashes, but the lib64 opt build (with symbols) does not. Attaching to today's master after the crash (local debug build, without the patched out vp9 sse functions) I get a different backtrace, this time from the cairo code: #0 0x000000321f0bca2d in nanosleep () at ../sysdeps/unix/syscall-template.S:81 #1 0x000000321f0bc8c4 in __sleep (seconds=0) at ../sysdeps/unix/sysv/linux/sleep.c:137 #2 0x00007f4acebd93d5 in ah_crap_handler (signum=6) at /home/giles/mozilla/firefox/toolkit/xre/nsSigHandlers.cpp:88 #3 0x00007f4acebe6272 in nsProfileLock::FatalSignalHandler (signo=6, info=0x7ffffbc8d870, context=0x7ffffbc8d740) at /home/giles/mozilla/firefox/profile/dirserviceprovider/src/nsProfileLock.cpp:190 #4 <signal handler called> #5 0x000000321f035c59 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56 #6 0x000000321f037368 in __GI_abort () at abort.c:89 #7 0x000000321f075da4 in __libc_message (do_abort=do_abort@entry=2, fmt=fmt@entry=0x321f17c648 "*** Error in `%s': %s: 0x%s ***\n") at ../sysdeps/posix/libc_fatal.c:175 #8 0x000000321f07bbc7 in malloc_printerr (action=<optimized out>, str=0x321f179dda "corrupted double-linked list", ptr=<optimized out>) at malloc.c:4930 #9 0x000000321f07d123 in _int_free (av=0x321f3b8760 <main_arena>, p=0x2e9cd20, have_lock=0) at malloc.c:3938 #10 0x00007f4acf1d0970 in _cairo_xlib_visual_info_destroy (info=0x2e9cdd0) at /home/giles/mozilla/firefox/gfx/cairo/cairo/src/cairo-xlib-visual.c:186 #11 0x00007f4acf1c61ea in _cairo_xlib_screen_destroy (info=0x1ce43e0) at /home/giles/mozilla/firefox/gfx/cairo/cairo/src/cairo-xlib-screen.c:289 #12 0x00007f4acf1c4c74 in _cairo_xlib_display_destroy ( abstract_display=0x1ce4270) at /home/giles/mozilla/firefox/gfx/cairo/cairo/src/cairo-xlib-display.c:120 #13 0x00007f4acf1ed3c9 in INT_cairo_device_destroy (device=0x1ce4270) at /home/giles/mozilla/firefox/gfx/cairo/cairo/src/cairo-device.c:339 #14 0x00007f4acf1c4fc4 in _cairo_xlib_close_display (dpy=0x63f230, codes=0x1ce4358) at /home/giles/mozilla/firefox/gfx/cairo/cairo/src/cairo-xlib-display.c:237 #15 0x000000322301fd22 in XCloseDisplay (dpy=0x63f230) at ClDisplay.c:65 #16 0x0000003e3d051086 in gdk_display_x11_finalize () from /lib64/libgdk-x11-2.0.so.0 #17 0x0000003f9ee14fcb in g_object_unref () from /lib64/libgobject-2.0.so.0 #18 0x00007f4acebcd26b in MOZ_gdk_display_close (display=0x64a020) at /home/giles/mozilla/firefox/toolkit/xre/nsAppRunner.cpp:2693 #19 0x00007f4acebd1514 in XREMain::XRE_main (this=0x7ffffbc8e3d0, argc=3, argv=0x7ffffbc8f898, aAppData=0x7ffffbc8e580) at /home/giles/mozilla/firefox/toolkit/xre/nsAppRunner.cpp:4205 #20 0x00007f4acebd163a in XRE_main (argc=3, argv=0x7ffffbc8f898, aAppData=0x7ffffbc8e580, aFlags=0) at /home/giles/mozilla/firefox/toolkit/xre/nsAppRunner.cpp:4368 #21 0x0000000000403d3f in do_main (argc=3, argv=0x7ffffbc8f898, xreDirectory= 0x60d8c0) at /home/giles/mozilla/firefox/browser/app/nsBrowserApp.cpp:280 #22 0x0000000000404133 in main (argc=3, argv=0x7ffffbc8f898) at /home/giles/mozilla/firefox/browser/app/nsBrowserApp.cpp:648
(In reply to Henrik Skupin (:whimboo) from comment #20) > I used the linux64 debug build from try and when it crashed I attached gdb > as the instructions are showing. With 'bt' I get the stacks as listed above. I see. this output doesn't get flushed on my terminal until after the 300 second sleep finishes. If I attach while it's waiting I get the backtrace with vp9_half_vert_variance8x_h_sse2. Since the binary has no symbols, I think gdb is getting them from a local or system library and then lying out the source. Explains why they still show up, and perhaps why the stack makes no sense. $ gdb /home/giles/mozilla/mozmill/firefox/firefox 32673 GNU gdb (GDB) Fedora 7.6.50.20130731-19.fc20 ... Reading symbols from /home/giles/mozilla/mozmill/firefox/firefox...Missing separate debuginfo for /home/giles/mozilla/mozmill/firefox/firefox Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/0c/5b070c1e4b656d4578bb6f99bb87ad2bb830f5.debug (no debugging symbols found)...done. Attaching to program: /home/giles/mozilla/mozmill/firefox/firefox, process 32673 Reading symbols from /lib64/libpthread.so.0...Reading symbols from /usr/lib/debug/lib64/libpthread-2.18.so.debug...done. done. ... Reading symbols from /home/giles/mozilla/mozmill/firefox/libxul.so...Missing separate debuginfo for /home/giles/mozilla/mozmill/firefox/libxul.so Try: yum --enablerepo='*debug*' install /usr/lib/debug/.build-id/cd/c4f8fbeccfc83973ec70eafd70ce8694e0aeb0.debug (no debugging symbols found)...done. ... But I can't find a load line for libvpx or libxul either way.
I keep getting confused because of all the similar function names. I see vp9_half_vert_variance8x_h_sse2 in the bt after removing vp9_half_horiz_variance8x_h_sse2 and vp9_half_horiz_vert_variance8x_h_sse2. So I'm not seeing a symbol we've removed from the build. But whimboo is in comment #18. That I do not understand. After asking around: 'nm' shows symbols from the "Normal Symbol Table" by default. The debug build from tbpl has been stripped, so there is no normal symbol table. There is still a "Dynamic Symbol Table" which lists vp9_half_vert_variance8x_h_sse2. This table can be dumped with 'nm -D' or 'objdump -T'. Presumedly this is where gdb is looking up the name when generating the backtrace. The symbol is 'NOTYPE GLOBAL DEFAULT' with zero length. jld suggested it's just the last label before an unexported function, which is where the PC actually is. I don't know why the symbol is on the export list in the first place. Notably the offset is a long way from the symbol address: #11 0x00007f1635a50c79 in vp9_half_vert_variance8x_h_sse2 () from /home/giles/mozilla/mozmill/firefox/libxul.so (gdb) p $pc $2 = (void (*)()) 0x7f1635a50c79 <vp9_half_vert_variance8x_h_sse2+271627> Another valid hypothesis.
Still looking for a sec-rating here - is this an exploitable issue that would impact users?
Flags: needinfo?(abillings)
I'm not sure why you're asking me when I asked the same question in comment 9. I'm unable to rate it based on the information given.
Flags: needinfo?(abillings)
(In reply to Al Billings [:abillings] from comment #25) > I'm not sure why you're asking me when I asked the same question in comment > 9. I'm unable to rate it based on the information given. Who were you asking in comment 9? I look to you for this, but if you don't know, please NI? the person you're expecting an answer from so the bug doesn't sit around without the necessary info. This bug isn't assigned to anyone either, so there's not much to go on for driving.
Flags: needinfo?(abillings)
Unlike many people, I actually do read my bugmail. You don't need to needinfo? me to get my attention as I go through bugs I'm cc'd on. I was asking anyone reading this bug or commenting in it. I'm needinfo? Dan since he should weigh in on the rating.
Flags: needinfo?(dveditz)
Flags: needinfo?(abillings)
Assignee: nobody → tterribe
Flags: needinfo?(dveditz)
I'm going to mark this sec-other until we have any evidence that this is a problem outside of xvfb.
Keywords: sec-other
Group: core-security → media-core-security
vp9_half_horiz_variance8x_h_sse2 doesn't appear to exist in trunk... (there is a vp8 version) Still relevant? close?
Component: Audio/Video → Audio/Video: Playback
Flags: needinfo?(hskupin)
Flags: needinfo?(giles)
It was removed in the update to libvpx 1.4.0 in bug 1151175, so the patches are obsolete. Moreover, in the last 28 days, there's only a single crash report with this signature, with buildid 20141106201515 (two years old). I think we can close.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(giles)
Resolution: --- → FIXED
Flags: needinfo?(hskupin)
Group: media-core-security → core-security-release
Group: core-security-release
Depends on: 1151175
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: