1193855 - Frequent hangs/crashes in due to apparent threading error (in Ubuntu 15.04, began with 40.0)

Reporter

Description

•

10 years ago

Attached file ff_gdb_safecrash_backtrace — Details

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.155 Safari/537.36 Steps to reproduce: This behavior occurs in normal use and in safe mode and does not appear to be related to any particular activity in the browser. The delay before crashing has been as little as a few seconds or as much at 15+ minutes. This is new behavior following update to version 40.0. OS is Ubuntu 15.04. Actual results: See also this automated crash report (it's actually quite difficult in this crash/resume state to make a report, but I did get this one out: https://crash-stats.mozilla.org/report/index/9faaf25e-1bd0-4ef1-a10c-ad6772150812 ). Browser hangs and fades to gray. It is not responsive to any inputs. On some of the hangs, there also seems to be some odd effect on UI functionality across the desktop (launcher icons not responding until the Firefox process is killed). Further, on restart Firefox suddenly detected that it was not the default browser, and the search bar handler changed from Google to Yahoo. When starting Firefox (safe mode) in debug mode through gdb, this is the terminal readout. I have also attached a backtrace from gdb. (gdb) run Starting program: /usr/lib/firefox/firefox --safe-mode [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". (process:10383): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed warning: Corrupted shared library list: 0x7fffe8054800 != 0x7ffff6b93800 [New Thread 0x7fffc79fd700 (LWP 10448)] [Thread 0x7fffc79fd700 (LWP 10448) exited] [New Thread 0x7fffde1ed700 (LWP 10393)] [New Thread 0x7fffb22ff700 (LWP 10559)] [[Firefox keeps running fine. A bunch of other new threads truncated here...]] [New Thread 0x7ffff7f71700 (LWP 10394)] [New Thread 0x7fffde9ee700 (LWP 10392)] [New Thread 0x7fffe9417700 (LWP 10391)] Program received signal SIGPIPE, Broken pipe. [Switching to Thread 0x7fffde1ed700 (LWP 10393)] 0x00007ffff7bcb2ef in __libc_send (fd=120, buf=buf@entry=0x7fffbe6d0000, n=n@entry=53, flags=flags@entry=0) at ../sysdeps/unix/sysv/linux/x86_64/send.c:31 31 ../sysdeps/unix/sysv/linux/x86_64/send.c: No such file or directory. Expected results: The browser should not have crashed...?

Adam Colligan

Reporter

Updated

•

10 years ago

Crash Signature: 9faaf25e-1bd0-4ef1-a10c-ad6772150812

Adam Colligan

Reporter

Updated

•

10 years ago

Severity: normal → critical

OS: Unspecified → Linux

Hardware: Unspecified → x86_64

Adam Colligan

Reporter

Updated

•

10 years ago

Summary: Firefox hangs at frequent random intervals (possibly on SIGPIPE) after update to 40.0 in Ubuntu 15.04 → Frequent hangs/crashes in due to apparent threading error (in Ubuntu 15.04, began with 40.0)

Adam Colligan

Reporter

Updated

•

10 years ago

Keywords: 64bit, crash, hang, main-thread-io, regression

Adam Colligan

Reporter

Comment 1

•

10 years ago

I have now re-created this in beta (41.0b1). I have also tried nightly and am getting crashes, but it is difficult for me to tell if it's the same issue or not, as it is handling the back-end processes differently I believe, and I have not been able to make gdb play as nicely with it.

Adam Colligan

Reporter

Comment 2

•

10 years ago

My best guess at the moment is that the regression is somewhere in this diff (from Launchpad): https://launchpadlibrarian.net/213839557/firefox_39.0%2Bbuild5-0ubuntu0.15.04.1_40.0%2Bbuild4-0ubuntu0.15.04.1.diff.gz

Nickolay_Ponomarev

Comment 3

•

10 years ago

> it's actually quite difficult in this crash/resume state to make a report, but I did get this one out How? This should work: https://developer.mozilla.org/en-US/docs/How_to_Report_a_Hung_Firefox#Linux_and_Mac (I'd like to determine if Firefox hangs and only crashes after you force it to by sending a signal, or if it crashes on its own.) Try crashing again, if the stack always looks similar, it may be related to OMTC, which was enabled on Linux in v40 (bug 994541, https://mozillagfx.wordpress.com/2015/05/19/off-main-thread-compositing-on-linux/ ) I'm not sure which pref can be used to test with it disabled (perhaps layers.offmainthreadcomposition.enabled?) Also, please confirm that you use a build from mozilla.org, not the distro-provided build. If above suggestions will not help in identifying the cause of the problem, to obtain the regression range use this: http://mozilla.github.io/mozregression/ > Program received signal SIGPIPE, Broken pipe. That's not necessarily a crash. Per http://krijnhoetmer.nl/irc-logs/developers/20141126#l-1917 : > <grobinson> seth: Does Firefox crash often when you're debugging it on Mac? I keep getting SIGPIPE > <seth> grobinson: so that's actually not a crash; we use SIGPIPE in our IO code > <seth> grobinson: i have a trick to fix that; just a sec > <seth> grobinson: (unfortunately masking the singal in .lldbinit does not work for some reason) > <seth> grobinson: OK, so i set a breakpoint on the function do_main in nsBrowserApp.cpp > <seth> grobinson: the action type for the breakpoint is "debugger command" > <seth> grobinson: and the command is "process handle SIGPIPE -n true -p true -s false" > <seth> grobinson: breakpoints are saved in the project, so if you set this once, you'll solve the problem forever. it'll disable breaking on SIGPIPE every time you run firefox > <seth> (there's nothing special about do_main, i was just trying to run the breakpoint as early as possible) > * bz solves this problem by not using lldb > <bz> and "handle SIGPIPE noprint nostop pass" in gdb

Flags: needinfo?(adam)

Nickolay_Ponomarev

Updated

•

10 years ago

Keywords: 64bit, crash, main-thread-io

Adam Colligan

Reporter

Comment 4

•

10 years ago

> This should work: > https://developer.mozilla.org/en-US/docs/ > How_to_Report_a_Hung_Firefox#Linux_and_Mac > (I'd like to determine if Firefox hangs and only crashes after you force it > to by sending a signal, or if it crashes on its own. > Try crashing again [...] Thanks for that (and thanks very much for taking an interest in the bug in general!). Here are a couple more manually-triggered crash reports: https://crash-stats.mozilla.com/report/index/c2e63486-bbf8-4d70-9648-f3f262150816 https://crash-stats.mozilla.com/report/index/26b5c1ea-631a-4a39-a999-3cdfc2150816 if the stack always looks similar, it may be related to > OMTC, which was enabled on Linux in v40 (bug 994541, > https://mozillagfx.wordpress.com/2015/05/19/off-main-thread-compositing-on- > linux/ ) > I'm not sure which pref can be used to test with it disabled (perhaps > layers.offmainthreadcomposition.enabled?) I did set that to "false", and that session proceeded to crash as usual within a few minutes. Here's a crash report from there: https://crash-stats.mozilla.com/report/index/25928c42-16be-411a-9d6e-65b612150816 Even after a restart, I still got another crash with that setting on "false": https://crash-stats.mozilla.com/report/index/bp-0f6443fd-fb64-4a6f-b42b-baafc2150816 > Also, please confirm that you use a build from mozilla.org, not the > distro-provided build. I do use the distro-provided builds normally (https://launchpad.net/firefox), which is where the reports are coming from. The beta I tested was also from an Ubuntu PPA here (https://launchpad.net/~mozillateam/+archive/ubuntu/firefox-next) and the nightly here (https://launchpad.net/~ubuntu-mozilla-daily/+archive/ubuntu/ppa). Those Firefox builds hosted on Launchpad state that their bugs are tracked here at Bugzilla, though. I downloaded firefox from the Mozilla.org site, which gave me a .tar.bz that seems able to run straight from the folder when extracted. The build ID I saw was identical to the one I run, though: 20150807094836, and all of my configs looked the same. I'm not sure if this is expected behavior or if it just means that I wasn't actually running the version I downloaded but rather just launching the files that my system already associates with that program command. (Running a totally different build from a folder doesn't seem like a problem with things like the Tor Browser Bundle, though -- sorry for my ignorance about how it works). I tried specifying a download for the en-US 64-bit linux version, and that .tar.bz has the same MD5 as the one auto-generated for me. But when I launched firefox from *that* extracted folder (in a different random-ish on my machine), this time with no already-running instances it opened with a "checking your add-ons" screen. It froze there, however. Here's that crash: https://crash-stats.mozilla.com/report/index/b6af6b1e-dd6f-499d-8bb3-4bca52150816 After running it for a time, it went as usual: https://crash-stats.mozilla.com/report/index/3406f2da-604a-44e2-8dc6-118eb2150816 > If above suggestions will not help in identifying the cause of the problem, > to obtain the regression range use this: > http://mozilla.github.io/mozregression/ I did start making a run at that the other night, but it's quite difficult work for a couple of reasons. (1) It could be anywhere between 39.0.5 and 40.0.x, which I believe covers quite a few nightlies; and (2) the time to crash and actions to cause a crash are weirdly indeterminate, so testing each nightly in the sequence takes a long time, and it's hard to tell when the "good" build has been found. I will get back on it if necessary, but my hope has been that the traces themselves might provide more direct insight. > > Program received signal SIGPIPE, Broken pipe. > That's not necessarily a crash. Per > http://krijnhoetmer.nl/irc-logs/developers/20141126#l-1917 : > > <grobinson> seth: Does Firefox crash often when you're debugging it on Mac? I keep getting SIGPIPE > > <seth> grobinson: so that's actually not a crash; we use SIGPIPE in our IO code > > <seth> grobinson: i have a trick to fix that; just a sec > > <seth> grobinson: (unfortunately masking the singal in .lldbinit does not work for some reason) > > <seth> grobinson: OK, so i set a breakpoint on the function do_main in nsBrowserApp.cpp > > <seth> grobinson: the action type for the breakpoint is "debugger command" > > <seth> grobinson: and the command is "process handle SIGPIPE -n true -p true -s false" > > <seth> grobinson: breakpoints are saved in the project, so if you set this once, you'll solve the problem forever. it'll disable breaking on SIGPIPE every time you run firefox > > <seth> (there's nothing special about do_main, i was just trying to run the breakpoint as early as possible) > > * bz solves this problem by not using lldb > > <bz> and "handle SIGPIPE noprint nostop pass" in gdb It's possible, then, that those lines were an artifact of my having opened that instance of firefox through gdb? Two other things I can add that may not be meaningful: (1) So far, since this started, I have noticed one sort of use that does not seem to lead to a crash/hang/whatever it is. Sometimes the crash happens right on opening, before I've even been able to finish typing a URL or choose a bookmark (or even answer whether I want the session restored). But assuming that doesn't happen, I sometimes go to www.haxball.com , which is a multiplayer flash game I play through the Pipelight plugin. If I have done / do nothing else in the browser but play that one game, I don't recall it having crashed even after fairly extended periods (maybe 30-60+ minutes). For whatever that's worth. Note that crashes in other contexts can happen during actions as mundane as mousing over a link, so I don't know how much I'd read into it. (2) Since I have been mostly using Chrome in the meantime, I have noticed a few similar (but maybe not identical, and much more rare) crashes with Chrome. I don't have debugging symbols for gdb, and I haven't been able to run the nacl-gdb version (probably because of sandboxing), but when I attach gdb to a Chrome process post-crash, I get something like this as a trace: #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f5801088b58 in ?? () #2 0x000002018acf5020 in ?? () #3 0x0000020189097150 in ?? () #4 0x00007ffc0afa18ee in ?? () #5 0x0000000000000000 in ?? () Again, maybe/probably nothing, but I'm throwing darts at whatever.

Flags: needinfo?(adam)

Nickolay_Ponomarev

Comment 5

•

10 years ago

>that session proceeded to crash as usual within a few minutes I believe you're using the verb "crash" to refer to the "hangs and fades to gray" state you've described, not the death of the firefox process and/or appearance of the Crash reporter window. Is that correct? The common terms here in b.m.o for the former cases is "hangs" and the latter cases "crashes". >Those Firefox builds hosted on Launchpad state that their bugs are tracked here at Bugzilla, though. Where do they state that? I believe the Ubuntu mozilla team uses their own tracker. >It's possible, then, that those lines were an artifact of my having opened that instance of firefox through gdb? That's my guess. The IRC log I pasted has instructions on disabling this behavior in gdb (and/or just try continuing, I think it's "c" in gdb). >I downloaded firefox from the Mozilla.org site I couldn't figure out what you were trying to describe in this paragraph. https://support.mozilla.org/en-US/kb/install-firefox-linux#firefox:linux:fx40 (starting from "2. Open a Terminal and go to your home directory: cd ~ ") has the instructions to install a build from tar.gz. Note that you must ensure no firefox processes are running, and run "path/to/firefox", not just "firefox". (BTW, The Ubuntu builds, I believe, clearly identify themselves as such in the About dialog. At least the one that came with my ubuntu install does.) Your settings are stored in a "profile" and are reused across different versions of Firefox. It might be a good idea to test with a clean profile too: https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles A sure-fire way to run an isolated Firefox instance of a specific version is: path/to/firefox -no-remote -profile /absolute/path/to/an/empty/dir > http://mozilla.github.io/mozregression/ Did you succeed in reproducing the problem at least in one build launched with mozregression? It downloads/runs the mozilla.org builds.

Flags: needinfo?(adam)

Nickolay_Ponomarev

Comment 6

•

10 years ago

And perhaps, someone from the graphics team can chime in with further suggestions. Short summary (for details see comment 0): Firefox on Linux "hangs and fades to gray" starting with version 40. Force-crashing it in this state always has this stack (e.g. bp-3406f2da-604a-44e2-8dc6-118eb2150816): 0 libpthread-2.21.so libpthread-2.21.so@0xcda0 1 libnspr4.so PR_WaitCondVar 2 libxul.so mozilla::Monitor::Wait 3 libxul.so mozilla::ipc::MessageChannel::WaitForSyncNotify 4 libxul.so mozilla::ipc::MessageChannel::Send 5 libxul.so mozilla::layers::PLayerTransactionChild::SendUpdate 6 libxul.so mozilla::layers::ShadowLayerForwarder::EndTransaction 7 libxul.so mozilla::layers::ClientLayerManager::ForwardTransaction 8 libxul.so mozilla::layers::ClientLayerManager::EndTransaction 9 libxul.so nsDisplayList::PaintRoot 10 libxul.so nsLayoutUtils::PaintFrame 11 libxul.so PresShell::Paint ... I guessed it could be related to OMTC, but setting layers.offmainthreadcomposition.enabled=false didn't have any effect.

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Comment 7

•

10 years ago

This stack indicates that the main thread sent a synchronous transaction to the compositor thread, and is waiting for the compositor to receive the transaction. in the crash bp-3406f2da-604a-44e2-8dc6-118eb2150816 we can see that the compositor is indeed busy compositing (perhaps hung). It's the OpenGL compositor, so it'd be interesting to know if the problem also occurs with hardware acceleration turned off (which is the default configuration on Linux). If the problem only happens with hardware acceleration, then this bug should block bug 594876, and is probably a duplicate of some other hangs reported with gl layers on Linux (I don't have the bug numbers handy).

Flags: needinfo?(nical.bugzilla)

Comment hidden (obsolete)

I see similar crash on Fedora, Firefox 40, debug (custom) build, Gtk2, safe mode, OMTC enabled (default clean profile). Firefox crashes right after start: Program received signal SIG38, Real-time event 38. 0x00007ffff6e5db61 in clone () from /lib64/libc.so.6 #0 0x00007ffff6e5db61 in clone () at /lib64/libc.so.6 #1 0x00007ffff7bc60d1 in create_thread () at /lib64/libpthread.so.0 #2 0x00007ffff7bc7c95 in pthread_create@@GLIBC_2.2.5 () at /lib64/libpthread.so.0 #3 0x00007ffff3c39311 in (anonymous namespace)::CreateThread(size_t, bool, PlatformThread::Delegate*, PlatformThreadHandle*) (stack_size=0, joinable=joinable@entry=true, delegate=delegate@entry=0x7fffd05bd4c0, thread_handle=thread_handle@entry=0x7fffd05bd4d0) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/ipc/chromium/src/base/platform_thread_posix.cc:144 #4 0x00007ffff3c39c8e in PlatformThread::Create(unsigned long, PlatformThread::Delegate*, unsigned long*) (stack_size=<optimized out>, delegate=delegate@entry=0x7fffd05bd4c0, thread_handle=thread_handle@entry=0x7fffd05bd4d0) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/ipc/chromium/src/base/platform_thread_posix.cc:156 #5 0x00007ffff3c3ab96 in base::Thread::StartWithOptions(base::Thread::Options const&) (this=0x7fffd05bd4c0, options=...) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/ipc/chromium/src/base/thread.cc:92 #6 0x00007ffff3c3abf5 in base::Thread::Start() (this=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/ipc/chromium/src/base/thread.cc:81 #7 0x00007ffff3fb0018 in mozilla::layers::ImageBridgeChild::StartUpOnThread(base::Thread*) (aThread=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/gfx/layers/ipc/ImageBridgeChild.cpp:612 #8 0x00007ffff3fcdba2 in gfxPlatform::Init() () at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/gfx/thebes/gfxPlatform.cpp:502 #9 0x00007ffff3fcdef0 in gfxPlatform::GetPlatform() () at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/gfx/thebes/gfxPlatform.cpp:416 #10 0x00007ffff4ab2556 in nsRefreshDriver::ChooseTimer() const () at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/layout/base/nsRefreshDriver.cpp:866 #11 0x00007ffff4ab2556 in nsRefreshDriver::ChooseTimer() const (this=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/layout/base/nsRefreshDriver.cpp:1000 #12 0x00007ffff4ab26b0 in nsRefreshDriver::EnsureTimerStarted(nsRefreshDriver::EnsureTimerStartedFlags) (this=0x7fffd17a6000, aFlags=nsRefreshDriver::eNone) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/layout/base/nsRefreshDriver.cpp:1207 #13 0x00007ffff4ab2755 in nsRefreshDriver::MostRecentRefresh() const (this=0x7fffd17a6000) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/layout/base/nsRefreshDriver.cpp:1096 #14 0x00007ffff4b2711b in nsPresContext::Init(nsDeviceContext*) (this=0x7fffd034b000, aDeviceContext=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/layout/base/nsPresContext.cpp:1029 #15 0x00007ffff4b097e7 in nsDocumentViewer::InitInternal(nsIWidget*, nsISupports*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, bool, bool, bool) (this=0x7fffd17da2c0, aParentWidget= 0x7fffd0462b80, aState=aState@entry=0x0, aBounds=..., aDoCreation=aDoCreation@entry=true, aNeedMakeCX=aNeedMakeCX@entry=true, aForceSetNewDocument=true) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/layout/base/nsDocumentViewer.cpp:814 #16 0x00007ffff4b09b08 in nsDocumentViewer::Init(nsIWidget*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&) (this=<optimized out>, aParentWidget=<optimized out>, aBounds=...) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/layout/base/nsDocumentViewer.cpp:626 #17 0x00007ffff4d3d0f1 in nsDocShell::SetupNewViewer(nsIContentViewer*) (this=this@entry= 0x7fffd034a000, aNewViewer=aNewViewer@entry=0x7fffd17da2c0) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/docshell/base/nsDocShell.cpp:9294 #18 0x00007ffff4d4319a in nsDocShell::Embed(nsIContentViewer*, char const*, nsISupports*) (this=this@entry=0x7fffd034a000, aContentViewer=0x7fffd17da2c0, aCommand=aCommand@entry=0x7ffff56b559e "", aExtraInfo=aExtraInfo@entry=0x0) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/docshell/base/nsDocShell.cpp:7199 #19 0x00007ffff4d41ed0 in nsDocShell::CreateAboutBlankContentViewer(nsIPrincipal*, nsIURI*, bool) (this=0x7fffd034a000, aPrincipal=<optimized out>, aBaseURI=0x0, aTryToSaveOldPresentation=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/docshell/base/nsDocShell.cpp:8014 #20 0x00007ffff4d796ed in nsWebShellWindow::Initialize(nsIXULWindow*, nsIXULWindow*, nsIURI*, int, int, bool, nsITabParent*, nsWidgetInitData&) (this=0x7fffd04fc100, aParent=<optimized out>, aOpener=aOpener@entry=0x0, aUrl=aUrl@entry=0x7fffd02b72e0, aInitialWidth=aInitialWidth@entry=100, aInitialHeight=aInitialHeight@entry=100, aIsHiddenWindow=true, aOpeningTab= 0x0, widgetInitData=...) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/xpfe/appshell/nsWebShellWindow.cpp:216 #21 0x00007ffff4d79bb4 in nsAppShellService::JustCreateTopWindow(nsIXULWindow*, nsIURI*, unsigned int, int, int, bool, nsITabParent*, nsWebShellWindow**) (this=this@entry=0x7fffd03f31c0, aParent=aParent@entry=0x0, aUrl=0x7fffd02b72e0, aChromeMask=aChromeMask@entry=4094, aInitialWidth=aInitialWidth@entry=100, aInitialHeight=aInitialHeight@entry=100, aIsHiddenWindow=true, aOpeningTab=0x0, aResult=0x7fffffffc6f0) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/xpfe/appshell/nsAppShellService.cpp:620 #22 0x00007ffff4d79fea in nsAppShellService::CreateHiddenWindowHelper(bool) (this=0x7fffd03f31c0, aIsPrivate=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/xpfe/appshell/nsAppShellService.cpp:138 #23 0x00007ffff4e64dec in nsAppStartup::CreateHiddenWindow() (this=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/toolkit/components/startup/nsAppStartup.cpp:244 #24 0x00007ffff4e94ea3 in XREMain::XRE_mainRun() (this=this@entry=0x7fffffffc928) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/toolkit/xre/nsAppRunner.cpp:4012 #25 0x00007ffff4e95295 in XREMain::XRE_main(int, char**, nsXREAppData const*) (this=this@entry=0x7fffffffc928, argc=argc@entry=5, argv=argv@entry=0x7fffffffde28, aAppData=aAppData@entry=0x7fffffffcb30) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/toolkit/xre/nsAppRunner.cpp:4168 #26 0x00007ffff4e954d2 in XRE_main(int, char**, nsXREAppData const*, uint32_t) (argc=5, argv=0x7fffffffde28, aAppData=0x7fffffffcb30, aFlags=<optimized out>) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/toolkit/xre/nsAppRunner.cpp:4257 #27 0x0000000000404190 in do_main(int, char**, nsIFile*) (argc=argc@entry=5, argv=argv@entry=0x7fffffffde28, xreDirectory= 0x7ffff6b5f6c0) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/browser/app/nsBrowserApp.cpp:214 #28 0x00000000004039c1 in main(int, char**) (argc=5, argv=0x7fffffffde28) at /home/komat/CVS/firefox/firefox-40.0-bez-debug-optimalizace-gtk2/mozilla-release/browser/app/nsBrowserApp.cpp:478

Andrew Comminos [:acomminos]

Comment 9

•

10 years ago

(In reply to Nicolas Silva [:nical] from comment #7) > This stack indicates that the main thread sent a synchronous transaction to > the compositor thread, and is waiting for the compositor to receive the > transaction. in the crash bp-3406f2da-604a-44e2-8dc6-118eb2150816 we can see > that the compositor is indeed busy compositing (perhaps hung). It's the > OpenGL compositor, so it'd be interesting to know if the problem also occurs > with hardware acceleration turned off (which is the default configuration on > Linux). If the problem only happens with hardware acceleration, then this > bug should block bug 594876, and is probably a duplicate of some other hangs > reported with gl layers on Linux (I don't have the bug numbers handy). The crash bp-3406f2da-604a-44e2-8dc6-118eb2150816 leads me to believe this could be related to XInitThreads pain, mainly with drivers that don't expect a thread-safe X11 environment (and end up deadlocking themselves somehow). It may be worth looking into this crash report with more rigor, since it's occurring in the open-source drivers. Either way, this may make a good case for killing XInitThreads in gecko (as mentioned in bug 1189132).

Adam Colligan

Reporter

Comment 10

•

10 years ago

Thank you all for your time so far in having a look at this. (In reply to Nickolay_Ponomarev from comment #5) > I believe you're using the verb "crash" to refer to the "hangs and fades to > gray" state you've described, not the death of the firefox process and/or > appearance of the Crash reporter window. Is that correct? > The common terms here in b.m.o for the former cases is "hangs" and the > latter cases "crashes". Got it. There may have been instances where the process died on its own and the crash handler appeared, but only one or two (that was my earlier reference to why I had found it difficult to report). In general, I have to kill the process (sometimes possible through the OS GUI, sometimes requiring sending a kill signal because other OS elements have become unresponsive to the mouse at the same time as the Firefox hang). > >Those Firefox builds hosted on Launchpad state that their bugs are tracked here at Bugzilla, though. > Where do they state that? I believe the Ubuntu mozilla team uses their own > tracker. That's here for the main package (https://bugs.launchpad.net/firefox ). I see on re-reading it that while it is true that the primary message of that page is to say that bugs are tracked over here, it is also the case that there are some separate bug pages linked below that, which do exist on Launchpad. I guess I don't know quite enough about the relationship between the teams or the difference between the builds; I can try copying or linking the bug over to that side if you think that would be the thing to do. > >It's possible, then, that those lines were an artifact of my having opened that instance of firefox through gdb? > That's my guess. The IRC log I pasted has instructions on disabling this > behavior in gdb (and/or just try continuing, I think it's "c" in gdb). For now, I have stopped bothering running most test sessions through gdb. So far, I do not think I have seen a hang in safe mode again in this limited testing. So it's even more plausible that this is a problem that does not come up (or does not come up the same way) in safe mode, but it only appeared to in that test because a similar hang condition resulted from the interaction of the session with gdb. > A sure-fire way to run an isolated Firefox instance of a specific version is: > path/to/firefox -no-remote -profile /absolute/path/to/an/empty/dir I had been launching by clicking the application in the directory through Nautilus, but that may have been just issuing a shell command to launch whatever the "firefox" path was; I'm not sure. Thanks for pointing me in this direction, though. I have started doing some runs with the Mozilla builds and an empty profile folder, but I haven't been able to put enough time in quite yet to tell if there's a difference. > > http://mozilla.github.io/mozregression/ > Did you succeed in reproducing the problem at least in one build launched > with mozregression? It downloads/runs the mozilla.org builds. I have not reproduced it there yet, although I'm not sure I started at quite the right point in the nightlies, as I thought I was armed only with the version number (40.0). I need to find a bigger block of time to try to bisect. ________ New Updates: I am working on a grid/tree of behavior based on the build, whether it's in safe mode, whether my normal profile or an empty one is attached, etc. Again, it's slow going because there is an uncertain time to hang, but hopefully I'll be able to find some more time. I can report this odd condition when using the standard Ubuntu release build of firefox from the PPA: - In safe mode, using my normal profile, I have not yet caused a hang (have tried for 10-15 mins or so). - Not in safe mode, but using an empty profile, I have also not yet caused a hang (again tried 10-15 mins). - Not in safe mode, using my normal profile, but with ALL add-ons manually disabled, I have gotten the hang very quickly several times (e.g., https://crash-stats.mozilla.com/report/index/bp-80eeb501-fd28-4c81-8aa5-ac4562150817 ). I'm currently testing with a normal profile, non-safe-mode, add-ons disabled manually, and HW accel also disabled. My next steps in that area may be Pipelight, since it's still active in safe mode. Assuming that I cannot easily recreate the bug in nightlies or with an empty profile, I may just "reset firefox", though I'm loathe to do that given the amount of individual customization I've put into some extensions (the settings, not the code itself). In the meantime, is there a good summary somewhere of differences between normal and safe mode that are *not* the disabling of extensions? If I'm continuing to get the hangs with all extensions disabled but not getting them in safe mode, and all the remaining differences between the modes are things that show up in about:config... maybe I can use a config file diff to make my normal mode more and more like safe mode until it stops hanging, then see where the culprit was?

Adam Colligan

Reporter

Comment 11

•

10 years ago

Here is a crash of the Mozilla build (not the Ubuntu build). In this session, I did not specify an empty profile, but I also did not carry over any extensions (which were all deactivated in my main profile anyway). This session had hardware accel ENABLED: https://crash-stats.mozilla.com/report/index/7f99e53d-5034-4b23-be7e-f3c982150817 . So I went back and tried using the profile that the Mozilla version had created by default in an empty folder. I let it totally replace the prefs.js again through a session, and I just made one change: enable hardware acceleration. I re-opened it with the same profile and, lo and behold, I got the hang: https://crash-stats.mozilla.com/report/index/bp-891b58ac-45b4-464a-9cc9-0b5052150817 . I assume this is decently good evidence that HW accel may be at the heart of this -- not sure if that's conclusive enough that I should change the title of this bug?

Flags: needinfo?(adam)

Adam Colligan

Reporter

Comment 12

•

10 years ago

Looks like I may have spoken too soon there -- irritating thing about the indeterminate time to hang. Here is what I think is the usual hang with layers.acceleration.force-enabled FALSE. Context: Ubuntu build, normal profile, some of my add-ons re-enabled, Report: https://crash-stats.mozilla.com/report/index/bp-7e7c1e7b-a5e7-47c3-b025-b69b42150817 . The option media.hardware-video-decoding.enabled was still set to "true", though -- I don't know if that would make a difference or not. I'll start running with both at "false" from now.

Wayne Mery (:wsmwk)

Updated

•

10 years ago

Crash Signature: 9faaf25e-1bd0-4ef1-a10c-ad6772150812 → [@ libpthread-2.21.so@0xcda0 ]

Keywords: crash

Abe - QA (:Abe_LV)

Updated

•

10 years ago

Component: Untriaged → General

Nicolas Silva [:nical]

Comment 13

•

10 years ago

(In reply to Andrew Comminos [:acomminos] from comment #9) > Either way, this may make a good case for killing XInitThreads in gecko (as > mentioned in bug 1189132). Interesting. Could you open a new bug about killing XInitThreasd with a summary of the issues and possible solutions? All I remember is that before we added XInitThread OMTC would simply crash at startup with some assertion inside x11 or gtk or something like that. But that was more than three years ago.

Blocks: ogl-linux-beta

Andrew Comminos [:acomminos]

Comment 14

•

10 years ago

(In reply to Nicolas Silva [:nical] from comment #13) > (In reply to Andrew Comminos [:acomminos] from comment #9) > > Either way, this may make a good case for killing XInitThreads in gecko (as > > mentioned in bug 1189132). > > Interesting. Could you open a new bug about killing XInitThreasd with a > summary of the issues and possible solutions? All I remember is that before > we added XInitThread OMTC would simply crash at startup with some assertion > inside x11 or gtk or something like that. But that was more than three years > ago. Unfortunately, my patches in bug 1195359 that replaced usage of XInitThreads with separate display connections for the widget code and the compositor only 'fixed' the hang in bug 1189132 by not flushing an XUnmapWindow call when the compositor is shut down; the hang still occurs if we flush the X11 client queue on the widget's display connection. It's likely not worth it to make the switch considering the hangs still occur.

Abe - QA (:Abe_LV)

Comment 15

•

10 years ago

Although this issue is not reproduced on my end, due to the amount of comments from developers and its blocks(594876), I will be changing the bug from Unconfirmed to New.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Wayne Mery (:wsmwk)

Updated

•

9 years ago

Component: General → Graphics

Product: Firefox → Core

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

9 years ago

Priority: -- → P3

Whiteboard: [gfx-noted]

Sylvestre Ledru [:Sylvestre]

Comment 16

•

7 years ago

Closing because no crash reported since 12 weeks.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → WONTFIX

Sylvestre Ledru [:Sylvestre]

Comment 17

•

7 years ago

Closing because no crash reported since 12 weeks.