Closed Bug 494522 Opened 15 years ago Closed 15 years ago

Firefox 3.5pre frequent hang upon closing FF, with some message on STDOUT/STDERR about "bRENDER"

Categories

(Firefox :: General, defect)

x86
Linux
defect
Not set
critical

Tracking

()

RESOLVED INVALID

People

(Reporter: rickstockton, Unassigned)

Details

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1pre) Gecko/20090522 Shiretoko/3.5pre
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1pre) Gecko/20090522 Shiretoko/3.5pre

started happening following upgrade to Mandriva 2009.1 (latest: Kernel 26.29.1+, GTK+ 2.16.1). All extensions are disabled.

This hang on Close happens with both menu-based close (File-->Quit) and Titlebar's "Close Window" decoration (the "X"). The GUI vanishes, but frequently (60-80% of time), firefox.bin is left running-- although it appears to consume no CPU at all, this is not a "churning away" type of hang. The task instantly responds to SIGHUP, SIGTERM, and SIGKILL. And that's been my work-around. (I didn't try SIGQUIT, but will SWAG that to work too.) I need to do that about 60-80% of the time. Interestingly, menuitem File-->Restart never suffers the hang (at least, I've not seen it suffer the hang).

Now for the "bRENDER" part: To help isolate this, I modified my run-mozilla.sh to echo the environment WITHOUT being in debug mode. Here's what I get:

[rick@localhost firefox]$ ./firefox       
+ moz_libdir=/usr/local/lib/firefox-3.5pre
+ found=0                                 
+ progname=./firefox                      
++ dirname ./firefox                      
+ curdir=.                                
++ basename ./firefox                     
+ progbase=firefox                        
+ run_moz=./run-mozilla.sh                
+ test -x ./run-mozilla.sh                
+ dist_bin=.                              
+ found=1                                 
+ '[' 1 = 0 ']'                           
+ script_args=                            
+ debugging=0                             
+ MOZILLA_BIN=firefox-bin                 
+ '[' linux-gnu = beos ']'                
+ pass_arg_count=0                        
+ '[' 0 -gt 0 ']'                         
+ '[' 0 = 1 ']'                           
+ ./run-mozilla.sh ./firefox-bin          
MOZILLA_FIVE_HOME=.                       
  LD_LIBRARY_PATH=.:./plugins:.           
DISPLAY=:0.0                              
DYLD_LIBRARY_PATH=.:.                     
     LIBRARY_PATH=.:./components:.        
       SHLIB_PATH=.:.                     
          LIBPATH=.:.                     
       ADDON_PATH=.                       
      MOZ_PROGRAM=./firefox-bin           
      MOZ_TOOLKIT=
        moz_debug=0
     moz_debugger=

and then, when it hangs during a close attempt, I get this as the next line (starting in column one):

bRENDER

Just once, it cleared itself after several minutes by adding on this:

bRENDER+ exitcode=130
+ exit 130

But all the other times, I've had to kill it off with a pkill, leading (of course) to something like this:

bRENDERHangup

(that's the case for sending SGHUP). 

I tried using MXR to search for this "bRender", but I didn't see it. Maybe it's coming from a system-provided Cairo module? The OS upgrade provides libcairo2 at reve level 1.8.6-3mdv2009.1

Reproducible: Sometimes

Steps to Reproduce:
1) Perform any "quit Firefox" or "Close Window" action. Note the addition of of a new line "bRENDER" to STDOUT/STDERR when the hang occurs.

2) restart Firefox, and see the "Firefox is already running...." message.

3) Verify firefox.bin is hung with "ps -ef |grep firefox" (and see that the CPU seconds don't increase, even over several minutes).

4) workaround by "pkill -1 firefox". It goes away


Expected Results:  
successful close occurs only about 1/3 of the time. BTW, I got that self-resolution with RC=130 only once. Every other time, I must resort to "kill" or "pkill". 

No matter what signal I send, I can't provoke a Talkback Crash. I'm drafting this as "critical"-- there's no dataloss, but it's nasty to run a kill script after the GUI disappears properly. (And you have to invoke the script by hand, unless you do some REAL magic.)

It might just be Mandriva. But, if Firefox is depending on OLD versions of GTK, or Cairo, or Kernel, or whatever-- we'll probably need to either include the dependent code right in the product, or fix the incompatibility breakage.

It could also be "just another rendering bug", there's lots of recent work going on there.
If someone can advise me how to provoke a useful crash via sending a signal, or how to otherwise "debug" my shutdown behavior, I'm all ears. I've seen old doco which implies that Cairo is pretty much built-in, so I'm guessing that she issue lies elsewhere (maybe between Cairo and X11 and Nvidai Video Driver). BTW, I've got, per Mandriva 2009.1,

"a set of X.org packages most similar to the upstream 'X.org 7.4' release, particularly including version 1.6.1 of the X.org X server."

plus NVidia proprietary driver, beta version 185.18.10.

But first thing, I suppose, is a stack trace. How can I provoke one?
i'd probably use grep -lr bRENDER /

that'd give you hints.

there is an x11 logging proxy which i'd probably try to find (i need to write myself a note so i don't have to hunt for it), that should at least give you some idea what's going on.

you could run firefox w/ --sync, in which case if you had symbols (build firefox yourself or get debugging symbols from your vendor) you could use gdb to find out what it was doing.

if you don't absolutely need nvidia drivers and it's 100% reproducible, then you should try not using the nvidia drivers to see if it happens.
Summary: frequent hang upon closing FF, with some message on STDOUT/STDERR about "bRENDER" → Firefox 3.5pre frequent hang upon closing FF, with some message on STDOUT/STDERR about "bRENDER"
Invalid; problem not in Mozilla-written code.

I've been very busy and away from further analysis on this bug, but it no longer happens to me. ever. :))

Fix was NOT within Firefox, the problem was definitely within the Mandriva Distro. It's now fixed via a Mandriva update, *probably* a new X11 Server package which I installed (along with a long list of other 'official update' RPMs) about 3 weeks ago.

Today, just to be absolutely sure, I backed down to the same nightly with which I opened this bug (2009-05-22). And that old Build COULD NOT be made to not provoke the problem in my up-to-date Mandriva 2009.1 "Spring" installation.

NVidia was not to blame. I strongly suspect that the Resolution was provided within a Mandriva-built "Official Update" to X11 Server, although I haven't tried to identify the specific fix within that newer RPM. (Build error; backport from newer X11 'fix' code; or Mandriva-written fix being sent upstream? I didn't research this.)

Problem has vanished. There's several reasons why no one else saw it:
(1) Distro is MUCH less widely used than others-- Ubuntu, Fedora, SUSE, and etc.

(2) Distro is increasingly used in 64-bit version, maybe that never even exposed the problem.

(3) Official Update package (probably X11 Server) fixing the problem was created and released after release of Mandriva 2009.1 "Spring". Not much window of opportunity.

(4) And most of all: Mandriva provides their own builds of FF, TB, and etc. as Mandriva RPMs, currently back at 3.0 level. Hardly ANY Mandriva users are running with Mozilla-built nightlies and RCs.

I never did succeed with a search for "bRENDER", but my patience was perhaps too limited. (The positive hit may have been lost within gazillions of messages about invalid links.) But more likely, I'll SWAG that the "b" was written out separately and before the "RENDER" string. But your suggestion to attack via gdb taught me a lot, and I eventually would have tracked it down that way. Thanks for the help!
Status: UNCONFIRMED → RESOLVED
Closed: 15 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.