Last Comment Bug 738709 - Firefox 14 alpha1 hangs on this page www.cyscape.com/showbrow.asp
: Firefox 14 alpha1 hangs on this page www.cyscape.com/showbrow.asp
Status: RESOLVED FIXED
[testday-20120323] [qa+]
: regression
Product: Core
Classification: Components
Component: Widget (show other bugs)
: 14 Branch
: x86 Mac OS X
: -- normal (vote)
: mozilla14
Assigned To: Mike Hommey [:glandium]
:
:
Mentors:
http://www.cyscape.com/showbrow.asp
: 741268 741603 (view as bug list)
Depends on:
Blocks: 737084
  Show dependency treegraph
 
Reported: 2012-03-23 10:54 PDT by Bader Zaidan
Modified: 2012-06-21 08:31 PDT (History)
22 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
fixed


Attachments
a diff of the source when using FF and IE UAs (26.24 KB, text/plain)
2012-03-29 11:52 PDT, Alex Keybl [:akeybl]
no flags Details
Hang Report from 10.7 (279.34 KB, text/plain)
2012-03-29 16:37 PDT, Marcia Knous [:marcia - use ni]
no flags Details
hang backtrace (4.13 KB, text/plain)
2012-03-30 08:58 PDT, Benoit Girard (:BenWa)
no flags Details
"call (void) pss()" results (Java stack) (9.41 KB, text/plain)
2012-03-30 09:07 PDT, Benoit Girard (:BenWa)
no flags Details
Don't call pthread_atfork from jemalloc on OSX, the zone allocator already registers functions for pre/post-fork (1.07 KB, patch)
2012-04-04 00:05 PDT, Mike Hommey [:glandium]
justin.lebar+bug: review+
Details | Diff | Splinter Review

Description Bader Zaidan 2012-03-23 10:54:34 PDT
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:14.0) Gecko/20120323 Firefox/14.0a1
Build ID: 20120323031214

Steps to reproduce:

visited  www.cyscape.com/showbrow.asp 



Actual results:

Firefox 14a1 hangs on this page under Mac OS X Lion and CPU is at 99%


Expected results:

the page should have displayed results of a browser test
Comment 1 Bader Zaidan 2012-03-23 11:00:02 PDT
this bug is only present under firefox nightly 14 alpha 1.

the cpu is at 99% and is at 89 degrees celcius.
after i force close firefox have to stop the process under activity monitor
Comment 2 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-03-23 11:02:13 PDT
Juan was able to reproduce this on Nightly on Mac OSX Lion.
I was unable to reproduce this on Windows 7 and Fedora 16.
It would appear this bug is restricted to Nightly Mac builds.
Comment 3 juan becerra [:juanb] 2012-03-23 11:06:36 PDT
I was able to reproduce this on the nightly (Fx14) but if I disable the Java Applet Plugin allows the page to finish loading, and things seem just as in Fx10, 11 or 12.
Comment 4 Bader Zaidan 2012-03-23 11:20:49 PDT
Further tests prove that it only hangs under Mac OS X lion in Firefox nightly 14 alpha 1 when java enabled.
IF you change the browser version or disable java it will not hang.
Comment 5 Alex Keybl [:akeybl] 2012-03-23 11:22:45 PDT
(In reply to LightWaveX from comment #4)
> Further tests prove that it only hangs under Mac OS X lion in Firefox
> nightly 14 alpha 1 when java enabled.
> IF you change the browser version or disable java it will not hang.

Sending over to Core::Widget - if changing the UA has an effect, can we diff the source we get depending on the UA to see what specifically changes?
Comment 6 Bader Zaidan 2012-03-23 13:19:36 PDT
First Bad: 2012-03-23, Last Good: 2012-03-22, 

Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2012-03-22&enddate=2012-03-23
Comment 7 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-03-23 13:20:45 PDT
Thanks LightWaveX
Comment 8 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-03-23 13:42:05 PDT
Based on the changeset:
http://hg.mozilla.org/mozilla-central/rev/ab2ff3b5611f

Appears to be a regression from:
Bug 736752 - Compartment mismatch in JetPack 'test-content-proxy.testTypedArrays', r=bholley
Comment 9 Bader Zaidan 2012-03-24 02:22:58 PDT
> if changing the UA has an effect, can we diff
> the source we get depending on the UA to see what specifically changes?

I attempted to change the useragent but the 'User Agent Switcher' addon is not supported.
Comment 10 Bader Zaidan 2012-03-24 02:33:07 PDT
I successfully installed UA switcher.

If you set the UA to Internet explorer 6 7 or 8 page does not redirect after test like it is supposed to , but the browser does not hang, crash or display errors.

If you set the user agent to iphone 3.0,  hang occurs.
Comment 11 Alex Keybl [:akeybl] 2012-03-29 11:50:19 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #8)
> Based on the changeset:
> http://hg.mozilla.org/mozilla-central/rev/ab2ff3b5611f
> 
> Appears to be a regression from:
> Bug 736752 - Compartment mismatch in JetPack
> 'test-content-proxy.testTypedArrays', r=bholley

roc - do you agree with Anthony's hypothesis here?
Comment 12 Alex Keybl [:akeybl] 2012-03-29 11:52:07 PDT
Created attachment 610629 [details]
a diff of the source when using FF and IE UAs
Comment 13 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-03-29 16:23:35 PDT
(In reply to Alex Keybl [:akeybl] from comment #11)
> roc - do you agree with Anthony's hypothesis here?

It doesn't look like the page touches canvas at all, which is all that patch touches.

Nothing out in that regression range jumps out at me though.

It sounds like the problem is related to the Java plugin (comment #4). Josh, Stephen, can you reproduce this?
Comment 14 Marcia Knous [:marcia - use ni] 2012-03-29 16:37:50 PDT
Created attachment 610750 [details]
Hang Report from 10.7

Attaching the hang report from that site in case it is useful.
Comment 15 Robert O'Callahan (:roc) (email my personal email if necessary) 2012-03-29 16:44:30 PDT
Looks like we're just hanging in the Java plugin. No idea why that would have regressed.
Comment 16 Benoit Girard (:BenWa) 2012-03-30 08:58:00 PDT
Created attachment 610901 [details]
hang backtrace

I just got a hang and it appears to be the same one as here. i.e. mine also happens in SetWindow. The concerning part is that it appears the plugin calls 'Java_java_lang_UNIXProcess_forkAndExec' which I don't think you ever want to do within firefox, right?
Comment 17 Benoit Girard (:BenWa) 2012-03-30 09:07:36 PDT
Created attachment 610907 [details]
"call (void) pss()" results (Java stack)
Comment 18 Steven Michaud [:smichaud] (Retired) 2012-03-30 09:14:52 PDT
> The concerning part is that it appears the plugin calls 
> 'Java_java_lang_UNIXProcess_forkAndExec' which I don't think you ever want to do
> within firefox, right?

Nope, that's alright.  The Java plugin always runs in-process on the Mac.  (It's parent process does, that is.  But the Java plugin parent process also spawns its own child process.)

By the way, please post the result of "thread apply all bt" (before you've done "call (void) pss()").
Comment 19 Steven Michaud [:smichaud] (Retired) 2012-03-30 09:21:50 PDT
A quick look at the output of pss() suggests this isn't a Java hang ... so it may be our fault.

I've got a bunch of topcrashers on my plate, so I won't really be able to dig into this until next week.  But I'll try to reproduce later today.

I'm taking this bug, now -- I'm the resident Java guy.  But I won't cry if someone else figures it out before I have a chance to get to it.
Comment 20 Benoit Girard (:BenWa) 2012-03-30 10:14:35 PDT
I can reproduce this crash 100% with a nightly-profiling build but not my local build. Someone else should try it with that build.

Running 'Java Applet Plug-in 14.0.3'.
Comment 21 Steven Michaud [:smichaud] (Retired) 2012-03-30 10:57:10 PDT
Benoit, what version of OS X are you testing with?  Is it Lion?
Comment 22 Benoit Girard (:BenWa) 2012-03-30 11:06:36 PDT
10.7.2, so not the latest (haven't rebooted in months).
Comment 23 Alex Keybl [:akeybl] 2012-03-30 13:37:44 PDT
Bug 739259 is likely a dupe of this and has a regression window.
Comment 24 d.a. 2012-03-30 14:13:33 PDT
Looks like a dupe of bug 739259. It happens with Snow Leopard too, 10.6.8. 

When entering a page with Java-applet Firefox will create a child process of the firefox binary, but no plugin process for the Java Plugin. If you force quit the child process it will recreate it. Only way to properly quit Firefox is to force quit the parent process and then the child process.
Comment 25 Josh Matthews [:jdm] (on vacation until Dec 5) 2012-03-31 03:18:42 PDT
On mozilla.dev.platform, another way of reproducing this hang (same regression range) was provided:

On 12-03-30 12:21 PM, Patrick Brunschwig wrote:
> I'm using fork() with js-ctypes for the new subprocess implementation.
> This worked fine until since a few days ago. On Daily (Mac OS X) it
> seems that the child process is consuming a full CPU and does not continue.
> 
> Is it forbidden to use fork(), do I do something wrong, or is this a
> Gecko bug?
> 
> Here is the relevant code:
> 
> 	var fork =<js-cytypes fork()>
> 	var execve =<js-cytypes fork()>
>          pid = fork();
>          if (pid>  0) { // parent
> 	    ...	
>              return pid;
>          } else if (pid == 0) { // child
> 	    ...
>              execve(command, _args, _envp);
>              exit(1);
> 	}
> 	else {
> 	    ...
> 	}
Comment 26 Josh Matthews [:jdm] (on vacation until Dec 5) 2012-03-31 03:22:25 PDT
The common element here is forking, which leads me to strongly suspect bug 737084 (Do pthread_atfork in jemalloc on mac and android).
Comment 27 Mike Hommey [:glandium] 2012-03-31 06:32:42 PDT
(In reply to Josh Matthews [:jdm] from comment #26)
> The common element here is forking, which leads me to strongly suspect bug
> 737084 (Do pthread_atfork in jemalloc on mac and android).

I agree. Disabling pthread_atfork on mac should fix it, and break nothing else, because I figured, since bug 737084, that the system zone allocator on mac was calling the locking functions anyways.
Comment 28 Mike Hommey [:glandium] 2012-03-31 07:32:30 PDT
I do wonder why we don't catch that in the test suite, though...
Comment 29 Steven Michaud [:smichaud] (Retired) 2012-04-02 08:39:26 PDT
*** Bug 741268 has been marked as a duplicate of this bug. ***
Comment 30 Steven Michaud [:smichaud] (Retired) 2012-04-02 16:08:13 PDT
*** Bug 741603 has been marked as a duplicate of this bug. ***
Comment 31 Steven Michaud [:smichaud] (Retired) 2012-04-02 16:09:49 PDT
Bug 741603 has a stack which may indicate that Java needn't be involved.
Comment 32 Mike Hommey [:glandium] 2012-04-04 00:05:58 PDT
Created attachment 612118 [details] [diff] [review]
Don't call pthread_atfork from jemalloc on OSX, the zone allocator already registers functions for pre/post-fork
Comment 33 Justin Lebar (not reading bugmail) 2012-04-04 06:50:09 PDT
(In reply to Mike Hommey [:glandium] from comment #32)
> Created attachment 612118 [details] [diff] [review]
> Don't call pthread_atfork from jemalloc on OSX, the zone allocator already
> registers functions for pre/post-fork

Mike and I talked about this on IRC.  Although the zone allocator does some locking pre- and post-fork, it's not doing the equivalent of _malloc_prefork's locking.  So someone could come in during fork and allocate directly through jemalloc.

But this patch takes us to where we were before bug 737084, and that didn't randomly deadlock, so it's virtuous.  jemalloc2 will lock correctly here.
Comment 35 Steven Michaud [:smichaud] (Retired) 2012-04-05 09:18:54 PDT
I'm able to reproduce this bug (using the STR in comment #0) with today's mozilla-central nightly on OS X 10.6.8, with both Java For Mac OS X 10.6 Update 6 and the recently released Update 7.

And I've confirmed that the STR from comment #0 no longer works (on OS X 10.6.8 with Update 7) with the patch from comment #32 applied.

Later I'll test on Lion.
Comment 36 Justin Lebar (not reading bugmail) 2012-04-05 09:23:43 PDT
Just to be clear, this hasn't landed on m-c yet, so the fact that Steven can reproduce isn't a problem.  :)
Comment 38 Virgil Dicu [:virgil] [QA] 2012-06-21 08:31:23 PDT
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:14.0) Gecko/20100101 Firefox/14.0

No hang on Firefox 14 beta 8 on Mac OS 10.7 Lion here and CPU usage normal, but couldn't reproduce the problem with a build from the report date either. I'll leave it for someone who could reproduce initially to verify.

Note You need to log in before you can comment on or make changes to this bug.