Closed Bug 826909 Opened 12 years ago Closed 9 years ago

FF hangs on signed Java applet load

Categories

(Core Graveyard :: Plug-ins, defect)

17 Branch
x86
Windows 7
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: glahti, Unassigned)

Details

(Keywords: hang, regression)

Attachments

(1 file, 4 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0
Build ID: 20121128204232

Steps to reproduce:

Loading a signed java applet through JNLP or embedded x-applet tag hangs on loading the .jar file.  


Actual results:

I ran into a bug with FF 17.0.1. I build a fresh windows pro x64 system, loaded latest FF 17.0.1 and latest java jre 1.7.10. I have an application which loads a signed Java applet to do some printing on the client-side. When using FF, the applet load hangs. It does not hang in IE9 or Chrome. Going backwards on the JRE to 1.6.37 doesn't resolve the issue.  Previous versions of FF worked (not verified which ones, but had no issues up until latest update).

Some of the details that I've been able to dissect:

Using apache 2.4.3 x64 HTTP application on a separate windows 2008 R2 server. This is the key, as a local host works fine.  The server is on a different subnet but goes through various routers to get to it however the web server is serving pages as specified.  Disabled firewall on server, no effect.  Client is using a windows 7 x64 pro install.

When FF kicks off the java function to load the applet through JNLP or using the x-java-applet method using a straight embedded jar file, it hangs. The debug java trace I see that it's hanging on the jnlp load:

network: Created version ID: 1.7.0.10
network: Created version ID: 1.7
network: Created version ID: 2.2.4
network: Cache entry not found [url: http://192.168.15.39/patholog/ZebraPrintApplet.jnlp, version: null]
network: Cache entry not found [url: http://192.168.15.39/patholog/ZebraPrintApplet.jnlp, version: null]
network: Cache entry not found [url: http://192.168.15.39/patholog/ZebraPrintApplet.jnlp, version: null]
network: Connecting http://192.168.15.39/patholog/ZebraPrintApplet.jnlp with proxy=DIRECT
network: Connecting http://192.168.15.39:80/ with proxy=DIRECT
network: Connecting http://192.168.15.39/patholog/ZebraPrintApplet.jnlp with cookie "Patholog=vbh0t1oj69svfcb4ihi4bhq203"

If I run a test on the server using FF 17.0.1 (i.e. the .jar file is local on the server and I'm running FF on the server), the applet load gets past the hang and I get the standard pop-up to load the loads the applet .jar file. If I run IE9 or chrome, it also gets past the hang point and loads the .jar file and continues along to execute it.

A wireshark trace dump of the operation on the server side shows the "HTTP GET jnlp" operation works. I'm seeing two OCSP protocol responses with a responseStatus as successful (looks like two different signatures) but after that nothing.

The jar file is self-signed and on previous FF versions it loaded and I was able to get the pop-up from Java about accepting the signature for the signed jar file to run. 17.0.1 fails to even get there.

At this point, I'm stumped. I've tried turning off the certificates in the security section thinking that maybe the OCSP is barfing.

Any clues or thoughts?



Expected results:

.jar file should have been loaded and executed then verified certificate and the standard pop-up for "accept this certificate" java yada yada.
Severity: normal → critical
Is it possible to attach your signed Java app, so we could test? (or a sanitized demo version just to test)
Severity: critical → normal
Flags: needinfo?(glahti)
Flags: needinfo?(glahti)
I can provide wireshark captures if required.
Severity: normal → critical
Keywords: hang
Can you test, does problem exists in current trunk build from - http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central/firefox-21.0a1.en-US.win32.zip ? Just unpack it to separate dir and run from there.
Component: Untriaged → General
Flags: needinfo?(glahti)
Working on testing it (out on travel).
Flags: needinfo?(glahti)
I downloaded the build from the link above, still hangs.
Latest 18.0 also fails.  I get a little farther in the debug log from the java console:

security:  --- parseCommandLine converted : -Djava.net.preferIPv4Stack=true -Djavaplugin.trace=true -Djavaplugin.trace.option=basic|net|cache|security|ext|liveconnect|temp
into:
[-Djava.net.preferIPv4Stack=true, -Djavaplugin.trace=true, -Djavaplugin.trace.option=basic|net|cache|security|ext|liveconnect|temp]
basic: Added progress listener: sun.plugin.util.ProgressMonitorAdapter@cd70f7
basic: Plugin2ClassLoader.addURL parent called for http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar
network: Cache entry not found [url: http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar, version: null]
network: Connecting http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar with proxy=DIRECT
network: Connecting http://192.168.15.39:80/ with proxy=DIRECT
network: Connecting http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar with cookie "Patholog=lbpo0dubdt3l0dhdlrdlqc0q52"
basic: Loading Java Applet ...

But then it hangs.
Severity: critical → normal
Component: General → Java (Oracle)
Product: Firefox → Plugins
Version: 17 Branch → unspecified
Is it possible to provide a simple HTML testcase, please? (no PHP, no script to run on a server)
Flags: needinfo?(glahti)
Attached file test case with pure html execution (obsolete) —
Flags: needinfo?(glahti)
Still fails with 18.0.1 release.  I think this is a serious issue, I can't reliably use Firefox in any production setting.  

What's the best way to debug this?  I'd be happy to help debug, but unfamiliar with Firefox development process.  Any way to get a stack trace or something that can at least show why it's hanging?  

The latest run says this:

[-Djava.net.preferIPv4Stack=true, -Djavaplugin.trace=true, -Djavaplugin.trace.option=basic|net|cache|security|ext|liveconnect|temp]
basic: Added progress listener: sun.plugin.util.ProgressMonitorAdapter@3fc9b6
basic: Plugin2ClassLoader.addURL parent called for http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar
network: Cache entry not found [url: http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar, version: null]
network: Connecting http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar with proxy=DIRECT
network: Connecting http://192.168.15.39:80/ with proxy=DIRECT
network: Connecting http://192.168.15.39/patholog/java/patholog/dist/PathologLibS.jar with cookie "Patholog=uptcu0frbju0r3pt9vg5514ch1"


It just hangs on the connection.  Maybe there's something out of the box that's wrong with the security trust certificate portion of Firefox?  I do not see this issue with Chrome or IE9.
I tried locally your testcase, but the Java applet stays blank (not sure if it's normal or not).

Try to use the tool mozregression to find a regression range if you think the issue has appeared since FF18: http://harthur.github.com/mozregression/
The 1st FF18 nightly builds started in August: mozregression --good=2012-08-01
Attached file HelloWorldApplet.html (obsolete) —
Another method into the class to just print out "HelloWorld" to the screen.
I realized that I had inadvertently had a left-over test case that prints "Hello World" to the screen in the jar file that was attached so I've attached a file HelloWorldApplet.html that will do that.  Calling this method also hangs in the same fashion, I've tested this out on my network setup.

At least this way you can validate if the operation occurs or not.  Still trying to narrow this down, but it would be great if there is some way of stack tracing what's going on or what Firefox is waiting on to cause the hang.
Attachment #704365 - Attachment mime type: text/plain → text/html
I tried, now I'm able to see "Hello world". But the applet loads fine, no hang in FF18.
Were you running the apache server and browser on the same machine?  I've found that configuration works fine.  It's when I have a different server (win 2008 r2 x64) running apache2.4 across a network is when I have the problem.

I initially wondered if it was a routing issue but since IE9 and Chrome work on the same configuration that dispelled that theory.
I'm running all locally (.jar file on my HDD). Is it possible for you to attach the .jar file on a public server so we could test online?
Finally!

More debugging, got it narrowed down to the following scenario in the following configuration:

* remote server running apache2.4
* latest FF 18.0.1 and JRE 1.7.11
* I've turned off Java "Keep temporary Files on my computer" but I've repeated it with or without the caching.


1) Execute the html page that loads in the applet.  Click on the "allow java applet to continue" in the page (not the site info button to the left of the URL) then click yes when the security screen comes up to allow the applet to run.  This works every time, even if close the tab, re-load it or I close the browser and restart it. 

2) I can execute the html page and click on the site info button to "always allow applet to run on this domain".   When the security warning dialog for the applet pops up the use the "I accept the risk and want to run this application" check box and click run.  This works every time, even if close the tab, re-load it or I close the browser and restart it.  

3) If I execute the html page with the java applet always allowed to run on the domain (I did the left-click on the site info to the left of the URL and said "always allow") AND click on the check box "I accept the risk and want to run this application" AND open up the hide options and click on "Always trust content from this publisher" it works only once.

THEN IT HANGS!  If I close the browser, re-open it and repeat step 3 it fails and hangs.  Clearing the certificate in the Java cache AND clearing the temporary files will clear the hang.  So it looks like once the certificate is cached locally it fails to grab it.  I've got the advanced settings of the Java control panel set to default except the for the debugging and show console.

I did a wireshark trace a while back on this trying to determine if it was a network issue and I noticed there was some weirdness in the OCSP call vs a good operation with IE9 and a bad hang with Firefox.  Yell if you want traces.

I don't see this issue with IE9 or Chrome, they are reliable when I have the certificate saved and have the "Always trust content from this publisher" option set.  

The server will hang up and require a restart after about 30 seconds of it hanging.  If I close the browser before then, it appears the socket gets closed by way of a crash and unloads the server operation.
We still need a regression range in Firefox and a stack trace (see https://developer.mozilla.org/docs/How_to_get_a_stacktrace_with_WinDbg).
Flags: needinfo?(glahti)
Tested 16.0.2 and it doesn't hang.  Tested latest 19.0 and it hangs.  I've re-written my application and I've got it to hang hard and consistently on a Java applet call.  The structure of the PHP/Javascript on the application has changed but the applet hasn't.  Both Chrome and IE9 have no issues.

I'll attempt a stack trace but it'll have to be end of next week.
Flags: needinfo?(glahti)
Attached file Debugger log of FF19.0 failure (obsolete) —
Not sure if this will help, but I couldnt get it to launch a Java app in the debugger.  May need more instructions (other than how to get a stacktrace).
Severity: normal → critical
Status: UNCONFIRMED → NEW
Component: Java (Oracle) → Plug-ins
Ever confirmed: true
Keywords: stackwanted
Product: Plugins → Core
Hardware: x86_64 → x86
Summary: FF 17.0.1 hangs on signed Java applet load → FF hangs on signed Java applet load
Version: unspecified → 17 Branch
Keywords: regression
The stack trace in comment 20 apparently is from java.exe, not of Firefox?

We could use either:
* a local test-case or something publically accessible test URL or
* a more specific regression window per comment 11
Keywords: qawanted
Sorry about that, I'm used to debugging SystemVerilog, embedded C and Java apps not windows-based goo.  I'll see what I can get together in the next couple of days.
I've narrowed this down a bit more to an interaction with McAfee, Firefox, and java.  If I turn off the real-time scanning in McAfee (2012 version, fully-updated with latest patches), it downloads and executes the App.  One of my customers who was experiencing the same issue I think is using different anti-virus software, need to go check which version.

Note that I don't need to do this with Chrome or IE (9 or 10).  

How do I go about debugging and getting a stacktrace for this?  Same way outlined in the Firefox debugging tips?
(In reply to Gateslinger from comment #23)
> How do I go about debugging and getting a stacktrace for this?

You can find details on taking stacktraces here:
https://developer.mozilla.org/en-US/docs/How_to_get_a_stacktrace_with_WinDbg
I'm back to debugging this issue.  Using the same code, I have determined:

1) Still fails on version 22.  Has failed since version 16.  Does not fail on IE9, IE10, and the last 3 revisions of Chrome.
2) The fail involves Java (any release, including the latest 7u25) and executing a signed applet.  Self-signed or certified, it still fails.  
3) It fails on windows XP and Windows 7 pro clients.  
4) Fails with most of the last versions of PHP (5.3, 5.4.x NTS version compiled with VC9) and Apache 2.2 and 2.4.
5) Fails on a local webserver config of apache.  It will successfully hang the webserver and not allow any other processes on the webserver to run.
6) Does *NOT* fail with a local webserver using IIS.  

The #6 bothers me.

I am having issues getting a stack trace.  Using WinDBG, it breaks on an exception as soon as I launch the Java applet.  What do I need to do to get it to run to the hanging portion of the Java applet?  I am working on getting a packaged testcase.
Should point out that the antivirus issue was a dead-end.  Thought it was causing problems, but ultimately I've got a hard case with or without antivirus running.
Did you try to use mozregression to find a regression range? See my comment #11.
Flags: needinfo?(glahti)
Attached file ptest.zip
Test case which exhibits the hanging failure.  See the ptest/README.txt for info.
Attachment #698323 - Attachment is obsolete: true
Attachment #701771 - Attachment is obsolete: true
Attachment #704365 - Attachment is obsolete: true
Attachment #716027 - Attachment is obsolete: true
Flags: needinfo?(glahti)
Is this something other people can test running everything locally?
Providing us with a regression range using mozregression [1] is still the fastest way to get this somewhere.
If i understand the comments right this started to fail with Fx 17? If so, you can run "mozregression --good=2012-07-15" to narrow this down to a smaller set of changes.

[1] http://mozilla.github.io/mozregression/
Flags: needinfo?(glahti)
They test this locally.  I started the regression but it seemed to fail right off the bat on 17.
Flags: needinfo?(glahti)
Just to confirm, Firefox with Apache 2.4 thru 2.4.6 exhibits this issue, Apache 2.2 or IIS does not get this hanging issue.
Is the Apache server hanging in the middle of the request? It could be that java is (synchronously) waiting for the request to complete, and blocking. Because plugins block the browsers event loop, this would hang FF as well :(

Does killing the socket unfreeze the browser? If java is waiting on a request that will never complete, the only option from FF's point of view is to kill the plugin.
Resolving this as incomplete based on the lack of updates over the last two years. Feel free to reopen if it still reproduces.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → INCOMPLETE
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: