Closed
Bug 540640
Opened 16 years ago
Closed 15 years ago
[10.5] Crashes [@ libclient.dylib@0x119b59] [@ libclient.dylib@0x1192e9] [@ libclient.dylib@0x119289] triggered by bad interaction between JEP and Silverlight -- Apple and Microsoft bugs
Categories
(Core Graveyard :: Plug-ins, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: smichaud, Assigned: smichaud)
References
Details
Attachments
(1 file)
|
42.83 KB,
patch
|
Details | Diff | Splinter Review |
Several different crashes in libclient.dylib happen only on OS X 10.5
and are 100% associated with the Java Embedding Plugin and Silverlight
(agcore, coreclr). See recent interesting-modules files at
http://people.mozilla.com/crash_analysis/.
I can reproduce them using the STR from bug 532981 comment #22. Bug
532981's crashes happen only on 10.6. Both that bug and this one must
have the same underlying cause -- probably an Apple bug.
Here are the STR over again:
1) Start Firefox (3.0.X, 3.5.X, or 3.6) and visit the following URL:
http://java.sun.com/applets/jdk/1.4/demo/applets/Clock/example1.html
(Close this window or keep it open, as you choose.)
2) Open a new window and visit
http://silverlight.services.live.com/invoke/99030/ControlsDemo/iframe.html.
(Close this window or keep it open, as you choose.)
3) Open a new window and visit
http://java.sun.com/applets/jdk/1.4/demo/applets/Clock/example1.html.
At this point you should crash. Or you may need to reload this page.
| Assignee | ||
Comment 1•16 years ago
|
||
Here's a Breakpad id for one of my crashes:
bp-2c73caac-f1e6-4423-acdb-dd7b62100119
| Assignee | ||
Updated•16 years ago
|
Assignee: nobody → smichaud
| Assignee | ||
Comment 2•16 years ago
|
||
> See recent interesting-modules files at
> http://people.mozilla.com/crash_analysis/.
Only the FF 3.5.7 interesting-modules files are big enough (have
enough crashes) for the libclient.dylib crashes to show up in them.
| Assignee | ||
Comment 3•15 years ago
|
||
With patient digging and a bit of luck, I've figured out why these crashes
(and those of bug 532981) happen, and how to work around them. It has to do
with Mach exception handling.
The situation is complex enough that it's difficult to know who to blame.
Suffice it to say that there are bugs and/or design flaws in Apple's JVM, the
Silverlight plugin, and even (at a fundamental level) in Mach exception
handling itself.
Applications that use Mach exception handling need (basically) to do two
things -- 1) Tell the kernel which exceptions are "supported", and which Mach
port to send its exception handling messages to; 2) Set up an exception
handling thread that accepts Mach messages on the appropriate port, and
handles them when they are sent (by the kernel, when it detects a "supported"
exception).
The calls to request that the kernel handle certain exceptions are per-task
(task_swap_exception_ports()/task_set_exception_ports()) or per-thread
(thread_swap_exception_ports()/thread_set_exception_ports()). But Mach
exception handling is really designed for use by *applications* (not by
plugins) -- the reason is that there's no way (that I can find) to specify
multiple Mach ports (and multiple handlers) for the main thread. Mozilla
browsers don't use Mach exception handling. And everything's fine if only one
plugin uses it. But Apple's JVM (in the JEP and in Apple's plugins), the
Silverlight plugin and the Flash plugin all use it. Furthermore, while the
Flash plugin tries to be well-behaved, both Apple's JVM and the Silverlight
plugin blithely assume that they're the only "applications" using Mach
exception handling -- so both set the main thread's Mach exception port
unconditionally (to different values), and leave it that way. (Secondary
threads aren't a problem -- each plugin creates its own, and can configure
their Mach exception handling as it sees fit.)
So here's how this bug's crashes happen:
Apple's JVM is loaded (by the JEP) and sets the main thread's Mach exception
port to (say) 0x00001234. Then the Silverlight plugin gets loaded and changes
the main thread's Mach exception port to (say) 0x00005678. Now, whenever a
"supported" exception happens in the JVM on the main thread, the kernel tries
to get it handled on the "wrong" Mach port -- which leads to a crash.
You might wonder why nothing happens in the opposite case -- when Apple's JVM
stomps on the Silverlight plugin's setting for the main thread's Mach
exception port. I'll have more to say about this in the next comment.
| Assignee | ||
Comment 4•15 years ago
|
||
As I said in comment #3, there are two steps to setting up Mach
exception handling. Apple's JVM, the Flash plugin and the Silverlight
plugin all take the first step (tell the kernel which exceptions are
"supported", and which Mach port to send its exception handling
messages to). And both Apple's JVM and the Flash plugin take the
second step (set up an exception handling thread that accepts Mach
messages on the appropriate port). But the Silverlight plugin never
does this!
The docs I can find on Mach exception handling don't say what's
supposed to happen when you take "step 1" without "step 2". My best
guess is that the kernel, if it failed to send a handling message on a
"supported" exception, would simply crash the application. But why
bother to set up Mach exception handling if you aren't really going to
use it? The Silverlight plugin already uses C++ exception handling
(__cxa_allocate_exception(), __cxa_throw(), __cxa_begin_catch(),
__cxa_end_catch() and friends). So why should it also need Mach
exception handling ... especially if it's not really using it?
So Apple's JVM has (so far) one serious bug -- it uses Mach exception
handling unconditionally, without regard to other plugins. But
Silverlight has two -- it stomps on other plugins' Mach exception
handling settings for the main thread, and it does this for no reason
at all.
The fact that the Silverlight plugin doesn't use Mach exception
handling is probably the reason it doesn't crash (or otherwise
malfunction) when Apple's JVM stomps on its Mach exception settings
for the main thread.
| Assignee | ||
Comment 5•15 years ago
|
||
But wait, there's more ...
No crashes happen (with the STR from comment #0) on OS X 10.4.11, or
using Java 1.4.2 (on OS X 10.5.X with older Apple JVMs). But these
older Apple JVMs still use Mach exception handling, and still perform
both "steps" (from comment #3) to set it up.
What did Apple do to make the crashes start happening?
It seems they changed how the "server" works on their Mach exception
handling thread. When Apple's JVM started using the simple canned
server method from libSystem.dylib called mach_msg_server() (see
mach/mach.h) is when the crashes started happening.
Previous Apple JVMs called mach_msg() from some kind of custom server,
which somehow worked around the design flaw in Mach exception handling
that triggers these crashes. For some reason, older Apple JVMs keep
receiving error handling messages (from the kernel) at the "old" Mach
exception port for the main thread, even after the Silverlight plugin
has changed this setting!
So are there accepted/recommended (if undocumented) ways to work
around the design flaw in Mach exception handling on the main thread,
and make it possible for several plugins that use Mach exception
handling to peacefully cooexist even while stomping on each others'
settings? But then why didn't Apple notice the problem and go back to
using their custom Mach exception handling server?
It's just as likely the previous "work around" was accidental, and is
no longer possible.
So (in my own workaround for this problem) I'm not going to look for
this kind of solution. Neither am I going to try to make Apple's JVM
well-behaved. I'm going to make the smallest possible changes to keep
Apple's JVM happy when the Silverlight plugin stomps on its
main-thread Mach exception handling settings.
| Assignee | ||
Updated•15 years ago
|
Summary: [10.5] Crashes [@ libclient.dylib@0x119b59] [@ libclient.dylib@0x1192e9] [@ libclient.dylib@0x119289] triggered by bad interaction between JEP and Silverlight -- probable Apple bug → [10.5] Crashes [@ libclient.dylib@0x119b59] [@ libclient.dylib@0x1192e9] [@ libclient.dylib@0x119289] triggered by bad interaction between JEP and Silverlight -- Apple and Microsoft bugs
| Assignee | ||
Comment 6•15 years ago
|
||
Here's the debugging patch I used to do much of the research behind my
previous comments. Though it "fixes" the crashes, it's not a viable
solution (the method it uses just happens to work (because Silverlight
is the only plugin that uses thread_swap_exception_ports(), as opposed
to thread_set_exception_ports())). My "fix" is merely an illustration
(part of the evidence that supports my explanation of these bug's
crashes).
Here are the three plugins (Java, Silverlight and Flash) I used in my
tests:
http://java.sun.com/applets/jdk/1.4/demo/applets/Clock/example1.html
http://gallery.expression.microsoft.com/en-us/CWAStyle
(at http://silverlight.net/community/samples/silverlight-samples/)
http://www.playercore.com/bugFiles/ime/imekrjp.swf
Note that it's impossible to debug the Silverlight plugin in gdb --
the browser quickly freezes up, and main thread stack traces stop
making sense. This happens in both Firefox and Safari. It's why I
went to the trouble to use Apple's Symbolication framework to log
stack traces in my patch.
Here's a tryserver build:
https://build.mozilla.org/tryserver-builds/smichaud@pobox.com-bugzilla540640-debug/bugzilla540640-debug-macosx.dmg
| Assignee | ||
Comment 7•15 years ago
|
||
And here's yet more ...
This bug's crashes only happen (on the main thread) when Apple's JVM handles
very low-level exceptions -- like java.lang.NullPointerException. They don't
happen with (for example) java.lang.ArrayOutOfBoundsException.
So (practically speaking) I only need to worry about Java code that might run
on the main thread and might try to use try/catch to handle a
NullPointerException. There's very little of this in the JEP, and all of it
is called via JNI.
So I've fixed this bug (and bug 532981) (in my current version of the JEP
(what will become JEP 0.9.7.3)) by wrapping four JNI calls in code that
ensures the main thread's exception port is (re)set to whatever value Apple's
JVM originally set it to (when JNI_CreateJavaVM() was called as the first Java
applet was loaded).
I'm not entirely sure why these crashes don't happen in Safari. But I
strongly suspect it's because (by chance) Apple's Java plugin never calls Java
code on the main thread that might try to handle a NullPointerException.
| Assignee | ||
Comment 8•15 years ago
|
||
I've just released a new version of the Java Embedding Plugin
(0.9.7.3) that fixes this bug (by working around it). For more
information see bug 551327.
| Assignee | ||
Comment 9•15 years ago
|
||
JEP 0.9.7.3 has now landed on the 1.9.2 and 1.9.1 branches, and should
be in tomorrow's Firefox 3.6.3pre and 3.1.10pre nightlies (at
ftp://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/).
Please test with them and let us know your results.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Updated•3 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•