Closed Bug 822825 Opened 7 years ago Closed 7 years ago

Java plugin hits mozPoisonWriteMac ValidWriteAssert assertion on shutdown

Categories

(Core :: XPCOM, defect)

x86
macOS
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla20
Tracking Status
firefox19 --- affected
firefox20 --- fixed

People

(Reporter: cpeterson, Assigned: espindola)

References

()

Details

Attachments

(2 files, 2 obsolete files)

I have tested Java Applet Plug-in Java 7 Update 9 and 10 on Nightly 20 and Aurora 19. I am using Mac OS X 10.8.2. I don't know if this is Java bug or a Firefox bug.

STR:
1. Using a debug build on OS X, load a web page that contains a Java applet, such as https://panopticlick.eff.org/index.php?action=log&js=yes
2. Quit Firefox

RESULT:

Program received signal SIGSEGV, Segmentation fault.
[Switching to process 66442 thread 0x10603]
ValidWriteAssert (ok=false) at /Users/cpeterson/Code/mozilla/aurora/xpcom/build/mozPoisonWriteMac.cpp:89
89	    MOZ_ASSERT(ok);
(gdb) bt
#0  ValidWriteAssert (ok=false) at /Users/cpeterson/Code/mozilla/aurora/xpcom/build/mozPoisonWriteMac.cpp:89
#1  0x000000010386f513 in AbortOnBadWrite (fd=65, wbuf=0x10e834000, count=54) at /Users/cpeterson/Code/mozilla/aurora/xpcom/build/mozPoisonWriteMac.cpp:330
#2  0x0000000103870083 in wrap_write_temp<write_data> (fd=65, buf=0x10e834000, count=54) at /Users/cpeterson/Code/mozilla/aurora/xpcom/build/mozPoisonWriteMac.cpp:212
#3  0x000000012237032b in Java_com_sun_deploy_net_socket_UnixSocketImpl_unStreamSocketWrite ()
#4  0x00000001480e8f90 in ?? ()
#5  0x00000001480dd2d4 in ?? ()
#6  0x00000001480dd2d4 in ?? ()
#7  0x00000001480dd2d4 in ?? ()
#8  0x00000001480dd2d4 in ?? ()
#9  0x00000001480dd806 in ?? ()
#10 0x00000001480dd158 in ?? ()
#11 0x00000001480dd806 in ?? ()
#12 0x00000001480dd158 in ?? ()
#13 0x00000001480d74f7 in ?? ()
#14 0x00000001478b855f in JavaCalls::call_helper ()
#15 0x00000001478b8a3c in JavaCalls::call_virtual ()
#16 0x00000001478b8b76 in JavaCalls::call_virtual ()
#17 0x00000001478ef4f0 in thread_entry ()
#18 0x0000000147aa82b4 in JavaThread::thread_main_inner ()
#19 0x0000000147aa979a in JavaThread::run ()
#20 0x00000001479e51b5 in java_start ()
#21 0x00007fff972df742 in _pthread_start ()
#22 0x00007fff972cc181 in thread_start ()
(gdb)

Here is a link to the JDK source code for the Java_com_sun_deploy_net_socket_UnixSocketImpl_unStreamSocketWrite() function:

https://jdk-source-code.googlecode.com/svn/trunk/jdk6u21_src/deploy/src/common/unix/native/UnixDomainSocket.c
Assignee: nobody → respindola
Status: NEW → ASSIGNED
I was able to reproduce this. The thread doing the write has this backtrace:

#0  (anonymous namespace)::ValidWriteAssert (ok=false) at /Users/espindola/mozilla-central/xpcom/build/mozPoisonWriteMac.cpp:89
#1  0x0000000103a16f3b in (anonymous namespace)::AbortOnBadWrite (fd=63, wbuf=0x110525000, count=54) at /Users/espindola/mozilla-central/xpcom/build/mozPoisonWriteMac.cpp:330
#2  0x0000000103a17ab3 in _ZN12_GLOBAL__N_115wrap_write_tempILZNS_10write_dataEEEEliPKvm (fd=63, buf=0x110525000, count=54) at /Users/espindola/mozilla-central/xpcom/build/mozPoisonWriteMac.cpp:212
#3  0x0000000143c3132b in Java_com_sun_deploy_net_socket_UnixSocketImpl_unStreamSocketWrite ()
#4  0x0000000145811f90 in ?? ()
#5  0x00000001458062d4 in ?? ()
#6  0x00000001458062d4 in ?? ()
#7  0x00000001458062d4 in ?? ()
#8  0x00000001458062d4 in ?? ()
#9  0x0000000145806806 in ?? ()
#10 0x0000000145806158 in ?? ()
#11 0x0000000145806806 in ?? ()
#12 0x0000000145806158 in ?? ()
#13 0x00000001458004f7 in ?? ()
#14 0x00000001440f855f in JavaCalls::call_helper ()
#15 0x00000001440f8a3c in JavaCalls::call_virtual ()
#16 0x00000001440f8b76 in JavaCalls::call_virtual ()
#17 0x000000014412f4f0 in thread_entry ()
#18 0x00000001442e82b4 in JavaThread::thread_main_inner ()
#19 0x00000001442e979a in JavaThread::run ()
#20 0x00000001442251b5 in java_start ()
#21 0x00007fff984cd742 in _pthread_start ()
#22 0x00007fff984ba181 in thread_start ()

The data it is trying to write is

x /54xb wbuf
0x110525000:	0x00	0x00	0x00	0x0f	0x01	0x01	0x00	0x00
0x110525008:	0x00	0x00	0x00	0x00	0x00	0x00	0x46	0x72
0x110525010:	0x00	0x00	0x00	0x00	0x00	0x00	0x04	0xbf
0x110525018:	0x00	0x00	0x00	0x00	0x00	0x00	0x03	0xda
0x110525020:	0x4f	0x48	0x00	0x00	0x00	0x00	0x00	0x00
0x110525028:	0x00	0x00	0x00	0x00	0x27	0x10	0x00	0x00
0x110525030:	0x00	0x00	0x00	0x00	0x03	0xe8

And the main thread is at:


#0  JS::Value::gcKind (this=0x1120f4160) at jsapi.h:434
#1  0x0000000104d23c76 in MarkValueInternal (trc=0x10bdb72b8, v=0x1120f4160) at /Users/espindola/mozilla-central/js/src/gc/Marking.cpp:468
#2  0x0000000104d24338 in js::gc::MarkSlot (trc=0x10bdb72b8, s=0x1120f4160, name=0x1051cedf2 "private") at /Users/espindola/mozilla-central/js/src/gc/Marking.cpp:569
#3  0x0000000104d247ab in js::gc::MarkCrossCompartmentSlot (trc=0x10bdb72b8, src=0x1120f4140, dst=0x1120f4160, name=0x1051cedf2 "private") at /Users/espindola/mozilla-central/js/src/gc/Marking.cpp:648
#4  0x0000000104ad8cc3 in proxy_TraceObject (trc=0x10bdb72b8, obj=0x1120f4140) at /Users/espindola/mozilla-central/js/src/jsproxy.cpp:2813
#5  0x0000000104d301c8 in js::GCMarker::processMarkStackTop (this=0x10bdb72b8, budget=@0x7fff5fbf6b50) at /Users/espindola/mozilla-central/js/src/gc/Marking.cpp:1382
#6  0x0000000104d26288 in js::GCMarker::drainMarkStack (this=0x10bdb72b8, budget=@0x7fff5fbf6b50) at /Users/espindola/mozilla-central/js/src/gc/Marking.cpp:1426
#7  0x00000001049cf077 in DrainMarkStack (rt=0x10bdb7000, sliceBudget=@0x7fff5fbf6b50, phase=js::gcstats::PHASE_MARK) at /Users/espindola/mozilla-central/js/src/jsgc.cpp:3543
#8  0x00000001049ce56e in IncrementalCollectSlice (rt=0x10bdb7000, budget=0, reason=js::gcreason::SHUTDOWN_CC, gckind=js::GC_NORMAL) at /Users/espindola/mozilla-central/js/src/jsgc.cpp:3982
#9  0x00000001049cdba4 in GCCycle (rt=0x10bdb7000, incremental=false, budget=0, gckind=js::GC_NORMAL, reason=js::gcreason::SHUTDOWN_CC) at /Users/espindola/mozilla-central/js/src/jsgc.cpp:4149
#10 0x00000001049cc279 in Collect (rt=0x10bdb7000, incremental=false, budget=0, gckind=js::GC_NORMAL, reason=js::gcreason::SHUTDOWN_CC) at /Users/espindola/mozilla-central/js/src/jsgc.cpp:4267
#11 0x00000001049c8f46 in js::GC (rt=0x10bdb7000, gckind=js::GC_NORMAL, reason=js::gcreason::SHUTDOWN_CC) at /Users/espindola/mozilla-central/js/src/jsgc.cpp:4290
#12 0x00000001049b3d12 in js::GCForReason (rt=0x10bdb7000, reason=js::gcreason::SHUTDOWN_CC) at /Users/espindola/mozilla-central/js/src/jsfriendapi.cpp:178
#13 0x0000000102ce3ce6 in nsXPConnect::Collect (this=0x10bccc740, reason=28) at /Users/espindola/mozilla-central/js/xpconnect/src/nsXPConnect.cpp:373
#14 0x0000000102ce3d12 in non-virtual thunk to nsXPConnect::Collect(unsigned int) () at /Users/espindola/mozilla-central/js/xpconnect/src/nsXPConnect.cpp:374
#15 0x0000000103abe2e5 in nsCycleCollector::GCIfNeeded (this=0x1005e8000, aForceGC=true) at /Users/espindola/mozilla-central/xpcom/base/nsCycleCollector.cpp:2710
#16 0x0000000103abe582 in nsCycleCollector::Collect (this=0x1005e8000, aMergeCompartments=false, aResults=0x0, aTryCollections=5, aListener=0x0) at /Users/espindola/mozilla-central/xpcom/base/nsCycleCollector.cpp:2796
#17 0x0000000103abeb43 in nsCycleCollector::Shutdown (this=0x1005e8000) at /Users/espindola/mozilla-central/xpcom/base/nsCycleCollector.cpp:2968
#18 0x0000000103abf8e9 in nsCycleCollector_shutdown () at /Users/espindola/mozilla-central/xpcom/base/nsCycleCollector.cpp:3404
#19 0x0000000103a0e0d6 in mozilla::ShutdownXPCOM (servMgr=0x0) at /Users/espindola/mozilla-central/xpcom/build/nsXPComInit.cpp:622
#20 0x0000000103a0dc55 in NS_ShutdownXPCOM_P (servMgr=0x10053b4e8) at /Users/espindola/mozilla-central/xpcom/build/nsXPComInit.cpp:513
#21 0x00000001013d1db0 in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0x10050c240) at /Users/espindola/mozilla-central/toolkit/xre/nsAppRunner.cpp:1124
#22 0x00000001013d1cc5 in ScopedXPCOMStartup::~ScopedXPCOMStartup (this=0x10050c240) at /Users/espindola/mozilla-central/toolkit/xre/nsAppRunner.cpp:1105
#23 0x00000001013db22a in XREMain::XRE_main (this=0x7fff5fbfee70, argc=1, argv=0x7fff5fbffa60, aAppData=0x100008230) at /Users/espindola/mozilla-central/toolkit/xre/nsAppRunner.cpp:3916
#24 0x00000001013db60d in XRE_main (argc=1, argv=0x7fff5fbffa60, aAppData=0x100008230, aFlags=0) at /Users/espindola/mozilla-central/toolkit/xre/nsAppRunner.cpp:4089
#25 0x0000000100001ce3 in do_main (argc=1, argv=0x7fff5fbffa60) at /Users/espindola/mozilla-central/browser/app/nsBrowserApp.cpp:174
#26 0x000000010000154c in main (argc=1, argv=0x7fff5fbffa60) at /Users/espindola/mozilla-central/browser/app/nsBrowserApp.cpp:279

So it looks like we have to revert bugs 821534 and 818739 for now and figure out a way to trigger this earlier.
Blocks: 818739, 821534
The file being written to is /var/folders/pf/k3jphc791gj1b1jrnrqhm7cw0000gn/T/.com.sun.deploy.net.socket.66703.8671514364767356563.A
The file is a socket, so the correct answer *might* be to ignore sockets, but it would probably be better to figure out why is this being firing during the very last GC.
Attached patch ignore sockets (obsolete) — Splinter Review
Can you try this and see if it works for you? On my machine it avoids the crash.
Attachment #693667 - Flags: feedback?(cpeterson)
Comment on attachment 693667 [details] [diff] [review]
ignore sockets

Rafael, I tested your patch and it seems to fix the crash.

The Java plugin is running the applet in a separate process, using Unix domain sockets for IPC. When I load the Chess applet at http://english.op.org/~peter/ChessApp/ then Nightly forks a process called "Java Applet - english.op.org".

When I close the Chess applet's web page, the Java applet process lingers for 60 seconds.

Curiously, if I force Firefox to run the Java plugin out-of-process (dom.ipc.plugins.java.enabled=true), then the Java applet process (forked from the "Nightly Plugin Process") crashes.
Attachment #693667 - Flags: feedback?(cpeterson) → feedback+
Rafael, does your patch imply that calling exit(0) at this new site will potentially prevent Java applets from communicating their final state to their servers?
Flags: needinfo?(respindola)
(In reply to Vladan Djeric (:vladan) from comment #6)
> Rafael, does your patch imply that calling exit(0) at this new site will
> potentially prevent Java applets from communicating their final state to
> their servers?

The Java plugin runs in the Firefox process, but the Java applets run out-of-process. When Firefox exits normally or the applet's tab is closed, "Java Tutorials: Life Cycle of an Applet" [1] says the applet's stop() and destroy() life cycle methods will be called.

If the Firefox process crashes or exits without warning, the applet process would (in theory) have an opportunity to handle the Unix domain socket hangup and clean up. However, I don't know whether the applet process calls stop() and destroy() in this exceptional case.

[1] http://docs.oracle.com/javase/tutorial/deployment/applet/lifeCycle.html
> The Java plugin runs in the Firefox process, but the Java applets run
> out-of-process. When Firefox exits normally or the applet's tab is closed,
> "Java Tutorials: Life Cycle of an Applet" [1] says the applet's stop() and
> destroy() life cycle methods will be called.
> 
> If the Firefox process crashes or exits without warning, the applet process
> would (in theory) have an opportunity to handle the Unix domain socket
> hangup and clean up. However, I don't know whether the applet process calls
> stop() and destroy() in this exceptional case.
> 
> [1] http://docs.oracle.com/javase/tutorial/deployment/applet/lifeCycle.html


I was thinking about this a bit. I think we should do two things:

* Ignore sockets. The objective of the write poisoning is to avoid writes from process we control hitting the disk. For process we don't control, we would have to do something like a ptrace jail.
* We should still debug why the java plugin is reacting so late to us shutting down. If it is sending some shutdown message over the socket, it could have done so a long time before the last CC.
Flags: needinfo?(respindola)
Attached patch ignore sockets (obsolete) — Splinter Review
I will try to debug why java is doing anything at all during the last CC, but I think in the end we should also ignore sockets. While it is possible for the other process to be doing writes in response to data sent over the socket, that is probably out of scope of what we want to catch with write poisoning.
Attachment #693667 - Attachment is obsolete: true
Attachment #694006 - Flags: review?(bugs)
Comment on attachment 694006 [details] [diff] [review]
ignore sockets

Hmm, the code uses odd coding style.
(should be 2 line indentation and {} with 'if's)
But let's not change that now.

I think this is ok, though perhaps we should explicitly kill
plugins when we get xpcom-shutdown.
But I'd like to get comments from bsmedberg too.
Attachment #694006 - Flags: review?(bugs)
Attachment #694006 - Flags: review?(benjamin)
Attachment #694006 - Flags: review+
> I think this is ok, though perhaps we should explicitly kill
> plugins when we get xpcom-shutdown.

I think we should do that too. I have just started debugging what exactly triggers the java plugin this late.
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #11)
> > I think this is ok, though perhaps we should explicitly kill
> > plugins when we get xpcom-shutdown.
> 
> I think we should do that too. I have just started debugging what exactly
> triggers the java plugin this late.

It looks like the JVM sets some kind of hartbeat system. I hacked the write poisoning to not abort, but print JAVA when there is a socket write. What I noticed is that the thread started by the plugin keeps sending those pings, about one a second. It "never" stops, so I guess our only option is really to ignore it.
Java should not be running in-process on mac ever, as far as I know. gfritzsche I know we talked about that, was that fixed?
FWIW, I'd be ok with taking this patch as a temporary measure until we can make sure that java is running out of process, but I'd like to back it out after that point.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #15)
> FWIW, I'd be ok with taking this patch as a temporary measure until we can
> make sure that java is running out of process, but I'd like to back it out
> after that point.

I think the argument from comment 9 applies, no?

While there is something strange going on with the java plugin, trying to figure out what goes out of process is a bit out of scope for write poisoning. The best it can do is find writes from processes we control. We already ignore pipes, so ignoring sockets seems natural.
Comment 9 is irrelevant, if java isn't loaded into the firefox process at all. And I'm saying that it should not be.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #18)
> Comment 9 is irrelevant, if java isn't loaded into the firefox process at
> all. And I'm saying that it should not be.

Why? Java is just how we found that we were asserting on sockets. The comment basically states that we should not assert on any sockets, be they created by java, another plugin, an extension or ourselves.
Why wouldn't we? If extensions are doing socket operations after we get to the safe shutdown point, that seems like a bug.

But also if this is just about the correct operation of write poisoning and not about plugins, the review should be done by one of the people working on write poisoning.
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #20)
> Why wouldn't we? If extensions are doing socket operations after we get to
> the safe shutdown point, that seems like a bug.

I guess any activity this late is a bug, but we should try to keep the scope. Avoiding problems from inter process communication is more than what we wanted to do with write poisoning. 

> But also if this is just about the correct operation of write poisoning and
> not about plugins, the review should be done by one of the people working on
> write poisoning.

Initially it looked like there was a plugin interaction problem. It is now clear that we hit this write just because the last cycle collection is slow enough for the java thread to try to write, it is not reacting to firefox shutting down.

It may be a good thing to move the plugin itself out of process, but that is probably an unrelated bug.
Attachment #694006 - Flags: review?(benjamin)
Comment on attachment 694006 [details] [diff] [review]
ignore sockets

Review of attachment 694006 [details] [diff] [review]:
-----------------------------------------------------------------

I'm ok with this change so we can keep looking for harmful writes. But ideally I agree with Benjamin that we want to run java OOP, at least on mac 64-bit. Once we fix that then we should consider reverting that to see if there are more unexpected instances of this.
Attachment #694006 - Flags: review?(bgirard) → review+
> I'm ok with this change so we can keep looking for harmful writes. But
> ideally I agree with Benjamin that we want to run java OOP, at least on mac
> 64-bit. Once we fix that then we should consider reverting that to see if
> there are more unexpected instances of this.

So, the oop issue is unrelated. Would you agree in using getsockname so we *always* ignore unix sockets? That is, we would keep ignoring them even after java is oop.

I am afraid this is still feature creeping, but not as bad as trying to filter out an IPC mechanism and allowing others (FIFOs, shared memory, etc).
note that java is oop again since bug 823559.
Attachment #694006 - Attachment is obsolete: true
Attachment #695682 - Flags: review?(bgirard)
Attachment #695682 - Flags: review?(bgirard) → review+
https://hg.mozilla.org/mozilla-central/rev/2aa1ecf204b1
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: Java (Oracle) → XPCOM
Product: Plugins → Core
Target Milestone: --- → mozilla20
Version: unspecified → Trunk
You need to log in before you can comment on or make changes to this bug.