Last Comment Bug 64332 - Webclient test, EMWindow, freezes after a period of use.
: Webclient test, EMWindow, freezes after a period of use.
Status: VERIFIED FIXED
:
Product: Core Graveyard
Classification: Graveyard
Component: Java APIs to WebShell (show other bugs)
: Trunk
: x86 Windows NT
: -- normal with 4 votes (vote)
: ---
Assigned To: edburns
: Alexei V. Mokeev
:
Mentors:
: 96826 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2001-01-04 12:45 PST by edburns
Modified: 2012-04-09 22:27 PDT (History)
5 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
cvs diff -u of related bug fix, and workaround (4.16 KB, patch)
2001-02-02 14:24 PST, edburns
no flags Details | Diff | Splinter Review
Another workaround to try, iteration one. (9.61 KB, text/plain)
2001-02-06 16:00 PST, edburns
no flags Details
tar.gz of files in previous attachment. (8.22 KB, application/octet-stream)
2001-02-06 16:03 PST, edburns
no flags Details
Tar.gz of files for stress test. (13.49 KB, application/octet-stream)
2001-05-23 14:16 PDT, edburns
no flags Details
log of URL loads demonstrating that it ran for a half hour before crashing. (13.55 KB, text/plain)
2001-07-19 13:24 PDT, edburns
no flags Details

Description edburns 2001-01-04 12:45:33 PST
Environment:

Mozilla: Netscape_20000922_BRANCH
OS: Winnt 4.0 SP 6
Webclient: JAVADEV_RTM_20001102

Launch webclient with .\runem.  The app will work for a while, and then freeze.
Comment 1 edburns 2001-01-04 12:46:03 PST
I accept.
Comment 2 Brian Satterfield 2001-01-04 14:05:59 PST
I see the same behavior on Linux-SMP.  The embedded application will freeze
after a few minutes, leaving the java end running but seemingly waiting for an
event from mozilla. On the same OS but single processor, we've seen the
webclient run for roughly 24-36 hours without a hitch.

My config:
OS:  Red Hat 6.2, kernel 2.2.14-5.0smp
Box: Dell Precision, Dual PIII 866, 256 RAM
JDK: (1.2.2/1.2.2rc4) I've tried both the Sun distro and the blackdown distro.
Both with native and green threads...to no avail. 
Comment 3 edburns 2001-02-02 12:12:57 PST
I'm homing in on the cause of this bug.  After much trouble, I have a 
debuggable software stack, but I still can't debug into the hotspot vm.  No 
matter, the problem occurrs in the classic vm as well so I can trace it from 
there.  

It looks like it's deadlocking here:

USER32! 77e72c30()
Java_sun_awt_windows_WToolkit_eventLoop(JNIEnv_ * 0x00849510, _jobject * 
0x0516f0ec) line 1453
invoke_V_V(Hjava_lang_Object * 0x015d2b98, methodblock * 0x04e55914, int 1, 
execenv * 0x00849510) line 71
invokeLazyNativeMethod(Hjava_lang_Object * 0x015d2b98, methodblock * 
0x04e55914, int 1, execenv * 0x00849510) line 680 + 22 bytes
ExecuteJava_C(unsigned char * 0x0516fed8, execenv * 0x00849510) line 1559 + 22 
bytes
do_execute_java_method_vararg(execenv * 0x00849510, void * 0x015d2e48, char * 
0x0077d968, char * 0x00773bb8, methodblock * 0x00000000, int 0, char * 
0x0516ff60, long * 0x00000000, int 0) line 561 + 14 bytes
execute_java_dynamic_method(execenv * 0x00849510, Hjava_lang_Object * 
0x015d2e48, char * 0x1006bf50, char * 0x1006bf4c) line 277 + 33 bytes
ThreadRT0(Hjava_lang_Thread * 0x015d2e48) line 2084 + 23 bytes
saveStackBase(void * 0x10042340 ThreadRT0(Hjava_lang_Thread *)) line 139 + 10 
bytes
_start(sys_thread * 0x00849590) line 293 + 13 bytes
_threadstartex(void * 0x00849470) line 212 + 13 bytes
KERNEL32! 77f04f3e()

More status to come.
Comment 4 edburns 2001-02-02 13:41:49 PST
Hi Bryan,

Try this.  Add the following command line options to the java interpreter:

-classic -Djava.compiler=NONE

This disables hotspot, and disables the Just In Time compiler.  This seems to 
prevent freezes on my system.

Please post here whether or not this works around the freeze.

Comment 5 Brian Satterfield 2001-02-02 13:54:35 PST
Just tried the "-classic -Djava.compiler=NONE" options on a Linux-SMP and it
still hangs after some use.  But the SMP freeze may be a separate issue. I'll
wait and see what Bryan Hunter observes on a single processor windows machine.

Brian
Comment 6 edburns 2001-02-02 14:24:07 PST
Created attachment 24250 [details] [diff] [review]
cvs diff -u of related bug fix, and workaround
Comment 7 edburns 2001-02-02 14:26:32 PST
I posted the previous attachment, and am typing this message using webclient
with the attached patch.

You can patch your runem.pl script using the patch in the attachment to install
the workaround.  I'm planning on checking this workaround in once I get approval.

Ashu, the rest of this patch makes it so util_InitStringConstants() no longer
takes a JNIEnv pointer.  It now obtains one using JNU_GetEnv().
Comment 8 edburns 2001-02-02 15:57:38 PST
First attachment checked in.

Geetha, this is a dup of another bug, which bug is it?

Comment 9 Bryan K. Hunter 2001-02-05 04:56:02 PST
I tried running the test app for WebClient using the "-classic -
Djava.compiler=NONE" option but it still freezes up on me.  I tried several 
times after clean reboots of my system, but no luck.  I modified the runem.pl 
in src_share and verified on the screen that these options are being invoked.
Comment 10 Bryan K. Hunter 2001-02-05 04:57:07 PST
My comments above were for a WinNT 4.0 SP6 system running JDK1.3
Comment 11 edburns 2001-02-06 16:00:21 PST
Created attachment 24599 [details]
Another workaround to try, iteration one.
Comment 12 edburns 2001-02-06 16:03:31 PST
Created attachment 24600 [details]
tar.gz of files in previous attachment.
Comment 13 edburns 2001-02-06 16:26:24 PST
I have confirmed that upgrading to jdk1.3.1 will fix this bug.  Please wait for 
jdk1.3.1 beta to come out and we'll re-try it then.
Comment 14 edburns 2001-02-07 11:09:52 PST
In the meantime, please try the most recent workaround.  If it works for you, 
I'll check it in.
Comment 15 Bryan K. Hunter 2001-02-08 06:56:17 PST
I applied the patch (id=24600) and tested.  It did appear to be more stable, 
but still froze up after some use.  I think this patch would only affect the 
test application and not the API itself, is this correct?  

I am going to try JDK1.3.0_01.
Comment 16 Bryan K. Hunter 2001-02-26 13:07:06 PST
I tried JDK 1.3.0_01 but did not see any improvement.

I recently downloaded and installed JDK1.3.1 Beta for WinNT (and uninstalled 
all previous versions of the JDK and confirmed relevant env vars were changed 
accordingly).  I do notice that the test app is more stable, but it does still 
lock up on me.  I start up at www.google.com then proceed to "Google Web 
Directory" and navigate through the various levels of categories.  I can't seem 
to find a reliable pattern but this pattern does seem to lock it up most of the 
time:
From the Google main page, select "Google Web 
Directory", "Business", "Financial Services", "Mortgages", "United 
States", "Pennsylvania".  Usually it locks up while loading the "Pennsylvania" 
page.

Environment:

Mozilla: Netscape_20000922_BRANCH
OS: Winnt 4.0 SP 6
Webclient: JAVADEV_RTM_20001102
Comment 17 Bryan K. Hunter 2001-03-13 06:21:23 PST
Rejoice.  JDK1.3.1Beta plus the newly released WebClient 1.0 is working well 
for me now.  I would recommend that everyone who needs WebClient use 1.0 with 
JDK1.3.1.  

As far as I'm concerned, this bug is fixed.
Comment 18 edburns 2001-03-16 12:30:01 PST
This is very good news.  I'll update the release notes bug 64334
Comment 19 Bryan K. Hunter 2001-03-19 10:44:14 PST
My previous comments referred to Linux.  On WinNT I'm still have trouble, this 
may be due to a configuration issue.  Will work with Ed to identify and resolve.
Comment 20 edburns 2001-03-19 10:58:49 PST
This ain't fixed.

Bryan, after the freeze occurrs, can you please give the console from which you
started webclient keyboard focus and press

Ctrl Break?

This should give you a thread state dump.  Please post that to this bug.

Thanks,

Ed
Comment 21 Bryan K. Hunter 2001-03-19 15:39:30 PST
Creating Event Queue

InitMozillaStuff(784670): Create the action queue
Init the baseWindow
Create the BaseWindow...
Creation Done.....
Show the webBrowser
in BrowserControlCanvas setBounds: x = 4 y = 59 w = 632 h = 410
native library does implement webclient.Navigation
in BrowserControlCanvas setBounds: x = 4 y = 59 w = 632 h = 410
native library does implement webclient.CurrentPage
native library does implement webclient.History
native library does implement webclient.Preferences
java.lang.Exception: nativeRegisterPrefChangedCallback: can't set callback
native library does implement webclient.EventRegistration
native library does implement webclient.Bookmarks
debug: edburns: got Bookmarks instance
+++++++++++++++++++++ Thread Id ---- 00785510

has multiple monitor apis is 0
+++++++++++++++++++++ Thread Id ---- 00785510

debug: edburns: Currently Viewing: http://www.google.com/
Button1
debug: edburns: Currently Viewing: http://directory.google.com/
Button1
debug: edburns: Currently Viewing: http://directory.google.com/Top/Arts/
Button1
debug: edburns: Currently Viewing: 
http://directory.google.com/Top/Arts/Animation/
Button1
debug: edburns: Currently Viewing: 
http://directory.google.com/Top/Arts/Animation/Anime/
Button1
debug: edburns: Currently Viewing: 
http://directory.google.com/Top/Arts/Animation/Anime/Fandom/
Full thread dump:

"Thread-1" prio=5 tid=0x9184910 nid=0x103 waiting on monitor [0..0x6fb30]

"Screen Updater" prio=5 tid=0x857d80 nid=0x127 waiting on monitor 
[0x916f000..0x916fdc0]
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at sun.awt.ScreenUpdater.nextEntry(ScreenUpdater.java:76)
        at sun.awt.ScreenUpdater.run(ScreenUpdater.java:95)

"EventThread-7882352" prio=5 tid=0x785ed0 nid=0x112 runnable 
[0x8f6f000..0x8f6fdc0]
        at 
org.mozilla.webclient.wrapper_native.NativeEventThread.nativeProcessEvents
(Native Method)
        at org.mozilla.webclient.wrapper_native.NativeEventThread.run
(NativeEventThread.java:244)

"AWT-Windows" prio=7 tid=0x778470 nid=0x109 runnable [0x8f0f000..0x8f0fdc0]
        at sun.awt.windows.WToolkit.eventLoop(Native Method)
        at sun.awt.windows.WToolkit.run(WToolkit.java:188)
        at java.lang.Thread.run(Thread.java:484)

"SunToolkit.PostEventQueue-0" prio=7 tid=0x7773b0 nid=0x128 waiting on monitor 
[0x8ecf000..0x8ecfdc0]
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at sun.awt.PostEventQueue.run(SunToolkit.java:491)

"AWT-EventQueue-0" prio=7 tid=0x777a90 nid=0x11f waiting on monitor 
[0x8e8f000..0x8e8fdc0]
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at java.awt.EventQueue.getNextEvent(EventQueue.java:260)
        at java.awt.EventDispatchThread.pumpOneEvent
(EventDispatchThread.java:101)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
        at java.awt.EventDispatchThread.run(EventDispatchThread.java:84)

"Signal Dispatcher" daemon prio=10 tid=0x768180 nid=0xe2 waiting on monitor 
[0..0]

"Finalizer" daemon prio=9 tid=0x766560 nid=0x14d waiting on monitor 
[0x8d8f000..0x8d8fdc0]
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:108)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:123)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:162)

"Reference Handler" daemon prio=10 tid=0x765280 nid=0x140 waiting on monitor 
[0x8d4f000..0x8d4fdc0]
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:110)

"VM Thread" prio=5 tid=0x7644c0 nid=0x100 runnable

"VM Periodic Task Thread" prio=10 tid=0x7687d0 nid=0x14c waiting on monitor

Comment 22 edburns 2001-03-21 14:38:29 PST
Reopen
Comment 23 edburns 2001-04-06 15:24:56 PDT
I've compared Thread dumps from the same invocation in the pre and post freeze 
states, and they are exactly the same.  I think this tells me that the freeze 
has to be in mozilla code.
Comment 24 edburns 2001-04-06 16:06:42 PDT
I have determined that when the freeze occurrs, the NativeEventThread does 
indeed stop.
Comment 25 Alexei V. Mokeev 2001-05-03 03:53:39 PDT
Changing QA contact
Comment 26 Vladimir Strigun 2001-05-07 02:50:06 PDT
I reproduce this bug with latest nightly mozilla build.
I reproduce it on Linux 2.2 at dual Intel Pentim processors machine.
Webclient freeze after about 10 minutes of use. 
Comment 27 edburns 2001-05-23 14:16:11 PDT
Created attachment 35855 [details]
Tar.gz of files for stress test.
Comment 28 Vladimir Strigun 2001-05-25 01:10:50 PDT
Ed, I verify you stress test on Linux SMP and on NT with nightly mozilla & 
Webclient. On both platform test failed. On Linux it's crashed with following 
errors:

+++++++++++++++++++++ Thread Id ---- 0x812dca8

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
Enabling Quirk StyleSheet
/~mindeec/mindee.html
/adi/tr.ln/member;h=misc;sz=468x60;ord=104413188219354?
WEBSHELL- = 2
+++++++++++++++++++++ Thread Id ---- 0x812dca8

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
###!!! ASSERTION: couldn't lazily create the server
: 'NS_SUCCEEDED(rv)', file nsMsgAccount.cpp, line 94
###!!! Break: at file nsMsgAccount.cpp, line 94
Enabling Quirk StyleSheet
/~danreed/
+++++++++++++++++++++ Thread Id ---- 0x812dca8

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
Opening file cookperm.txt failed
Enabling Quirk StyleSheet
Enabling Quirk StyleSheet
/alt.irc.undernet
Opening file cookperm.txt failed
###!!! ASSERTION: NS_ENSURE_TRUE(NS_SUCCEEDED(mTreeOwner->FindItemWithName
(aName, static_c
ast< nsIDocShellTreeItem * >( this), _retval))) failed: '(!((mTreeOwner-
>FindItemWithName(
aName, static_cast< nsIDocShellTreeItem * >( this), _retval)) & 0x80000000))', 
file nsDocS
hell.cpp, line 1143
###!!! Break: at file nsDocShell.cpp, line 1143
+++++++++++++++++++++ Thread Id ---- 0x812dca8

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
+++++++++++++++++++++ Thread Id ---- 0x812dca8

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
Enabling Quirk StyleSheet
/life/cyber/tech/ctg660.htm
+++++++++++++++++++++ Thread Id ---- 0x812dca8

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
Enabling Quirk StyleSheet
/
###!!! ASSERTION: You can't dereference a NULL nsCOMPtr with operator->
().: 'mRawPtr != 0'
, file ../../../../dist/include/nsCOMPtr.h, line 652
###!!! Break: at file ../../../../dist/include/nsCOMPtr.h, line 652
# # An unexpected exception has been detected in native code outside the VM.# 
Program coun
ter=0x4ae1821e
#
# Problematic Thread: prio=1 tid=0x48ec6d68 nid=0x147e runnable 
#


On NT platform it's hung after 5 minutes with following output:

+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
debug: edburns: STATE_REDIRECTING
debug: edburns: STATE_TRANSFERRING
+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
debug: edburns: STATE_REDIRECTING
debug: edburns: STATE_TRANSFERRING
Enabling Quirk StyleSheet
debug: edburns: STATE_TRANSFERRING
/
+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
debug: edburns: STATE_REDIRECTING
debug: edburns: STATE_TRANSFERRING
Enabling Quirk StyleSheet
debug: edburns: STATE_TRANSFERRING
debug: edburns: STATE_TRANSFERRING
/
Opening file cookperm.txt failed
+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
debug: edburns: STATE_REDIRECTING
debug: edburns: STATE_TRANSFERRING
Enabling Quirk StyleSheet
debug: edburns: STATE_TRANSFERRING
/
+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
debug: edburns: STATE_REDIRECTING
debug: edburns: STATE_TRANSFERRING
+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
debug: edburns: STATE_REDIRECTING
debug: edburns: STATE_TRANSFERRING
+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
debug: edburns: STATE_REDIRECTING
debug: edburns: STATE_TRANSFERRING
Enabling Quirk StyleSheet
debug: edburns: STATE_TRANSFERRING
/dnsmith/Archie/archie.html
debug: edburns: Currently Viewing: http://www.pichamber.com/
Opening file cookperm.txt failed
Enabling Quirk StyleSheet
debug: edburns: STATE_TRANSFERRING
/pimaine.html
Opening file cookperm.txt failed
debug: edburns: STATE_TRANSFERRING
+++++++++++++++++++++ Thread Id ---- 050357D0

debug: edburns: Currently Viewing: http://random.yahoo.com/bin/ryl
WARNING: Never finished decoding the JPEG., file c:\mozilla\mozilla\modules\libp
r0n\decoders\jpeg\nsJPEGDecoder.cpp, line 177
debug: edburns: STATE_REDIRECTING
Comment 29 Vladimir Strigun 2001-05-25 01:14:44 PDT
Sorry, on NT platform Webclient hang up after 5 minutes :)
Comment 30 edburns 2001-07-13 14:10:58 PDT
Still occurrs.  This does not occurr with WinEmbed so it must be something to 
do with the interaction of Java and Mozilla.
Comment 31 edburns 2001-07-13 14:13:55 PDT
I notice that before the app hangs both the java event queue, in 
AwtToolkit::MessageLoop(), and the mozilla event queue, in 
NativeEventThread_nativeProcessEvents() continually fire.

After the hang, the mozilla event queue doesn't fire at all, and the java event 
queue only fires when the mouse is moved inside the java part of the app.
Comment 32 edburns 2001-07-13 14:44:28 PDT
Post freeze thread dump:

Full thread dump Classic VM (1.3.1-rc1-b21, native threads):
    "Thread-1" (TID:0x15c74f0, sys_thread_t:0x86ff90, state:CW, native 
ID:0x137) prio=5
    "Screen Updater" (TID:0x15c8f70, sys_thread_t:0x84ff70, state:CW, native 
ID:0x107) prio=5
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at sun.awt.ScreenUpdater.nextEntry(ScreenUpdater.java:76)
        at sun.awt.ScreenUpdater.run(ScreenUpdater.java:95)
    "EventThread-85677680" (TID:0x15bcb40, sys_thread_t:0x83e8c0, state:R, 
native ID:0x125) prio=5
        at 
org.mozilla.webclient.wrapper_native.NativeEventThread.nativeProcessEvents
(Native Method)
        at org.mozilla.webclient.wrapper_native.NativeEventThread.run
(NativeEventThread.java:244)
    "AWT-Windows" (TID:0x15c63c8, sys_thread_t:0x7f00c0, state:R, native 
ID:0x13f) prio=6
        at sun.awt.windows.WToolkit.eventLoop(Native Method)
        at sun.awt.windows.WToolkit.run(WToolkit.java:188)
        at java.lang.Thread.run(Thread.java:484)
    "SunToolkit.PostEventQueue-0" (TID:0x15c62f0, sys_thread_t:0x7f08e0, 
state:CW, native ID:0x138) prio=6
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at sun.awt.PostEventQueue.run(SunToolkit.java:491)
    "AWT-EventQueue-0" (TID:0x15c62c8, sys_thread_t:0x7ef1d0, state:CW, native 
ID:0x14d) prio=6
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at java.awt.EventQueue.getNextEvent(EventQueue.java:260)
        at java.awt.EventDispatchThread.pumpOneEventForHierarchy
(EventDispatchThread.java:106)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy
(EventDispatchThread.java:98)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
        at java.awt.EventDispatchThread.run(EventDispatchThread.java:85)
    "Finalizer" (TID:0x15a9528, sys_thread_t:0x78b9a0, state:CW, native 
ID:0x14c) prio=8
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:108)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:123)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:162)
    "Reference Handler" (TID:0x15a9300, sys_thread_t:0x789130, state:CW, native 
ID:0x13b) prio=10
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:110)
    "Signal dispatcher" (TID:0x15a9330, sys_thread_t:0x7894e0, state:R, native 
ID:0x144) prio=5
Monitor Cache Dump:
    sun.awt.ScreenUpdater@15C8F70/1637190: <unowned>
        Waiting to be notified:
            "Screen Updater" (0x84ff70)
    java.lang.ref.ReferenceQueue$Lock@15A9540/15DF358: <unowned>
        Waiting to be notified:
            "Finalizer" (0x78b9a0)
    java.lang.ref.Reference$Lock@15A9310/15DEF30: <unowned>
        Waiting to be notified:
            "Reference Handler" (0x789130)
    sun.awt.PostEventQueue@15C62F0/15E1A00: <unowned>
        Waiting to be notified:
            "SunToolkit.PostEventQueue-0" (0x7f08e0)
    org.mozilla.webclient.wrapper_native.NativeEventThread@15BCB40/1648938: 
owner "EventThread-85677680" (0x83e8c0) 1 en
try
    java.awt.EventQueue@15C6418/1630538: <unowned>
        Waiting to be notified:
            "AWT-EventQueue-0" (0x7ef1d0)
Registered Monitor Dump:
    utf8 hash table: <unowned>
    JNI pinning lock: <unowned>
    JNI global reference lock: <unowned>
    BinClass lock: <unowned>
    Class linking lock: <unowned>
    System class loader lock: <unowned>
    Code rewrite lock: <unowned>
    Heap lock: <unowned>
    Monitor cache lock: owner "Signal dispatcher" (0x7894e0) 1 entry
    Thread queue lock: owner "Signal dispatcher" (0x7894e0) 1 entry
        Waiting to be notified:
            "Thread-1" (0x86ff90)
    Monitor registry: owner "Signal dispatcher" (0x7894e0) 1 entry
Comment 33 edburns 2001-07-13 14:45:59 PDT
Pre freeze thread dump:

Full thread dump Classic VM (1.3.1-rc1-b21, native threads):
    "Thread-1" (TID:0x15c66e8, sys_thread_t:0x86e210, state:CW, native 
ID:0x1ec) prio=5
    "Screen Updater" (TID:0x15c8f70, sys_thread_t:0x84ff70, state:CW, native 
ID:0x13d) prio=5
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at sun.awt.ScreenUpdater.nextEntry(ScreenUpdater.java:76)
        at sun.awt.ScreenUpdater.run(ScreenUpdater.java:95)
    "EventThread-85677680" (TID:0x15bcb40, sys_thread_t:0x83e8c0, state:R, 
native ID:0x138) prio=5
        at 
org.mozilla.webclient.wrapper_native.NativeEventThread.nativeProcessEvents
(Native Method)
        at org.mozilla.webclient.wrapper_native.NativeEventThread.run
(NativeEventThread.java:244)
    "AWT-Windows" (TID:0x15c63c8, sys_thread_t:0x7f00c0, state:R, native 
ID:0xf7) prio=6
        at sun.awt.windows.WToolkit.eventLoop(Native Method)
        at sun.awt.windows.WToolkit.run(WToolkit.java:188)
        at java.lang.Thread.run(Thread.java:484)
    "SunToolkit.PostEventQueue-0" (TID:0x15c62f0, sys_thread_t:0x7f08e0, 
state:CW, native ID:0x13b) prio=6
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at sun.awt.PostEventQueue.run(SunToolkit.java:491)
    "AWT-EventQueue-0" (TID:0x15c62c8, sys_thread_t:0x7ef1d0, state:CW, native 
ID:0x144) prio=6
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at java.awt.EventQueue.getNextEvent(EventQueue.java:260)
        at java.awt.EventDispatchThread.pumpOneEventForHierarchy
(EventDispatchThread.java:106)
        at java.awt.EventDispatchThread.pumpEventsForHierarchy
(EventDispatchThread.java:98)
        at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
        at java.awt.EventDispatchThread.run(EventDispatchThread.java:85)
    "Finalizer" (TID:0x15a9528, sys_thread_t:0x78b9a0, state:CW, native 
ID:0x147) prio=8
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:108)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:123)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:162)
    "Reference Handler" (TID:0x15a9300, sys_thread_t:0x789130, state:CW, native 
ID:0x10c) prio=10
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:420)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:110)
    "Signal dispatcher" (TID:0x15a9330, sys_thread_t:0x7894e0, state:R, native 
ID:0x1e3) prio=5
Monitor Cache Dump:
    sun.awt.ScreenUpdater@15C8F70/1637950: <unowned>
        Waiting to be notified:
            "Screen Updater" (0x84ff70)
    java.lang.ref.ReferenceQueue$Lock@15A9540/15DF3A8: <unowned>
        Waiting to be notified:
            "Finalizer" (0x78b9a0)
    java.lang.ref.Reference$Lock@15A9310/15DEF80: <unowned>
        Waiting to be notified:
            "Reference Handler" (0x789130)
    sun.awt.PostEventQueue@15C62F0/15E1A50: <unowned>
        Waiting to be notified:
            "SunToolkit.PostEventQueue-0" (0x7f08e0)
    org.mozilla.webclient.wrapper_native.NativeEventThread@15BCB40/16499A8: 
owner "EventThread-85677680" (0x83e8c0) 1 en
try
    java.awt.EventQueue@15C6418/1630B18: <unowned>
        Waiting to be notified:
            "AWT-EventQueue-0" (0x7ef1d0)
Registered Monitor Dump:
    utf8 hash table: <unowned>
    JNI pinning lock: <unowned>
    JNI global reference lock: <unowned>
    BinClass lock: <unowned>
    Class linking lock: <unowned>
    System class loader lock: <unowned>
    Code rewrite lock: <unowned>
    Heap lock: <unowned>
    Monitor cache lock: owner "Signal dispatcher" (0x7894e0) 1 entry
    Thread queue lock: owner "Signal dispatcher" (0x7894e0) 1 entry
        Waiting to be notified:
            "Thread-1" (0x86e210)
    Monitor registry: owner "Signal dispatcher" (0x7894e0) 1 entry
Comment 34 edburns 2001-07-13 15:47:44 PDT
I think I have a testcase that will reliably bring on the freeze:

1. go to http://www.ebay.com/ in webclient
2. type something in the search text field and press enter.
Comment 35 edburns 2001-07-13 16:13:37 PDT
After writing some debugging code, I found the last MSG.message processed by 
the mozilla event queue is c138.  What is this message value?  It's always the 
last one before the freeze.
Comment 36 edburns 2001-07-13 16:15:47 PDT
Solicited help from newsgroup:

From: Ed Burns <ed.burnsREMOVE_THIS@sun.com>
Newsgroups: netscape.public.mozilla.embedding,netscape.public.mozilla.general
Subject: Judson Valeski (or any other embedding guru): Please Help
Date: 13 Jul 2001 13:17:46 -0700
Message-ID: <7cb4rsgeg7p.fsf@sun.com>
Comment 37 edburns 2001-07-16 15:38:04 PDT
I modified prmon.c to print out a message on monitor enter and exit like
this:

  fprintf(msgFile, "Enter Monitor: %p\n", mon);
  fflush(msgFile);

I analyzed the output and found that the monitor with pointer value
0x056BB390 was "Entered" 5153 times and "Exited" 5150 times before the
crash.  Also monitor 0x063FA9C0 "Entered" 11 times, "Exited" 9 times.

Could this be causing deadlock?

Here's the full output of my test data:

The number after the pointer is the number of times that monitor was
entered or exited.

Enter Monitor: 051D2170 37
Exit Monitor: 051D2170 37
Enter Monitor: 051D4420 6
Exit Monitor: 051D4420 6
Enter Monitor: 051D4CF0 117
Exit Monitor: 051D4CF0 117
Enter Monitor: 051D6880 4327
Exit Monitor: 051D6880 4327
Enter Monitor: 052A66C0 14
Exit Monitor: 052A66C0 14
Enter Monitor: 056566B0 3182
Exit Monitor: 056566B0 3182
Enter Monitor: 056BB390 5153
Exit Monitor: 056BB390 5150
Enter Monitor: 056BED70 9
Exit Monitor: 056BED70 9
Enter Monitor: 056D36A0 55
Exit Monitor: 056D36A0 55
Enter Monitor: 06382750 178
Exit Monitor: 06382750 178
Enter Monitor: 063B8D10 2
Exit Monitor: 063B8D10 2
Enter Monitor: 063C1030 7
Exit Monitor: 063C1030 7
Enter Monitor: 063C1420 5
Exit Monitor: 063C1420 5
Enter Monitor: 063C2D50 4
Exit Monitor: 063C2D50 4
Enter Monitor: 063F0E90 16
Exit Monitor: 063F0E90 16
Enter Monitor: 063F1A60 21
Exit Monitor: 063F1A60 21
Enter Monitor: 063FA9C0 11
Exit Monitor: 063FA9C0 9

Comment 38 edburns 2001-07-16 15:39:39 PDT
Here's a google link to the news articles posted about this bug.

http://groups.google.com/groups?as_umsgid=7cb4rsgeg7p.fsf@sun.com
Comment 39 edburns 2001-07-16 16:56:07 PDT
Ed Burns <ed.burnsREMOVE_THIS@sun.com> writes:

> Ed Burns <ed.burnsREMOVE_THIS@sun.com> writes:
> 
> > Grr.  Lamentably, incorporating these calls in my event loop in the same
> > manner as used in winEmbed did not fix the problem.  
> > 
> > This time the last event processed by the msg queue is 0xC16F.  Any ideas?
> 
> I modified prmon.c to print out a message on monitor enter and exit like
> this:
> 
>   fprintf(msgFile, "Enter Monitor: %p\n", mon);
>   fflush(msgFile);

I have refined the test data some more, writing a perl program to take
the monitor exit and enter output data and print out only the cases
where numEnters != numExits.  

I ran the program in the debugger until it froze and collected the data,
along with the stack traces on the threads mentioned in the data.  I did
this twice.

Hypothesis: I believe that deadlock is occurring and I have a hunch that
there are some valuable clues in the data below.  Can someone please
look at the stack traces and see if they can spot the deadlock?  Is this
the right forum for this kind of information?  I'm really stuck here.
My MO is to collect enough information for someone who is an expert to
gain insight.

CASE 1
------

Enter Monitor: 056BBD70: 229949.
Exit Monitor: 056BBD70: 229945.
Enter Monitor: 08AFFA80: 14.
Exit Monitor: 08AFFA80: 12.

Thread A
========

PR_EnterMonitor(PRMonitor * 0x056bbd70) line 87 + 14 bytes
util_PostEvent(WebShellInitContext * 0x051d53d0, PLEvent * 0x08ae7404) line 49 
+ 21 bytes
Java_org_mozilla_webclient_wrapper_1native_NavigationImpl_nativeStop(JNIEnv_ * 
0x0086faa0, _jobject * 0x070dfebc, long 85808080) line 295 + 13 bytes

Thread B
========

PR_EnterMonitor(PRMonitor * 0x08affa80) line 87 + 14 bytes
nsAutoMonitor::nsAutoMonitor(PRMonitor * 0x08affa80) line 184 + 13 bytes
nsSocketTransport::Dispatch(nsSocketRequest * 0x08aff790) line 1288
nsSocketRequest::Cancel(nsSocketRequest * const 0x08aff790, unsigned int 
2152398850) line 2527
nsHttpConnection::OnTransactionComplete(unsigned int 2152398850) line 247
nsHttpTransaction::Cancel(nsHttpTransaction * const 0x08afadd0, unsigned int 
2152398850) line 598
nsHttpChannel::Cancel(nsHttpChannel * const 0x08afccb0, unsigned int 
2152398850) line 1563
nsLoadGroup::Cancel(nsLoadGroup * const 0x063765b0, unsigned int 2152398850) 
line 239 + 16 bytes
nsDocLoaderImpl::Stop(nsDocLoaderImpl * const 0x06376620) line 278 + 31 bytes
nsURILoader::Stop(nsURILoader * const 0x06376e40, nsISupports * 0x06376638) 
line 536 + 23 bytes
nsDocShell::Stop(nsDocShell * const 0x06376e90) line 2211
wsStopEvent::handleEvent() line 355 + 18 bytes
handleEvent(PLEvent * 0x08aff844) line 48 + 11 bytes
PL_HandleEvent(PLEvent * 0x08aff844) line 590 + 10 bytes
processEventLoop(WebShellInitContext * 0x051d53d0) line 439 + 9 bytes
Java_org_mozilla_webclient_wrapper_1native_NativeEventThread_nativeProcessEvents
(JNIEnv_ * 0x0083e840, _jobject * 0x0544febc, long 85808080) line 242 + 9 bytes

Thread C
========

PR_EnterMonitor(PRMonitor * 0x056bbd70) line 87 + 14 bytes
PL_PostEvent(PLEventQueue * 0x056bb740, PLEvent * 0x051d5eac) line 251 + 10 
bytes
nsEventQueueImpl::PostEvent(nsEventQueueImpl * const 0x056bb1a0, PLEvent * 
0x051d5eac) line 251 + 16 bytes
nsMemoryImpl::FlushMemory(const unsigned short * 0x05141db4, int 0) line 432 + 
30 bytes
MemoryFlusher::Run(MemoryFlusher * const 0x051d6d60) line 177 + 43 bytes
nsThread::Main(void * 0x051d6bb0) line 105 + 26 bytes
_PR_NativeRunThread(void * 0x051d6990) line 399 + 13 bytes
_threadstartex(void * 0x051d67e0) line 212 + 13 bytes

Thread D
========

PR_EnterMonitor(PRMonitor * 0x056bbd70) line 87 + 14 bytes
PL_PostEvent(PLEventQueue * 0x056bb740, PLEvent * 0x08aff324) line 251 + 10 
bytes
nsEventQueueImpl::PostEvent(nsEventQueueImpl * const 0x056bb1a0, PLEvent * 
0x08aff324) line 251 + 16 bytes
nsRequestObserverProxy::FireEvent(nsARequestObserverEvent * 0x08aff320) line 
244 + 35 bytes
nsRequestObserverProxy::OnStartRequest(nsRequestObserverProxy * const 
0x08afb630, nsIRequest * 0x08afadd0, nsISupports * 0x00000000) line 185 + 12 
bytes
nsStreamListenerProxy::OnStartRequest(nsStreamListenerProxy * const 0x08afcee0, 
nsIRequest * 0x08afadd0, nsISupports * 0x00000000) line 224
nsHttpTransaction::HandleContent(char * 0x07a90b88, unsigned int 0, unsigned 
int * 0x05cbfdc8) line 466 + 41 bytes
nsHttpTransaction::Read(nsHttpTransaction * const 0x08afadd4, char * 
0x07a90b88, unsigned int 0, unsigned int * 0x05cbfdc8) line 709 + 23 bytes
nsReadFromInputStream(nsIOutputStream * 0x08afb5c4, void * 0x08afadd4, char * 
0x07a90b88, unsigned int 0, unsigned int 4096, unsigned int * 0x05cbfdc8) line 
831
nsPipe::nsPipeOutputStream::WriteSegments(nsPipe::nsPipeOutputStream * const 
0x08afb5c4, unsigned int (nsIOutputStream *, void *, char *, unsigned int, 
unsigned int, unsigned int *)* 0x050b5530 nsReadFromInputStream(nsIOutputStream 
*, void *, char *, unsigned int, unsigned int, unsigned int *), void * 
0x08afadd4, unsigned int 16384, unsigned int * 0x05cbfe5c) line 704 + 29 bytes
nsPipe::nsPipeOutputStream::WriteFrom(nsPipe::nsPipeOutputStream * const 
0x08afb5c4, nsIInputStream * 0x08afadd4, unsigned int 16384, unsigned int * 
0x05cbfe5c) line 839
nsStreamListenerProxy::OnDataAvailable(nsStreamListenerProxy * const 
0x08afcee0, nsIRequest * 0x08afadd0, nsISupports * 0x00000000, nsIInputStream * 
0x08afadd4, unsigned int 0, unsigned int 16384) line 283 + 38 bytes
nsHttpTransaction::OnDataReadable(nsIInputStream * 0x08afc3b0) line 214 + 72 
bytes
nsHttpConnection::OnDataAvailable(nsHttpConnection * const 0x08afaa10, 
nsIRequest * 0x08aff790, nsISupports * 0x00000000, nsIInputStream * 0x08afc3b0, 
unsigned int 0, unsigned int 8192) line 631 + 15 bytes
nsSocketReadRequest::OnRead() line 2670 + 57 bytes
nsSocketTransport::doReadWrite(short 1) line 991 + 14 bytes
nsSocketTransport::Process(short 1) line 477 + 13 bytes
nsSocketTransportService::Run(nsSocketTransportService * const 0x056b9fb4) line 
419 + 13 bytes
nsThread::Main(void * 0x056bd950) line 105 + 26 bytes
_PR_NativeRunThread(void * 0x056bd730) line 399 + 13 bytes
_threadstartex(void * 0x056bd580) line 212 + 13 bytes

---------------------------------------------------------------------------

CASE 2
------

Enter Monitor: 056BBD70: 5828.
Exit Monitor: 056BBD70: 5825.
Enter Monitor: 063EA6B0: 6.
Exit Monitor: 063EA6B0: 3.

Thread A
========

PR_EnterMonitor(PRMonitor * 0x056bbd70) line 87 + 14 bytes
util_PostEvent(WebShellInitContext * 0x051d53d0, PLEvent * 0x063ebe64) line 49 
+ 21 bytes
Java_org_mozilla_webclient_wrapper_1native_NavigationImpl_nativeStop(JNIEnv_ * 
0x0086f4c0, _jobject * 0x0705fe98, long 85808080) line 295 + 13 bytes

Thread B
========

PR_EnterMonitor(PRMonitor * 0x056bbd70) line 87 + 14 bytes
PL_PostEvent(PLEventQueue * 0x056bb740, PLEvent * 0x063ea180) line 251 + 10 
bytes
nsEventQueueImpl::PostEvent(nsEventQueueImpl * const 0x056bb1a0, PLEvent * 
0x063ea180) line 251 + 16 bytes
nsProxyObject::Post(unsigned int 4, nsXPTMethodInfo * 0x06dc2d6c, 
nsXPTCMiniVariant * 0x05cbfd4c, nsIInterfaceInfo * 0x05296c10) line 470
nsProxyEventObject::CallMethod(nsProxyEventObject * const 0x063ea940, unsigned 
short 4, const nsXPTMethodInfo * 0x06dc2d6c, nsXPTCMiniVariant * 0x05cbfd4c) 
line 463 + 52 bytes
PrepareAndDispatch(nsXPTCStubBase * 0x063ea940, unsigned int 4, unsigned int * 
0x05cbfdfc, unsigned int * 0x05cbfdec) line 100 + 31 bytes
SharedStub() line 124
nsHttpConnection::OnStatus(nsHttpConnection * const 0x063eaf48, nsIRequest * 
0x063ea470, nsISupports * 0x063eaf40, unsigned int 2152398851, const unsigned 
short * 0x05cbfe50) line 666
nsSocketTransport::OnStatus(nsSocketRequest * 0x063ea470, nsISupports * 
0x063eaf40, unsigned int 2152398851) line 1772 + 63 bytes
nsSocketTransport::OnStatus(unsigned int 2152398851) line 1787
nsSocketTransport::Process(short 0) line 462
nsSocketTransportService::ProcessWorkQ() line 243 + 10 bytes
nsSocketTransportService::Run(nsSocketTransportService * const 0x056b9fb4) line 
446 + 11 bytes
nsThread::Main(void * 0x056bd950) line 105 + 26 bytes
_PR_NativeRunThread(void * 0x056bd730) line 399 + 13 bytes
_threadstartex(void * 0x056bd580) line 212 + 13 bytes

Thread C
========

PR_EnterMonitor(PRMonitor * 0x063ea6b0) line 87 + 14 bytes
nsAutoMonitor::nsAutoMonitor(PRMonitor * 0x063ea6b0) line 184 + 13 bytes
nsSocketTransport::OnFound(nsSocketTransport * const 0x063ea804, nsISupports * 
0x00000000, const char * 0x063ea350, nsHostEnt * 0x06db8e74) line 1337
nsDNSRequest::FireStop(unsigned int 0) line 271 + 62 bytes
nsDNSLookup::CompleteLookup(unsigned int 0) line 702 + 18 bytes
nsDNSService::ProcessLookup(HWND__ * 0x005502d8, unsigned int 1024, unsigned 
int 1, long 64) line 849 + 22 bytes
nsDNSEventProc(HWND__ * 0x005502d8, unsigned int 1024, unsigned int 1, long 64) 
line 869 + 27 bytes

Thread D
========

PR_EnterMonitor(PRMonitor * 0x063ea6b0) line 87 + 14 bytes
nsAutoMonitor::nsAutoMonitor(PRMonitor * 0x063ea6b0) line 184 + 13 bytes
nsSocketTransport::AsyncRead(nsSocketTransport * const 0x063ea800, 
nsIStreamListener * 0x063eaf40, nsISupports * 0x00000000, unsigned int 0, 
unsigned int 4294967295, unsigned int 3, nsIRequest * * 0x063eaf60) line 1420
nsHttpConnection::ActivateConnection() line 382 + 65 bytes
nsHttpConnection::SetTransaction(nsHttpTransaction * 0x063e9300) line 154 + 8 
bytes
nsHttpHandler::InitiateTransaction(nsHttpTransaction * 0x063e9300, 
nsHttpConnectionInfo * 0x063e7110, int 0) line 387 + 12 bytes
nsHttpChannel::Connect(int 1) line 242
nsHttpChannel::AsyncOpen(nsHttpChannel * const 0x063e71c0, nsIStreamListener * 
0x063e8d80, nsISupports * 0x00000000) line 1802 + 10 bytes
nsDocumentOpenInfo::Open(nsIChannel * 0x063e71c0, int 0, nsISupports * 
0x06376e80) line 184 + 18 bytes
nsURILoader::OpenURIVia(nsURILoader * const 0x06376e40, nsIChannel * 
0x063e71c0, int 0, nsISupports * 0x06376e80, unsigned int 0) line 521 + 20 bytes
nsURILoader::OpenURI(nsURILoader * const 0x06376e40, nsIChannel * 0x063e71c0, 
int 0, nsISupports * 0x06376e80) line 483
nsDocShell::DoChannelLoad(nsIChannel * 0x063e71c0, int 0, nsIURILoader * 
0x06376e40) line 4667 + 24 bytes
nsDocShell::DoURILoad(nsIURI * 0x063e5ba0, nsIURI * 0x00000000, nsISupports * 
0x00000000, int 0, nsIInputStream * 0x00000000, nsIInputStream * 0x00000000) 
line 4456 + 36 bytes
nsDocShell::InternalLoad(nsDocShell * const 0x06376e80, nsIURI * 0x063e5ba0, 
nsIURI * 0x00000000, nsISupports * 0x00000000, int 1, int 0, const unsigned 
short * 0x0544fab4, nsIInputStream * 0x00000000, nsIInputStream * 0x00000000, 
unsigned int 1, nsISHEntry * 0x00000000) line 4275 + 43 bytes
nsDocShell::LoadURI(nsDocShell * const 0x06376e80, nsIURI * 0x063e5ba0, 
nsIDocShellLoadInfo * 0x00000000, unsigned int 0) line 559 + 72 bytes
nsDocShell::LoadURI(nsDocShell * const 0x06376e90, const unsigned short * 
0x063e1d60, unsigned int 0) line 2161 + 31 bytes
wsLoadURLEvent::handleEvent() line 70 + 33 bytes
handleEvent(PLEvent * 0x063e1f84) line 48 + 11 bytes
PL_HandleEvent(PLEvent * 0x063e1f84) line 590 + 10 bytes
processEventLoop(WebShellInitContext * 0x051d53d0) line 439 + 9 bytes
Java_org_mozilla_webclient_wrapper_1native_NativeEventThread_nativeProcessEvents
(JNIEnv_ * 0x0083e840, _jobject * 0x0544febc, long 85808080) line 242 + 9 bytes

Comment 40 edburns 2001-07-19 13:14:23 PDT
For the record, here is little audit trail of email:

On 16 July 17:52:38, Judson Valeski wrote:
> There's definately a lock/monitor in-balance here. I'm at a complete 
> loss as to what the cause could be though (esp. now that you're "doing 
> embedding idle stuff".

On 17 July 09:47:53, Judson Valeski wrote:
> Rick just landed (on the trunk) a re-write of the proxy object code to 
> prevent some crashing. I haven't looked at the code, but it's quite 
> possible that monitors/locks were re-worked in the process and may have 
> a positive impact here.
> 
> The socket transport thread shows up here too (no suprise).... could be 
> some bad interraction there.

On 17 July 10:05:01, Judson Valeski wrote:
> You're clearly banging on this hard :-/. I'm going to be out there next 
> week for a few days, but we're in the middle of shipping a few products 
> (NS6, and a couple embedding products). On top of that I'll be giving a 
> presentation in San Diego on embedding on Wed of next week. In short, 
> I'm totally slammed for the forseeable future.

On 17 July 14:13:05, Rick Potts wrote:
> hey ed,
> 
> I looked at your thread stack-traces and here's my "wild ass guess" as 
> to whats happening :-)
> 
> It appears that the basic deadlock is between the UI thread and the 
> socket transport thread in a classic A-B-B-A deadlock.  The two monitors 
> appear to be the PLEventQ monitor and the socket transport monitor.
> 
> In order for this to happen, a couple of things must be true:
> 
>    1. Your version of nsSocketTransport.cpp < rev 2.206.  Since in rev
>       2.206 darin added a patch to release the sockettransport monitor
>       *before* calling OnRead(...)
>    2. The function processEventLoop(...) must be holding on to the
>       PLEventQ monitor when PL_HandleEvent(...) is called.
> 
> It appears that the second case exposes another potential deadlock in 
> the nsSocketTransport.  Because the socketTransport lock is *not* 
> released before OnStatus(...) is called.  I think that it probably 
> should be...
> 
> ed, do these ramblings make any sense to you?

On 18 July 15:46:39, Rick Potts wrote:
> hey ed,
> 
> so it looks like you are still deadlocking because your function 
> processEventLoop(...) is holding onto the eventQ monitor when 
> PLEvent->HandleEvent(...) is called.
> 
> I'm assuming that processEventLoop(...) is your function :-)  It looks 
> like it should be very similar to PL_ProcessPendingEvents(...) except 
> that there, the monitor is released before calling HandleEvent(...)
> 
> We definately need to fix the corrosponding problem on our side - that 
> we are calling out of the sockettransport while we are holding onto the 
> sochet transport lock...
Comment 41 edburns 2001-07-19 13:23:31 PDT
Rick suggested a modification to NativeEventThread.cpp::processEventLoop(), 
which I made to webclient.  I also updated to the today's trunk on win32.  
After doing this, my WCRandom app, which reloads 
<http://random.yahoo.com/bin/ryl> every ten seconds, ran for a half hour, 
eventually crashing due to a memory problem.  It didn't hang.  

I'm posting the log data to this bug.

Then I'll try making Rick's modification in webclient, with the 0.9.1 build, in 
which Netscape 6.1 beta for Solaris will ship, and with which webclient will 
ship.
Comment 42 edburns 2001-07-19 13:24:31 PDT
Created attachment 42899 [details]
log of URL loads demonstrating that it ran for a half hour before crashing.
Comment 43 edburns 2001-07-19 15:51:48 PDT
Tried Rick's fix with 0.9.1 and it prevents the freeze.  This time it ran for 
45 minutes then crashed in some JavaScript code, not related to webclient.

Marking FIXED.  Woohoo!
Comment 44 edburns 2001-08-24 12:04:32 PDT
*** Bug 96826 has been marked as a duplicate of this bug. ***
Comment 45 Vladimir Strigun 2001-10-01 07:21:54 PDT
Verified with Webclient under Linux and mozilla 0.9.3
I loaded about 100 urls during 1 hor and webclient do not freeze.
Comment 46 Alexei V. Mokeev 2001-10-05 08:27:40 PDT
Mark VERIFIED according to Vladimir's comment.

Note You need to log in before you can comment on or make changes to this bug.