Closed Bug 69285 Opened 24 years ago Closed 23 years ago

Conn: Mozilla is unresponsive after sleep

Categories

(Core :: Networking, defect, P2)

PowerPC
Mac System 9.x
defect

Tracking

()

VERIFIED FIXED
mozilla0.9.1

People

(Reporter: mozilla.org, Assigned: gordon)

References

Details

(Keywords: hang, testcase)

Attachments

(7 files)

With Mozilla 0.8 2001021502 on Mac OS 9.1 with virtual memory turned off,
Mozilla fails to respond after sleep. I have to "force quit" and restart Mozilla.
xpapps? nspr? mac-helpwanted
Assignee: asa → vishy
Component: Browser-General → XP Apps
Keywords: hang
QA Contact: doronr → sairuh
This might be related to bug 60204 "App hangs on wakeup after visiting any
secure page." This problem occurs for me even if I have not visitied any secure
sites, but I store my mail passwords in PSM, so that might be triggering the
problem.
Mozilla seems to sleep/wake fine if I haven't checked my mail.
mac, pchen.  marking new
Assignee: vishy → pchen
As an aside, I never use Mozilla for mail, and it still hangs.

There was a bug with networking problems after sleep (bug 49990) but I don't
think that ever caused the app to hang. I couldn't reproduce this (G4, OS 9.1,
2001030508)
I still have this problem with 2001030508 and the setup described earlier.

This might just be a duplicate of bug 60204 (possibly related to Password
Manager being activated).
*** Bug 72904 has been marked as a duplicate of this bug. ***
Marking NEW based on duplicate.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I'm not sure what's happening here. For me, Mozilla remains responsive, but I 
can't load any pages after the machine goes to sleep. We never get down into 
NSPR, so I'm not sure what's going on.
It would be interesting to know if nsDNSService is calling 
OTInetStringToAddress(), and if so, does the lookup ever complete.  Setting break 
points in nsDNSLookup::InitiateLookup() on the line that calls 
OTInetStringToAddress() would be good, as well as somewhere in 
nsDNSService::Run(), which will collect the results unless there is a problem 
with the notifier routine.
Over to gordon.  This seems to be a problem with the socket transport thread 
never getting woken up after sleep. I'll attach a patch that adds an asssertion 
that fires for the first networking after sleep.
Assignee: pchen → gordon
Component: XP Apps → Networking
So it appears that sleep closes our local sockets that we use for pollable 
events. So pollable events break after sleep. I'm seeing a -3155 error on the 
first OTSnd() call on a reload after sleep, which should successfully wake the 
polling thread.
Verified that the local sockets get closed; the endpoint for each gets a 
'kOTProviderWillClose' notification as the machine goes to sleep. (I thus stand 
by my filing of bug 67892.)

I note also that PSM uses local sockets, and is also hosed when the machine goes 
to sleep (bug 58798).
client builds have just switched to PSM 2.0, so this bug should go away now.
QA Contact: sairuh → tever
Taking
Assignee: gordon → sfraser
0.9
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9
That patch is an imlementatin of pollable events for Mac only that does not use 
local sockets, but just a simple struct that holds a boolean value, and a pointer 
to the polling thread that needs waking up.

I'd rather not have an explicit reference to the polling thread. Any suggestions?

Note: I tested browser, mail, and PSM 1 with these changes, and everything 
worked.
Priority: -- → P2
Why do we need to call WakeUpNotifiedThread() in PR_SetPollableEvent? 
PR_WaitForPollableEvent() doesn't block. Is this to ensure that a thread that is 
truly blocked on I/O will wakeup? I thought the whole point of PR_Poll() is that 
threads using it don't ever block.
I applied the patch to my tree, and voila, my PowerBook can now browse after 
sleep. Brilliant! r=beard
PR_SetPollableEvent() has to wake up the polling thread because, well, that's 
what pollable events are for. The local sockets version of PR_SetPollableEvent() 
woke up the polling thread by writing to the socket, with the result that the OT 
notifier got pinged, which woke the polling thread. This flag version has to 
emulate that behaviour somehow, but I don't like the specificity of the current 
implementation. I don't want to check this version in.
Boink.
Target Milestone: mozilla0.9 → mozilla0.9.1
*** Bug 58798 has been marked as a duplicate of this bug. ***
gordon, do you have cycles to take this on?
Assignee: sfraser → gordon
Status: ASSIGNED → NEW
I've got more 0.9.1 bugs than you do :-), but now that I have a titanium 
powerbook, I'd really like to have this fixed.  I'm not sure how much time I'll 
have to look at this.  I might have a better idea next week.

Would you like the bug back, or is there someone else we can get to work on it?
this is a beta stopper. all macs default to sleeping after 30 minutes.

sfraser, what's your confidence of the attached patch?
The patch works, but it makes assumptions about the caller's use of NSPR 
functions, that are not necessarily valid for users other than Mozilla.
*** Bug 82250 has been marked as a duplicate of this bug. ***
mass move, v2.
qa to me.
QA Contact: tever → benc
Attached patch Update patch.Splinter Review
Okay, here's the plan:

I'd like everyone with a Mac build to apply the patch and run with it for a day 
or so.  If we don't find problems with it, I'll check it in for 0.9.1.

After talking with Wan-Teh, Simon and I expect the patch to work correctly given 
the way pollable events are used in mozilla.  For 0.9.2, I'll rework it to (more) 
fully support the semantics of pollable events.

Please add a line to this bug describing your experience (positive or negative).
Thanks.
I get what sem to be an endless number of assertions on start-up after applying
the patch:

Assertion failure: NULL != bottomFD, at macsockotpt.c:1776
the new patch seems to work. my powerbook can wake up and go places (yay!). 
trying to log into amazon.com gives "unable to write to pipe" assertions in the 
unknown content type handler and locks up the product forever. someone else 
should verify https, my tree is a bit old.

i'm going to run an opt build through jrgm's tests to verify performance.
AFter applying the latest patch:

1) I visited https://www.verisig.com/
2) Put the PowerBook to sleep and woke it up after a few minutes.
3) Then I went to some internal test sites as well as https://www.thawte.com/

SSL seems to work.  I update my tree this morning.
houston, we have a problem. 

I ran an opt built of hyatt's style branch, for which i'd already collected 
timing data, with this new patch (4th attachment) and here are the results:

before: 2045 ms
after:  3842 ms  (!!)

almost twice as slow. :( though I haven't timed it, it feels slower to start up, 
and it locks up the ui more while loading pages.
I applied the patch sent by e-mail last night.
On Quit, I crashed (possibly unrelated?).  Here is a partial stack crawl:
  PR_Lock
  DoneWaitingOnThisThread+00030
  WakeUpNotifiedThread+00068
  PR_SetPollableEvent+00058
  nsSocketTransportService::AddToWorkQ
  nsSocketTransport::Dispatch
  nsSocketRequest::Cancel
  nsFtpControlConnection::Disconnect
  nsFtpState::KillControlConnection
  nsFtpState::StopProcessing
  nsFtpState::OnStopRequest
  nsFtpControlConnetion::OnStopRequest
  nsOnStopRequestEvent::HandleEvent
  ...
I backed out the patch and rebuilt nspr. I don't crash anymore after going to 
ftp://ftp.mozilla.org/
An if() test bungle in the previous patch missed a case that caused poor 
performance; that's fixed. We also worked around an FTP bug that causes the crash 
that brade saw.
In mozilla/nsprpub/pr/src/io/prpolevt.c, do you need to
include "primpl.h"?

If you don't include "primpl.h", you can just name the structure
below as PRFilePrivate and then you don't need to cast fd->secret
or event->secret to PRPollableDesc*.

+/*
+ * PRFilePrivate structure for the NSPR pollable events layer
+ */
+typedef struct PRPollableDesc {
+    PRBool      gotEvent;
+    PRThread    *pollingThread;
+} PRPollableDesc;
The #include of primpl was necessary to be able to get inside PRThread*, for the 
assert and test if state == _PR_DEAD_STATE. This code can go away when a necko 
bug is fixed.
patch 5 shows no slowdowns for me. can't test sleep on my desktop, and my build
doesn't have psm, so i can't test that either.
r=wtc, with the following comments.

1. If you include "primpl.h", the following should be deleted.

+/*
+ * These internal functions are declared in primpl.h,
+ * but we can't include primpl.h because the definition
+ * of struct PRFilePrivate in this file (for the pollable
+ * event layer) will conflict with the definition of
+ * struct PRFilePrivate in primpl.h (for the NSPR layer).
+ */
+extern PRIntn _PR_InvalidInt(void);
+extern PRInt64 _PR_InvalidInt64(void);
+extern PRStatus _PR_InvalidStatus(void);
+extern PRFileDesc *_PR_InvalidDesc(void);

2. In _pr_MacPolEvtPoll(), *out_flags is not initialized
   (although the current NSPR implementation does initialize
   *out_flags).  So it is safer to say
       *out_flags = PR_POLL_READ;
   as opposed to
       *out_flags |= PR_POLL_READ;

   That whole thing should read:
+    if ((in_flags & PR_POLL_READ) && pollDesc->gotEvent)
+        *out_flags = PR_POLL_READ;
+    else
+        *out_flags = 0;

3. We say "FIXME" rather than "Danger Will Robinson!" in NSPR
   source code. ;-)
Issue #1 with latest patch (5/24):

in prpolevt.c, can we add the following line:
#pragma unused(in_flags)

approximately to line 200 to get rid of the compiler warning?


Issue #2:
I get an assertion on exit after going to an ftp site (which is much better than 
crashing).
  Assertion failure: pollDesc->pollingThread->state != _PR_DEAD_STATE, at 
prpolevt.c:284

Simon--is this caused by the Necko bug?  Is there a bug # for that bug?
brade@netscape.com wrote:
> Issue #1 with latest patch (5/24):
> 
> in prpolevt.c, can we add the following line:
> #pragma unused(in_flags)
>
> approximately to line 200 to get rid of the compiler warning?

This will be unnecessary with my suggested change, which does
use the in_flags argument.
brade: the assertion indicates that bug 65273 needs to be fixed (FTP is trying to 
set a pollable event on a thread that has termindated already when shutting 
down).
Simon--thanks for the pointer to the bug.

This latest patch is working great for me; I don't seem to have problems when I 
turn off and back on my ISDN line anymore.
Gordon,

Why did you comment out my suggested change?

+//    if ((in_flags & PR_POLL_READ) && pollDesc->gotEvent)
+    if (pollDesc->gotEvent)
a=selmer when reviews all in.
Latest works for me. r=beard
The patch has been checked in.  Marking FIXED.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
thanks a lot!  works for me now.  removing myself from the CC: list.
REOPEN:
I have tried this several times in MacOS 9 on my G4 and it does not work
(Mozilla 0.9)

I'll try a daily commercial build tomorrow...
Status: RESOLVED → REOPENED
Keywords: qawanted
Resolution: FIXED → ---
Ben - this fix didn't go in for mozilla0.9.  Did you mean to say mozilla0.9.1?
duh, this wasn't fixed in 0.9
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
I think I mean to say that "I'm not getting enough sleep..." I'll look at a
newer build.
Summary: Mozilla is unresponsive after sleep → Conn: Mozilla is unresponsive after sleep
VERIFIED: Mozilla 0.9.5
+testcase, - other keywords
Mac OS X build in 10.0.2
Mac OS build in classic mode
MacOS 9.1.

I've put some tests into the OS specific section of the general connectivity test.

Note: VM was on in all cases.
Status: RESOLVED → VERIFIED
Keywords: qawantedtestcase
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: