Conn: Mozilla is unresponsive after sleep

VERIFIED FIXED in mozilla0.9.1

Status

()

defect
P2
critical
VERIFIED FIXED
19 years ago
18 years ago

People

(Reporter: mozilla.org, Assigned: gordon)

Tracking

({hang, testcase})

Trunk
mozilla0.9.1
PowerPC
Mac System 9.x
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(7 attachments)

Reporter

Description

19 years ago
With Mozilla 0.8 2001021502 on Mac OS 9.1 with virtual memory turned off,
Mozilla fails to respond after sleep. I have to "force quit" and restart Mozilla.

Comment 1

19 years ago
xpapps? nspr? mac-helpwanted
Assignee: asa → vishy
Component: Browser-General → XP Apps
Keywords: hang
QA Contact: doronr → sairuh
Reporter

Comment 2

19 years ago
This might be related to bug 60204 "App hangs on wakeup after visiting any
secure page." This problem occurs for me even if I have not visitied any secure
sites, but I store my mail passwords in PSM, so that might be triggering the
problem.
Reporter

Comment 3

19 years ago
Mozilla seems to sleep/wake fine if I haven't checked my mail.

Comment 4

19 years ago
mac, pchen.  marking new
Assignee: vishy → pchen

Comment 5

19 years ago
As an aside, I never use Mozilla for mail, and it still hangs.

Comment 6

19 years ago
There was a bug with networking problems after sleep (bug 49990) but I don't
think that ever caused the app to hang. I couldn't reproduce this (G4, OS 9.1,
2001030508)
Reporter

Comment 7

19 years ago
I still have this problem with 2001030508 and the setup described earlier.

This might just be a duplicate of bug 60204 (possibly related to Password
Manager being activated).

Comment 8

19 years ago
*** Bug 72904 has been marked as a duplicate of this bug. ***

Comment 9

19 years ago
Marking NEW based on duplicate.
Status: UNCONFIRMED → NEW
Ever confirmed: true
I'm not sure what's happening here. For me, Mozilla remains responsive, but I 
can't load any pages after the machine goes to sleep. We never get down into 
NSPR, so I'm not sure what's going on.
Assignee

Comment 11

18 years ago
It would be interesting to know if nsDNSService is calling 
OTInetStringToAddress(), and if so, does the lookup ever complete.  Setting break 
points in nsDNSLookup::InitiateLookup() on the line that calls 
OTInetStringToAddress() would be good, as well as somewhere in 
nsDNSService::Run(), which will collect the results unless there is a problem 
with the notifier routine.
Over to gordon.  This seems to be a problem with the socket transport thread 
never getting woken up after sleep. I'll attach a patch that adds an asssertion 
that fires for the first networking after sleep.
Assignee: pchen → gordon
Component: XP Apps → Networking
So it appears that sleep closes our local sockets that we use for pollable 
events. So pollable events break after sleep. I'm seeing a -3155 error on the 
first OTSnd() call on a reload after sleep, which should successfully wake the 
polling thread.
Verified that the local sockets get closed; the endpoint for each gets a 
'kOTProviderWillClose' notification as the machine goes to sleep. (I thus stand 
by my filing of bug 67892.)

I note also that PSM uses local sockets, and is also hosed when the machine goes 
to sleep (bug 58798).

Comment 16

18 years ago
client builds have just switched to PSM 2.0, so this bug should go away now.
QA Contact: sairuh → tever
Taking
Assignee: gordon → sfraser
0.9
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9
That patch is an imlementatin of pollable events for Mac only that does not use 
local sockets, but just a simple struct that holds a boolean value, and a pointer 
to the polling thread that needs waking up.

I'd rather not have an explicit reference to the polling thread. Any suggestions?

Note: I tested browser, mail, and PSM 1 with these changes, and everything 
worked.
Priority: -- → P2

Comment 22

18 years ago
Why do we need to call WakeUpNotifiedThread() in PR_SetPollableEvent? 
PR_WaitForPollableEvent() doesn't block. Is this to ensure that a thread that is 
truly blocked on I/O will wakeup? I thought the whole point of PR_Poll() is that 
threads using it don't ever block.

Comment 23

18 years ago
I applied the patch to my tree, and voila, my PowerBook can now browse after 
sleep. Brilliant! r=beard
PR_SetPollableEvent() has to wake up the polling thread because, well, that's 
what pollable events are for. The local sockets version of PR_SetPollableEvent() 
woke up the polling thread by writing to the socket, with the result that the OT 
notifier got pinged, which woke the polling thread. This flag version has to 
emulate that behaviour somehow, but I don't like the specificity of the current 
implementation. I don't want to check this version in.
Boink.
Target Milestone: mozilla0.9 → mozilla0.9.1

Comment 26

18 years ago
*** Bug 58798 has been marked as a duplicate of this bug. ***
gordon, do you have cycles to take this on?
Assignee: sfraser → gordon
Status: ASSIGNED → NEW
Assignee

Comment 28

18 years ago
I've got more 0.9.1 bugs than you do :-), but now that I have a titanium 
powerbook, I'd really like to have this fixed.  I'm not sure how much time I'll 
have to look at this.  I might have a better idea next week.

Would you like the bug back, or is there someone else we can get to work on it?
this is a beta stopper. all macs default to sleeping after 30 minutes.

sfraser, what's your confidence of the attached patch?
The patch works, but it makes assumptions about the caller's use of NSPR 
functions, that are not necessarily valid for users other than Mozilla.

Comment 31

18 years ago
*** Bug 82250 has been marked as a duplicate of this bug. ***

Comment 32

18 years ago
mass move, v2.
qa to me.
QA Contact: tever → benc
Assignee

Comment 33

18 years ago
Posted patch Update patch.Splinter Review
Assignee

Comment 34

18 years ago
Okay, here's the plan:

I'd like everyone with a Mac build to apply the patch and run with it for a day 
or so.  If we don't find problems with it, I'll check it in for 0.9.1.

After talking with Wan-Teh, Simon and I expect the patch to work correctly given 
the way pollable events are used in mozilla.  For 0.9.2, I'll rework it to (more) 
fully support the semantics of pollable events.

Please add a line to this bug describing your experience (positive or negative).
Thanks.

Comment 35

18 years ago
I get what sem to be an endless number of assertions on start-up after applying
the patch:

Assertion failure: NULL != bottomFD, at macsockotpt.c:1776
the new patch seems to work. my powerbook can wake up and go places (yay!). 
trying to log into amazon.com gives "unable to write to pipe" assertions in the 
unknown content type handler and locks up the product forever. someone else 
should verify https, my tree is a bit old.

i'm going to run an opt build through jrgm's tests to verify performance.

Comment 38

18 years ago
AFter applying the latest patch:

1) I visited https://www.verisig.com/
2) Put the PowerBook to sleep and woke it up after a few minutes.
3) Then I went to some internal test sites as well as https://www.thawte.com/

SSL seems to work.  I update my tree this morning.
houston, we have a problem. 

I ran an opt built of hyatt's style branch, for which i'd already collected 
timing data, with this new patch (4th attachment) and here are the results:

before: 2045 ms
after:  3842 ms  (!!)

almost twice as slow. :( though I haven't timed it, it feels slower to start up, 
and it locks up the ui more while loading pages.

Comment 40

18 years ago
I applied the patch sent by e-mail last night.
On Quit, I crashed (possibly unrelated?).  Here is a partial stack crawl:
  PR_Lock
  DoneWaitingOnThisThread+00030
  WakeUpNotifiedThread+00068
  PR_SetPollableEvent+00058
  nsSocketTransportService::AddToWorkQ
  nsSocketTransport::Dispatch
  nsSocketRequest::Cancel
  nsFtpControlConnection::Disconnect
  nsFtpState::KillControlConnection
  nsFtpState::StopProcessing
  nsFtpState::OnStopRequest
  nsFtpControlConnetion::OnStopRequest
  nsOnStopRequestEvent::HandleEvent
  ...

Comment 41

18 years ago
I backed out the patch and rebuilt nspr. I don't crash anymore after going to 
ftp://ftp.mozilla.org/
An if() test bungle in the previous patch missed a case that caused poor 
performance; that's fixed. We also worked around an FTP bug that causes the crash 
that brade saw.

Comment 44

18 years ago
In mozilla/nsprpub/pr/src/io/prpolevt.c, do you need to
include "primpl.h"?

If you don't include "primpl.h", you can just name the structure
below as PRFilePrivate and then you don't need to cast fd->secret
or event->secret to PRPollableDesc*.

+/*
+ * PRFilePrivate structure for the NSPR pollable events layer
+ */
+typedef struct PRPollableDesc {
+    PRBool      gotEvent;
+    PRThread    *pollingThread;
+} PRPollableDesc;
The #include of primpl was necessary to be able to get inside PRThread*, for the 
assert and test if state == _PR_DEAD_STATE. This code can go away when a necko 
bug is fixed.
patch 5 shows no slowdowns for me. can't test sleep on my desktop, and my build
doesn't have psm, so i can't test that either.

Comment 47

18 years ago
r=wtc, with the following comments.

1. If you include "primpl.h", the following should be deleted.

+/*
+ * These internal functions are declared in primpl.h,
+ * but we can't include primpl.h because the definition
+ * of struct PRFilePrivate in this file (for the pollable
+ * event layer) will conflict with the definition of
+ * struct PRFilePrivate in primpl.h (for the NSPR layer).
+ */
+extern PRIntn _PR_InvalidInt(void);
+extern PRInt64 _PR_InvalidInt64(void);
+extern PRStatus _PR_InvalidStatus(void);
+extern PRFileDesc *_PR_InvalidDesc(void);

2. In _pr_MacPolEvtPoll(), *out_flags is not initialized
   (although the current NSPR implementation does initialize
   *out_flags).  So it is safer to say
       *out_flags = PR_POLL_READ;
   as opposed to
       *out_flags |= PR_POLL_READ;

   That whole thing should read:
+    if ((in_flags & PR_POLL_READ) && pollDesc->gotEvent)
+        *out_flags = PR_POLL_READ;
+    else
+        *out_flags = 0;

3. We say "FIXME" rather than "Danger Will Robinson!" in NSPR
   source code. ;-)

Comment 48

18 years ago
Issue #1 with latest patch (5/24):

in prpolevt.c, can we add the following line:
#pragma unused(in_flags)

approximately to line 200 to get rid of the compiler warning?


Issue #2:
I get an assertion on exit after going to an ftp site (which is much better than 
crashing).
  Assertion failure: pollDesc->pollingThread->state != _PR_DEAD_STATE, at 
prpolevt.c:284

Simon--is this caused by the Necko bug?  Is there a bug # for that bug?

Comment 49

18 years ago
brade@netscape.com wrote:
> Issue #1 with latest patch (5/24):
> 
> in prpolevt.c, can we add the following line:
> #pragma unused(in_flags)
>
> approximately to line 200 to get rid of the compiler warning?

This will be unnecessary with my suggested change, which does
use the in_flags argument.
brade: the assertion indicates that bug 65273 needs to be fixed (FTP is trying to 
set a pollable event on a thread that has termindated already when shutting 
down).

Comment 51

18 years ago
Simon--thanks for the pointer to the bug.

This latest patch is working great for me; I don't seem to have problems when I 
turn off and back on my ISDN line anymore.
Assignee

Comment 52

18 years ago

Comment 53

18 years ago
Gordon,

Why did you comment out my suggested change?

+//    if ((in_flags & PR_POLL_READ) && pollDesc->gotEvent)
+    if (pollDesc->gotEvent)

Comment 55

18 years ago
a=selmer when reviews all in.

Comment 56

18 years ago
Latest works for me. r=beard
Assignee

Comment 57

18 years ago
The patch has been checked in.  Marking FIXED.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED

Comment 58

18 years ago
thanks a lot!  works for me now.  removing myself from the CC: list.

Comment 59

18 years ago
REOPEN:
I have tried this several times in MacOS 9 on my G4 and it does not work
(Mozilla 0.9)

I'll try a daily commercial build tomorrow...
Status: RESOLVED → REOPENED
Keywords: qawanted
Resolution: FIXED → ---

Comment 60

18 years ago
Ben - this fix didn't go in for mozilla0.9.  Did you mean to say mozilla0.9.1?
duh, this wasn't fixed in 0.9
Status: REOPENED → RESOLVED
Closed: 18 years ago18 years ago
Resolution: --- → FIXED

Comment 62

18 years ago
I think I mean to say that "I'm not getting enough sleep..." I'll look at a
newer build.

Updated

18 years ago
Summary: Mozilla is unresponsive after sleep → Conn: Mozilla is unresponsive after sleep

Comment 63

18 years ago
VERIFIED: Mozilla 0.9.5
+testcase, - other keywords
Mac OS X build in 10.0.2
Mac OS build in classic mode
MacOS 9.1.

I've put some tests into the OS specific section of the general connectivity test.

Note: VM was on in all cases.
Status: RESOLVED → VERIFIED
Keywords: qawantedtestcase
You need to log in before you can comment on or make changes to this bug.