Closed Bug 119592 Opened 24 years ago Closed 24 years ago

offline:mac only: Mail/News spins after you go back online

Categories

(SeaMonkey :: MailNews: Backend, defect, P2)

PowerPC
Mac System 9.x
defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.0

People

(Reporter: grylchan, Assigned: Bienvenu)

Details

Attachments

(1 file)

I've seen this a while and decided to file a bug. Using 2002-01-11-08-trunk on mac 10.1 and mac 9.1 Possibly related to bug 111015? I've filed other 'mac only bugs' bug 114545, bug 112993 and they may/may not be related also? Problem: If you go offline and then go back online and click on any mail folder it results in 'spinning status bar' and hourglass mouse icon. It hangs. Happens only in mac builds. Steps to reproduce: 1.Login to your imap mail (either new or migrated profile) 2.Go offline via file menu or icon (doesn't matter if you download mail or not) 3.Go back online 4.Click on few folders result: spinning status bar, hourglass mouse icon, messenger hangs. expected: to see contents of mail in the folder I select and messenger to not hang.
Changing subj line from IMAP spins after you go back online to Mail/News spins after you go back online Still see this in 2002-02-01-03-trunk/ mac 9.2. If a user goes offline and then goes back online, the user will not be able to read any non-downloaded mesgs online unless they quit mozilla and restart. User can still read downloaded mesgs but will see spining status bar, arrow is a 'watch', etc.. Adding nsbeta1 to keyword field
Keywords: nsbeta1
Summary: offline:mac only: IMAP spins after you go back online → offline:mac only: Mail/News spins after you go back online
does the browser work after going offline and then online again? Can you browse to new sites? Linux had a problem with dns lookups after going offline and online again. I wonder if Mac has a similar problem...
Using 2002020418 build on Mac 10.1.2: If you have just the browser up, go offline, go online, you can still use the browser to vist other web sites. No problems. If you have messenger and browser running, go offline, go online both don't work when they go back online. I can test mac 9.1 builds if you want.
David, Does this help you? I was able to get an old mozilla build built and running (20020129). And I was running it in the mac debugger. I'm trying to play developer here and I noticed this in the MozillaDebug.out windows. I see these lines after toggling the offline indicator in Messenger: ! ASSERTION: adding transport after service is shut down: 'mThreadRunning', file nsSocketTransportService.cpp, line 224 Assertion failure: pollDesc->pollingThread->state != _PR_DEAD_STATE, at prpolevt.c:287 Line 224 of nsSocketTransportService.cpp is NS_ASSERTION(mThreadRunning, "adding transport after service is shut down"); Line 287 of prpolevt is: PR_ASSERT(pollDesc->pollingThread->state != _PR_DEAD_STATE); And that line in prpolevt.c is in a function and the comment for that function is 'from macsockotpt.c.' I don't see either of these mesgs when I go offline via browser.
Gary, that assertion happens on windows too, which leads me to think that it's not responsible for the mac problem. The socket that's in trouble is going away anyway. I suppose mac nspr could be really horked by this, but it's hard to imagine how.
adding note: test bug 97310 (mac os only) when this bug gets fixed.
if I understand correctly a comment in bug 97310, the spinning does not happen if you shut down after going online and restart online. This makes me wonder if somehow necko/nspr/mac networking is somehow damaged by going offline so that things don't work with networking again. I know that linux, for example, as I mentioned earlier, dns doesn't work after going offline and online again. Darin or Simon, do you know of any problems in this area on the mac? Gary, here's another test: 1. startup online 2. Read an imap message. 3. Go offline 4. Go back online immediately 5. Try to read another imap message (that's not cached) Since you didn't do anything while offline, there should be nothing to playback, etc, from the offline session, so we should just be testing the ability to open a new connection to the imap server after going offline. Another thing to try is a similar test, except in step 5, try to open a folder you haven't opened in that mail session, instead of a different mail message in the same folder.
Status: NEW → ASSIGNED
Keywords: nsbeta1nsbeta1+
Priority: -- → P2
Target Milestone: --- → mozilla1.0
David, You might be onto something. Using 2002021303 on Mac 9.2.2, If I repeat your steps, in comment 6 ( i assume when I go offline in step 3 I am prompted to download and I say don't download), I see the spin. I tried with new mesg or new folder and both result in hourglass mouse and spinning status bar. Yet if I use same exact profile but a different build (2001101708) and I repeat your steps, it works as expected. Go offline, then go back online, I can click on a non cached mesg or unvisted folder and it works. No spinning status bar.
bienvenu: works-for-me under redhat 7.2 linux. what linux system are you seeing offline/online DNS problems?
Darin, I'm just describing a problem that we ran into with linux, and saying this could be a similar type of problem. Seth and I spent a few hours debugging the linux problem at least 6 months to a year ago, and I don't remember what the bug number was. It was resolved wontfix - I believe it had to do with shutting down dns and then not being able to restart it - perhaps the relevant code was rewritten...I'll see if I can find the old bug.
OK, I found the linux bug - it was eventually fixed, but the history seems a bit complicated. Here are a few of the bugs: bug 63564, bug 83387, and a whole host of related/duped bugs.
Seth and I did some research on this and we determined that there's something going wrong with the event queues on the mac that causes this spinning. If you perform these steps: Steps to reproduce: 1.Login to your imap mail (either new or migrated profile) 2.Go offline via file menu or icon (doesn't matter if you download mail or not) 3.Go back online 4.Click on a folder 5. Try to open a browser window Both 4 and 5 will spin, so somehow event queues are messed up in general. Now, if you go offline and online again, things work again. Somehow, going offline again and online (from either browser or mail), clean up all the event queues. You see a bunch of messages printed out about event queues getting cleaned up. I'm hoping that a mac person who's familiar with necko and event queues can try this on the mac and figure out what's going on here.
Danm is the event queue guru
Wanted to add some info which may/may not help? I was testing 2002-02-18-08-trunk on mac 9.2.2. I was downloading mail/news mesgs, i went offline, I think I went back online and then tried to quit. Well it actually crashed and got a talkback report. I was not successful in getting another talkback report as I tried to reproduce my steps about 4 more times. I don't know if this is a fluke or possibly not even remotely related to this bug. But I wanted to give you stack trace and talkback id in case it can lead to the soln. TBK: TB3079082Y htmlparser.shlb + 0x5e5e0 (0x3e5ddf70) htmlparser.shlb + 0x27c08 (0x3e5a7598) htmlparser.shlb + 0x2a430 (0x3e5a9dc0) uriloader.shlb + 0xba4 (0x3e877ed4) mimeEmitter.shlb + 0x6cb8 (0x3e956518) Mime.shlb + 0x29140 (0x3e274ed0) uriloader.shlb + 0xca4 (0x3e877fd4) Necko.shlb + 0x6d540 (0x3ea2b8b0) Necko.shlb + 0x10d0 (0x3e9bf440) Necko.shlb + 0x230 (0x3e9be5a0) PL_HandleEvent() [plevent.c, line 590] PL_ProcessPendingEvents() [plevent.c, line 520] nsEventQueueImpl::ProcessPendingEvents() [nsEventQueue.cpp, line 388] nsEventQueueImpl::ProcessPendingEvents() [nsEventQueue.cpp, line 394] NS_ShutdownXPCOM() [nsXPComInit.cpp, line 552] Netscape 6 + 0x63f0 (0x3f6e5650) Netscape 6 + 0x1bc00 (0x3f6fae60)
Dan, any clues? Gary, that crash looks related, in the sense that there seem to be some event queues that are living longer than they should (just a guess) and crashing when we process their pending events at shutdown.
Seth and I have traced this down to things going wrong in necko/nspr because the imap code is trying to write out data to the socket ("Logout") and then cancelling the read request. Once we've done that, we're horked as far as making new connections is concerned. Here's a snippet from the nsSocketTransportLog: 0[b1ad61c]: nsSocketBOS PR_Write [this=b49b508] wrote 0 of 11 0[b1ad61c]: nsSocketBS::ReleaseSocket [this=b49b508 sock=ba79b1c] 0[b1ad61c]: nsSocketBS::GetTransport [this=b49b508 mTransport=b6e97b8] 0[b1ad61c]: nsSocketTransport::ReleaseSocket [this=b6e97b8 sock=ba79b1c] 0[b1ad61c]: nsSocketTransport::CloseConnection [this=b6e97b8] Calling PR_Close 0[b1ad61c]: nsSocketRequest: Cancel [this=bb2f94c status=804b0002] 16[b4fb0e4]: nsSocketBS::SetTransport [this=b492638 transport=0] 16[b4fb0e4]: nsSocketTransport::ClearSocketBS [this=b6e9700 bs=b492638] 16[b4fb0e4]: nsSocketTransport: Deleting [cyrus.andrew.cmu.edu:143 b6e9700], TotalCreated=3, TotalDeleted=2 18[bcbe784]: nsSocketBS::SetTransport [this=b49b508 transport=0] 18[bcbe784]: nsSocketTransport::ClearSocketBS [this=b6e97b8 bs=b49b508] 0[b1ad61c]: Assertion failure: pollDesc->pollingThread->state != _PR_DEAD_STATE, at prpolevt.c:287 0[b1ad61c]: nsSocketTransport: Resuming [cyrus.andrew.cmu.edu:143 b6e97b8]. rv = 80004005 0[b1ad61c]: nsSocketTransport: Creating [b6e9700], TotalCreated=4, TotalDeleted=2 Note the assertion failure in proplevt.c - after that, we can't make more connections because AddToWorkQ fails. here's the imap code that's causing the problem in nsImapProtocol::TellThreadToDie: if (NS_SUCCEEDED(rv) && TestFlag(IMAP_CONNECTION_IS_OPEN) && m_outputStream) { IncrementCommandTagNumber(); command = GetServerCommandTag(); command.Append(" logout" CRLF); rv = m_outputStream->Write(command.get(), command.Length(), &writeCount); Log("SendData", "TellThreadToDie", command.get()); } if (mAsyncReadRequest) mAsyncReadRequest->Cancel(NS_BINDING_ABORTED); PR_EnterMonitor(m_threadDeathMonitor); If we put a call to PR_Sleep(500 milliseconds) between the call to write out the "logout" command and the call to cancel the async read request, then this bug goes away. We tried shorter intervals, and the problem still occurred, which makes me worry that more than just a thread context switch is required here, but perhaps we even have to read the response back from the server to avoid horking things. We would rather not have to wait for a response, because this code is usually run at shutdown to tell the server to logout of the connections and we don't want to have to hold shutdown hostage to the server response. What we used to do in 4.x was just get hold of the socket directly and write the logout command directly,though I'm guessing that's not possible in mozilla. Can any necko people give us some advice about how to proceed? tia, - David
ok, try this: cancel the read request w/ NS_OK instead of an error code. that way you aren't forcing the socket transport to PR_Close the file descriptor. i suspect since the mac NSPR socket i/o is layered on top of the async native model, you end up with this race condition. canceling w/o error will terminate your async read w/o killing the socket. necko will PR_Close the socket once you release your last reference to the socket transport, which may be an indirect reference via the read request interface. there's a chance that this will solve the bug, or it may only hide it. we may need to fix NSPR to gracefully handle PR_Close when some data still needs to be flushed to the socket. wtc, sfraser, gordon?
Attached patch proposed fixSplinter Review
thanks, Darin, that fixed it for us.
Comment on attachment 72659 [details] [diff] [review] proposed fix r=sspitzer (or sr=, which ever you need) I tested this patch on the mac, and it fixed the problem for me. perhaps we can get darin to do the sr=, as he suggested the fix.
Attachment #72659 - Flags: review+
Comment on attachment 72659 [details] [diff] [review] proposed fix sr=darin, though i wonder if this fixes the problem 100%... is there still a race condition that we are now winning 99% of the time?
Attachment #72659 - Flags: superreview+
Actually, I believe the fix for http://bugzilla.mozilla.org/show_bug.cgi?id=104020 is what really caused this problem.
fixes for this and 104020 checked in under 104020.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Using commercial trnk Mac 9.2.2 2002-3-19-08 trunk Mac 10.1.3 2002-3-19-03 trunk Verified, there is no longer a hang when going from offline back to online. Tried different scenarios (download/sync, offline via icon, etc..) and no problems when going back online. No spinning status bar. No hourglass mouse icon. Tried in both themes. marking as verified.
Status: RESOLVED → VERIFIED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: