anyone else seeing this? this is on the tip, about 11:00 pm, tuesday.
adding granrose. this will probably turn into a blocker tomorrow morning.
at 21:49 a fix for bug 34359 was checked in. Perhaps related?
Yes, the fix for 34359 looks like the highly probably cause for this regression. I'll try backing it out in a minute.
The fix just made sure that timeouts are honored when the transport in Read/Write state. If it caused the regression - it might be cuz imap handler was using nsSocketTransport in unexpected way, like reaving it in the state with some outstanding reads in progress.
Is this correcteded for this morning's release builds?
No, ruslan is blaming the imap code, so I guess we have to debug on our end. Unfortunately, I can't debug because I can't launch the product.
mark smoketest then since we won't be able to use the product very much with this. (I don't know about the product launching bug.)
Severity: normal → critical
Is this a Smoketest blocker? I'm able to launch, compose and send, read msg using today's build. I do see the "time out" alert sometimes but the mail is delivered to the recipient.
If the error msg does not come up all the time and doesn't prevent from using, then no, not a smoketest blocker. Suresh/Esther - will let you guys make the call on this.
sending mail only uses IMAP to put a copy of the sent message in the sent mail folder. Opening imap folders, getting new mail, and reading messages would do a lot more IMAP stuff. Unfortunately, I can't do anything about this since my clobber build still won't run.
It's easy to roll back the fix for timeouts - but I rather not to do it without a good reason as I don't see anything wrong with semantic of the socket transport. If there're outstanding reads/writes - it has to be able to time out. Otherwise if you hit a webpage which never produces any result - the browser would sit and wait there forever.
hey ruslan did you see these timeouts when you ran the pre-checkin tests for mailnews? If we can't get a lead on what exactly in imap is triggering on this then I think we need to back out the changes that introduced the regression until we can figure out what in imap (or in the netwerk socket) needs to change...IMHO. I'm starting a new build right now and will report back when it's ready to go since david b. is having trouble with his build.
my win32 rebuild is about 30 minutes away from being done. I'll help debug / resolve this.
Technically we can woraround this in the future by introducing different timeout setters (for connect/read/write) on the socket, so each protocol can decide whether to use them or not. But it's still be nice to find out what exactly is happening in case of imap.
agreed, we should find out what's going on with imap.
marking blocker so it shows up on the radar.
Is anybody looking at this?
Scott, comments seem to indicate you're going to handle getting the build working again. Send to David for the long term fix when you're done?
Assignee: selmer → mscott
Target Milestone: --- → M15
Steve, Scott is really the king of sockets and channels and mock channels and necko interaction.
accepting...I have a build I'm playing with now. But if we don't get traction on this soon and if this becomes our last blocker keeping the tree closed then we may need to back out the nsSocketTransport changes out until we can figure this out.
Status: NEW → ASSIGNED
Unless I'm missing something the problem seems pretty obvious. The socket is initialized with a default time out time which is really small (which is good for connectionless protocols like http). The timeout is currently: DEFAULT_SOCKET_TIMEOUT_IN_MS 35*1000 If we don't read or write from the socket within that time period then this timeout alert is coming up. This is bogus for connection based protocols like imap where we ALWAYS have an open connection to the server. Even if we aren't actively reading or writing something from/to the connection. This isn't to say imap doesn't have a timeout duration. It does, but it happens to be a user configured pref that we manage on the protocol side not the socket side. We need to expose an interface to allow a protocol to set the timeout field on a network socket (no such interface exists yet since a protocol talks to a socket via the nsIChannel interface and it doesn't provide such a call). Then imap can set the same timeout it is using (which is usually on the order of minutes) and we shouldn't see this dialog. For now I'm thinking we need to back out the change to the socket transport until such an interface is provided.
Ruslan, the only part that we need to back out is the following change to nsSocketTransport::CheckForTimeout: if ((mCurrentState == eSocketState_WaitConnect || mCurrentState == eSocketState_WaitReadWrite) && idleInterval >= gTimeoutInterval) should be if ((mCurrentState == eSocketState_WaitConnect) && idleInterval >= gTimeoutInterval) Can you please review this so we can get this off the blocker list? Thanks!
this bug is also horking folder discovery the first time I log into my imap server.
mscott's fix is working for me. at least now I can use mailnews again. are you going to check this in?
Scott, We already time out connections, so the new code was only for read/write timeout. I think you should back out the stuff Ruslan did if he isn't around to do it right now. If anything, there should be a setter for the timeout value (that http would use). Not setting it would == infinite timeout.
I can check it in. But let's not close a bug and have somebody look into it, ok?
I actually already checked it in. I'd like to mark this one closed and re-open a new one on the socket service to allow http to set a timeout value on the socket. How does that sound to you? I really don't believe there is any "hanging AsyncRead" problem here in the imap case. We maintain a connection to the server and read and write to it all the timeout. We manage the timeout at the protocol level if we decide to kill a connection because we think it's timed out (as i said, it's a user controlled pref for imap). I agree with warren, if a timeout is set on the socket, we should use it (ala the http case). If there isn't a timeout set in the read write case, then it should be an infinite timeout. I'll mark this fixed and open a new networking bug.
Status: ASSIGNED → RESOLVED
Last Resolved: 19 years ago
Fine by me. I already opened up a bug against myself to add setters? Do you want me to quickly put it into M15 or M16 is fine?
It's up to you. if you want http timeouts to work for M15 then I guess it should be M15 otherwise M16. I think http would be the consumer of your setter since it's the only one interested in setting the timeout right now...
Update: Using build 2000-04-05-21 on win98 and 2000-04-06-10 on linux this is fixed. Still need to verify on Mac
Status: RESOLVED → VERIFIED
I added default timeout of 30 seconds for http connections (Can be changed via prefs). If 30 seconds is not enough we can make it bigger.
IMAP connections need to stay alive 30 minutes. But you only changed http connections, so that should NOT affect this bug, right?
Right. Sorry. Just updated the wrong bug :-) Imap connections are not affected at all.
mid-air collision ? / bugzilla cleanup Reopening (current State: verfied and no resolution)
Status: VERIFIED → REOPENED
Status: REOPENED → RESOLVED
Last Resolved: 19 years ago → 18 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.