Closed
Bug 17065
Opened 25 years ago
Closed 25 years ago
[DOGFOOD] Messenger stalls opening IMAP Inbox
Categories
(MailNews Core :: Networking, defect, P3)
Tracking
(Not tracked)
VERIFIED
FIXED
M13
People
(Reporter: trudelle, Assigned: mscott)
References
Details
(Whiteboard: [PDT+] Verified)
Today's opt bits (and last few days too)
Using existing profile/account, or creating all new ones.
Open Messenger
Select Inbox
throbber animates, status
> Receiving message headers ## of ###
## increases to a few dozen, then stalls.
I left it overnight, and it never got any further, nor timed out.
Updated•25 years ago
|
Assignee: phil → mscott
Summary: Messenger stalls opening IMAP Inbox → [DOGFOOD] Messenger stalls opening IMAP Inbox
Comment 1•25 years ago
|
||
Reassign to mscott, cc alecf, nominate for dogfood
Assignee | ||
Comment 2•25 years ago
|
||
I saw this too earlier this week and I thought I fixed it with some changes
that
went in on Wednesday. But Peter's reporting this problem on today's build
so the deadlock situation must still be there.
I'll take another look.
Reporter | ||
Comment 3•25 years ago
|
||
Unless I'm not installing correctly. Build ID = 1999102208. That's today,
right?
Assignee | ||
Comment 4•25 years ago
|
||
Using a build from Saturday afternoon, I was finally able to trigger this hang
again.
I think I am seeing the same thing. Here is my sequence:
1. launch today's build (much faster startup) to browswer window
2. select Messenger
3. Double click account name to get tree hierarchy
4. select inbox
5. everything slows way down, CPU usage at 100% on NT, othe apps. don't respond
6. then password dialog comes up (I don't save password)
7. eventually I can enter text in the password dialog but it very, very slow
respond to keyboard input. Hit return
8. Stuff happens very still crawling, about 1-2 minutes later inbox comes up and
CPU usage is back to normal.
9. Windows NT, 128MB RAM
Assignee | ||
Comment 6•25 years ago
|
||
Hey Dave, I think your problem isn't necessarily a hang of the system but
a problem with imap performance which has regressed horribly over the weekend.
I'm also seeing huge delays when trying to bring up modal dialogs and doing
other things. We're tracking that in Bug # 17062. My first guess is to say that
problem is related to all the event queue changes danm has been making recently
but I need to debug it further before reaching that conclusion.
Reporter | ||
Comment 7•25 years ago
|
||
What I'm seeing is Linux-only, doesn't slow anything down, and happens after the
password dialog. It doesn't hang or freeze, just stalls that window while
retrieving headers.
Comment 8•25 years ago
|
||
we talked to daver, his comments only relate to the rather slow IMAP performance
as of late. This bug is still the "Hanging IMAP on linux" bug...
Comment 9•25 years ago
|
||
Let me just add, me too, to this bug. It looks like you've seen this, but if
you need a machine to reproduce this on, my Linux machine is doing this all of
the time.
Reporter | ||
Comment 10•25 years ago
|
||
Still happening every day for me, using the opt bits. Don't know if this is
related, but on the Mac, selecting the Inbox today also stalls, doesn't even get
to the authentication.
Assignee | ||
Comment 11•25 years ago
|
||
I've been trying to fix this today but every time I select an imap folder, I'm
crashing off in some gtk code that I'm not familiar with. I caught a bad checkin
in mid pull.
Reporter | ||
Comment 12•25 years ago
|
||
Today's Linux opt bits segfault on opening folder.
Assignee | ||
Comment 13•25 years ago
|
||
Yeah it turns out the crash I was seeing last night that was preventing me from
debugging this was the same one keeping the tree closed this morning. I just
thought there was something in my tree that was out of date for some reason.
Apparently not. The crash is in some gtk, glib code that I know nothing about.
I'm trying to find an expert on that stuff now. For more info, see Bug #17352
Reporter | ||
Comment 14•25 years ago
|
||
Pavlov is the module owner for GTK.
Assignee | ||
Updated•25 years ago
|
Component: Back End → Networking-Mail
Assignee | ||
Comment 15•25 years ago
|
||
Adding dougt to the cc list for advice. When I am able to reproduce this hang
(or maybe it is another hang altogether), here's what I'm seeing:
1) The imap thread has made a call to the UI thread through a proxied interface.
So the thread is now in the proxy code blocking on the response from the UI
thread and waiting for the response. It's spinning in the code that pumps events
through the nested event queues (i.e. is it process pending events, then sleep,
then process move events)
2) the UI thread doesn't appear to ever process the event generated by the imap
code. On my debug build, the UI thread has a stack trace that looks like:
calling nsAppShell::Run, g_main_run(),
g_main_iterate(), g_hook_next_valid()
My guess is that our event somehow got lost and so the proxy call never returns
with an answer. Hence the imap thread stalls and we never finish parsing the
folder.
Note: I had to jump through a few hoops to reproduce this problem. I had to
delete all my .msf files. Then I opened my inbox. displayed a message. Opened
another folder. Then opened my trash folder which was 800 msgs. I would hang
just about every time trying to parse my trash folder.
So in my example, we would have the UI thread, 2 imap threads (inbox, and
another folder) which were just waiting on a url to run and a 3rd thread which
was the imap thread trying to open the trash folder.
Hmmm.
Updated•25 years ago
|
Target Milestone: M11
Assignee | ||
Comment 16•25 years ago
|
||
I've been playing around with this again this morning and I'm still seeing
the same behavior.
I'm now in a position where I don't have to jump through hoops. Just opening
my trash folder is causing the imap thread to hang for the reasons stated
above. We're blocked waiting for a proxied call to return. But the event
is never reaching the UI thread.
It's getting lost along the way. I think I'm going to need dougt or danm's help
on this puppy.
Assignee | ||
Updated•25 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 17•25 years ago
|
||
I've set printfs in two places:
1) where the imap thread calls the proxied object
2) in the ui thread, a breakpoint inside the real object
Sure enough, in the case where the imap thread hangs, I see the printf
saying we are now calling into the proxied object. And there is no
matching printf in the real object. So the UI thread is never processing
our event.
I'd re-assign to dougt but I think he's going to need me to help debug this
down further...
Comment 18•25 years ago
|
||
Seems bad enough to hold M11
Assignee | ||
Comment 19•25 years ago
|
||
Doug, this problem may be related to some strange stuff I'm seeing with how
linux is processing events. I just posted an email thread asking for help to
the xpcom and unix groups.
Basically, I'm seeing the UI thread taking events out of the imap thread and
processing them. This is leading to bad things for imap because we are
executing
code in the wrong thread!!
I can't tell for sure if that problem is also causing us to block later on when
we make a call through a proxy object.
Comment 20•25 years ago
|
||
this is no longer an M11 stopper, is it? We (mscott and I) checked in some code
last night to get around the problem.
Reporter | ||
Comment 21•25 years ago
|
||
I haven't seen this specific problem in 3 days, but now it is taking about 20
seconds to load even the smallest plain text messages. Is that the 'call
through proxy object' block mentioned above?
Comment 22•25 years ago
|
||
possibly, but it's also possibly just the general slowness painting on Linux -
the meteors run when loading a message because the web shell starts it, and
running the meteors really slows things down.
Assignee | ||
Comment 23•25 years ago
|
||
I think this is definetly still a M-11 stopper. What we did last night was
pretty
horrible for a work around (at least the part about Clearing the ODA
flag) and could easily be contributing to the message display performance
degradation that Peter is seeing in today's build.
I've tried asking for help in the newsgroups but no one's answered. I may have
to read up on glib and gtk to figure out why this code is misbehaving.
Reporter | ||
Comment 24•25 years ago
|
||
No, it isn't just general degradation. My.netscape.com starts to appear in less
than 5 seconds, and finishes in 15 seconds on the same machine. There is
something very wrong with message display time.
Comment 25•25 years ago
|
||
do you mean imap msg display, or msg display in general? How long does a
small news message or local message take to display?
Reporter | ||
Comment 26•25 years ago
|
||
POP message takes about as long. I can't create a news account, setup forces it
into POP.
Comment 27•25 years ago
|
||
yes, that sucks. apparently, drop down combo boxes in dialogs are broken on
mac and linux. Did you try using the arrow keys? I hear they work.
Reporter | ||
Comment 28•25 years ago
|
||
Cursor keys appear to work, but apparently now give the wrong result to the
caller, at least in the last 2 days.
Comment 29•25 years ago
|
||
(The arrow keys used to be a workaround. Now, even after you use the arrow key
to make your selection, it appears that the selection never holds. Bug 15476)
Assignee | ||
Updated•25 years ago
|
Assignee: mscott → brendan
Status: ASSIGNED → NEW
Assignee | ||
Comment 30•25 years ago
|
||
Brendan, this is the bug we talked about in the pork jockey meeting. Basically,
I've verified that necko is placing events such as OnDataAvailable into our imap
thread's event queue.
Then when that event is processed, I've seen that the thread we're running in is
the UI thread and not the imap thread.
From there, strange and erratic behavior happens.
In my posting to the newsgroup, you can see the stack trace where the UI thread
is listening to the UI event queue in main(). And you can then see where the UI
thread calls into the imap thread's event queue, asking it for an event.
Comment 31•25 years ago
|
||
I've tracked this down a bit further, if you look at my reply in the newsgroup
to your posting. Also see 18005
Updated•25 years ago
|
Target Milestone: M11 → M12
Comment 32•25 years ago
|
||
sounds to hairy for this late in m11.
let me know if that changes.
Updated•25 years ago
|
QA Contact: lchiang → huang
Comment 33•25 years ago
|
||
Dogfood bug for M12. Change QA Contact to me. Cc: Lisa.
Updated•25 years ago
|
Status: NEW → ASSIGNED
Comment 34•25 years ago
|
||
Brendan, do you have a projected fix date on this bug?
Comment 35•25 years ago
|
||
Reassigning to pavlov, since it's a linux bug where events are being processed
on wrong thread. Pavlov -- do you have the cycles to look at this?
Updated•25 years ago
|
Assignee: pavlov → dougt
Comment 36•25 years ago
|
||
assigning to me
Updated•25 years ago
|
Status: NEW → ASSIGNED
Whiteboard: [PDT+] → [PDT+] Fix ready, patch sent for review.
Updated•25 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Comment 37•25 years ago
|
||
fix checked in.
Updated•25 years ago
|
Status: RESOLVED → VERIFIED
Whiteboard: [PDT+] Fix ready, patch sent for review. → [PDT+] Verified the fixed already
Comment 38•25 years ago
|
||
Verified fixed on the 12-16-12-M12 commercial build.
Mark as Verified!!
Assignee | ||
Updated•25 years ago
|
Status: VERIFIED → REOPENED
Assignee | ||
Comment 39•25 years ago
|
||
Doug, I didn't think you checked in the gtk changes (removing the implementation
of listen to event queues) to fix this problem, just the assertion code. I
though it didn't get in for M12.
Karen, the reason why this worked for you is because bienvenu and I have code
changes in place that hid the problem before. Those are still in place. After
we're sure that dougt checked in the gtk related changes, we need to pull out
these band aids.
Then you can verify this bug. I'm going to clear the verified resolution for
now...doug can you confirm if all your patches are in or just the bandaid
assertion stuff? We should leave as opened if that's the case
Assignee | ||
Updated•25 years ago
|
Resolution: FIXED → ---
Assignee | ||
Comment 40•25 years ago
|
||
re-opening so bievnenu and I can take out our imap hacks to hide the threading
problem. dougt, if your gtk changes are checked in, just re-assign to me and
I'll remove our hacks then karen can verify this.
Updated•25 years ago
|
Whiteboard: [PDT+] Verified the fixed already → [PDT+]
Comment 41•25 years ago
|
||
OK. Removed "verified" from the Status Whiteboard.
Assignee | ||
Updated•25 years ago
|
Whiteboard: [PDT+] → [PDT+] Verified the fixed already
Assignee | ||
Comment 42•25 years ago
|
||
Doug, is there any chance we can comment out those assertions? I get lots of
crashes on my linux box when working in imap because are generating so many
events, these asserts about events getting processed on the wrong thread get
dumped to the console. Eventually, i'm just seeing a solid wall of these
assertions getting dumped to the console and it can't keep causing me to crash
in the printf in the assertion.
Assignee | ||
Updated•25 years ago
|
Whiteboard: [PDT+] Verified the fixed already → [PDT+]
Assignee | ||
Comment 43•25 years ago
|
||
I stomped on Karen's status white board change.
Comment 44•25 years ago
|
||
yeah, the new problem is that it takes up 100% CPU time while using IMAP
Updated•25 years ago
|
Assignee: dougt → mscott
Status: REOPENED → NEW
Comment 45•25 years ago
|
||
Take out your hacks. The fix that I put in for m12 will make sure that your
events will not get processed on the wrong thread. However, you will still see
asserts! The fix to the gtk widget was not checked in since it seamed to break
your password dialog from coming up. I am not sure why this is, and have not
really had a change to debug this.
Comment 46•25 years ago
|
||
final m12 candidates are spinnning now. moving to m13.
if we fall off track and need to respin m12 for some
yet unknown reason we can consider this if you get
a fix in hand.
Assignee | ||
Comment 47•25 years ago
|
||
This isn't something we need to fix for M12 Chris. So need to worry about taking
a fix for it later.
Doug, I'll get permission from Chris H. to take out our imap hacks and then I'll
send this back to you as I know you were helping us track down why we still get
the asserts on linux =).
Assignee | ||
Updated•25 years ago
|
Whiteboard: [PDT+] → [PDT+] (mscott's part is NOT PDT+)
Assignee | ||
Comment 48•25 years ago
|
||
The changes I need to make here for removing our imap hacks is definetly not
PDT+. Doug, you posted a sweet patch for gtk yesterday that fixes the asserts we
were seeing. Can I assign this bug to you so you can check that patch in with
this bug report?
Then you can bounce it back to me, I'll remove the PDT label for my stuff and
check in the imap changes.
How does this sound?
Assignee | ||
Comment 49•25 years ago
|
||
*** Bug 22668 has been marked as a duplicate of this bug. ***
Assignee | ||
Updated•25 years ago
|
Status: NEW → RESOLVED
Closed: 25 years ago → 25 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 50•25 years ago
|
||
Okay, doug's stuff is in. I finally removed my imap hack in
CreateNewLineFromSocket which was there because events were getting processed on
the wrong thread. I parsed a 12,000 message folder (with no.msf) and didn't stall.
Marking this as fixed as my part of the bug is now done too.
When you go to verify it might help to make sure the cpu on your linux box isn't
spiked after you connect to an imap servr (but aren't doing anything).
Comment 51•25 years ago
|
||
OK. I will verify this bug, then.
Updated•25 years ago
|
Status: RESOLVED → VERIFIED
Whiteboard: [PDT+] (mscott's part is NOT PDT+) → [PDT+] Verified
Comment 52•25 years ago
|
||
Verfied on Linux 2000-01-07-08-M13 commercial build.
Used System Processor: Pentium/200MHz can read imap 5000 msgs for 55 seconds.
There is no stall anymore when reading large imap messages.
CPU is not hanging on 100% anymore when idling short time.
Updating Status Whiteboard and marking as verified!!
Updated•20 years ago
|
Product: MailNews → Core
Updated•16 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•