Closed Bug 218414 Opened 21 years ago Closed 21 years ago

IMAP folder becomes corrupt; causes crash when clicked on [@ FnSortIdKeyPtr ][@ MSVCRT.DLL]

Categories

(MailNews Core :: Database, defect)

x86
Windows 2000
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: coldchrist, Assigned: Bienvenu)

Details

(Keywords: crash, stackwanted, Whiteboard: TB23827388H)

Crash Data

Attachments

(6 files)

User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; T312461; .NET CLR 1.1.4322) Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5b) Gecko/20030827 I have created several server-side IMAP folder for filtering work and personal email, since I access the same account from two locations. Periodically one of the folders will become corrupt. I haven't been able to determine why it happens. The symptom is that most (not all) subsequent actions taken on that folder cause an immediate crash. Clicking on the folder, or moving messages into it with drag and drop, will cause the error. Accessing the folder via webmail using exchange works fine. I have deleted and recreated the folder a couple of times and this fixes the problem temporarily. The last time it happened I renamed the folder by right-clicking on it; this worked and fixed the problem. If this turns out to be a reliable fix it will be an acceptable workaround. I have submitted three or four talkbacks related to this. Reproducible: Sometimes Steps to Reproduce: 1. Create multiple server side folders with IMAP. 2. Create filters that move messages from the inbox to these folders. (I don't know if this step is necessary but that's how I have it set up.) 2. Move messages from inbox to these folders; read messages, file them locally, and continue to work with the folders. Actual Results: About once a day the above actions caused a talkback crash. The rest of the time everything was fine. Expected Results: Not crashed. I have junk filtering turned on; don't know if that's relevant.
Severity: normal → critical
Keywords: crash
Mike: Could you provide TalkBack incident ID? After Mozilla crashed the TalkBack will runned and it will automatically submit the crash. Than run "mozilla.org/bin/components/talkback.exe" manually and post the talkback ID of that crash in this bug.
I will post the talkback incident ID next time it happens; or can I run that .exe right now and have it pick up the last one? If it helps, all the 1.5b talkbacks I have experience have related to this bug (I'm pretty sure) so if you can locate them via my email address that should get you to it.
Here are the talkback incident ids for the four crashes I experienced that I think are related to this: TB23354415M TB23329434X TB23327982X TB23269562X
Keywords: stackwanted
Whiteboard: TB23354415M, TB23329434X, TB23327982X, TB23269562X
This problem has now occurred on my XP Pro box too. It manifested slightly differently there; it didn't crash when I clicked on the folder, but gave me an hourglass cursor. Then when I clicked on a message in the folder I got a crash. Talkback incident ID is TB23557497X.
Keywords: stackwanted
Whiteboard: TB23354415M, TB23329434X, TB23327982X, TB23269562X
-> Mail DB
Assignee: sspitzer → bienvenu
Component: Mail Window Front End → Mail Database
Summary: IMAP folder becomes corrupt; causes crash when clicked on → IMAP folder becomes corrupt; causes crash when clicked on [@ FnSortIdKeyPtr ][@ MSVCRT.DLL]
I don't know if this is useful information, but in the multiple occurrences I have seen of this bug in the last week or so, renaming the folder is a completely reliable workaround. To do this, identify the folder that causes the crash when clicked on; then right-click on it and choose rename. Enter a new name and hit Enter. The problem is now resolved and you can rename back to the old name and continue to work.
This has now occurred on the 1.5 release candidate version. Details: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20030916 Talkback incident id is TB23827388H. As before I was able to correct the problem by right-clicking and renaming the folder.
Keywords: stackwanted
Whiteboard: TB23827388H
I bet UID validity has rolled on that folder, causing us problems. I can patch the crash, but I need to find out why we're getting into the view sort code with a null db.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Actually, a protocol log of selecting a folder that crashes might be helpful: http://www.mozilla.org/quality/mailnews/mail-troubleshoot.html#imap it's not obvious to me why we're crashing. If we had a null db, we would have crashed earlier, I think, and I don't think we'd crash in msvcrt.dll. I tried forcing a rolling of uid validity and that worked fine for me. Also, what view are you using? Are you sorted by subject or sender, or in threaded mode? thx, - David
I have set up the log and will attach it to this bug when the next crash occurs. The crashes are usually days apart so this log will probably be large; is it OK to take the tail couple of hundred or thousand lines? The view I use is order received/ascending. Going down the view menu options I have all messages, all threads, normal headers, original HTML. I should also mention that this only ever has happened on the server-side folders I created, and never on the built-in folders such as Junk -- which is fortunate because I wouldn't be able to rename those.
While waiting for the next crash, I was trying to think about what they all had in common. One thing I am fairly sure they had in common was that in each case there were unread messages in the folder. That is, I had clicked on "Get Msgs" or hit Ctrl-T and new messages had come in and been filtered to the folder in question. Then when I clicked on the folder I got the crash. I can't be a hundred percent sure this is the case for every crash, but it is true for the last two or three at least.
Mike, I 'm not quite sure what this last comment means: "there were unread messages in the folder. That is, I had clicked on "Get Msgs" or hit Ctrl-T and new messages had come in and been filtered to the folder in question. Then when I clicked on the folder I got the crash." Does this mean that Mozilla filtered new messages into the folder in question, or server side filters did that? re the log, I'd like to at least see the part of the log where we select the folder in question and download flags+hdrs. Do you notice us downloading all the headers for the folder before it crashes?
Sorry if the previous was unclear. I meant that Mozilla's filters, which I had configured, moved the unread messages into the server side folders. There are no server side filters in operation. The Mozilla filters operated correctly and the server-side folders that received emails were bolded to show that there were unread messages in them, as you would expect. Then when I clicked on a folder in the left hand pane, in order to view the message headers in the right hand pane, I got the crash. When I restart Mozilla, the folder still shows unread messages in it (unsurprisingly). It will crash again every time I click on it, until I do the rename (as described in comment #7 above). It is also the case that when this happens, another folder will be OK at the same time. For example: I have two server-side folders called "personal" and "work". I receive ten messages, five are filtered into personal and five into work. Everything looks normal -- both folder names are bolded. Then I click on work and it crashes. I restart; click on personal and everything is fine; I can read those messages. I click on work and it crashes again. It'll keep crashing till I rename the folder from work to something else. I believe, though I haven't tried this recently so I can't be sure, that while the folder is in this "corrupt" state, I can filter further messages into it if I press "Get Msgs" again. I can try that again next time if it would be interesting. I will send just the tail of the IMAP file as you request, when the next crash occurs. Thanks.
sorry, when you write server-side folders, I thought "server side filters", partly because I thought server side filters might throw off the uid validity. If you say IMAP folders, I'll understand better :-) - but client side filters seem unlikely to throw off uid validity, so I'm a bit confused...
I don't know much about how IMAP works, I'm afraid, so I don't understand the UID validity comments. Are you suggesting that the IMAP folders on the server get into a state which Mozilla can't handle? Or just that Mozilla's client-side representation of them gets into a corrupt state? I would think it has to be the latter, because these folders are still accessible via my webmail IMAP interface. I don't know the technology involved but via my https://webmail.[etc.]/exchange portal I can see and read messages in the folder that has been corrupted. There are no problems there at all. I agree it's hard to see how client-side filters can cause a problem on the server; presumably all they're doing is issuing requests to the server to move the messages from INBOX to "work" and so on.
sorry, UID VALIDITY is just a value that the client and server use to agree on the state of the folder. If it changes on the server, the client has to throw away all the information it has cached about the folder. It's not an error or a corruption; it's just a rare situation.
I'm attaching the tail of an IMAP log. The problem it precedes is slightly different from the one I've reported here, but since the fix was identical I suspect it may be related. This time, what happened was that when I clicked on "Get Msgs", two messages appeared in the INBOX. Both should have been filtered to the "personal" folder. When I clicked on the "personal" folder, I got an hourglass cursor and could not view the email. Only one of the two emails showed up in the summary (headers) pane. At this point, although the user interface was responsive to clicks, I was unable to view any emails from the server -- it would not refresh. I didn't try looking at client-side copies of emails in local folders. I exited Mozilla but this did not kill the process; I had to kill mozilla.exe via the task manager. Restarting Mozilla did not resolve the problem; when I looked in the "personal" folder it hung again. I had to exit, kill mozilla.exe again, and then restart and right-click on the IMAP folder and rename it. When I did this it fixed the problem. I've attached the last 400 or so lines of the IMAP log; let me know if you need more.
Attached file Tail of an IMAP log
I'm attaching another tail of an IMAP log. This time I got a crash after clicking on the IMAP folder named "work". Talkback ID is TB24003850Q. This IMAP log is for that talkback ID. However, when I restarted Mozilla and clicked on the "work" folder again, it does not crash. It does hang, however. The status line at the bottom of the page says "work Receiving: Message headers 1 of 1". The header pane at the top right does show the header of the new message. For reference, if you're reading the IMAP log the title of the message which caused the problem is "Re: New Meeting?" and the time received is 9:04 a.m. If I click on a different folder, either an IMAP folder or a local folder, the header pane refreshes with the appropriate list of headers. I can click on the headers and the highlight shows up correctly. However, the message body pane remains blank throughout, and the status bar continues to say "work Receiving: Message headers 1 of 1". The cursor is an hourglass in the folder pane and the header pane, and it's the normal pointer in the (empty) message pane. When I exit and restart, I get asked to log in with a different profile because Mozilla thinks it is still logged in. I exited that dialog, killed Mozilla via the task manager, restarted and renamed the "work" folder, and everything is now OK again.
Is it possible that this problem is caused by a memory leak? I ask because, having gone into the task manager several times to kill mozilla.exe, I noticed that the memory usage seems quite high. When I start Mozilla it's around 33Mb; it's currently 62Mb, and when I've killed it it's always been around 60-80Mb. Is that within what the expected range of memory usage for Mozilla? I just tried experimenting with a couple of actions and watching the memory usage. It seems to be the case that when you click on an IMAP folder and there are new headers there, that the total memory usage increases significantly. I tried forwarding a 0.5 Mb file to myself over and over again; this raised the memory usage from 62Mb to 70Mb in just a few minutes. When I deleted the test messages the memory didn't reduce, but the next time I downloaded a test message the memory did not increase either.
the log seems mostly OK - this is a little suspicious, however: 2020[3415e08]: 34c2e28:imap.athensgroup.com:S-INBOX:ProcessCurrentURL: entering 2020[3415e08]: 34c2e28:imap.athensgroup.com:S-INBOX:ProcessCurrentURL: entering 2020[3415e08]: 34c2e28:imap.athensgroup.com:S-INBOX:ProcessCurrentURL: entering 2020[3415e08]: 34c2e28:imap.athensgroup.com:S-INBOX:SendData: 29 uid copy 21502,21504,21506:21507 "Junk" Not sure why you'd get three of those in a row, without anything else in between
This has occurred a couple more times but I assume you have enough data now, so I've not been submitting more logs or talkbacks. I did check some recent logs and the "ProcessCurrentURL: entering" line does repeat in multiple locations; sometimes twice, a few times three times, as in the example you noted. It did occur three times in a row just prior to a recent crash; I didn't check every one of them. Let me know if you need either more logs or more talkbacks. One other item I noticed. When the crash occurs, the IMAP folder which is going to cause the problem is getting new files from the filters. The folder name (most recently "work") goes bold to indicate there are new files. Typically the next thing you see is that the number of new files is updated in that pane, and bolded. However, the crash occurs before that update, so that when I am looking at the crash popup, I can see the word "work" in bold, but no updated number in the folder pane. In addition, when I restart, the folder is no longer bolded. I hope this is useful in tracking down the location of the problem.
Mike, if you download tomorrow's 1.6 trunk build, it has more logging in it that might tells us what urls are getting run in sequence like that with "entering". I'll also try to check in some bulletproofing for potential crashes with that stack trace. Re memory leaks, no, that's not the cause of the problem. I'm still mystified why you'd be in this code at all if your sort order is order received since that code is for when you're sorted by a string field like sender or subject. I tried pretending that uid validity had changed and was unable to recreate this problem.
I'm *usually* in order received sort, but I will check for sure next time. However, I just looked at the work folder and it is currently in date order, sorted with the most recent date at the bottom of the list. I don't recall changing it since the most recent crash, earlier today, so I think it's most likely that's the sort I was using. I'll grab the 1.6 build tomorrow; thanks. By "more logging", I take it you mean IMAP logging, so I'll resend that log for the next crash.
yes, thx, I meant IMAP log. The next time this happens (i.e., you crash after selecting the folder), could you try moving away the .msf file for the folder (it would be called something like "work folder.msf" in your user profile directory), and restarting, selecting the folder and see if you still crash? Also, I don't remember if I've asked you this before - it's a long shot, but do you have multiple mozilla accounts or just a single account? If you have multiple mozilla accounts, have you changed the local directory for any of them to point to the same local directory as another account?
OK, got a bit more info. I decided to leave it on date sort, and bingo, another crash. What I noticed this time that I hadn't noticed before is that there was no little triangle indicating the sort order on the header pane column headers. When I restarted and renamed the work folder, and redisplayed it, the sort order came up as date, which is correct. So perhaps the underlying problem is that the sort order is getting lost somehow.
working backwards from the talkback logs, I think we're crashing because a collation key is null, and strcmp is crashing. So this patch bulletproofs that.
I think your comparison is the wrong way around. When p2->key is null, you return 1, indicating that the keys are in order. Also, you check rv when you don't need to. And you only fixed one call site!
Attachment #132860 - Flags: superreview?(bienvenu)
Attachment #132860 - Flags: review?(bienvenu)
Comment on attachment 132860 [details] [diff] [review] Supplementary patch Neil, you're missing the point of what I intended - if both key1 and key2 are empty, then we fall through to using the id as the tie-breaker. Also, I'm trying to see if this really is what's causing the crash. I'll attach a new patch fixing the sense, if I got it wrong, and catching the other place this can go wrong. But why are we getting empty collation keys in the first place?
Attachment #132860 - Flags: review?(bienvenu) → review-
my intent was to make empty collation keys sort to the top when sorted from a-z, like blank strings - I think my patch does that. If p1 is not empty and p2 is, p1 > p2, so we return 1. If p1 is empty and p2 is not, we return -1. If they're both empty, we use the id as tie breaker. rv is used for the assertion that CompareCollationKeys succeeded, and I didn't change that.
I misunderstood your comment about rv, we were checking an uninitialized rv if we had an empty collation key.
Comment on attachment 132878 [details] [diff] [review] check other call site, and init rv to NS_OK OK, so I didn't get everything right, but I didn't get everything wrong either ;-)
Attachment #132878 - Flags: review+
The key could be null if the code to generate a key fails (line 3322 of nsMsgDBView.cpp) - perhaps there is a problem fetching the header string or converting it to unicode.
Mike, have you had a chance to try a 1.6 build with this fix in it? I think any build from 10/10 or later will have the fix. I'm going to mark this fixed - if you still see this problem, could you re-open this?
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
I downloaded a 1.6 build on 10/8 and the problem has not reoccurred. If it does, I'll reopen this. Thanks for the fix -- much appreciated, as is the rest of Mozilla!
Attachment #132860 - Flags: superreview?(bienvenu)
Product: MailNews → Core
Product: Core → MailNews Core
Crash Signature: [@ FnSortIdKeyPtr ] [@ MSVCRT.DLL]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: