Closed Bug 240788 Opened 21 years ago Closed 19 years ago

Marking Message as Junk Hangs Thunderbird

Categories

(Thunderbird :: Mail Window Front End, defect)

x86
Windows XP
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 3

People

(Reporter: mozilla, Assigned: jeongkyu.kim)

References

Details

(Keywords: dataloss, verified1.8.1.3)

Attachments

(1 file)

20040405 through 20040417 trunk builds Attempting to mark any message as junk hangs Thunderbird with 99% CPU. Deleting training.dat fixes the problem. This is a VERY large training.dat that has been learning for several years. I would rather not just delete it, but instead figure out why it broke. The training.dat file is available in 7zip format at http://testing.bakerweb.biz/mozilla/training.7z
This error corrupted my Inbox. Luckily I had a backup that was run this morning.
Keywords: dataloss
Further details: I was having no problem with Thunderbird up until this morning when I downloaded and ran the 20040417 trunk build. It proceeded to download some messages to the Inbox that were obviously junk. At the end of the download process it froze displaying downloading 7 of 7 messages and the junk mail stayed in the Inbox. After waiting about 10 minutes I killed the task and tried marking the junk messages as Junk with an older build - same problem. I tried deleting training.dat and it worked ok. If I restore training.dat from a backup made just a few hours before this hang, everything seems to be ok.
I have a similar "complaint", so let me add comments here instead of creating a new bug. The issue I have is that when a lot of messages are selected (in my case over 600) and I want to mark them all as not junk (I just fetched my mail for the first time from the remote account, and I know all of the messages to be valid), then the email client takes a long time to process all of those message as being not junk. I realize that their may be some expensive underlying processes at work in marking 600 message as not being junk -- that's fine. But if a process is going to take a long time, it's good defensive UI design to tell the user this information. I would have liked to see a message to the effect that Thunderbird was "working..." and that the delay (i.e. freeze) was temporary. I'm on Windows XP, running on a dual-proc 2.6GHz machine with 1GB RAM, and this process took about 20 seconds of freeze-time for me. I can imagine that many Thunderbird users will have significantly slower machines and will benefit even more from proactive notification about a long process. Thanks for all the good work!
Same problem here as in comment 3: marking as junk (or as not junk) takes a long time, sometimes several minutes or more. It also takes a huge amount of memory. For example, marking 60K spams as such took an hour and 1.6GB of virtual memory (Athlon 3000+, 1GB of RAM; due to swapping, CPU was was at about 30%). During this time, Thunderbird was completely unresponsive, and didn't even properly refresh its display.
(In reply to comment #0) : I got the same problem 2 days ago, after 3months without any problem. Thunderbird hangs with 99% or 100% CPU, indefinitely, just after it has finished downloading mails from one of my mailboxes (this is when it should apply all user-defined filters and anti-spam filters). All what I tried didn't work : - reinstalling Thundirbird (1.0.2 20050317) - compacting all mail folders - deactivating all extensions And just now I came across this comment from Jerry Baker, saying that deleting training.dat fixed the problem. I deleted my own training.dat file, and it's working fine now. Thanks :-) My training.dat file is 1.19MB in size, I kept a copy if someone needs.
I found that the traninig file (from original reporter) seems to be somewhat corrupted and caused infinite loop in tb's bayesian filter. The format of the file could be found in nsbayesianfilter.cpp. /* Format of the training file for version 1: [0xFEEDFACE] [number good messages][number bad messages] [number good tokens] [count][length of word]word ... [number bad tokens] [count][length of word]word ... */ The problem was that the posted training file had huge 'length of word' value (2,217919,599) in one of the bad tokens. When thunderbird reads tokens, it doubles up suggested buffer size from 4096 bytes, which caused overflow in this case. And this overflow turned the buffer size into 0, and doubling had no other effect than infinite loop (see code below). static PRBool readTokens(...) { ... PRUint32 size; if (readUInt32(stream, &size) != 1) break; if (size >= bufferSize) { delete[] buffer; PRUint32 newBufferSize = 2 * bufferSize; while (size >= newBufferSize) newBufferSize *= 2; <-- infinite loop here ... } One possible solution is to ignore rest of the file when there is a overflow. A side effect of this is to lose lose training information in the rest of the file. But, that is better than infinite loop. :-) I will post my patch to handle this. By the way, is there any specific reason to use binary format for training data? In my opinion, it would be better to use text format considering the the case such as this issue.
see comment #6 seeking r?
Attachment #184221 - Flags: review?(mscott)
Thx for tracking this down Jeongkyu. One reason to use a binary file is that this file can get very large, and a binary format saves some space, which should speed up loading of the file. With your patch, what happens going forward after we throw away the rest of the file? Does new training information get remembered after that? I doubt we can recover from a corruption like this w/o throwing away the rest of the data, so this is probably the best we can do.
(In reply to comment #8) Thanks for your comment, David. I also agree that binary format is good for performance. As well as we can save sapce, we also don't need to convert string into number. However, one drawback is lack of human-readability. Because of this, it is very hard to make a diagnosis or fix it when some problems happend. Actually, effects of corrupted training data can be vary. In the case of original reporter, size of more than 2G leads overflow and infinite loop. If it was 1G than tb would try to allocate 2G heap memory. So, I belive that it is worth to think about using text format. The performance hit by text format could be compensated by other improvement. For example, tb curretnly use at least 4KB heap for each token. Typically, tokens occupy several bytes at most. I guess that this situation could be improved a lot if we allocated thread-safe common heap for all tokens. With regarding to the patch, I haven't got a chance to make sure that new training information is saved correctly. After applying the patch, I found that the problematic data file was shrunk as expected and there is no more infinite loop. I will try to verify effect of the patch by writing some code to display tokens inside training data file. :-) Jeongkyu
If you can confirm that the the training file grows again after we shrink it down then I''ll review the patch. i.e. after triggering this truncation we should make sure future attempts at training get reflected back into the training data file.
Is bug 295843 a dupe of this?
Comment on attachment 184221 [details] [diff] [review] a patch to check overflow in reading tokens of training data clearing the review request pending an answer to an old question of mine in the last comment.
Attachment #184221 - Flags: review?(mscott)
Hi Scott, > If you can confirm that the the training file grows again after we shrink it > down then I''ll review the patch. Sorry for being super late. :-) I just confirmed that the training file grows again. Here is what I've done. 1. Cleaned up thunderbird profile and set up test account. 2. Turned on junk mail control and closed thunderbird. 3. Copied the problematic training.dat into my profile. The data size at this point was 2,162,378 bytes. 4. Opened thunderbird and received a test mail. 5. Marked the mail as junk. 6. Confirmed that thunderbird hanged. 7. Applied my patch and rebuilt thunerbird. 8. Opened patched thunderbird. 9. Marked the test mail as junk. 10. Confirmed that thunderbird did not hang. 11. Closed the thunderbird. 12. Checked the data file. It was shrunken as expected and the size of it was 524,664. 13. Opened the thunderbird again and marked several test mails as junk. 14. Closed thunderbird and checked the data file. It had grown as expected and the size of it was 525,042 bytes.
Attachment #184221 - Flags: review?(mscott)
Comment on attachment 184221 [details] [diff] [review] a patch to check overflow in reading tokens of training data cool. Although I don't think this issue needs to have the bug cited in acomment, folks can use lxr for that.
Attachment #184221 - Flags: review?(mscott) → review+
Comment on attachment 184221 [details] [diff] [review] a patch to check overflow in reading tokens of training data on the one hand, we could try incrementing the newBufferSize in the situation where doubling it produces 0, but on the other hand, when it's that big, might as well prune it.
Attachment #184221 - Flags: superreview+
(In reply to comment #14) > (From update of attachment 184221 [details] [diff] [review] [edit]) > cool. Although I don't think this issue needs to have the bug cited in > acomment, folks can use lxr for that. > Got it. Would you take the comment out when you commit the patch? (In reply to comment #15) > (From update of attachment 184221 [details] [diff] [review] [edit]) > on the one hand, we could try incrementing the newBufferSize in the situation > where doubling it produces 0, but on the other hand, when it's that big, might > as well prune it. > I believe that the initial buffer size(4kb) is already big enough for each token. The actual problem here is not how we handle buffer size but why this problemtic case happen. The data file was somehow damaged, but I could not think of any possible scenario.
Blocks: 295843
Jeongkyu Kim in comment #16: >... > The data file was somehow damaged, but I could not think of any possible > scenario. Jeongkyu, there are testcases in bug 215701 and friends (perhaps this might lead to a fix for them)
Severity: normal → critical
QA Contact: front-end
Whiteboard: [checkin needed]
Assignee: mscott → jeongkyu.kim
mozilla/mailnews/extensions/bayesian-spam-filter/src/nsBayesianFilter.cpp 1.57
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Whiteboard: [checkin needed]
Target Milestone: --- → Thunderbird 3
Version: unspecified → Trunk
*** Bug 317798 has been marked as a duplicate of this bug. ***
*** Bug 295843 has been marked as a duplicate of this bug. ***
No longer blocks: 295843
will this be picked up in TB 2 and/or SM 1.1?
Comment on attachment 184221 [details] [diff] [review] a patch to check overflow in reading tokens of training data Wayne: Ask for "approval1.8.1" on the attachment rather than just posting a comment, it'll be harder to miss that way.
Attachment #184221 - Flags: approval1.8.1?
Attachment #184221 - Flags: approval1.8.1? → approval-thunderbird2?
Comment on attachment 184221 [details] [diff] [review] a patch to check overflow in reading tokens of training data per a discussion with ajschultz
Attachment #184221 - Flags: approval-thunderbird2? → approval-thunderbird2+
landed on branch
Keywords: fixed1.8.1.2
Verified fixed for 1.8.1.3 using Mozilla/5.0 (Windows; U; Windows NT 5.2; en-US; rv:1.8.1.3) Gecko/20070326 Thunderbird/2.0.0.0 ID:2007032620 (Thunderbird 2 RC 1) Marking Messages as Junk is working fine, no hang on several tests.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: