Closed
Bug 240819
Opened 20 years ago
Closed 20 years ago
Crash in mail.dll when checking mail - TB073 [@ nsTransform2D::SetToIdentity ]
Categories
(Thunderbird :: Mail Window Front End, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird0.8
People
(Reporter: jay, Assigned: mscott)
Details
(Keywords: crash, topcrash, Whiteboard: fixed-aviary1.0)
Crash Data
Attachments
(1 file)
1.21 KB,
patch
|
Details | Diff | Splinter Review |
Mozilla Thunderbird 0.6+ (20040417) Win98SE Crash in MAIL.DLL when checking mail, manual or auto-check. No error message, just the crash. Not 100% reproducible, 75% would be a more accurate figure.
Comment 1•20 years ago
|
||
(In reply to comment #0) > Mozilla Thunderbird 0.6+ (20040417) Win98SE > > Crash in MAIL.DLL when checking mail, manual or auto-check. No error message, > just the crash. > > Not 100% reproducible, 75% would be a more accurate figure. I have the same issue. It appears to be caused by the adaptive Junk Mail Controls.
Comment 2•20 years ago
|
||
jay garcia: Could you reproduce with Thunderbird 0.7.x? Could you provide TalkBack id in such case?
Comment 3•20 years ago
|
||
I've now disabled junk mail detection for a week or so; in that time, I have not had a single crash. Certainly seems related...
Comment 4•20 years ago
|
||
I forgot to mention... I'm running 0.7.2 on Win2k.
can someone please turn on talkback, turn on junk mail filtering, let it crash and send the talkback and post the talkback number here pretty please? ;)
Comment 6•20 years ago
|
||
Of course, now I can't seem to get it to crash. However, here are some talkback IDs, in reverse chronological order: TB451242H (30 July) TB450929Q (30 July) TB436895X (29 July) TB436804Z (29 July) TB435573G (28 July) TB435141X (28 July) TB433148M (28 July) TB420253X (26 July) TB368206Z (19 July) TB360832E (18 July) TB354886M (18 July) TB351673Z (17 July)
Comment 7•20 years ago
|
||
David, your incidents are probably for several different crashes: nsTransform2D::SetToIdentity http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=451242 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=450929 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=436895 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=436804 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=435573 nsBayesianFilter::classifyMessage http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=420253 nsViewManager::DispatchEvent http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=435141 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=433148 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=368206 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=360832 TimerThread::UpdateFilter http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=354886 http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=351673
Comment 8•20 years ago
|
||
(In reply to comment #7) Well, the only crash I've seen has been caused by downloading mail. However, I did have some follow-on crashes related to Talkback (where Thunderbird would crash, Talkback would hang, lots of task zaniness ensues requiring Task Manager to kill things off). Besides, I would be surprised if things like nsTransform2D::SetToIdentity, nsViewManager::DispatchEvent, or TimerThread::UpdateFilter had segv-like bugs in their implementation -- most of TB wouldn't work. This makes me suspect the Bayesian filter is causing memory corruption (I'd need to run a purified version to test this hypothesis).
Comment 9•20 years ago
|
||
Wait a minute. These are all "invalid operation" exceptions, and they're all occurring in methods which use floating-point operations. (The line numbers in the stack traces are off, not sure why...) I'm running on an Athlon XP machine, nothing terribly out of the ordinary (except that it's old). Not overclocking, overtweaking, overanything. I've seen similar errors (floating point exceptions) in Acrobat Reader. Hmm.
Comment 10•20 years ago
|
||
Ok, just got it again. Here's the talkback: http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=548292 Again, this shows the error happening in nsTransform2D::SetToIdentity(). Very weird. Win2k (5.00.2195), sp 4, Athlon XP 1800+, 1GB of RAM.
Comment 11•20 years ago
|
||
Hmm. I see 308 talkbacks with the SetToIdentity() trace, and a lot of them mention "downloading mail." http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=1&searchby=stacksig&match=begins&searchfor=nstransform2d%3A%3ASetToIdentity However, I believe that downloading mail isn't the crux of the problem; it's something in the way we're doing floating-point. To dig any deeper, I'm going to need to explore this in a debugger (which means updating my version of MSVC from the ancient 5.0...).
Comment 12•20 years ago
|
||
Okay, this bug is about crash in nsTransform2D::SetToIdentity. Chris, do you have any idea, what situation should crash nsTransform2D? TB548292: nsTransform2D::SetToIdentity [../../../dist/include/gfx/nsTransform2D.h, line 89] nsRenderingContextWinConstructor [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/gfx/src/windows/nsGfxFactoryWin.cpp, line 63] nsComponentManager::CreateInstance [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/xpcom/components/nsComponentManagerObsolete.cpp, line 103] nsWindow::OnPaint [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp, line 5039] nsWindow::ProcessMessage [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp, line 3825] nsWindow::WindowProc [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp, line 1349] USER32.DLL + 0x1ef0 (0x77e11ef0) USER32.DLL + 0x3869 (0x77e13869) USER32.DLL + 0x38ab (0x77e138ab) ntdll.dll + 0x1ff57 (0x77f9ff57) USER32.DLL + 0x21af (0x77e121af) nsAppShellService::Run [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/xpfe/appshell/src/nsAppShellService.cpp, line 495] main [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/mail/app/nsMailApp.cpp, line 58] KERNEL32.DLL + 0x11af6 (0x7c581af6)
Summary: Crash in mail.dll when checking mail → Crash in mail.dll when checking mail [@ nsTransform2D::SetToIdentity ]
Comment 13•20 years ago
|
||
(In reply to comment #12) Argh, no. This bug is about invalid floating point state that is being triggered by the Bayesian filter, not about one specific stack trace. It just so happens that nsTransform2D::SetToIdentity is one of the more common floating point functions being called after we leave the Bayesian filter. Tagging this as a nsTransform2D::SetToIdentity bug is a red herring. I have a hunch this is related to a floating-point optimization in the build flags that is failing on, say, AMD vs. Intel chips. Note that the failure is NOT a GPF but an Invalid Operation and *always* on FP code. This is key.
Comment 14•20 years ago
|
||
Adding TB073 and topcrash keyword. This is a topcrasher for Thunderbird 0.7.3 (currently #7): http://talkback-public.mozilla.org/reports/thunderbird/TB073/TB073-topcrashers.html
Keywords: topcrash
Summary: Crash in mail.dll when checking mail [@ nsTransform2D::SetToIdentity ] → Crash in mail.dll when checking mail - TB073 [@ nsTransform2D::SetToIdentity ]
Assignee | ||
Comment 15•20 years ago
|
||
hhopefully we can get some traction on this for 0.8
Status: NEW → ASSIGNED
Target Milestone: --- → Thunderbird0.8
Comment 16•20 years ago
|
||
Here's the best breakdown I could come up with for "nsTransform2D::SetToIdentity" crashes...by processor type and brand/vendor: 80x86 (1 Subgroup) 124 Incidents GenuineIntel 124 Incidents Pentium (2 Subgroups) 230 Incidents AuthenticAMD 149 Incidents GenuineIntel 81 Incidents Pentium II (2 Subgroups) 7 Incidents AuthenticAMD 5 Incidents GenuineIntel 2 Incidents Doesn't look like Talkback collects any more details about the type of processor.
Assignee | ||
Comment 17•20 years ago
|
||
There's a pretty good chance the patch in Bug #244357 will fix this crash but I haven't had time to regression test it on the junk scores it generates.
Comment 18•20 years ago
|
||
I'm not sure bug 244357 will fix this problem. For the incident involving nsBayesianFilter.cpp I would look at this code. /* this part is similar to the Graham algorithm with some adjustments. */ PRUint32 i, goodclues=0, count = tokenizer.countTokens(); --> double ngood = mGoodCount, nbad = mBadCount, prob; for (i = 0; i < count; ++i) { Token& token = tokens[i]; const char* word = token.mWord; Token* t = mGoodTokens.get(word); double hamcount = ((t != NULL) ? t->mCount : 0); t = mBadTokens.get(word); double spamcount = ((t != NULL) ? t->mCount : 0); --> prob = (spamcount / nbad) / ( hamcount / ngood + spamcount / nbad); double n = hamcount + spamcount; prob = (0.225 + n * prob) / (.45 + n); ... How do you know ngood and nbad are non-zero? Also, the second marked line should probably be written to eliminate some of the divisions prob = (spamcount * ngood)/(hamcount *nbad + spamcount * ngood) I think.
Comment 19•20 years ago
|
||
I missed the more obvious issue. If t is null then both hamcount and spamcount are zero and you have a problem.
Comment 20•20 years ago
|
||
Maybe something like double denom = (spamcount * ngood)/(hamcount *nbad + spamcount * ngood); if (denom == 0.0) { // do something useful, but I don't know what continue; } else prob = (spamcount * ngood) / denom;
Assignee | ||
Comment 21•20 years ago
|
||
David Cuthbert, how easy is it for you to run into this crash? If we checked in some potential fixes can you use the build and say within a day or two that the crash is gone? Or is it not that frequent?
Comment 22•20 years ago
|
||
(In reply to comment #21) Oh, easily. I get enough spam that repeatedly testing this is trivial. Hm, a good test might be to back up the profile and download the same set of mail between the two versions (the idea being that the old one crashes, new one doesn't).
Assignee | ||
Comment 23•20 years ago
|
||
Here's a possible patch based on some comments by tenthumbs to avoid a possible division by zero situation.
Assignee | ||
Comment 24•20 years ago
|
||
Comment on attachment 156822 [details] [diff] [review] possible fix to protect against a division by zero tenthumbs, what do you think of this?
Attachment #156822 -
Flags: review?(tenthumbs)
Assignee | ||
Comment 25•20 years ago
|
||
David C, I just checked in this potential fix into the 0.8 branch in the hopes that you can grab a build with the fix and see if it does indeed address the problem. Can you please look for a 0.8 test build here: http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-0.8/ You'll need to wait until builds for August 24th come out. Thanks!
Whiteboard: fixed-aviary1.0
Comment 26•20 years ago
|
||
Yep; I'll check it when it comes out.
Comment 27•20 years ago
|
||
Are you absolutely positive that ngood and nbad can never simultaneously be zero? I can't really see it from the code. The orginal Graham algorithm actually does this. n1 = min(1, spamcount / nbad); d1 = min(1, hamcount / ngood); d2 = min(1, spamcount / nbad); prob = n1 / (d1 + d2); which would catch ngood or nbad being zero. That's inefficient and could throw exceptions but maybe it's useful. I'm not sure, though.
Assignee | ||
Comment 28•20 years ago
|
||
FYI David, the builds are now out: http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-0.8/
Comment 29•20 years ago
|
||
Ok... no good idea whether the fix helped or not yet. Here's what I did: --- Verify that we have a proper testcase --- 1. Downloaded my mail (93 messages, ~500k, mostly spam) to a Linux host. Copied the mail so we can fool the POP3 server into hosting it multiple times. 2. In the buggy TB, I enabled junk mail filtering, and then closed the application. 3. Copied my Thunderbird profile to a backup so we could restore the state (C:\Documents and Settings\dacut\Application Data\Thunderbird -> Thunderbird_orig) 4. Start up the buggy TB. Grabbed mail from the POP3 server. It crashed with the same ol' talkback trace (TB645214 if you're curious). Ok, at this point we know that we have a testcase which causes the bug. --- Verify the fix --- 5. Installed the nightly into a different directory (C:\Program Files\thundertest). 6. Delete my Thunderbird profile, restore from Thunderbird_orig. 7. Restore my mail on the POP3 server. 8. Start up the nightly (from the command line to ensure I'm not starting the buggy version). 9. Download POP3 mail. No crash, lots of spam identified and properly filtered. Ooh, ok, this looks promising! 10. Shut down the nightly. --- Sanity check: make sure buggy version still fails --- 11. Restore TB profile. 12. Restore mail. 13. Start up the buggy version. 14. Download mail. This time, however, no crash, and spam is again identified, just as if I had run the nightly. Hm. Puzzling. Perhaps it's picking up a component from the nightly? 15. Shut TB down. 16. Delete nightly install. 17. Restore TB profile 18. Restore mail. 19. Start up the buggy version again. 20. Download mail. Again, no crash, spam identified properly. I'm... stumped. Is there anything stochastic about the spam classifier (e.g., using a random variable seeded by the timer)? If I rerun the buggy version multiple times (from the restored mail+profile), could I encounter the crash? Does the spam classifier store any state in the registry (in addition to training.dat)? Also, the nightly reports its version as 0.7.0. I'm assuming this is because the number simply hasn't been updated... if I grabbed a bum build, let me know.
Assignee | ||
Comment 30•20 years ago
|
||
I think we fixed this. Optimisitcally marking this fixed as it's now on the trunk and the branch.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Updated•19 years ago
|
Attachment #156822 -
Flags: review?(tenthumbs)
Updated•13 years ago
|
Crash Signature: [@ nsTransform2D::SetToIdentity ]
You need to log in
before you can comment on or make changes to this bug.
Description
•