Closed
Bug 240819
Opened 21 years ago
Closed 20 years ago
Crash in mail.dll when checking mail - TB073 [@ nsTransform2D::SetToIdentity ]
Categories
(Thunderbird :: Mail Window Front End, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird0.8
People
(Reporter: jay, Assigned: mscott)
Details
(Keywords: crash, topcrash, Whiteboard: fixed-aviary1.0)
Crash Data
Attachments
(1 file)
1.21 KB,
patch
|
Details | Diff | Splinter Review |
Mozilla Thunderbird 0.6+ (20040417) Win98SE
Crash in MAIL.DLL when checking mail, manual or auto-check. No error message,
just the crash.
Not 100% reproducible, 75% would be a more accurate figure.
Comment 1•21 years ago
|
||
(In reply to comment #0)
> Mozilla Thunderbird 0.6+ (20040417) Win98SE
>
> Crash in MAIL.DLL when checking mail, manual or auto-check. No error message,
> just the crash.
>
> Not 100% reproducible, 75% would be a more accurate figure.
I have the same issue. It appears to be caused by the adaptive Junk Mail
Controls.
Comment 2•21 years ago
|
||
jay garcia: Could you reproduce with Thunderbird 0.7.x? Could you provide
TalkBack id in such case?
Comment 3•20 years ago
|
||
I've now disabled junk mail detection for a week or so; in that time, I have not
had a single crash. Certainly seems related...
Comment 4•20 years ago
|
||
I forgot to mention... I'm running 0.7.2 on Win2k.
can someone please turn on talkback, turn on junk mail filtering, let it crash
and send the talkback and post the talkback number here pretty please? ;)
Comment 6•20 years ago
|
||
Of course, now I can't seem to get it to crash.
However, here are some talkback IDs, in reverse chronological order:
TB451242H (30 July)
TB450929Q (30 July)
TB436895X (29 July)
TB436804Z (29 July)
TB435573G (28 July)
TB435141X (28 July)
TB433148M (28 July)
TB420253X (26 July)
TB368206Z (19 July)
TB360832E (18 July)
TB354886M (18 July)
TB351673Z (17 July)
Comment 7•20 years ago
|
||
David, your incidents are probably for several different crashes:
nsTransform2D::SetToIdentity
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=451242
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=450929
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=436895
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=436804
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=435573
nsBayesianFilter::classifyMessage
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=420253
nsViewManager::DispatchEvent
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=435141
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=433148
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=368206
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=360832
TimerThread::UpdateFilter
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=354886
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=351673
Comment 8•20 years ago
|
||
(In reply to comment #7)
Well, the only crash I've seen has been caused by downloading mail. However, I
did have some follow-on crashes related to Talkback (where Thunderbird would
crash, Talkback would hang, lots of task zaniness ensues requiring Task Manager
to kill things off).
Besides, I would be surprised if things like nsTransform2D::SetToIdentity,
nsViewManager::DispatchEvent, or TimerThread::UpdateFilter had segv-like bugs in
their implementation -- most of TB wouldn't work. This makes me suspect the
Bayesian filter is causing memory corruption (I'd need to run a purified version
to test this hypothesis).
Comment 9•20 years ago
|
||
Wait a minute. These are all "invalid operation" exceptions, and they're all
occurring in methods which use floating-point operations. (The line numbers in
the stack traces are off, not sure why...)
I'm running on an Athlon XP machine, nothing terribly out of the ordinary
(except that it's old). Not overclocking, overtweaking, overanything. I've
seen similar errors (floating point exceptions) in Acrobat Reader.
Hmm.
Comment 10•20 years ago
|
||
Ok, just got it again. Here's the talkback:
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=2&type=iid&id=548292
Again, this shows the error happening in nsTransform2D::SetToIdentity().
Very weird.
Win2k (5.00.2195), sp 4, Athlon XP 1800+, 1GB of RAM.
Comment 11•20 years ago
|
||
Hmm. I see 308 talkbacks with the SetToIdentity() trace, and a lot of them
mention "downloading mail."
http://talkback-public.mozilla.org/talkback/fastfind.jsp?search=1&searchby=stacksig&match=begins&searchfor=nstransform2d%3A%3ASetToIdentity
However, I believe that downloading mail isn't the crux of the problem; it's
something in the way we're doing floating-point. To dig any deeper, I'm going
to need to explore this in a debugger (which means updating my version of MSVC
from the ancient 5.0...).
Comment 12•20 years ago
|
||
Okay, this bug is about crash in nsTransform2D::SetToIdentity.
Chris, do you have any idea, what situation should crash nsTransform2D?
TB548292:
nsTransform2D::SetToIdentity [../../../dist/include/gfx/nsTransform2D.h, line 89]
nsRenderingContextWinConstructor
[e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/gfx/src/windows/nsGfxFactoryWin.cpp,
line 63]
nsComponentManager::CreateInstance
[e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/xpcom/components/nsComponentManagerObsolete.cpp,
line 103]
nsWindow::OnPaint
[e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp,
line 5039]
nsWindow::ProcessMessage
[e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp,
line 3825]
nsWindow::WindowProc
[e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/widget/src/windows/nsWindow.cpp,
line 1349]
USER32.DLL + 0x1ef0 (0x77e11ef0)
USER32.DLL + 0x3869 (0x77e13869)
USER32.DLL + 0x38ab (0x77e138ab)
ntdll.dll + 0x1ff57 (0x77f9ff57)
USER32.DLL + 0x21af (0x77e121af)
nsAppShellService::Run
[e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/xpfe/appshell/src/nsAppShellService.cpp,
line 495]
main [e:/builds/tbird-0.7.2/WINNT_5.0_Clobber/mozilla/mail/app/nsMailApp.cpp,
line 58]
KERNEL32.DLL + 0x11af6 (0x7c581af6)
Summary: Crash in mail.dll when checking mail → Crash in mail.dll when checking mail [@ nsTransform2D::SetToIdentity ]
Comment 13•20 years ago
|
||
(In reply to comment #12)
Argh, no. This bug is about invalid floating point state that is being
triggered by the Bayesian filter, not about one specific stack trace. It just
so happens that nsTransform2D::SetToIdentity is one of the more common floating
point functions being called after we leave the Bayesian filter. Tagging this
as a nsTransform2D::SetToIdentity bug is a red herring.
I have a hunch this is related to a floating-point optimization in the build
flags that is failing on, say, AMD vs. Intel chips. Note that the failure is
NOT a GPF but an Invalid Operation and *always* on FP code. This is key.
Comment 14•20 years ago
|
||
Adding TB073 and topcrash keyword. This is a topcrasher for Thunderbird 0.7.3
(currently #7):
http://talkback-public.mozilla.org/reports/thunderbird/TB073/TB073-topcrashers.html
Keywords: topcrash
Summary: Crash in mail.dll when checking mail [@ nsTransform2D::SetToIdentity ] → Crash in mail.dll when checking mail - TB073 [@ nsTransform2D::SetToIdentity ]
Assignee | ||
Comment 15•20 years ago
|
||
hhopefully we can get some traction on this for 0.8
Status: NEW → ASSIGNED
Target Milestone: --- → Thunderbird0.8
Comment 16•20 years ago
|
||
Here's the best breakdown I could come up with for
"nsTransform2D::SetToIdentity" crashes...by processor type and brand/vendor:
80x86 (1 Subgroup) 124 Incidents
GenuineIntel 124 Incidents
Pentium (2 Subgroups) 230 Incidents
AuthenticAMD 149 Incidents
GenuineIntel 81 Incidents
Pentium II (2 Subgroups) 7 Incidents
AuthenticAMD 5 Incidents
GenuineIntel 2 Incidents
Doesn't look like Talkback collects any more details about the type of processor.
Assignee | ||
Comment 17•20 years ago
|
||
There's a pretty good chance the patch in Bug #244357 will fix this crash but I
haven't had time to regression test it on the junk scores it generates.
Comment 18•20 years ago
|
||
I'm not sure bug 244357 will fix this problem. For the incident involving
nsBayesianFilter.cpp I would look at this code.
/* this part is similar to the Graham algorithm with some adjustments. */
PRUint32 i, goodclues=0, count = tokenizer.countTokens();
--> double ngood = mGoodCount, nbad = mBadCount, prob;
for (i = 0; i < count; ++i)
{
Token& token = tokens[i];
const char* word = token.mWord;
Token* t = mGoodTokens.get(word);
double hamcount = ((t != NULL) ? t->mCount : 0);
t = mBadTokens.get(word);
double spamcount = ((t != NULL) ? t->mCount : 0);
--> prob = (spamcount / nbad) / ( hamcount / ngood + spamcount / nbad);
double n = hamcount + spamcount;
prob = (0.225 + n * prob) / (.45 + n);
...
How do you know ngood and nbad are non-zero?
Also, the second marked line should probably be written to eliminate some of
the divisions
prob = (spamcount * ngood)/(hamcount *nbad + spamcount * ngood)
I think.
Comment 19•20 years ago
|
||
I missed the more obvious issue. If t is null then both hamcount and spamcount
are zero and you have a problem.
Comment 20•20 years ago
|
||
Maybe something like
double denom = (spamcount * ngood)/(hamcount *nbad + spamcount * ngood);
if (denom == 0.0)
{
// do something useful, but I don't know what
continue;
}
else
prob = (spamcount * ngood) / denom;
Assignee | ||
Comment 21•20 years ago
|
||
David Cuthbert, how easy is it for you to run into this crash? If we checked in
some potential fixes can you use the build and say within a day or two that the
crash is gone? Or is it not that frequent?
Comment 22•20 years ago
|
||
(In reply to comment #21)
Oh, easily. I get enough spam that repeatedly testing this is trivial.
Hm, a good test might be to back up the profile and download the same set of
mail between the two versions (the idea being that the old one crashes, new one
doesn't).
Assignee | ||
Comment 23•20 years ago
|
||
Here's a possible patch based on some comments by tenthumbs to avoid a possible
division by zero situation.
Assignee | ||
Comment 24•20 years ago
|
||
Comment on attachment 156822 [details] [diff] [review]
possible fix to protect against a division by zero
tenthumbs, what do you think of this?
Attachment #156822 -
Flags: review?(tenthumbs)
Assignee | ||
Comment 25•20 years ago
|
||
David C, I just checked in this potential fix into the 0.8 branch in the hopes
that you can grab a build with the fix and see if it does indeed address the
problem.
Can you please look for a 0.8 test build here:
http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-0.8/
You'll need to wait until builds for August 24th come out. Thanks!
Whiteboard: fixed-aviary1.0
Comment 26•20 years ago
|
||
Yep; I'll check it when it comes out.
Comment 27•20 years ago
|
||
Are you absolutely positive that ngood and nbad can never simultaneously be
zero? I can't really see it from the code.
The orginal Graham algorithm actually does this.
n1 = min(1, spamcount / nbad);
d1 = min(1, hamcount / ngood);
d2 = min(1, spamcount / nbad);
prob = n1 / (d1 + d2);
which would catch ngood or nbad being zero. That's inefficient and could throw
exceptions but maybe it's useful. I'm not sure, though.
Assignee | ||
Comment 28•20 years ago
|
||
FYI David, the builds are now out:
http://ftp.mozilla.org/pub/mozilla.org/thunderbird/nightly/latest-0.8/
Comment 29•20 years ago
|
||
Ok... no good idea whether the fix helped or not yet. Here's what I did:
--- Verify that we have a proper testcase ---
1. Downloaded my mail (93 messages, ~500k, mostly spam) to a Linux host. Copied
the mail so we can fool the POP3 server into hosting it multiple times.
2. In the buggy TB, I enabled junk mail filtering, and then closed the application.
3. Copied my Thunderbird profile to a backup so we could restore the state
(C:\Documents and Settings\dacut\Application Data\Thunderbird -> Thunderbird_orig)
4. Start up the buggy TB. Grabbed mail from the POP3 server. It crashed with
the same ol' talkback trace (TB645214 if you're curious).
Ok, at this point we know that we have a testcase which causes the bug.
--- Verify the fix ---
5. Installed the nightly into a different directory (C:\Program Files\thundertest).
6. Delete my Thunderbird profile, restore from Thunderbird_orig.
7. Restore my mail on the POP3 server.
8. Start up the nightly (from the command line to ensure I'm not starting the
buggy version).
9. Download POP3 mail. No crash, lots of spam identified and properly filtered.
Ooh, ok, this looks promising!
10. Shut down the nightly.
--- Sanity check: make sure buggy version still fails ---
11. Restore TB profile.
12. Restore mail.
13. Start up the buggy version.
14. Download mail. This time, however, no crash, and spam is again identified,
just as if I had run the nightly.
Hm. Puzzling. Perhaps it's picking up a component from the nightly?
15. Shut TB down.
16. Delete nightly install.
17. Restore TB profile
18. Restore mail.
19. Start up the buggy version again.
20. Download mail. Again, no crash, spam identified properly.
I'm... stumped. Is there anything stochastic about the spam classifier (e.g.,
using a random variable seeded by the timer)? If I rerun the buggy version
multiple times (from the restored mail+profile), could I encounter the crash?
Does the spam classifier store any state in the registry (in addition to
training.dat)?
Also, the nightly reports its version as 0.7.0. I'm assuming this is because
the number simply hasn't been updated... if I grabbed a bum build, let me know.
Assignee | ||
Comment 30•20 years ago
|
||
I think we fixed this. Optimisitcally marking this fixed as it's now on the
trunk and the branch.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Updated•20 years ago
|
Attachment #156822 -
Flags: review?(tenthumbs)
Updated•14 years ago
|
Crash Signature: [@ nsTransform2D::SetToIdentity ]
You need to log in
before you can comment on or make changes to this bug.
Description
•