training.dat not created or updated with many accounts

RESOLVED FIXED

Status

RESOLVED FIXED
14 years ago
10 years ago

People

(Reporter: tuukka.tolvanen, Assigned: Bienvenu)

Tracking

Trunk
x86
Linux
Bug Flags:
blocking-aviary1.5 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

14 years ago
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8b) Gecko/20050218 Firefox/1.0+
Build Identifier: version 1.0+ (20050223)

training.dat isn't being created or updated in my main profile. I fully
recreated it from scratch, copying only the mailboxes and .mfs files, but it
still fails. There are 2 POP3S, 2 POP, 1 IMAPS, 5 NNTP and 1 RSS accounts
configured.

If I create a test profile with just one account, training.dat is created
properly. Woo hoo :(

Reproducible: Always

Steps to Reproduce:
0. delete training.dat if you fancy
1. start, get some mail
2. mark some spam and ham
3. exit
Actual Results:  
observe that training.dat is not present/updated in profile

Expected Results:  
training.dat should be present/updated in profile
(Reporter)

Comment 1

14 years ago
The more exact pull time for the build is 2005-02-23-14Z, i.e. about 20h after
the checkin for bug 283080.
(Reporter)

Comment 2

14 years ago
I set the following prefs:

user_pref("mailnews.bayesian_spam_filter.flush.minimum_interval", 10000);
user_pref("mailnews.bayesian_spam_filter.flush.diryting_messages_threshold", 1);

(sic,<http://lxr.mozilla.org/mozilla/source/mailnews/extensions/bayesian-spam-filter/src/nsBayesianFilter.cpp#937>)

and I'm now seeing training.dat created while running. The corresponding
defaults are

+#define DEFAULT_MIN_INTERVAL_BETWEEN_WRITES             15*60*1000
+#define DEFAULT_WRITE_TRAINING_DATA_MESSAGES_THRESHOLD  50

So the problem is that the sync on exit fails, and to a lesser degree that the
dirtying treshold is way too high. The default 15 minutes for the timer is
sensible enough, but I don't see any point in requiring >1 unsynced training
actions for the timed flushing to ever even take place. (50 training actions in
one session just indicates abuse or benchmarking anyway imo.)
(Reporter)

Comment 3

14 years ago
I opened bug 283493 specifically on the timed writing issue, to focus on the
save-on-exit issue here.

Comment 4

14 years ago
I just deleted my training.dat in response to Asa's suggestion on his blog. A
new training.dat was automatically created, but it doesn't seem to learn
anything (the filesize stays at 72 kibi, and it's marking valid e-mails as spam,
even after I've repeatedly marked these types as Not Junk).

This is highly disturbing, because I get 200+ junk/day, and overlooking all the
eroneously marked junk "hidden" among all the real junk is highly likely and
dangerous.

I too have numerous accounts (4x IMAP, 2x POP). 

Thunderbird: version 1.0+ (20050308), winXP ---> *OS = All*

Also suggest: Dataloss 
Severity:     Major
Flags: blocking-aviary1.1?

Comment 5

14 years ago
If Seth is "not reading bugmail" and there is no "QA", will anyone with the
authority/capability even notice this (IMO important) bug?

Comment 6

14 years ago
Using a tree checked out very late 20050322 (and finished building 20030523 at
0:10) there is no updating of training.dat happening. Last update to it is over
24 hours old:

# ls -l .thunderbird/189nypdk.default/training.dat
-rw-r--r--    1 dzm      dzm      10446488 Mar 22 09:53
.thunderbird/189nypdk.default/training.dat

Comment 7

14 years ago
(In reply to comment #6)
> Using a tree checked out very late 20050322 (and finished building 20030523 at
> 0:10) there is no updating of training.dat happening. Last update to it is over
> 24 hours old:
> 
> # ls -l .thunderbird/189nypdk.default/training.dat
> -rw-r--r--    1 dzm      dzm      10446488 Mar 22 09:53
> .thunderbird/189nypdk.default/training.dat

Bug 245499
Bug 283080
Bug 283493
might be related
(Assignee)

Comment 8

14 years ago
Created attachment 188609 [details] [diff] [review]
release filter plugin so it will get destroyed and write itself out

we weren't hitting the destructor for the bayesian filter, so we weren't
writing out the training data, at least on my profile. With this change, my
training data gets written out on shutdown.
Assignee: sspitzer → bienvenu
Status: NEW → ASSIGNED
Attachment #188609 - Flags: superreview?(mscott)

Updated

14 years ago
Attachment #188609 - Flags: superreview?(mscott) → superreview+
(Assignee)

Updated

14 years ago
Attachment #188609 - Flags: approval-aviary1.1a2?

Updated

14 years ago
Attachment #188609 - Flags: approval-aviary1.1a2? → approval-aviary1.1a2+
(Assignee)

Updated

14 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 14 years ago
Resolution: --- → FIXED

Updated

14 years ago
Flags: blocking-aviary1.1? → blocking-aviary1.1+
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.