Open
Bug 181471
Opened 22 years ago
Updated 2 years ago
Enable sharing of junk Bayesian database
Categories
(MailNews Core :: Filters, enhancement)
Tracking
(Not tracked)
NEW
People
(Reporter: john.peacock, Unassigned)
References
Details
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3a) Gecko/20021122 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3a) Gecko/20021122 In a corporate environment, often the same spam messages will come in to many users. It would be very helpful if the already trained junk filter databases could be combined and shared back out. Alternatively, if a client/server model would be created, where all clients would train a server-based database, which could be then be accessed by all. Reproducible: Didn't try Steps to Reproduce:
It would also be beneficial if there was a way to have IMAP users (such as myself) have the junk filter settings follow them. Project Cyrus has an ACAP system ( http://asg.web.cmu.edu/acap/ ) which might be applicable to this situation.
Comment 2•22 years ago
|
||
Sharing would be rather difficult, since what is spam for me, is't always spam for someone else. see also bug 153522
Reporter | ||
Comment 3•22 years ago
|
||
As I said, in a _corporate_ environment, the needs are very different from an ISP. I am also more interested in tagging than blocking, so I would forsee a combination model: 1) shared spam database is rated at a lower priority 2) user spam database is rated higher 3) user whitelist outweighs shared spam database This way, the shared experience spam gets enough hits to mark it, if the user rating concurs, the spam can be deleted unread if the user chose that. A user can manually whitelist any address to prevent tagging or deletion.
Comment 4•22 years ago
|
||
*** Bug 181569 has been marked as a duplicate of this bug. ***
Comment 5•22 years ago
|
||
dylang: What you're suggesting is bug #78858. It's unrelated to this bug.
Comment 6•22 years ago
|
||
Please note bug #182131.
Comment 7•22 years ago
|
||
*** Bug 187363 has been marked as a duplicate of this bug. ***
Comment 8•22 years ago
|
||
to dmose
Assignee: sspitzer → dmose
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 9•22 years ago
|
||
If the basic population of training mails is big enough and the contributing users give lots of different "junk" and "not-junk" criteria, then no false positive junk marking should occur. Of course, the global spam-filter is not as sharp as the individual one, trained with familiar junkmail. But a wave of newly composed MMF or EYPN messages may be blocked faster, after only a couple of user reports to the global filter. I'd appreaciate such a feature highly.
Comment 10•22 years ago
|
||
#9: What you are suggesting is very similar to the software called "SpamNet" by Cloudmark, which only works for Outlook (and I think Outlook Express support is coming soon). When I used to use Outlook I used this add-on extensively and it worked quite well. It's essentially collaborative centralized spam filtering. The only downside I can see to such a project for Mozilla Mail is that it may require a beefy server or a list of mirrors for such a database of caught spam, and that the traffic for updating it may be significant. Otherwise, I think it is a very good goal because it essentially creates a peer-reviewed blacklist of e-mails. Anyway, it's worth contemplating. This gets my vote.
Comment 11•21 years ago
|
||
This would also help greatly for people who read mail on multiple machines. None of my machines is very good at catching spam now, but I bet they'd be a lot better if they could share data. I've tried copying training.dat from one machine to another but it doesn't seem to work.
Comment 12•21 years ago
|
||
So in the meantime, what about some tool for merging several training.dat files? Users could help each other by exchanging this info. Now, everybody must train on his own...
Comment 13•21 years ago
|
||
Rather difficult to do - different persons have different ideas what spam is. For me, every Chinese mail is spam, but not for my collegue in Shanghai. When he tried to use my mozilla-mailer (different account, but with my training data), he had to search for his messages in my junk-folder.
Comment 14•21 years ago
|
||
*** Bug 207579 has been marked as a duplicate of this bug. ***
Comment 15•21 years ago
|
||
Many seem to consider this request "difficult to do", not technically, but from the standpoint of one mans spam is anothers gold. I think that issue is irrelevant. The feature would be implemented as an option, for those of us who have concrete use cases: I guarantee that what is spam to me is definitely spam to my 9yr old daughter, and I would like an easy way for her inbox to take advantage of my spam training.
Comment 16•20 years ago
|
||
(In reply to comment #6) > Please note bug #182131. I'd vote for this. A corporate environment can used a shared-writable imap folder to share the bayesian data. Depending on that bug 182131
Depends on: 182131
Comment 17•20 years ago
|
||
In what way does general sharing (which does not require IMAP) depend on an IMAP-specific bug? I think you may have that dependency backward.
Comment 18•20 years ago
|
||
*** Bug 239772 has been marked as a duplicate of this bug. ***
Updated•20 years ago
|
Product: Browser → Seamonkey
Comment 19•19 years ago
|
||
*** Bug 291136 has been marked as a duplicate of this bug. ***
Comment 20•17 years ago
|
||
Assigning bugs that I'm not actively working on back to nobody; use SearchForThis as a search term if you want to delete all related bugmail at once.
Assignee: dmose → nobody
Updated•16 years ago
|
Assignee: nobody → mail
QA Contact: laurel
Updated•15 years ago
|
Assignee: mail → nobody
Component: MailNews: Message Display → Filters
Product: SeaMonkey → MailNews Core
QA Contact: filters
Comment 21•15 years ago
|
||
For followers of this bug, I'd like to point out the work that I am doing in bug 506397, as it is the backend work needed for this bug. 506397 allows message corpus training data from an external source to be added to the bayes training database for a local user. If wanted (and usually you would want) the identity of the external data is kept separate from the training data that the user themself has prepared. That way, you could update or add to the external data without messing with the local user's data, though both would be used in any classifications done by the filter. The intention of bug 506397 is to provide backend support, but the final implementation of the concept would be implemented in an extension. I will probably provide such an extension myself in the future.
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•