Closed
Bug 492555
Opened 15 years ago
Closed 15 years ago
attachment reminder: Too common keywords, many false triggers
Categories
(Thunderbird :: Message Compose Window, defect)
Thunderbird
Message Compose Window
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird 3.0b3
People
(Reporter: BenB, Assigned: mkmelin)
References
Details
Attachments
(1 file, 2 obsolete files)
3.59 KB,
patch
|
philor
:
review+
|
Details | Diff | Splinter Review |
See bug 244455 comment 80 ff and comment 41 ff.
It currently triggers on:
attachment
attached
doc
pdf
resume
cover letter
The list should not contain "attached" and "resume", as these
are normal language words ("I resumed the computer", "I was so attached to this
girl..." - TB: "Did you forget to add an attachment?"). Also, you can only
trigger on ".pdf", if you have code to not trigger on URLs, as .pdf very often
appears in URLs, and we do *not* want to encourage people to attach PDFs which
can just as well be linked, rather the opposite.
Even then, the keyword should be "\.doc", not ".doc", as this is a regexp and "." is a wildcard.
Keywords should, for now, only contain:
"attachment" and "cover letter"
After URLs are not checked, it should only contain:
attachment,\.doc,\.pdf
-mail.compose.attachment_reminder_keywords=attachment,attached,.doc,.pdf,resume,cover letter
+mail.compose.attachment_reminder_keywords=attachment,cover letter
Flags: blocking-thunderbird3?
Comment 1•15 years ago
|
||
I'm not a native speaker, but "please find attached ..." is definitely one of the common phrases, at least in business correspondence.
Others I stumbled upon are
"attached please find/see"
"please find enclosed"
"I'm enclosing"
So I'd leave "attached" in, and add "enclosed" and "enclosing".
@Ben: You used as an example of false trigger, "I resumed the computer".
Does this then mean that, in the regexp, there is no checking to make sure that the characters preceding and following the keyword are non-word character? That is: (^|[^\w])keyword(?!\w) (or something to that effect). This would be a problem as not checking would trigger on sections of long or compound words.
And then there is the issue of this triggering if someone is sending from a saved draft of a reply having original correspondences that contain the keyword.
That said, it would also be nice to have an option with the dialog to check "Do not use this feature", in case there a user is exacerbated by too many false positives.
Reporter | ||
Comment 3•15 years ago
|
||
Steffen: Yes, but to me, the number (or percentage) of false positive is what matters, not the positives. And if a word is also often used in ways that do not referly to attachments, that means it should not trigger, as that would be
* annoying - MS paperclip effect
* diminish the useful of the feature itself.
* Worst, it causes dialogbox-weardown and -ignore, it's a well-known effect that users then ignore *all* dialog boxes, which is harmful for security. Such a feature here can well destroy the efforts the security team puts in reducing useless dialog boxen and improving security warnings.
Reporter | ||
Comment 4•15 years ago
|
||
> it would also be nice to have an option with the dialog to check
> "Do not use this feature"
Filed bug 492573 about that.
(In reply to comment #3)
> Steffen: [...]
Not Steffen. :-) I usually add myself only to the voter list; that way I can add or remove myself from a bug without generating bug spam--especially on large lists.
Nonetheless, what do you think of using a regexp like: (^|[^\w])keyword(?!\w)
That would more strictly match a word and significantly reduce false positives.
So, for instance, if you have a keyword like ".doc", and people are communicating about code, you would avoid triggering on something like: xmlDoc.documentElement
You would also avoid, if a keyword was "resume", from triggering on: "resumed", "presumed", etc.
In my above regexp, "keyword" assumes that if you have a keyword like ".doc", the "." is already properly escaped somewhere else in code.
Assignee | ||
Comment 6•15 years ago
|
||
Yeah it's probably a good idea include word boundaries in the regexp, and the make the code escape dots before using dot-words in the regexp.
Though of course we can never be 100% right...
Assignee | ||
Comment 7•15 years ago
|
||
Chinmay: is that something you can take on?
Flags: blocking-thunderbird3? → blocking-thunderbird3+
Target Milestone: --- → Thunderbird 3.0b3
Reporter | ||
Comment 8•15 years ago
|
||
This - for now - removes the keywords that cause too many false positives.
Once we have code to ignore URLs, we can re-add "\.doc" and "\.pdf", but for now, it's triggering too often (see comment 0).
Attachment #377021 -
Flags: review?(mkmelin+mozilla)
Comment 9•15 years ago
|
||
Magnus: Yes, I will handle it.
I will add word boundaries in the regexp, and make the code escape dots before using dot-word in the regexp.
As I am busy with some other stuffs, I will be back with the solution in just 2 days.
Reporter | ||
Comment 10•15 years ago
|
||
This keeps "attached" in the list
Attachment #377030 -
Flags: review?(mkmelin+mozilla)
Assignee | ||
Comment 11•15 years ago
|
||
(In reply to comment #8)
> Once we have code to ignore URLs, we can re-add "\.doc" and "\.pdf", but for
> now, it's triggering too often (see comment 0).
The word boundaries (+ dot escaping) will fix that, no?
Assignee | ||
Comment 12•15 years ago
|
||
Hm, though the extensions won't be terribly usable then i suppose.
Reporter | ||
Comment 13•15 years ago
|
||
> > ignore URLs, we can re-add "\.doc" and "\.pdf"
> The word boundaries (+ dot escaping) will fix that, no?
If you only want to trigger on "Please see the .doc in this mail", yes.
If you want to trigger on "In planning.doc, you can see", but not on
"http://foo/planning.pdf", then no, word boundaries don't help, you'll
need to recognize URLs.
Assignee | ||
Comment 14•15 years ago
|
||
I guess it shouldn't be too hard to filter out words starting with http: https:
Assignee | ||
Comment 15•15 years ago
|
||
Comment on attachment 377021 [details] [diff] [review]
Fix, step 1, v1
We'll have to bump the entity name too... I do thing "attached" should be in there. Resume should proably be résumé or get removed. Anyway, since it's a pain to bump, lets hold on until we know we have word boundary matching.
Attachment #377021 -
Flags: review?(mkmelin+mozilla) → review-
Assignee | ||
Updated•15 years ago
|
Attachment #377030 -
Attachment is obsolete: true
Attachment #377030 -
Flags: review?(mkmelin+mozilla)
Comment 16•15 years ago
|
||
Re-assigning to dmose so we can move this forward. Chinmay, I'm assuming you've gotten busy with work over the summer. Let me know if you are in fact working on this still.
Assignee: chinmay_patel → dmose
Status: NEW → ASSIGNED
Comment 17•15 years ago
|
||
David:
My laptop has broken and I have sent it to HP to fix it :( .
I use Toronto Public Library PCs for internet access. Therefore, I am sorry to say that I won't be able to continue with this bug. Certainly, I am willing to work on other projects once I get my laptop back.
Once again, I am really sorry to leave the bug unresolved.
Comment 19•15 years ago
|
||
This might help.
This code will set word boundaries and escape dotes from the keyword. Moreover, dot at the beginning of the keyword is most of the time an extension. Hence, I am taking out the starting word boundaries for such keywords.
for (let i = 0; i < keywordsArray.length && !keywordFound; i++)
{
var temp = keywordsArray[i].replace(/\./g, "\\\.");
if (temp.match("^\\\\\."))
temp = temp + "\\b";
else
temp = " \b" + temp + "\\b";
var re = new RegExp(temp, "i");
keywordFound = re.test(mailData);
}
Hopefully, this will help.
(Note: I could't test this code for all possible cases.)
Comment 20•15 years ago
|
||
this is too smart. I created an email in the German language and used "doch" which also triggered this sheet. I have deactivated this *smart* feature because it is not useful for me. Please keep in mind keywords vs. locales.
Assignee | ||
Comment 21•15 years ago
|
||
Require word boundaries, ignore links, tweak the keywords a bit (seem we can't bump the l10n value after all).
Some testing i did:
o .doc test http://example.com/test.doc -> no match
o doc test http://example.com/test.doc now -> no match
o .doc lieber doch nicht -> no match
o doc lieber doch nicht -> no match
o .pdf see this .pdf -> MATCH
o .pdf ...have a look at this .pdf? -> MATCH
o .pdf see the file testcase.pdf -> MATCH
o attachment can you have a look at this attachment? -> MATCH
Word boundary does mean stuff like /doc will also match, which sounds ok to me.
Attachment #377021 -
Attachment is obsolete: true
Attachment #382356 -
Flags: review?(bugzilla)
Assignee | ||
Updated•15 years ago
|
Status: NEW → ASSIGNED
Comment 22•15 years ago
|
||
Comment on attachment 382356 [details] [diff] [review]
proposed fix
This looks reasonable at a quick glance, but I'm going to pass to Phil as I think he'll be better at reviewing this.
Attachment #382356 -
Flags: review?(bugzilla) → review?(philringnalda)
Updated•15 years ago
|
Attachment #382356 -
Flags: review?(philringnalda) → review+
Comment 24•15 years ago
|
||
Comment on attachment 382356 [details] [diff] [review]
proposed fix
Where by "better" I think he meant "way slower." Looks fine, let's see if it's enough.
Assignee | ||
Comment 25•15 years ago
|
||
changeset: 2879:7674cfc3f3a4
http://hg.mozilla.org/comm-central/rev/7674cfc3f3a4
->FIXED
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Summary: attachment warning: Too common keywords, many false triggers → attachment reminder: Too common keywords, many false triggers
Comment 26•15 years ago
|
||
Devs,
You're not going to come up with a list which pleases everyone; some of us (who forget attachments a lot) want a very broad list. Other people who don't forget attachments find this feature very annoying ... and a lot of us use different words to indicate attachments.
The only real answers is to somehow make the word list user-editable. If that's too hard, then just at least give this feature an on|off preference.
Comment 27•15 years ago
|
||
Josh: Have a look into the Composition preference settings, General tab.
Both options (on/off checkbox, list of keywords) are available there.
Comment 28•15 years ago
|
||
RSX,
Not on Mac they're not.
Assignee | ||
Comment 29•15 years ago
|
||
Josh: are you using the latest nightly? (This wasn't available for tb3beta2)
You need to log in
before you can comment on or make changes to this bug.
Description
•