Words that should or need to be added to the dictionary

VERIFIED FIXED in mozilla1.8.1beta2

Status

()

Core
Spelling checker
VERIFIED FIXED
11 years ago
10 years ago

People

(Reporter: u88484, Assigned: Brett Wilson)

Tracking

({verified1.8.1})

1.8 Branch
mozilla1.8.1beta2
verified1.8.1
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

11 years ago
Tracking bug for all words that need to be added to the dictionary.
(Reporter)

Updated

11 years ago
Summary: [meta] Words that should/need to be added to the dictionary → [meta] Words that should or need to be added to the dictionary
(Reporter)

Updated

11 years ago
Depends on: 308744
(Reporter)

Updated

11 years ago
(Assignee)

Updated

11 years ago
Whiteboard: swag: 1d
(Assignee)

Updated

11 years ago
Depends on: 339899

Updated

11 years ago
No longer depends on: 339899

Updated

11 years ago
Depends on: 340176

Comment 1

11 years ago
"proven" doesn't seem to be in the dictionary
(Assignee)

Comment 2

11 years ago
+ add "online"

Comment 3

11 years ago
Should words be added to this bug, or should a new bug be filled and added here?
(Assignee)

Comment 4

11 years ago
Add words to this bug. I'm closing all the other bugs and listing them here:

Linux distributions:
+ debian
+ ubuntu
+ slackware
+ gentoo
Rationale: Nerds will complain if we don't add them :) and they aren't similar to real words that would confuse "normal" people. I explicitly did NOT add "suse" because it is very likely normal people will type "sues" "sue" "use", etc. and we don't want to confuse them.

Internet:
+ online (as above)
+ unsubscribe
+ blog
+ blogger
+ blogging
+ blogged
+ blogs
+ podcast
+ internet (currently it has only the capitalized version)

Random:
+ caffeinated
+ gauge

Mozilla:
+ Mozilla
+ Thunderbird
+ Firefox
+ Sunbird
+ Seamonkey

Companies and product names:
+ Google
+ eBay
+ PayPal
+ PowerPoint
Company names are a very slippery slope as there is no place to stop. These are the company names that I think are very likely to appear in web forms among the general internet population. "Microsoft" is already in the dictionary, and "Yahoo" is a real word. We can consider adding a FEW other names, but not many. If this is going to be contentious, I'd prefer adding none.
(Assignee)

Comment 5

11 years ago
*** Bug 223322 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 6

11 years ago
*** Bug 259916 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 7

11 years ago
*** Bug 236757 has been marked as a duplicate of this bug. ***
(Assignee)

Comment 8

11 years ago
*** Bug 214519 has been marked as a duplicate of this bug. ***
No longer depends on: 214519
(Assignee)

Comment 9

11 years ago
*** Bug 308744 has been marked as a duplicate of this bug. ***
No longer depends on: 308744
(Assignee)

Comment 10

11 years ago
*** Bug 340176 has been marked as a duplicate of this bug. ***
No longer depends on: 340176
(Assignee)

Updated

11 years ago
No longer depends on: 223322, 236757, 259916
(Assignee)

Updated

11 years ago
Summary: [meta] Words that should or need to be added to the dictionary → Words that should or need to be added to the dictionary
(Assignee)

Comment 11

11 years ago
+ webcast

Comment 12

11 years ago
That would be SeaMonkey, with capital M, no?
And the linux distros, shouldn't they also be capitalized (or maybe both?). And why not add SUSE in all caps?
(Assignee)

Comment 13

11 years ago
- Seamonkey above
+ SeaMonkey
+ JavaScript
+ inline
+ http
+ ftp
+ https
Target Milestone: --- → mozilla1.8.1beta2

Comment 14

11 years ago
TCP is in the dictionary, but IP and UDP are not
(Assignee)

Updated

11 years ago
Assignee: mscott → brettw

Comment 15

11 years ago
By "the" dictionary, are we talking:
- English (Australia)?
- English (Canada)?
- English (United Kingdom)?
- English (US)?
- English (New Zealand)?
- something else entirely?

We should look at them all when deciding which words to add.  OK, so I'd expect a lot of these computing neologisms I'd expect to be more or less dialect-neutral, but there might be a few that aren't.  For example, what about "online" vs. "on-line" and "inline" vs. "in-line"?  And is "caffeinated" (which strikes me as possibly a back-formation) established in most English dialects?

Comment 16

11 years ago
The dictionary allows (and suggests) some invalid words, such as "yous" and "thats".  Should I start a new bug about this?

Comment 17

11 years ago
+ focussed
(Assignee)

Comment 18

11 years ago
I think focussed must be British spelling or something. I've never seen it and it looks very strange to me. MS Word marks it as misspelled. I wonder if there is a separate en-GB dictionary?

Comment 19

11 years ago
I'm not sure, but both Answer.com ("v., -cused or -cussed") and Dictionary.com ("v. fo·cused, or fo·cussed") list it, and neither suggests that it's a British variant.

Comment 20

11 years ago
(In reply to comment #18)
> I think focussed must be British spelling or something. I've never seen it and
> it looks very strange to me.

A quick look through some BrE dictionaries shows that both spellings are valid.

> MS Word marks it as misspelled.

Regardless of which dictionary you use?  Here (MS Office X), only the US dictionary rejects "focussed" - for UK and Australia it accepts both "focused" and "focussed".

> I wonder if there is a separate en-GB dictionary?

Separate - from what?
en-GB dictionary - of course there is.  That's exactly what I was talking about in comment 15, as you could have seen for yourself by selecting "Download More" from the spellchecker UI.

Comment 21

11 years ago
*** Bug 344372 has been marked as a duplicate of this bug. ***

Comment 22

11 years ago
+ proven

Comment 23

11 years ago
D'oh -- someone has already said "proven".  Apologies.
(Assignee)

Comment 24

11 years ago
Created attachment 229555 [details] [diff] [review]
Words described above

I added all the words given above, focussed, and SUSE. SUSE looks OK because I was unable to get it to suggest SUSE for a misspelling of a real word.
Attachment #229555 - Flags: review?(mscott)
Attachment #229555 - Flags: approval1.8.1?
(Assignee)

Comment 25

11 years ago
The patch above is for en-US only (the one that is checked into the tree). We should apply this to the other English dictionaries. I think all the words apply to other variants, but I don't know where those live.
Whiteboard: swag: 1d → has patch
(Assignee)

Updated

11 years ago
Version: Trunk → 1.8 Branch

Comment 26

11 years ago
Comment on attachment 229555 [details] [diff] [review]
Words described above

If we add these words here, all of these words are going to get clobbered the next time we update the dictionary. This file gets copied over from open office.org every time we update the spell checker. At least that's the idea anyway :)
(Reporter)

Comment 27

11 years ago
(In reply to comment #26)
> (From update of attachment 229555 [details] [diff] [review] [edit])
> If we add these words here, all of these words are going to get clobbered the
> next time we update the dictionary. This file gets copied over from open
> office.org every time we update the spell checker. At least that's the idea
> anyway :)
> 

Heh, mid-air collision with basically same comment. Some dictionaries on openoffice.org haven't been updated in years though...so I would think that the clobber wouldn't happen too often but its still a pain to make sure it doesn't before each release.

Via http://lingucomponent.openoffice.org/spell_dic.html

English (United States) en_US 2004-06-23
(Assignee)

Comment 28

11 years ago
(In reply to comment #26)
> (From update of attachment 229555 [details] [diff] [review] [edit])
> If we add these words here, all of these words are going to get clobbered the
> next time we update the dictionary. This file gets copied over from open
> office.org every time we update the spell checker.

I thought OO wasn't updating MySpell anymore and have switched to Hunspell.

We've never updated this file before.

Even if you are right, what would you suggest doing? We really need some words added to the dictionary, and I wouldn't expect MySpell/HunSpell to add all of them (for example, Mozilla, SeaMonkey, etc). I don't see any way around it.

We could check the patch in the directory like with do with sqlite if you are really worried.
Comment on attachment 229555 [details] [diff] [review]
Words described above

Clearing approval request until reviewed
Attachment #229555 - Flags: approval1.8.1?

Comment 30

11 years ago
Don't know if it is too late, but "spam" is also missing
(Assignee)

Comment 31

11 years ago
I'll do a new patch with
+ spam
+ cafe
+ webmaster

I'm less sure about:
+ phishing (and variants)

Comment 32

11 years ago
FWIW, it seems that "proven" is missing from the current patch.

Comment 33

11 years ago
hishing protection is one of the new features of Firefox, so why not add it.

Also saw that "spammer" and "spammers" are missing.
(Reporter)

Comment 34

11 years ago
+ uninstall/uninstalling 
(Assignee)

Comment 35

11 years ago
Created attachment 230586 [details] [diff] [review]
A few more
Attachment #229555 - Attachment is obsolete: true
Attachment #230586 - Flags: review?(mscott)
Attachment #229555 - Flags: review?(mscott)
(Assignee)

Comment 36

11 years ago
Some of these words should probably be added to the source dictionary, although I'm not sure if its still being maintained.

In any case, many other of these words should not be added (like "Mozilla") but we need them in our dictionary. How about it I check this patch in to the dictionary directory so we can apply it if we ever update the dictionary. We do this successfully with sqlite.
viewport
preload
preloading
JavaScript
CSS
XHTML
Do we really want to add every tech word we can come up with? If so, do we also want to add words from other jargons? If so, where do we stop?
(In reply to comment #35)
> Created an attachment (id=230586) [edit]
> A few more

Why -HTTP?
(Assignee)

Comment 40

11 years ago
The criteria are words that are likely to be typed into webmail or other forms by a large number of our target audience. We're not adding HTML tag names, and I would also argue against "viewport" for this reason. Bug JavaScript is OK, as well as some words like "unsubscribe" that are often found in email.

The Linux names are a bit silly and maybe we shouldn't add them, but I checked to make sure they won't get suggested by other words, so it doesn't really hurt.
(Assignee)

Comment 41

11 years ago
Because I added "http" which is not case sensitive. If we don't get URL identification working, this will help in URLs.

Comment 42

11 years ago
(In reply to comment #28)
> We've never updated this file before.

I've updated the en-US dictionary files several times over the years. But they have moved into the locale specific directory without preserving CVS history which is why you can't see that from the log. 

> We could check the patch in the directory like with do with sqlite if you are
> really worried.

I was just going to suggest adding a README comment like we do for the myspell changes that can sit along side the dictionary, but your suggestion of checking in the actual patch along side the dictionary is probably an even better idea. I like it. 

(Assignee)

Comment 43

11 years ago
Created attachment 231249 [details] [diff] [review]
Patch with patch in it

This implements the requirements of the previous discussion, checking in the patch alongside the dictionary. I also added a small readme.

There's no reason that we need to let this bake on trunk before checking into branch, so I'm requesting approval.
Attachment #230586 - Attachment is obsolete: true
Attachment #231249 - Flags: review?(mscott)
Attachment #231249 - Flags: approval1.8.1?
Attachment #230586 - Flags: review?(mscott)
(Assignee)

Updated

11 years ago
Whiteboard: has patch → [needs review]

Comment 44

11 years ago
Comment on attachment 231249 [details] [diff] [review]
Patch with patch in it

thanks for adding the readme and the patch files Brett.
Attachment #231249 - Flags: review?(mscott) → review+
(Assignee)

Updated

11 years ago
Attachment #231249 - Attachment description: Patch with path in it → Patch with patch in it
(Assignee)

Updated

11 years ago
Whiteboard: [needs review] → [needs approval]
(Assignee)

Comment 45

11 years ago
Fixed on trunk, leaving open for branch checkin.
Comment on attachment 231249 [details] [diff] [review]
Patch with patch in it

a=drivers, please land this on the MOZILLA_1_8_BRANCH.
Attachment #231249 - Flags: approval1.8.1? → approval1.8.1+
(Assignee)

Comment 47

11 years ago
Fixed on branch.
Status: NEW → RESOLVED
Last Resolved: 11 years ago
Keywords: fixed1.8.1
Resolution: --- → FIXED
Whiteboard: [needs approval]
Not sure if this is the place to complain about more missing words (if not, please let me know where the right place is).

The dictionary is missing the word "programmatically", although my dictionary tells me it's a real word. (http://www.answers.com/main/ntquery?s=programmatically)
(Assignee)

Comment 49

11 years ago
(In reply to comment #48)
> Not sure if this is the place to complain about more missing words (if not,
> please let me know where the right place is).

No. Perhaps you should file a new bug for keeping track of new words? Adding words is a never-ending task and I'm *so* done worrying about this, so don't CC me on it :)

Comment 50

11 years ago
+ toolbar   ?

too late?
(Assignee)

Comment 51

11 years ago
(In reply to comment #50)
> + toolbar   ?
> 
> too late?

Yes.

Comment 52

11 years ago
(In reply to comment #51)
> (In reply to comment #50)
> > + toolbar   ?
> > 
> > too late?
> 
> Yes.
> 

How are we going to maintain the dictionary in future? Are we forking from ooo, or are we going to track their releases and then apply the patch with mozilla/tech-related words every time?

Should we be suggesting our words to ooo perhaps, and jointly maintain it?
(Assignee)

Comment 53

11 years ago
There is a bug on replacing MySpell with HunSpell. Hopefully we'll use that in Firefox 3 becuase it should be much better, so maybe worrying about OO is a waste of time. My goal for this release was to get the major words that were important (e.g. "Mozilla") and a few other random things that would be useful.

We don't have a policy right now. For strategy moving forward, I have no idea, and hopefully I won't be the person in charge of doing this forever.
(Reporter)

Updated

11 years ago
Status: RESOLVED → VERIFIED
Keywords: verified1.8.1
Keywords: fixed1.8.1

Comment 54

11 years ago
The word gauge is misspelled in the spell checker/dictionary.  Gage is *not* a correct spelling, rather it was introduced into English by Toyota Motor Co. who misspelled the things on the dashboard.  Whether it shows up now in online dictionaries is not a good gauge of its correctness.

Regarding some of the other comments, focussed is correct, so is targetting, and a whole host of other words that have been re-written since MS Word decided to weigh in on spelling without knowing how to spell properly.  The rules are that if a vowel is short and precedes the last letter of a word, and leaving the last consonant single would normally be pronounced as a long vowel, you add a repeat of the last letter.  There are a lot of other rules that get ignored sometimes too, like adding a 'k' after a 'c' before 'ing' (ci is pronounced like in cigar), as in mimicking or mimicked.  Retaining an 'e' to avoid the wrong pronunciation of 'g' is also correct, as in manageable.  This business about "British variants" is malarky unless you are talking about colour or humour.
(Assignee)

Comment 55

11 years ago
(In reply to comment #54)
> The word gauge is misspelled in the spell checker/dictionary.

The job of the spellchecking dictionary is not to implement somebody's idea of correctness. "Gage" is used quite commonly on the web, with 10s of millions of hits on Google, including many gauge-making companies. My American car also uses "gage".

Updated

10 years ago
Duplicate of this bug: 383970
(In reply to comment #25)
> The patch above is for en-US only (the one that is checked into the tree). We
> should apply this to the other English dictionaries. I think all the words
> apply to other variants, but I don't know where those live.

We might want to reopen this since this never happened.
You need to log in before you can comment on or make changes to this bug.