Closed Bug 460351 Opened 16 years ago Closed 14 years ago

random nouns with apostrophe+s 's on the end in en-US.dict

Categories

(Core :: Spelling checker, defect)

All
Windows XP
defect
Not set
minor

Tracking

()

RESOLVED FIXED

People

(Reporter: info, Unassigned)

References

Details

(Whiteboard: [fixed by bug 479334])

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b2pre) Gecko/20081016 Minefield/3.1b2pre Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b2pre) Gecko/20081016 Minefield/3.1b2pre ID:20081016033525 The en-US spelling dictionary includes English contractions like "would've". But it also includes a few hundred nouns with 's on the end for no reason I can discern. Reproducible: Always Steps to Reproduce: 1. In your Firefox nightly, open dictionaries/en-US.dic 2. Search for 's (single quote or apostrophe) Actual Results: Seemingly at random the dictionary includes word variants with 's on the end, e.g. man's but not woman's zip's and zone's but not strip's and bone's strain's but not train's Expected Results: I believe many of these could be eliminated with no effect on spell checking behavior. The apostrophe doesn't even seem necessary to get spell checking to suggest the "'s" variant as a correction -- paste the following into a textarea and right-click each word. manns womanns zipps zonees stripps bonees strainss trainss http://hunspell.sourceforge.net/ points to OOo for dictionaries, http://ftp.services.openoffice.org/pub/OpenOffice.org/contrib/dictionaries/en_US.zip has a different but seemingly equally random set of 's entries.
Interesting. Most of these are examples of someone -- or rather some autogeneration program -- getting too clever with prefixes and suffixes. For example, inactivity's should be allowed but *inactivities shouldn't...but the prefix /I is used for activity allowing all forms of inactivity so you need to add a special case for activity's. So fixing these by hand would just introduce weird errors into the dictionary --> INVALID (or fix upstream) BUT, there were some definite bugs lurking here. For example bullshitter's and cocksucker's weren't marked as taboo (!), so Firefox was suggesting them. Thanks, Firefox! I'll fix these as part of Bug 479334.
Status: UNCONFIRMED → NEW
Depends on: 479334
Ever confirmed: true
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by bug 479334]
Just FYI: I'm the upstream maintainer and I can tell you for a fact that this has nothing to do with "getting too clever with prefixes and suffixes". I do not work with words in the affix compress form, rather I affix compress the final list to create one suitable for hunspell. What it has to do with was the fact that I do not have any reliable source of information on which words take possessive forms, thus I have to make educated guesses. Basically I apply a possessive form to any word which could possible be a noun, with limited exceptions. Unfortunately this results in many words getting a possessive form, which shouldn't. I hope to eventually be more precise.
You need to log in before you can comment on or make changes to this bug.