Closed
Bug 1490541
Opened 6 years ago
Closed 6 years ago
add words to en-US.dic
Categories
(Core :: Spelling checker, enhancement)
Core
Spelling checker
Tracking
()
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: ananuti, Assigned: ananuti)
References
Details
User Story
merchantability - The condition, state, or quality of being merchantable; saleability. (Chiefly in legal contexts.). https://en.oxforddictionaries.com/definition/us/merchantability salability - capable of being or fit to be sold https://en.oxforddictionaries.com/definition/us/salability sucky - very bad or unpleasant ‘What a sucky, sucky way to end a sucky, sucky day.’ https://en.oxforddictionaries.com/definition/us/sucky ==== From: https://www.merriam-webster.com/words-at-play/new-words-in-the-dictionary-september-2018 Latinx - (Capitalized) - a gender-neutral alternative to Latino or Latina https://www.merriam-webster.com/dictionary/Latinx https://en.oxforddictionaries.com/definition/latinx adorbs - extremely charming or appealing; adorable https://www.merriam-webster.com/dictionary/adorbs https://en.oxforddictionaries.com/definition/us/adorbs avo - an avocado https://www.merriam-webster.com/dictionary/avo bingeable - something you can binge https://www.merriam-webster.com/dictionary/bingeable biohacking, biohacker - biological experimentation (as by gene editing or the use of drugs or implants) done to improve the qualities or capabilities of living organisms especially by individuals and groups outside of a traditional medical or scientific research environment https://www.merriam-webster.com/dictionary/biohacking https://en.oxforddictionaries.com/definition/us/biohacking bougie - bourgeois https://www.merriam-webster.com/dictionary/bougie https://en.oxforddictionaries.com/definition/us/bougie fav - synonym for fave (a verb and noun) https://www.merriam-webster.com/dictionary/fav https://en.oxforddictionaries.com/definition/us/fave fintech - Financial technology (plural fintechs - can't use magic "S" 🎯) https://www.merriam-webster.com/dictionary/fintech https://en.oxforddictionaries.com/definition/us/fintech gochujang - Korean chili paste ‘lamb cutlets with gochujang, pickled cucumber, and carrot’ https://www.merriam-webster.com/dictionary/gochujang https://en.oxforddictionaries.com/definition/gochujang guac - short for guacamole ‘we got chips, salsa, and guac’ https://www.merriam-webster.com/dictionary/guac https://en.oxforddictionaries.com/definition/guac hangry, hangrier, hangriest - angry from hunger https://www.merriam-webster.com/dictionary/hangry https://en.oxforddictionaries.com/definition/hangry hophead - one who likes beer https://www.merriam-webster.com/dictionary/hophead https://en.oxforddictionaries.com/definition/hophead iftar - the meal taken by Muslims at sundown https://www.merriam-webster.com/dictionary/iftar https://en.oxforddictionaries.com/definition/iftar mise - from mise en place https://www.merriam-webster.com/dictionary/mise%20en%20place mise - the issue in a legal proceeding upon a writ of right; also : the writ itself https://www.merriam-webster.com/dictionary/mise mocktail - a alcohol free cocktail https://www.merriam-webster.com/dictionary/mocktail https://en.oxforddictionaries.com/definition/mocktail rando - a random person https://www.merriam-webster.com/dictionary/rando https://en.oxforddictionaries.com/definition/us/rando ribbie - a spelling based on RBI in baseball land https://www.merriam-webster.com/dictionary/ribbie https://en.oxforddictionaries.com/definition/us/ribbie zuke - a zucchini https://www.merriam-webster.com/dictionary/zuke
Attachments
(1 file)
5.41 KB,
patch
|
ehsan.akhgari
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Assignee | ||
Comment 1•6 years ago
|
||
Attachment #9008273 -
Flags: review?(ehsan)
Comment 2•6 years ago
|
||
Comment on attachment 9008273 [details] [diff] [review] bug1490541.patch Review of attachment 9008273 [details] [diff] [review]: ----------------------------------------------------------------- ::: extensions/spellcheck/locales/en-US/hunspell/en-US.dic @@ +26397,5 @@ > fink/MDGS > finned > finny > +fintech > +fintechs Good observation on not using S!
Attachment #9008273 -
Flags: review?(ehsan) → review+
Pushed by eakhgari@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/84b407430f08 Bug - Add words to en-US dictionary. r=ehsan
Comment 4•6 years ago
|
||
Once upon a time we had a long discussion on what should be in the Mozilla en-US dictionary. It's basically based on the (regular) SCOWL dataset, plus some common names, accented words, plus some Mozilla terms and some extra words. I argued back then that Mozilla should use the large SCOWL dataset and Ehsan argued against it, IIRC, basically saying that users should get a "basic" dictionary without slang or niche or speciality words, also in order to avoid misspellings. See bug 1235506 comment #9: mask spelling of more common words for example "calender"/"calendar" See bug 1235506 comment #20: large -> WONTFIX Here you've added a bunch of words, most of which should *not* have been added at all. The ambition has never been to offer the most complete dictionary. If you want that and don't care about en-US, use https://addons.mozilla.org/en-GB/firefox/addon/british-english-dictionary-2/, that's the most complete one available. Coming back to the words added. "hangry" is of course an absolute no no, since it will now allow the spelling mistake "hangry" instead of "hungry". Particularly bad since phonetically the "u" in "hungry" is pronounced as "a" (/ˈhʌŋɡri/). Please take the time and check those words against SCOWL. You can just paste them all === merchantability salability sucky latinx adorbs bingeable biohacking fav fintech fintechs gochujang guac hangry hophead iftar mise mocktail rando ribbie zuke === here: http://app.aspell.net/lookup You will see that most words are not recommended for their dictionary, and some are in the large dataset. I would back this out. According to SCOWL large, only merchantability, salability, hophead and mocktail are derirable.
Flags: needinfo?(ehsan)
Comment 5•6 years ago
|
||
I agree about hangry actually, that's a good point, but do you mind filing a new bug? No point in backing out the whole patch just because of one hunk. (In reply to Jorg K (GMT+2) from comment #4) > You will see that most words are not recommended for their dictionary, and > some are in the large dataset. I'm well aware that our en-US dictionary diverges from SCOWL, intentionally so. I'd rather not have that debate again, since I think it's a matter of opinion.
Flags: needinfo?(ehsan)
Comment 6•6 years ago
|
||
Indeed, but could you please state some clear guidelines that lay out what words should be included. What is the definition for the Mozilla en-US dictionary? What are your plans? Only due to lack of clear guidelines there has been the discussion in the past. IMHO the criterion for inclusion cannot be that someone presents a patch and anything is included. That will make for a very inconsistent and patchy result. Ehsan, can you as the custodian please make sure such rules are defined and followed. Please don't put the onus of rectifying the the current situation on someone who made a drive-by comment (after coincidentally seeing the changeset on inbound). Looking at the history of manual additions: https://hg.mozilla.org/mozilla-central/log/tip/extensions/spellcheck/locales/en-US/hunspell/en-US.dic most seem very welcome, so I don't quite understand what happened this time. Ekanan, can you please fix the problem. I think not only "hangry" should be removed, but in fact many of the words added this time, see comment #4. I'm happy to stand corrected if Ehsan comes up with some guidelines. Personally I'd suggest to look all future additions up in http://app.aspell.net/lookup. If they are in the large dictionary, there's no problem including them. If the word has a "should include" rating of one stars, I would generally not include it.
Flags: needinfo?(ananuti)
Assignee | ||
Comment 7•6 years ago
|
||
> Personally I'd suggest to look all future additions up in > http://app.aspell.net/lookup. If they are in the large dictionary, there's > no problem including them. If the word has a "should include" rating of one > stars, I would generally not include it. Usually, I use that toy ONLY if they are NOT in the AmEng dictionary (OFD/M-W). If they are, I'll bake a patch. That's my modus operandi. gabish? period. > I agree about hangry actually, that's a good point, but do you mind filing a new bug? > No point in backing out the whole patch just because of one hunk. Next time around, I'll take it away.
Flags: needinfo?(ananuti)
Comment 8•6 years ago
|
||
Ehsan, I really think it needs some guidelines here. If anything in the Oxford or Merriam Webster dictionaries should be included, then we need a different approach to making the dictionary complete. I don't agree that Kevin Atkinson's tool is a "toy", I believe he and his SCOWL friends do analysis on Google Books regarding the frequency of words and maintain their word lists with great care. Maintaining a dictionary is really the job of a linguist, and neither you, nor me have English as their native language.
Comment 9•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/84b407430f08
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Comment 10•6 years ago
|
||
I researched the words added here and they seem to have some usage from what I see in web searches. I came to the conclusion adding them was OK, so please accept my apologies. Apparently we're aiming for a certain completeness of Mozilla's en-US dictionary, so may I suggest the following: Download SCOWL's "toy" word lists in the "normal" and "large" size from https://sourceforge.net/projects/wordlist/files/speller/2018.04.16/wordlist-en_US-2018.04.16.zip/download https://sourceforge.net/projects/wordlist/files/speller/2018.04.16/wordlist-en_US-large-2018.04.16.zip/download Compare en_US.txt to en_US-large.txt using some comparison tool and note "useful" words not contained in the smaller set and most likely therefore not contained in Mozilla's dictionary. I did that for the letter "A" and found these useful words: Alicante (city in Spain, we already have Madrid, Barcelona, Valencia), Amazonas, Americanist, Anglicist (all marked as spelling errors). May I also mention enrobe, relict, residuary, enforceability which were in the Mozilla dictionary before May 2015 (picked from bug 1235506 comment #10).
Comment 11•6 years ago
|
||
(In reply to Jorg K (GMT+2) from comment #6) > Indeed, but could you please state some clear guidelines that lay out what > words should be included. What is the definition for the Mozilla en-US > dictionary? What are your plans? Only due to lack of clear guidelines there > has been the discussion in the past. There are *no guidelines* at this time. I invite all who are interested in developing such guidelines that you would like to see to start putting in the time and expertise necessary to research and develop the kind of guideline they would like to see here instead of simply demanding it into existence. Until such a day where we have some guidelines that we would follow, the process will remain as follows: Contributions to the en-US dictionary are encouraged. Contributors are encouraged to study resources including dictionaries, the SCOWL wordset, and any other data sources that should be helpful in the word selection process. The reviewers will do their best effort to provide guidance. Occasionally we will get things wrong, and when that happens we encourage bug reports so that we can fix the mistakes. > Ehsan, can you as the custodian please make sure such rules are defined and > followed. Please don't put the onus of rectifying the the current situation > on someone who made a drive-by comment (after coincidentally seeing the > changeset on inbound). I did no such thing. I simply asked you to file a bug instead of commenting on a bug with a patch landed (common Mozilla development practice). > Looking at the history of manual additions: > https://hg.mozilla.org/mozilla-central/log/tip/extensions/spellcheck/locales/ > en-US/hunspell/en-US.dic > most seem very welcome, so I don't quite understand what happened this time. Well, mistakes happen. Anyway, Ekanan will take of the issue as mentioned in comment 7, so I don't think there's more to discuss here. For future discussions, I invite you to read https://bugzilla.mozilla.org/page.cgi?id=etiquette.html again before commenting. Thank you.
You need to log in
before you can comment on or make changes to this bug.
Description
•