Open Bug 499593 Opened 10 years ago Updated 3 days ago
[meta] Add word _____ which is not in American English dictionary / recognized / suggested / misspelled by spell checker (spell checking, spelling, spell checker, en-US)
In Bug 479334, I updated the dictionary to include a large number of new words suggested by bug reporters, which were filed as over 25 separate bugs. Along the lines of the "source code spelling fixes" bug, I would like to make this the consolidated bug for adding new words to the spell checker dictionary. From now (June 2009) until the next time the dictionary gets patched, please leave a comment in this bug with the word you would like added (and a rationale, if any). Please also dupe bugs (with bug numbers > 5,000,000) that make dictionary suggestions to this bug.
add: natively, New Zealand
correct: opthalmic -> ophthalmic, opthalmologic -> ophthalmologic, opthalmology -> ophthalmology (best to grep for "opthal" and fix all to add an "h" after the "p". This error dates back to the original ispell dictionaries.)
Add testsuite and testcase From bug 503687, add afterwards.
html in lowercase is not in the dictionary, so saying "foo.html" appears spelled wrong. The same applies to other extensions like jpg, gif, png, css, js, php, etc. Though these aren't very important, they would be helpful additions. Also from my last comment, the plural forms would be needed too, "testsuites" and "testcases".
Neither "testsuite" nor "testcase" is an English word; neither should be added. Similarly, HTML, JPG, GIF, etc. are acronyms and should properly appear in uppercase in written documents. Users who regularly write about file names can add them to personal dictionaries if desired; for similar reasons, things like "usr" shouldn't be in the dictionary. (FWIW, I write about file names all the time, and I've gotten along just fine with per-file whitelisting, which protects me from typos in situations where file names aren't the topic without adding significant inconvenience.)
(In reply to comment #7) > Neither "testsuite" nor "testcase" is an English word; neither should be added. > > Similarly, HTML, JPG, GIF, etc. are acronyms and should properly appear in > uppercase in written documents. So how should be XML written? Almost all Microsoft stuff I see "Xml". Real world case in the IT industry it is the standard to use "testsuite", "testcase", html, jpg, gif, url, popup, plugin etc. Many times in design document reviews I insisted usage of uppercase for COBOL, when other issues got priority my comments got ignored. So I stopped pointing out that. Firefox/Mozilla need to know whom we are targeting as spellcheck user. Are we targeting a person who is writing novel in Firefox? Or people who are just using email, forum, putting comments etc on intranet/internet sites. I can agree "usr" shouldn't be in the dictionary. But "hmm" "um" "umm" will be good as every body use them in chat. Alternatively we could allow use of multiple dictionary. The way WordPerfect allowed to Technical Dictionary along with English Dictionary.
(In reply to comment #8.) The point of a spell checker is to help people who want to spell things correctly. That doesn't mean adding misspellings simply because they are popular. A dictionary shouldn't contain "Xml" any more than it should contain "definately", no matter how many people at Microsoft or elsewhere screw it up. There is little harm in having a squiggly underline under a word. But the misspelled "ophthalmic", in comment #3 above, got a person in trouble with his boss. I believe that spelling dictionaries are setting a standard and thus should be conservative. Personal dictionaries and multiple dictionaries have advantages (and disadvantages), but that should go in a different bug report.
I find it wrong, wrong, wrong that an institution as geeky as ours does not recognized the noble theremin. Please add "theremin" to our English dictionaries.
Thatcherism is missing from the dictionary. http://en.wikipedia.org/wiki/Thatcherism
Might've - contraction of might have
All suggestions through Comment 17 (plus comment 19) have been fixed by Bug 479334. --------- Some words to start the next patch: * Dieing trifecta quiniela exacta superfecta
I would vote (strongly) against "exacta". As a general rule, it's not a good idea to insert words that are rarely used and are a small typo away from a common word. The other three are OK, since they aren't close to common words, although horse-race betting is sufficiently specialized that it wouldn't break my heart to see them omitted.
Geoff, thanks for the suggestion. "exacta" occurs on over 2 million English web pages, which is a lot relative to many of our dictionary words (although "exact" occurs on around 500 million). So it depends on what we're trying to do -- prevent false positive "spelling errors" of "exacta," or prevent false negative misspellings of "exact." We should think about it more when preparing the next patch. "trifecta," however, is part of ordinary language and should definitely be included.
To deal with words like "exacta" the best solution would be to make the spell checker more intelligent. It could do some basic grammar checking to spot that kind of thing. For the moment I think exacta should be included. Adding an extra 'a' to the end of a "exact" does not seem very likely to happen. The 'a' key is far from the 't' key and the word "a" never follows "exact". If the following word started with 'a' and the user presses the space bar after it rather than before then the next word will be misspelt and the error should come to light.
I'll point out the following, even though my preference is for the spell checker to be a superset of commonly accepted dictionaries (dictionary words + common "web words"). So although I agree with email@example.com's conclusion (that we should keep exacta), "exacts" is an English word that could easily be "typo'ed" to "exacta," since "s" and "a" are adjacent.
I suppose you have to ask what the spelling checker is for. To my mind it's sole purpose is to detect misspelt words. Detecting the wrong word is up to grammar checking. I think the debate over "exacta" is a bit of a red herring for that reason. The spelling checker does not proof-read the text. Even typos are outside of it's core functionality IMHO.
Thanks for the thoughtful comments, James and Paul. Let's try to keep this bug for word additions/deletions, not a place to debate spell checker policy. If you feel very strongly about some such issue, please open another bug in the spell check component.
Paul, I can assure you (based on over 20 years of experience in spell checking) that adding an extra "a" is quite common for a number of reasons. Typos come from more than just hitting adjacent keys. The best solution for this sort of problem is the private dictionary. People who regularly type a word can add it, while people who don't won't be bothered by false positives. I presume Mozilla supports a private dictionary? As to grammar checking, I wish.
Bug 153104 has a list of country names. Some are still not in the dictionary: Darussalam (in Brunei Darussalam) Bouvet (in Bouvet Island) Malvinas (in Falkland Islands) Faroe (in Faroe Islands) Sri Lanka Miquelon (in Saint Pierre and Miquelon) Mayen (in Svalbard and Jan Mayen) Sao (in Sao Tome and Principe) Tokelau Timor-Leste Futuna (in Wallis and Futuna) Mayotte
Sao also appears in a lot of city names (e.g., Sao Paolo, Brazil). My only question is about Darussalam. Since this is a transliteration from the Arabic, is there a standard spelling? (I don't know; I'm asking. I've seen "Dar-es-Salaam" and am wondering if it's a different transliteration of the same word.)
repost, reposts, reposter, reposted, reposting
neuroscientist/MS astrobiologist/MS lepidopterist/MS limnologist/MS volcanologist/MS (source: http://en.wikipedia.org/wiki/Scientist)
malware/S adware/S crimeware/S rootkit/MS keylogger/MS botnet/MS We should really try to keep up with the Internet. A browser shouldn't look dated, even the dictionary.
Oh, and undesignated.
commenters (plural) not allowed in Minefield/3.7a1pre. I don't know if comment/GSMDR can be tweaked to allow "commenters"; it seems other words have a separate "-er" form that handles the plural noun (e.g. kill/JMDRSZG + killer/M).
Thanks for that one. We should keep affixes as simple and consistent across the dictionary as possible, so commenter/MS is best.
The "Z" flag will add "ers" to a word. So comment/GSMDRZ will handle the plural (but not "commenter's" because there isn't a flag for that, though there should be). In general, though, the simplest approach is to use munchlist to generate affix flags. That way, you can just add any random word form and not have to worry about affix rules (which are sometimes a bit confusing).
"someone else's browser" I'm far from a Hunspell .aff expert, but perhaps the bare "else" in en-US.dic should be else/M
Here's a list of country names (taken from <http://www.iso.org/iso/list-en1-semic-3.txt>, which is the ISO 3166-1 list) that are still missing in the latest trunk : Bouvet (in Bouvet Island) Faroe (in Faroe Islands) Hong (in Hong Kong) Mayotte Puerto (Puerto Rico) Réunion Barthélemy (in Saint Barthélemy) Cunha (Saint Helena, Ascension And Tristan Da Cunha) Miquelon (in Saint Pierre And Miquelon) Marino (in San Marino) Sao (Sao Tome And Principe) Sri (in Sri Lanka) Mayen (in Svalbard And Jan Mayen) Leste (in Timor-Leste) Tokelau Viet (in Viet Nam) Futuna (in Wallis And Futuna) I left out some names that could not be presented in English or that already had common English names (Åland Islands, Côte d'Ivoire, Libyan Arab Jamahiriya, ...). I don't think we should add every name of every little island (a dictionary is not an encyclopedia). I would add : Faroe Hong Puerto Sri Viet
chemistries plural , and also biochemistries and possibly geochemistries, and even photochemistries appears in some research papers. I think in en-US.dic chemistry/M should be chemistry/MS and similar for any other correct -chemistries plural.
badging , now very popular as a shorthand for "branding with logos and emblems". I think in en-US.dic badge/MZDRS should have a G as well.
parkour traceur both have crossed over into English (traceuse, not so much)
(In reply to comment #47) > parkour > traceur > > both have crossed over into English (traceuse, not so much) They're not mentioned on www.merriam-webster.com
7 Million English language google results are enough for inclusion, I think. Also, http://dictionary.reference.com/browse/parkour Jo, there are hundreds or thousands of words in our dictionary that aren't in M-W, especially newer words or words prevalent on the internet. If you believe our policy of including words that aren't in some particular dictionary is a problem, please file another bug.
I've seen several bugs were such requests were rejected because the words could not be found in m-w (or Oxford or whatever). Parkour has only 2.920.000 results btw, traceur 109.000. And not all of them are English anyway (eeven on the first page of the Google results), as the language is often guessed wrongly. But that doesn't matter.
I don't think the Google hit count is a good metric. After all, there are over 8 million hits for "independant" (it hurts me to even type it!). A better criterion is twofold: (a) do enough people use the word that it's a burden to ask them all to put it into their private dictionaries, and (b) is it likely to mask a typo of an existing word. By that criterion, I'd argue that parkour is barely OK but traceur should be avoided, since it's rare and hides mistypings of "tracer".
decertify decertified decertifies decertifying decertification
eschatological (http://dictionary.reference.com/browse/eschatological) inertance (http://dictionary.reference.com/browse/inertance)
Upstream Maintainer Here: I have looked through many of the bug reports and I would roughly classify them as falling into the following categories: 1) Common words which should probably be added, the only reason they are not there is because of bugs in the creation process. For example: analyses 2) New words which should probably be added. 3) Proper names which possible should be added 4) Not so common words which user X thinks should be added 5) Variant spelling of a word which are acceptable spelling, but perhaps not very common. As the upstream maintainer I would really like to know about (1) since they represent a problem. In fact I just created a release and I would like to have known "analyses" before hand. I would also like to know about (2) and (3), even if I don't add every suggestion. I do not have time to actively monitor other projects bug reports. The best thing is to file bug reports at http://sourceforge.net/tracker/?group_id=10079&atid=1014602; however, I admit that the sourceforge bug tracker can be a pain to use. You can also email firstname.lastname@example.org. Just adding email@example.com to the CC list would be a big help. If this would be possible please email me privately to work out the details (like making sure the post gets through). As for (4), I purposely limit the dictionary size to avoid many of the problems Geoff Kuenning talked about. The upstream dictionary is generated from SCOWL (http://wordlist.sourceforge.net) which offers wordlist in a variety of sizes. I currently use the 60 size which I think is a good compromise in including common words, while avoiding the not-so-common words which could mask spelling mistakes. I can very easily generate a dictionary from a larger size. Size 70 will include most words found in the dictionary, and if that not big enough for you, I can include size 80. See the SCOWL readme for additional information. Another thing that could be done is to only let in words from a larger size if they would not mask spelling mistakes. This will take a bit of work, but it might be worth it. It will require some sort of automatic check for when a word is close-enough to not warrant inclusion. If anyone is interested in pursuing this approach it is perhaps best to take the discussion the the wordlist-devel mailing list. Note that another problem with a large dictionary is cluttering the suggestion list, my proposed solution above will not help here. As for (5), this is deliberate choice. In order to promote consistent spelling I, for the most part, only include one spelling for a word. However, if you want I can easily create a dictionary for you which includes additional variants. You have two choices: I can include one with just the common variants (for example, "grey" is not included) or one which includes all variants which are considered acceptable (including "grey").
Hi Kevin, thanks for taking a look. I'd like to discuss this with you at greater length, but one preliminary point (I think you're aware of this but want to make it explicit) is that the volume of bug reports here aren't really typical of the current Firefox dictionary patch -- they're largely the squeaky wheel getting greased (and I include many of my own additions in that group ;), and thus include a lot of (4) and (5)s in your categorization. I would say the patch itself is mostly (2) new words, especially from Chromium; a large set of (3) proper names, source unclear; and a fair amount of (1) omissions. I do like the idea of selectively adding words that don't cause spelling check false negatives on high frequency words. That seems like an interesting but manageable coding challenge.
Matt, thanks for you fast reply. Although many of the additions mentioned in comments in this bug fall in (4) and (5), many of the bugs mentioned in Bug 479334 do not, and in fact after reviewing all of them I have acted on most of them. We should probably take any remaining discussion offline. I encourage everyone with an interest in the English dictionary to join the wordlist-devel mailing list (http://lists.sourceforge.net/mailman/listinfo/wordlist-devel), it is a very low volume list.
transgender http://dictionary.reference.com/browse/transgender http://www.change.org/petitions/tell-facebook-to-stop-correct-spell-of-transgender#comments (The more common "transgendered" is already apparently in the dictionary).
undescended, as in "Francis II of France's undescended testicles". This is the only common form (420,000 Google matches, 389,000 of them for undescended testicle), NOT undescend (not in Wiktionary and only 1,820 Google matches), undescending (not in Wiktionary and only 3,220 Google matches), etc.
Can we please just have a comprehensive collection of words added that don't force us to look up correctly-spelled words because the application has the vocabulary of a twelve-year-old?
Dan, I'm sure we'd all LOVE to have a comprehensive word collection. If you can point us to a source, it would be much appreciated. But I will note, to forestall unhelpful suggestions, that copyrighted collections aren't useful, nor are definitional dictionaries like Webster's and Wiktionary always helpful, for several reasons.
How about the dictionary used in OpenOffice? Is that not an open-source project?
The OpenOffice dictionary is what we're using right now. Currently, Firefox has a somewhat larger dictionary than OpenOffice. This bug is for word suggestions, not dictionary suggestions or general spellchecker complaints. If there are specific missing words that you're having trouble with, add them as a comment here and they will be considered for future dictionary patches. If you have a practical proposal for a new dictionary or word source that can be correctly licensed to use with open-source software such as Firefox, open a NEW bug.
To add: Dudeism, Dudeist, Duder All terms associated with Dudeism, the most laid back religion on the planet.
Find a real word, and I'm sure they'll consider it. Locate it in a print dictionary.
phonons phonon's I think phonon/MS (the same suffixes as photon/MS in the dictionary) would take care of this. My copy of kompozer 0.8b3pre has "phonon/M".
disintermediate disintermediation Still popular 90s buzzwords, both in Merriam-Webster, both on Wikipedia, 58,000 & 235,000 Google results respectively.
Found 2 today: millennia opposable
"advisor", superior (and more commonly used) synonym of adviser: http://answers.yahoo.com/question/index?qid=20090724065947AAyhdKx
teleport mage (Suggested in https://bugzilla.mozilla.org/show_bug.cgi?id=831411)
Theres should suggest There's, a contraction for there is or, rarely, there has.
immersive im·mers·ive [ih-mur-siv] adjective noting or pertaining to digital technology or images that deeply involve one's senses and may create an altered mental state: immersive media; immersive 3-D environments.
triages 3rd person singular present, plural of tri·age Noun The action of sorting according to quality. Verb Assign degrees of urgency to (wounded or ill patients).
bloviated to speak or write verbosely and windily.
synesthesia (American English spelling of synæsthesia or synaesthesia )
Unchecking (unchecked) unbridled: not restrained or controlled; "unbridled rage"; "an unchecked temper"; "ungoverned rage" (Current spelling correction recommendation is 'Chungking' oddly enough)
sommelier the wine expert at a restaurant
stormtrooper. Strangely enough, stormtroopers is in the dictionary. Don't the agents of the evil empire ever go alone?? spellchecking. Really, now, a spellchecker that doesn't know what it does? unreinforced. precalculated. circularization. Used in orbital mechanics. aerobrake. More space stuff. realtime. durian. Yeah, they stink. While you might not want one in your house that's no reason to keep it out of the dictionary. phlebotomist. It's a very rare person that hasn't met one of these. colonoscopy. Something us older people have to deal with. Las. As in city names. rheumatologist. You have rheumatism but not the docs that deal with it. Given the post above about the choice of word lists, could I suggest offering a few choices of dictionary--say, the 60, 70 and 80 levels?
"Stormtrooper" is not a word; it's two words: storm trooper. The correct fix is to remove "stormtroopers" from the dictionary. It's "spell checker", not "spellchecker". Actually, it's "spelling checker." And "checking spelling". Although "spell-check" has become fairly common usage. Likewise, "realtime" isn't a word. You should write "real time" or "real-time" depending on whether you're using it as a noun or an adjective. "Las" is a tough one, since it can hide typos (e.g., leaving the last letter off "last"). But if it's only capitalized in the dictionary, I'd add it.
Sorry everyone that I had not seen this bug yet. First things first, Kevin, nice to meet you! We at Mozilla have been accepting individual word additions in other bugs for a while, and our dictionary is quite diverged from the upstream hunspell dictionary unfortunately. It would be nice if we could revisit this at some point. Is that something which interests you? To others, I don't think this bug is in a very good shape to be actionable. There is a long list of words here and it's very difficult to go through the full list and accept them into the en-US dictionary. Does someone want to volunteer to create a comprehensive list somewhere so that we can look into adding the words on that list to our dictionary? If yes, please get in touch with me. I don't know if doing that work throughout the comments on this bug is going to make things very easy to handle, so we might want to do that work outside of Bugzilla to keep things maintainable.
Does the openoffice dictionary migrate into FF periodically, or is this an old version we just keep using? I ask because I wonder if the answer isn't to go upstream and add everything here to that one.
(In reply to comment #83) > Does the openoffice dictionary migrate into FF periodically, or is this an old > version we just keep using? I ask because I wonder if the answer isn't to go > upstream and add everything here to that one. We unfortunately don't currently have a good way of receiving updates from any upstream project. To the best of my knowledge, we got the initial word list from SCOWL some time in 2008, added some extra words that the chromium projects added to their dictionaries, added in some more Mozilla specific words, and have not received any data from either of those upstreams since then. (Note that I was not working on this code back when the original dictionary was added.) Since then we've been making individual small updates to the dictionary every now and then. At some point we should get a better story for communicating these fixes back and forth with the upstream projects but that doesn't necessarily block adding the words suggested here.
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #82) > First things first, Kevin, nice to meet you! We at Mozilla have been > accepting individual word additions in other bugs for a while, and our > dictionary is quite diverged from the upstream hunspell dictionary > unfortunately. It would be nice if we could revisit this at some point. Is > that something which interests you? Yes it is. I very much want to avoid a another fork of the English dictionaries. I do not have a lot of time to devote to maintaining the dictionary, but fortunately the English language does not change that fast. :) David Bartlett British hunspell dictionary was an unfortunate fork that was created primary due to a communication breakdown. I am somewhat picky about what words to I accept but I am not unreasonable. Suggestions are best submitted as a bug report at http://sourceforge.net/p/wordlist/issues/. But before doing so please take the time to understand what SCOWL is (see http://wordlist.sourceforge.net/) and how the hunspell dictionary is created from SCOWL. In particular, note that the official hunspell dictionary is a subset of SCOWL and it is possible to create your own customized dictionary by choosing a different subset. In particular and as a matter of policy, I generally only include one spelling of a word and also decided that fewer (up to level 60 in SCOWL) words is better than more (level 70, or even 80) for the purpose of spell checking. With this in mind one huge time saver for me will be to determine if suggestions to add are (1) already in the newest upstream version or (2) in SCOWL but not in the subset that was used to create the Hunspell dictionary. (It should be easy to write a script to do this once you understand SCOWL, but I don't have the time right now.) If (2) is the case I may still consider it giving a compelling enough argument. Also, it is good to check http://corpus.byu.edu/coca/ and see how frequently the word is used in a respectable Corpus as I do not consider google result count a valid source without a compiling reason. (Also, as a super long term goal, I hope to use the Coca corpus to create a better wordlist.) Thanks, Kevin
Hi, I'm not sure why no one has replied to my last comment. In any case I wanted to let you know that I moved my Dictionaries to GitHub (https://github.com/kevina/wordlist) and also created a new release today with several new words. There is also a simple web-app now to look up words at http://app.aspell.net/lookup. The best way to get changes upstream is to submit requests for new words on the GitHub issue tracker. If you find that I am not letting in enough words, I am willing to consider having a special Mozilla supplement file if it will make staying in sync easier. Please let me know if you have any questions, as I know I am not always as clear as I could be. :) Thanks.
Holy ****. Bug tracking / source control / your pickiness seem like a really poor fit for the task of creating and maintaining a real dictionary. Why can't it be derived form an authoritative source instead? Do we really care enough that the word list be open source, to make such a big sacrifice to quality? If this isn't the place to ask these questions, sorry. Add 'pickiness' to the list of missing words.
"Deliverables": plural form of "deliverable" "Mitigations": plural form of "mitigation"
martialed. The past tense of martial--as in court martial.
Paul Gordon: I'm sure if your comment was directed specifically at me. However, please note that one of the reasons I am so picky is because I mostly rely on other sources with the primary one being 12Dicts (http://wordlist.aspell.net/12dicts/). If a word is not in one of my source lists I am hesitate to add it too keep the size of the list under control. My dictionary may be smaller than some people like, but I can assure you it is of a high quality.
I'm not sure exactly what's going on: It definitely does not accept unreceive as a word. However, sometimes it's accepting unreceived as a word--why the past but not the present? Furthermore, it's being inconsistent here, sometimes flagging unreceived but accepting the incorrect unrecieved. Of course now that I try to put it on the list I can't get it to misbehave. (I haven't gotten it to accept unrecieve at all.) (Use case: An operator marked the wrong job as received. Now he needs to undo that--unreceive a job.)
"Unreceive" isn't accepted as a word because it isn't a word. It's that simple. "Unreceived" isn't the past tense of a verb, it's an adjective. That's why "unreceive" and "unreceiving" aren't in the dictionary. If somebody receives something in error, they don't "unreceive", they send it back. If an operator marks a job as received (which is, in this case, an adjective) they don't fix the mistake by unreceiving the job, they fix it by unmarking it.
(In reply to Geoff Kuenning from comment #92) > "Unreceive" isn't accepted as a word because it isn't a word. It's that > simple. > > "Unreceived" isn't the past tense of a verb, it's an adjective. That's why > "unreceive" and "unreceiving" aren't in the dictionary. > > If somebody receives something in error, they don't "unreceive", they send > it back. If an operator marks a job as received (which is, in this case, an > adjective) they don't fix the mistake by unreceiving the job, they fix it by > unmarking it. If it's not a word then why is "unreceived" accepted? And Firefox is flagging "unmarking".
Go back and read what I wrote. "Unreceived" with a D is a word: it's an adjective. "Unreceive" without the D isn't a word. And "unmark" shouldn't be in the dictionary either; I erred by using it--see dictionary.com, for example. I should have said "removing the mark."
Fukushima. I note that Chernobyl is in the dictionary, the other big nuclear disaster also should be.
"A cappella" should be handled better. "cappella" isn't in the dictionary and gets the squiggly line and a suggestion to change to "Capella", and if someone mistypes the phrase as a single word acappella, accapella, etc. the spell checker should suggest the two words "a cappella", in the manner that it corrects the spelling of "alot" (for which fix I love Hunspell and its developers A LOT!) (I guess "Capella" with a capital 'C' is in the dictionary because "Capella is the brightest star in the constellation Auriga.")
"inverter" is not accepted in Firefox nightly 54.0a1 build ID 20170213030206. It's definitely a word, https://en.wikipedia.org/wiki/Solar_inverter
'victimless' eg: https://en.wikipedia.org/wiki/Victimless_crime Lowering perceived quality of Firefox: https://twitter.com/ChuckBaggett/status/884077572644118529 also opened in wordlist: https://github.com/en-wl/wordlist/issues/185
'combust' 'combusts' eg: https://www.merriam-webster.com/dictionary/combust Lowering perceived quality of Firefox: https://twitter.com/Neurisko/status/884431031171940357 also opened in wordlist: https://github.com/en-wl/wordlist/issues/186
'dystopia' eg: https://en.wikipedia.org/wiki/Dystopia Lowering perceived quality of Firefox: https://twitter.com/AphixJS/status/890091686881357831 also opened in wordlist: https://github.com/en-wl/wordlist/issues/188
There's no way you're going to get this to work. Firefox can't hit the **bullseye**!
'balkanize' eg: https://www.merriam-webster.com/dictionary/balkanize
'cockatiel' eg: https://en.wikipedia.org/wiki/Cockatiel Lowering perceived quality of Firefox: https://twitter.com/Falcc/status/898280615136591872 also opened in wordlist: https://github.com/en-wl/wordlist/issues/192
"Balkanize" and its variants should be capitalized in the dictionary: Balkanize/Balkanizes/Balkanized/Balkanization, since they're derived from a proper name.
The AmEng dictionary, M-W, AHD and Random House Webster's, does have `balkanize`. That's why I added those words in lowercase. :)
'tranch', 'tranches' http://www.investopedia.com/terms/t/tranches.asp Kind of crazy that the entire fiscal crisis was about tranches of debt that were sold and resold, and that that was almost ten years ago, and that FF still doesn't have this word. :)
'intifada' eg: https://en.wikipedia.org/wiki/Intifada Lowering perceived quality of Firefox: https://twitter.com/shipstogaza/status/909424563821453316
'vanishingly' As in: "Her hopes were vanishingly small that the yo yo would ever come back up."
'preinstall' 'preinstalled' https://www.merriam-webster.com/dictionary/preinstall Also added to wordlist: https://github.com/en-wl/wordlist/issues/201
Please add "nondeterminism" and "nondeterministic" http://www.dictionary.com/browse/determinism / https://en.wikipedia.org/wiki/Nondeterminism And "bimodal" https://www.merriam-webster.com/dictionary/bimodal
2 years ago
Depends on: 1412668
remediate http://www.dictionary.com/browse/remediate Lowering perceived quality of Firefox: https://twitter.com/cryptoishard/status/928089390450176000 also opened in wordlist: https://github.com/en-wl/wordlist/issues/204
'fugacious' It means fleeting, like my hope of this being an effective way to add words to a dictionary.
amicus analytics API collegial discoverable encodings explainer expungement, expunction. expungements Ghostscript JSON online parsers PDFs rasterization roadmap screenshare searchable strategizing, strategize vanishingly
APIs, plural of API.
copyrightable - something that can have by copyrighted
Divorcee(s) -- somebody that's been divorced. (I feel like there's a better way to do this....)
Banc, as in "en banc": https://en.wikipedia.org/wiki/En_banc
Kryptonite --- Not sure if it should be capitalized. I think not at this point.
Egads - expressing surprise, anger, or affirmation. Singular is in there, plural is missing.
"webpages" isn't a word. It's a mistake made by people who don't want to type a space in "web pages" (which should actually be "Web pages", since the WWW is a proper noun). "egads" also is not the plural of "egad", and a quick check of authorities like Merriam-Webster suggests that "egad" is the correct usage.
AP Style Guide has website as one word and recommends lowercasing web these days, but yeah, I can't find their stance on "webpage." I'll retract that one for the moment! Egads is indeed not the plural per se, but it's listed as an alternative on dictionary.com and in Google. It's not important what's the "correct usage", provided both are in the dictionary. The spellchecher should be as broad as possible. If it's a word or word variant and it's in the dictionary, it shouldn't be flagged as a misspelling.
> APIs, plural of API. > fundraising listed in en_US.dic
> The spellchecher should be as broad as possible. If it's a word or word variant and it's in the dictionary, it shouldn't be flagged as a misspelling. Actually, it's the reverse. The whole point of spellcheckers it to help people conform to a standard. So if there is a preferred usage, the spellchecker should encourage that preference. Otherwise we should accept mistakes like "desparate". There's also the issue of consistency: if a person has a habit of not always writing a word the same way, we should help them choose one of them. That's why ispell's dictionary only accepts "totaled" even though "totalled" is considered an acceptable alternate (and I hope Mozilla's dictionary does the same). Dictionary.com and (especially) Google are not particularly reliable authorities. I prefer sources like OED and MW, where the editors have devoted their life to the language. So IMHO "egads" shouldn't be there. "Fundraising"...definitely; it's in common usage now, as is "fundraiser". Although somewhat oddly, IMO not "fundraise" (instead, people should write "raise funds").
(In reply to Geoff Kuenning from comment #132) > I prefer sources like OED and MW, where the editors have > devoted their life to the language. https://en.oxforddictionaries.com/definition/us/egad https://www.merriam-webster.com/dictionary/egads I decided to patch this in bug 1422346.
pruno - http://www.dictionary.com/browse/pruno?s=t an alcoholic wine-like drink made by prison inmates (1936)
I'd argue against "pruno" on the grounds that it's rarely used and would hide typos for more common words such as prune. Of course if it's regularly used in some regions then it should be there...
> if it's regularly used in some regions then it should be there... This is the *American* English dictionary, for gosh sakes. Pruno belongs in this dictionary.
fearmonger and fearmongering How am I supposed to discuss politics??
preparer, as in tax preparer. (has a frequency of 205 in https://corpus.byu.edu/coca/ mentioned in comment 85) Folks arguing for inclusion, please read comment 85, and IMO anyone who knows enough to use an archaic word or actually prefers a rarer alternate spelling can right-click and Add to dictionary.
spam, as in the mail you don't want. Lowercased.
ISPs - Plural of Internet Service Provider.
metadata - There has got to be a better way to do this. So many of these words are wildly basic.
URLs - Plural is missing.
Wikipedia - It's earned it.
podcast - Needs no introduction/definition Samsung - It's a pretty big, important company. oyez - the thing you say at the beginning of a court case, "Oyez, oyez, oyez!", or that was said by town criers (pronounced O-yay: https://www.youtube.com/watch?v=EyveyqLQKDg) curation
Cryptocurrency, cryptocurrencies. https://en.oxforddictionaries.com/definition/cryptocurrency http://www.dictionary.com/browse/cryptocurrency (In reply to mlissner from comment #146) > curation That's already in Firefox's dictionary: bug 1422346.
(In reply to Gingerbread Man from comment #147) > That's already in Firefox's dictionary: bug 1422346. Fantastic, my apologies! I guess I should get back to running Nightly...
Fundraiser - http://www.dictionary.com/browse/fundraiser?s=t
soffit - the underside of an architectural structure such as an arch, a balcony, or overhanging eaves.
PDF JPEG PNG XLSX Basically all of the major file extensions should be in here with plurals allowed.
Crowdsource and all variations of it: http://www.dictionary.com/browse/crowdsource?s=t
redactions - singular is in there, plural is missing. unredacted - http://www.dictionary.com/browse/unredacted
sponsorships - Singular is there, plural is missing.
streetwalking. You can refer to a streetwalker but not what she does.
olds - as in two-year-olds. See #27, here: http://www.dictionary.com/browse/olds?s=t
Onwards - plural. Singular is there; plural is missing.
'interoperability' eg: https://en.wikipedia.org/wiki/Interoperability also opened in wordlist: https://github.com/en-wl/wordlist/issues/210
Also "interoperable". I would argue against "trendline" (should be written "trend line", which is perfectly usable) and "onwards". "Onward" is an adverb, not a noun, so there is no such thing as a plural form--and the added "s" is an unnecessary colloquialism that merely encourages inconsistency. (OTOH I'll note that the related form "backwards" is well established in English.)
relatable -- something you can relate to. Can we crowdsource this list somehow?
Crowdsourcing a spelling list would be a really bad idea. You'd quickly wind up with disasters like "dependant".
Maybe we just use Wikipedia titles as the starting point. I don't know. Doing it this way is wrong.
Emeryville - City near Berkeley, CA. Population 11k.
Pixar - It has earned it. Add-ons - Plural of add-on.
blockchain - like it or not, it's a thing now
Incorporator(s) -- people that found a corporation
**** - http://www.dictionary.com/browse/****?s=t "colloquial modern alternative spelling of **** (n.), preserving the original vowel of the Old English verb."
These are the biggest companies on the NASDAQ that are missing: NVIDIA Amgen Qualcomm Priceline Celgene Biogen Baidu Altaba Walgreens Surely if it's a billion dollar corp, people will be writing about it. Facebook, Microsoft, Amazon, Google, and a bunch of others are indeed in there.
Firefox is awfully vanilla. It doesn't even know what a safeword is.
rebrand, rebranded, rebranding, etc.
> These are the biggest companies on the NASDAQ that are missing: > > NVIDIA > Amgen > Qualcomm > Priceline > Celgene > Biogen > Baidu > Altaba > Walgreens > I'd rather file this upstream. https://github.com/en-wl/wordlist/issues/211
Fine by me, so long as they get in one way or another. My energy for this topic stops here though.
Well you've contributed a lot, and that list of companies was an excellent addition. Thanks for the energy!
I work in legal data. One thing I can do is make lists of important judges. These are all SCOTUS judges that aren't in Firefox's dictionary, with their year of appointment and the specific missing words on the second and third lines. Your call how far we want to go back, but I'd make an argument for at least the more recent judges: Neil Gorsuch (2017-04-10) Gorsuch Elena Kagan (2010-08-06) Kagan Sonia Sotomayor (2009-08-06) Sotomayor Samuel A. Alito Jr. (2006-01-31) Alito Stephen Gerald Breyer (1994-08-03) Breyer Ruth Bader Ginsburg (1993-08-05) Bader David Souter (1990-10-03) Souter Antonin Scalia (1986-09-25) Scalia Harry Blackmun (1970-05-14) Blackmun Thurgood Marshall (1967-08-30) Thurgood Abe Fortas (1965-08-11) Fortas Sherman Minton (1949-10-05) Minton James Francis Byrnes (1941-06-25) Byrnes James Clark McReynolds (1914-08-29) McReynolds Mahlon Pitney (1912-03-13) Mahlon Pitney Willis Van Devanter (1910-12-16) Devanter Horace Harmon Lurton (1909-12-20) Lurton Joseph McKenna (1898-01-21) McKenna Rufus Wheeler Peckham (1895-12-09) Peckham George Shiras Jr. (1892-07-26) Shiras Samuel M. Blatchford (1882-03-22) Blatchford Noah Haynes Swayne (1862-01-24) Swayne Levi Woodbury (1846-01-03) Woodbury John Catron (1837-03-08) Catron Robert Trimble (1826-05-09) Trimble Gabriel Duvall (1811-11-18) Duvall Bushrod Washington (1798-12-20) Bushrod James Iredell (1790-02-10) Iredell William Cushing (1789-09-27) Cushing
Did the same analysis for American Presidents. Only missing word is "Buren" as in: Martin Van Buren (1837-01-01)
bupkis - "something worthless; nothing" http://www.dictionary.com/browse/bupkis?s=t
whitelist, whitelisted, whitelisting, etc.
You're old school. You don't know what SpaceX is!
Spitballing - To speculate; propose conclusions or possibilities : Well, I'm just spit-balling/ You're just spit-balling (1955+)
revelatory - revealing something hitherto unknown. "an invigorating and revelatory performance"
crapola - slang for ****.
empathic - http://www.dictionary.com/browse/empathic?s=t of, relating to, or characterized by empathy, the psychological identification with the feelings, thoughts, or attitudes of others: a sensitive, empathetic school counselor.
gunsmoke Some dictionaries have it, some don't. One that does: https://www.oxfordlearnersdictionaries.com/us/definition/english/gunsmoke
redirections - plural is missing, ugh, so close.
unmaintained - Not kept in good condition. https://en.oxforddictionaries.com/definition/unmaintained
TIFF and TIFFs it's a common file format.
unbox, unboxing, unboxed, etc.
Exfiltrate, exfiltrated, exfiltrating, etc.
résumé - a brief written account of personal, educational, and professional qualifications and experience, as that prepared by an applicant for a job.
recommender - a related form of recommend: http://www.dictionary.com/browse/recommender?s=t
kerning - how close letters are to each other. Goodness there are some major words missing.
distractible - Able to be distracted
neato - slang for neat. I forget the policy on slang, but this goes back to the 60's. Anyway, that's three words missing from a two sentence email I just wrong. Jeesh.
woah - an interjection "Woah — four missing words in under 15 minutes."
hornbeam - a genus of tree
oy - http://www.dictionary.com/browse/oy?s=t (used to express dismay, pain, annoyance, grief, etc.)
situ as in: in situ
snazziness - from the word snazzy.
850 words added to the dictionary: https://www.merriam-webster.com/words-at-play/new-words-in-the-dictionary-march-2018, including: harissa - a spicy North African chili paste wordie - Word lover demonym - names a person who comes from a specific place, like Hoosier or Parisian tzatziki - a Greek yogurt sauce made with cucumbers and garlic kabocha - a kind of Japanese pumpkin kombucha - a fermented and effervescent tea drink. blockchain - a way of keeping track of transactions microfinance microcredit glamping - combination of glamorous and camping neoadjuvant - treatment for a disease or condition that is administered before the primary treatment in order to improve the likelihood of a successful outcome. antifa was borrowed from the German abbreviated combination of "anti-fascist." embiggen subtweet respellings welp Alas, the full list isn't available, but these at least were in their blog post.
hoc - as in ad hoc.
schemas - plural of schema (an alternative to schemata).
replead - to plead a case again
nystagmus - Rapid involuntary movements of the eyes.
praecipe - an order requesting a writ or other legal document.
doozy - something outstanding or unique of its kind
discoverability 'discoverable' is already included.
gerontocracy government ruled by old people xenophile a person who is attracted to foreign peoples cultures kakistocracy - Government by the least suitable or competent citizens of a state.
reglet - a thin strip of wood
stopword - A word that is automatically omitted from a computer-generated concordance or index.
wicking, wicked, etc. Verb form is missing.
reproducibility - the ability of research to be reproduced
Bing - Google is in there already.
opioid - Ouch.
disbarments - Plural. Singular is in there.
badass - tough, uncompromising, or intimidating person. badasses - plural badass - adj
Eutrophic - Rich in mineral and organic nutrients that promote a proliferation of algae and aquatic plants, resulting in a reduction of dissolved oxygen. Used of a lake or pond. Eutrophication - (Environmental Science) a process by which pollution from such sources as sewage effluent or leachate from fertilized fields causes a lake, pond, or fen to become overrich in organic and mineral nutrients, so that algae and cyanobacteria grow rapidly and deplete the oxygen supply overrich - having wealth or great possessions; abundantly supplied with resources, means, or funds; wealthy: leachate - water that has percolated through a solid and leached out some of the constituents. cyanobacteria - a phylum of bacteria that obtain their energy through photosynthesis
glom, verb, Become stuck or attached to.
Phyllo - a kind of dough that can be stretched into thin sheets, used in layers to make pastries, especially in eastern Mediterranean cooking. (Common alternate spelling to Filo) Antiderivative - An indefinite integral Asymptote - a line that continually approaches a given curve but does not meet it at any finite distance. (This one is interesting because "Asymptotic" is already in the dic) Collinear - lying on the same line Collinearity Differentiable - Able to be differentiated Decile - each of ten equal groups into which a population can be divided according to the distribution of values of a particular variable. Deciles Quartile - each of four equal groups into which a population can be divided according to the distribution of values of a particular variable. Quartiles Interquartile - Between quartiles Octant - one of the 8 divisions of the 3-D space by coordinate planes Octants Isometry - a mapping of a metric space onto another or onto itself so that the distance between any two points in the original space is the same as the distance between their images in the second space Invertible - (adj.) Invert Noninvertible - Not able to be inverted Monomial - Consisting of one term Multivariable - Involving two or more variable quantities Nonnegative - Not negative, positive Nonreal - Not real Parametrize - describe or represent in terms of a parameter or parameters. Piecewise - a function defined by multiple sub-functions, each sub-function applying to a certain interval of the main function's domain, a sub-domain. extremum - the maximum or minimum value of a function. extrema - plural form Laplacian - Mathematical operator integrand - a function that is to be integrated
IIRC - If I recall correctly (BTW, OTOH, and others are already in there)
switcheroo - I reversal. Sometimes I'm so proud of humanity. Then I learn that "switcheroo" isn't in Firefox's dictionary, and I do a switcheroo.
backsplash - tile behind a sink
HVAC - Heating, Ventilation, Air Conditioning
keister - A person's buttocks
Garamond - the font
weirded,weirding,etc - as in weirded out
retainage - A portion of a payment for construction work which is withheld from the contractor until the work is satisfactorily completed; the practice of withholding such a payment.
IANAL - I am not a lawyer
precedential - Of the nature of or constituting a precedent; providing a guide or rule for subsequent cases.
Iteratively - iterative is in there.
nosings - Plural, a rounded edge of a step or moulding. moulding - A shaped strip of wood or other material fitted as a decorative architectural feature, especially in a cornice. First time I've had a word in a definition ("moulding") trigger another word I needed to submit. Yikes.
MDF - Medium density fiberboard.
> moulding - A shaped strip of wood or other material fitted as a decorative > architectural feature, especially in a cornice. this is *British* spelling, yes? https://en.oxforddictionaries.com/definition/us/molding
Yeah, that's probably a fair point. Thanks for catching that. I'm seeing a lot of american manufacturers calling it moulding, but in hindsight, I think they're just wrong.
performant - functioning well.
thusly - done in such a manner. I added the word "thusly" to the Firefox spellchecker, *thusly*.
Summary: Add word _____ which is not in American English dictionary / recognized / suggested / misspelled by spell checker (spell checking, spelling, spell checker, en-US) → [meta] Add word _____ which is not in American English dictionary / recognized / suggested / misspelled by spell checker (spell checking, spelling, spell checker, en-US)
Willy-nilly: Without direction or planning; haphazardly.
SSN - Social Security Number
neurotoxin and neurotoxins
carte blanche - Neither word is available currently.
gazillionth - Gazillion is in there, gazillionth is not.
screener and it's plural, screeners. (I was talking about the airport security screeners.)
(In reply to mlissner from comment #263) > carte blanche - Neither word is available currently. Is this really an **English** word? (It's French for "blank paper".)
Well, is burrito? It's in the dictionary? https://en.oxforddictionaries.com/definition/carte_blanche What is the English language?
Subrogation: The substitution of one person or group by another in respect of a debt or insurance claim, accompanied by the transfer of any associated rights and duties.
funder - somebody that funds something. God, we have got to do better.
reskin - replace the skin of an airplane or something else (verb)
diverter - A mechanical or electrical part used to divert electricity or water.
kibibyte, mebibyte, gibibyte, tebibyte, pebibyte, exbibyte, zebibyte, yobibyte -- All units of measurement for file sizes. See: https://en.wikipedia.org/wiki/Binary_prefix
Charlottesville - A city in Virginia. Population around 46k. Constantly in the news.
GDPR - This is an international law that is affecting websites around the world. I think it deserves an entry in here.
ad infinitum, backlit, bijection, commoditization, else's, handwrite, heliocentrism, merchanting, photosensor, pre-fill, preload, prepend, scoresheet, surjection, unrequested From https://www.reddit.com/r/firefox/comments/8zvk1k/firefox_62_how_to_improve_english_spelling/
Dox - Search for and publish private or identifying information about (a particular individual) on the Internet, typically with malicious intent. (verb)
Barcode, as well as its variations barcodes, barcoded, barcoding. See: https://en.oxforddictionaries.com/definition/barcode
Relatedly - Owing to a relationship or connection; so as to be related, in conjunction. Also as a sentence adverb: on a related point.
toodles - an exclamation
rebar - Reinforcing steel used as rods in concrete.
de facto - Both words need to be added
prepend - Opposite of append, verb.
segregable - Something you can segregate
Yowza - Used to express approval, excitement, or enthusiasm.
coupler, couplers aluminize, aluminized
retroreflector, retroreflectors trailhead, trailheads
Infringer - somebody that infringes something
Beaucoup: Lots https://en.oxforddictionaries.com/definition/beaucoup
paver - A paving stone.
stupefyingly - adverb form of stupefying
WTF - What the f***.
pounder - A person or thing weighing a specified number of pounds. Like, you know, a "¼ pounder with cheese."
accrete - add little by little
cardinality - The number of elements in a set or other grouping, as a property of that grouping.
newish - fairly new
post-partum - After birth
criminalization - The action of turning an activity into a criminal offence by making it illegal. https://en.oxforddictionaries.com/definition/criminalization
tung - An oil used as a drying agent in inks, paints, and varnishes. https://en.oxforddictionaries.com/definition/tung_oil
OMG - Oh my god.
sucky - very bad or unpleasant
Recordings - plural of recording
merchantability - The condition, state, or quality of being merchantable; saleability. (Chiefly in legal contexts.).
From: https://www.merriam-webster.com/words-at-play/new-words-in-the-dictionary-september-2018 biohack - verb fintech - Financial technology fav - synonym for fave (a verb and noun) bougie - bourgeois ribbie - a spelling based on RBI in baseball land adorbs - adorable rando - a random person zuke - a zucchini avo - an avocado guac - short for guacamole iftar - the meal taken by Muslims at sundown gochuajang - Korean chili paste mise - from mise en place hophead - one who likes beer mocktail - a alcohol free cocktail hangry - angry from hunger Latinx - (Capitalized) - a gender-neutral alternative to Latino or Latina bingeable - something you can binge
Sorry, one more: biohack - also a noun, can be pluralized
(In reply to mlissner from comment #305) > Sorry, one more: > > biohack - also a noun, can be pluralized biohack isn't in the dictionary. you meant biohacking, biohacker?
decisis - from "stare decisis" the legal principle that lets one case rest on the findings of previous cases.
jeesh - another way of saying jeez.
affordance - Psychology. A property of an object or an aspect of the environment, especially relating to its potential utility, which can be inferred from visual or other perceptual signals; (more generally) a quality or utility which is readily apparent or available. (used a lot with software design — what affordances does a new feature provide?)
arrestee - Somebody that is being arrested
Gadzooks - an exclamation of excitement
bizarro - Bizarre
en masse - In a group, altogether.
overbroad - Of an accent: too broad. rare.
pro bono - Denoting work undertaken without charge, especially legal work for a client on low income.
requestor - alternative of requester.
kludgy - Poorly done.
uncopyrightable - Cannot have a copyright
'theming' eg: https://en.wikipedia.org/wiki/Theming also opened in wordlist: https://github.com/en-wl/wordlist/issues/233
headspace - Several definitions, but I'm thinking of the "room to think" definition.
dispositive - Relating to or bringing about the settlement of an issue or the disposition of property.
superset - A set which includes another set or sets.
evidentiary - another term for evidential
discoverable - Able to be discovered or found; findable.
unencrypted - Not encrypted
Congressionally - As in "Congressionally mandated"
uptime - time during which a piece of equipment (such as a computer) is functioning or able to function
Rebecca - A person's name
integrations - Plural
habeas corpus - A writ requiring a person under arrest to be brought before a judge or into court, especially to secure the person's release unless lawful grounds are shown for their detention. (corpus is a word, habeas is not.)
amongst - among
rambutan - A red, plum-sized tropical fruit with soft spines and a slightly acidic taste.
seldomly - Something that's not done frequently is done seldomly.
aspirational - of, relating to, or characterized by aspiration
Petaluma - a city in western California north of San Francisco population 57,941
'monetization' eg: https://en.wikipedia.org/wiki/Monetization also opened in wordlist: https://github.com/en-wl/wordlist/issues/238
nutjob - a mentally unbalanced person : a crazy person
et al - and others
You need to log in before you can comment on or make changes to this bug.