Closed
Bug 1144254
Opened 7 years ago
Closed 7 years ago
naïve is not in the en-US dictionary
Categories
(Core :: Spelling checker, defect)
Tracking
()
RESOLVED
FIXED
mozilla46
Tracking | Status | |
---|---|---|
firefox46 | --- | fixed |
People
(Reporter: jrmuizel, Assigned: jorgk-bmo)
References
Details
Attachments
(1 file, 2 obsolete files)
764 bytes,
patch
|
ehsan.akhgari
:
review+
|
Details | Diff | Splinter Review |
It should be
Comment hidden (obsolete) |
Comment hidden (obsolete) |
Comment 3•7 years ago
|
||
(Comment 1 is incorrect, as discussed in bug 1183512 comment 2. Reopening & tagging that comment as obsolete.)
Summary: naïve is not in the dictionary → naïve is not in the en-US dictionary
Assignee | ||
Comment 4•7 years ago
|
||
Looks like the spelling with the diaeresis has it's merit: https://en.wiktionary.org/wiki/na%C3%AFve Also: naïvely, naïveness, naïveté. Ekanan can you please make this change. As I said in comment #2, "naïve" is in the British dictionary and I don't see why it shouldn't be in the US dictionary.
Status: RESOLVED → REOPENED
Flags: needinfo?(ananuti)
Resolution: INVALID → ---
![]() |
||
Comment 5•7 years ago
|
||
OK, both OFD and M-W have this in en-US. patch coming.
Flags: needinfo?(ananuti)
![]() |
||
Comment 6•7 years ago
|
||
Assignee: nobody → ananuti
Attachment #8702227 -
Flags: review?(ehsan)
![]() |
||
Comment 7•7 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1c5235e686f8 Let's see if the `ï` breaks things.
![]() |
||
Comment 8•7 years ago
|
||
Comment on attachment 8702227 [details] [diff] [review] Add naïve, naïvely, naïver, naïvest, naïveness and naiveness to the en-US dictionary Unfortunately, we can't fix this bug without UTF-8 in the affix file. *sigh* If we use UTF-8, the spellchecker will treat non-Latin words as misspelled. see bug 1162823. :( Maybe WONTFIX?
Attachment #8702227 -
Flags: review?(ehsan)
Assignee | ||
Comment 9•7 years ago
|
||
Can you please elaborate further. > Unfortunately, we can't fix this bug without UTF-8 in the affix file. So why don't we use UTF-8? Is there a word missing? > If we use UTF-8, the spellchecker will treat non-Latin words as [not?] misspelled. I assume "naïve" classifies as non-Latin. Looking at en-GB.dic (the one maintained by Marco A.G.Pinto), I see: naive/YT naiveness naivete/Z naivety/SM naiveté/SM naïve/Y naïveness naïvety/S naïveté/S So why does it work there? I'm using the GB dictionary to write this comment and it works just fine. en-GB.aff has: SET UTF-8 The other word typically spelled with an accent is "résumé" (as in CV): https://en.wikipedia.org/wiki/R%C3%A9sum%C3%A9 In the GB dictionary I see: résumé/S In fact, if you grep for á, é, í or ó in the GB dictionary, you find heaps, like: Bogotá (https://en.wikipedia.org/wiki/Bogot%C3%A1, capital of Colombia, country in South America) or cliché (https://en.wikipedia.org/wiki/Clich%C3%A9). With all due respect to you and Ehsan, I think we should move the maintenance of the dictionary out of Core::Spelling Checker. This is really a community effort and shouldn't involve (busy) core developers. I think the French model is great (from bug 1229406): You request words here http://www.dicollecte.org/dictionary.php?prj=fr and it gets done for you.
Assignee | ||
Updated•7 years ago
|
Summary: naïve is not in the en-US dictionary → naïve is not in the en-US dictionary (and neither are many other accented terms that have Wikipedia entries, like Bogotá or cliché).
Assignee | ||
Comment 10•7 years ago
|
||
OK, perhaps comparing to en-GB is not the right thing to do. So let's compare to the add-on en-US dictionary from https://addons.mozilla.org/en-US/firefox/addon/united-states-english-spellche/. Affix file says: SET ISO8859-1 (so Latin with some accented characters, etc.). Now let's look for some words: clichéd cliché/SM (heaps of words with é) They also have Bogotá. And they have: naive/SRTYP naiveté/SM naivety/MS They don't have naïve. Anyway, I don't see why the en-US dictionary that ships with Mozilla products should be worse than others.
Assignee | ||
Comment 11•7 years ago
|
||
Why would UTF-8 be necessary, why is ISO8859-1 not good enough? Let's fix all the issues in bug 1235506.
Depends on: 1235506
![]() |
||
Comment 12•7 years ago
|
||
(In reply to Jorg K (GMT+1) from comment #11) > Why would UTF-8 be necessary, why is ISO8859-1 not good enough? i have no idea. you can try out the build from here http://archive.mozilla.org/pub/firefox/try-builds/ananuti@gmail.com-1c5235e686f856f58812313da6d9b1272d35757f/ pasting `naïve` into textarea, you'll see the red underline. if substitute `ISO8859-1` by `UTF8`, the red underline will disappear. but we can't use UTF8 (bug 1162823). feel free to investigate further, bug 1164263 is open. > Let's fix all the issues in bug 1235506. go for it :)
Assignee | ||
Comment 13•7 years ago
|
||
I don't see why I'd need a try run for adding one word to the dictionary. I simply added "naïve" to the en-US.dic I already have on my system. I did so in Notepad++ on Windows and made sure the file encoding was "ANSI", which is ISO8859-1. "naïve" works just fine. Your mistake was that you added the word and saved the file as UTF-8. You can see it in your patch. And surely, if you present a UTF-8 file to the spellchecker and pretend it's ISO8859-1, it ain't working ;-) Conclusion: If we decide that we want it, "naïve" in all its variations can be added without a problem. As I suggested in bug 1235506, we should also add the word to the yet to be created "Mozilla knows better" file.
Assignee | ||
Comment 14•7 years ago
|
||
Comment on attachment 8702227 [details] [diff] [review] Add naïve, naïvely, naïver, naïvest, naïveness and naiveness to the en-US dictionary Wrong UTF-8 encoding used for the patch. Should be ISO8859-1. In fact, the word addition is encoded in UTF-8, yet the checkin comment is in ISO8859-1: Add na, naly, nar, nast, naness and naiveness to the en-US dictionary.
Attachment #8702227 -
Flags: feedback-
![]() |
||
Updated•7 years ago
|
Assignee: ananuti → nobody
![]() |
||
Updated•7 years ago
|
Attachment #8702227 -
Attachment is obsolete: true
Assignee | ||
Comment 15•7 years ago
|
||
Requested at SCOWL: https://github.com/kevina/wordlist/issues/139
Status: REOPENED → NEW
Assignee | ||
Comment 16•7 years ago
|
||
OK, expanding the current en-US.dic file and looking for "naiv" I get 10 words: naive naively naiver naivest naivete <-- this is really naiveté without the accent. No need to add ï there. naivete's <-- same here. naivety naivety's naiveté naiveté's Therefore we should add 8 words: naïve naïvely naïver naïvest naïvety - see https://en.wikipedia.org/wiki/Naivety naïvety's naïveté naïveté's Patch coming.
Assignee | ||
Comment 17•7 years ago
|
||
Note the ANSI/windows-1252 encoding of the patch.
Attachment #8710941 -
Flags: review?(ehsan)
Assignee | ||
Comment 18•7 years ago
|
||
Changing the summary back to what it was. Accented words got added in bug 1238031.
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
Summary: naïve is not in the en-US dictionary (and neither are many other accented terms that have Wikipedia entries, like Bogotá or cliché). → naïve is not in the en-US dictionary
Assignee | ||
Comment 19•7 years ago
|
||
(In reply to Jorg K (GMT+1) from comment #17) > Note the ANSI/windows-1252 encoding of the patch. I meant to say ISO 8859-1. Same thing for the purpose of the patch. Details: https://en.wikipedia.org/wiki/Windows-1252 This character encoding is a superset of ISO 8859-1, but differs from the IANA's ISO-8859-1 by using displayable characters rather than control characters in the 80 to 9F (hex) range.
Assignee | ||
Comment 20•7 years ago
|
||
Oops, forgot to update word count in the first line.
Attachment #8710941 -
Attachment is obsolete: true
Attachment #8710941 -
Flags: review?(ehsan)
Attachment #8710996 -
Flags: review?(ehsan)
Updated•7 years ago
|
Attachment #8710996 -
Flags: review?(ehsan) → review+
Assignee | ||
Comment 21•7 years ago
|
||
Dear Sheriff, this patch changes three lines in the en-US dictionary. I promise, no test will fail due to this. Please combine with other patches when landing. Thanks.
Keywords: checkin-needed
Comment 22•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/61910fcb5817
Keywords: checkin-needed
Comment 23•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/61910fcb5817
Status: ASSIGNED → RESOLVED
Closed: 7 years ago → 7 years ago
status-firefox46:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
You need to log in
before you can comment on or make changes to this bug.
Description
•