Closed
Bug 637949
Opened 13 years ago
Closed 7 years ago
that'll & that'd are flagged as misspelled
Categories
(Core :: Spelling checker, defect)
Tracking
()
RESOLVED
FIXED
mozilla54
People
(Reporter: dholbert, Assigned: RyanVM)
References
Details
(Keywords: regression)
STEPS TO REPRODUCE: 1. Visit http://pastebin.mozilla.org/ 2. Type "that'll" or "that'd" in the text box ACTUAL RESULTS: Red underline (flagged as misspelled) EXPECTED RESULTS: No red underline Trunk gives me ACTUAL RESULTS Firefox 3.6.13 gives me EXPECTED RESULTS. Mozilla/5.0 (X11; Linux x86_64; rv:2.0b13pre) Gecko/20110301 Firefox/4.0b13pre NOTE: I tried some variations with different base words (it'll/it'd/you'll/you'd) and they all worked as expected. "that" seems to be the odd one out.
Comment 1•13 years ago
|
||
Can you test with 3.6.14 please?
Comment 2•13 years ago
|
||
This could be a regression from the hunspell update (bug 579649) that landed for 1.9.2.14...
Reporter | ||
Comment 3•13 years ago
|
||
You're right - this is broken in Firefox 3.6.14. Mozilla/5.0 (X11; U; Linux i686 (x86_64); en-US; rv:1.9.2.14) Gecko/20110218 Firefox/3.6.14
Assignee | ||
Comment 4•13 years ago
|
||
Broken with Hunspell 1.3.2 as well.
Comment 5•13 years ago
|
||
Nemeth, is this something to be fixed on the hunspell side?
Comment 6•13 years ago
|
||
Ehsan, Could you check the English dictionaries? In my platform both of British and American English dictionaries contain these forms, too: $ grep "'\(d\|ll\)" en_US.dic | tr '\n' ' ' I'd I'll cont'd he'd he'll it'd it'll rec'd she'd she'll somebody'll someone'll spec'd that'd that'll there'd there'll they'd they'll this'll today'll we'd we'll what'd where'd who'd who'll you'd you'll But the typographical apostrophe (’ = U+2019) could be a problem. You need correct UTF-8 -> 8-bit conversion for 8-bit dictionaries: U+2019 -> U+0027 (ASCII apostrophe). But if your English dictionaries are UTF-8 encoded, you have to add these abbreviated forms with typographical apostrophes to the dic file, or the following input conversion to the aff file: ICONV 1 ICONV ’ ' It's useful to define output conversion, too: OCONV 1 OCONV ' ’
Comment 7•13 years ago
|
||
(In reply to comment #6) > Ehsan, Could you check the English dictionaries? In my platform both of British > and American English dictionaries contain these forms, too: > > $ grep "'\(d\|ll\)" en_US.dic | tr '\n' ' ' > I'd I'll cont'd he'd he'll it'd it'll rec'd she'd she'll somebody'll someone'll > spec'd that'd that'll there'd there'll they'd they'll this'll today'll we'd > we'll what'd where'd who'd who'll you'd you'll ehsanakhgari:~/moz/mozilla-central/extensions/spellcheck/locales/en-US/hunspell [03:53:23]$ grep "'\(d\|ll\)" en-US.dic | tr '\n' ' ' I'd I'll he'd he'll it'd it'll rec'd she'd she'll they'd they'll we'd we'll who'd who'll why'd you'd you'll Does that mean that this will be fixed by just adding these to the dictionary? Why was it not broken before the hunspell upgrade we had? Our en_US.dic has not changed in ages... > But the typographical apostrophe (’ = U+2019) could be a problem. You need > correct UTF-8 -> 8-bit conversion for 8-bit dictionaries: U+2019 -> U+0027 > (ASCII apostrophe). How would we do that? Are you talking about the usage of U+2019 in en-US.dic? > But if your English dictionaries are UTF-8 encoded, you have to add these > abbreviated forms with typographical apostrophes to the dic file, or the > following input conversion to the aff file: > > ICONV 1 > ICONV ’ ' > > It's useful to define output conversion, too: > > OCONV 1 > OCONV ' ’ Is there any drawback to specifying these directives in the affix file anyways?
Comment 8•8 years ago
|
||
Mozilla/5.0 (X11; Linux i686; rv:49.0) Gecko/20100101 Firefox/49.0 I have tested this issue on Ubuntu 12.04 x32, Mac OS X 10.11 and Windows 10 x64, with the latest Firefox release (46.0.1) and the latest Nightly (49.0a1-20160531030258) and managed to reproduce it. I have typed "that'll" and "that'd" in the text box and both have been flagged as misspelled.
OS: Linux → All
Comment 9•7 years ago
|
||
same as bug 422982 ??
Reporter | ||
Comment 10•7 years ago
|
||
I don't think so. (1) Bug 422982 goes away when you change lines (per Bug 422982 comment 1), whereas this bug does not. (2) bug 422982 only affects rich text fields & not plain text fields (per bug 422982 comment 10), whereas this bug *does* affect plain text fields. (3) Bug 422982 mentions "couldn't" and "isn't" being flagged as misspelled, but those words are fine if I use them with this bug's STR (typing them in at https://pastebin.mozilla.org/ )
Comment 11•7 years ago
|
||
I can not reproduce this. When one reports bugs like this it is very important to report the actual dictionary used (en-US SCOWL, en-UK from Marco Pinto etc.) and it's version. Otherwise it's a shot in the dark to try to reproduce this bug.
Assignee | ||
Comment 12•7 years ago
|
||
The en-US dictionary we ship is a tweaked version of upstream SCOWL. You can see it at: https://dxr.mozilla.org/mozilla-central/source/extensions/spellcheck/locales/en-US/hunspell And FWIW, I see both that'd and that'll being flagged as misspelled inside this text box as I write the comment :)
Comment 13•7 years ago
|
||
I'm on Ubuntu right now and there Firefox uses the system-wide dictionaries installed via the package manager. Ubuntu packs the old English dictionary from 2007 that can be found on Hunspell sourceforge site. With that dictionary this bug is not reproducible. I was able to reproduce the bug with en-US SCOWL. So it's a dictionary bug, should be delegated to SCOWL. Basically Old dic en_US - OK Marco en_GB - OK SCOWL en_US - BAD.
Assignee | ||
Comment 14•7 years ago
|
||
Hey Kevin, any thoughts for how to fix this?
Flags: needinfo?(kevin.bugzilla)
Comment 15•7 years ago
|
||
Submit a issue at https://github.com/en-wl/wordlist/issues.
Assignee | ||
Comment 16•7 years ago
|
||
Is this https://github.com/en-wl/wordlist/issues/122? At which point we'd need to change WORDCHARS to 0123456789’ IIUC.
Comment 17•7 years ago
|
||
No. The words are missing, that is all. If you submit an issue I will consider adding them.
Comment 18•7 years ago
|
||
Nope, it's probably not that. Funny enough but Hunspell the library never actually uses WORDCHARS. Those chars are meant to be delegated to the host application so the host app can parse the text considering those. The command line hunspell uses that field. AFAIK, Mozilla text parser doesn't uses WORDCHARS at all and that is a good thing. From today's perspective, WORDCHARS should be obsoleted and carefully designed text parsing should be put inside the library. The issue is probably in the affixes.
Assignee | ||
Comment 19•7 years ago
|
||
Done, thanks.
See Also: → https://github.com/en-wl/wordlist/issues/163
Comment 20•7 years ago
|
||
Dimitrij: I am the author of the upstream wordlist. The words are not in the upstream word list see: http://app.aspell.net/lookup?dict=en_US;words=that%27ll%0D%0Athat%27d%0D%0A. They may be added to Mozilla's version of the dictionary, but I do not have time to check.
Flags: needinfo?(kevin.bugzilla)
Comment 21•7 years ago
|
||
Fixed by bug 1333648.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment 22•7 years ago
|
||
See the added words here: https://hg.mozilla.org/mozilla-central/rev/5dc60728bb7e#l5.60
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → ryanvm
status-firefox52:
--- → affected
status-firefox53:
--- → affected
status-firefox54:
--- → fixed
Assignee | ||
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•