Closed Bug 1049349 Opened 11 years ago Closed 7 years ago

Don't autocorrect a correct but low-frequency word, e.g. yikes (unless the correction is *really* high-frequency)

Categories

(Firefox OS Graveyard :: Gaia::Keyboard, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: dholbert, Unassigned)

Details

Attachments

(2 files)

STR: In a SMS, type the word "yikes" and look at the autocorrect results (or just hit space to let autocorrect do its magic). EXPECTED RESULTS: Autocorrect should treat "yikes" as a valid English word. ACTUAL RESULTS: Autocorrect doesn't recognize it. Its best guess is "yokes". "Yikes" is totally a valid English word. Dictionary.com page: http://dictionary.reference.com/browse/yikes It's also in Firefox Desktop's spellcheck dictionary, as shown by: http://mxr.mozilla.org/mozilla-central/search?string=yikes&find=extensions%2Fspellcheck I can reproduce this on my Flame (running a 1.3-based build) as well as in Firefox OS Simulator 2.0.
OS: Linux → All
Hardware: x86_64 → All
Hmm, the word is in the dictionary in 1.4 and up, so it seems we hit a bug...
So this is defined behavior. Yokes has a much bigger frequency (60) than yikes (18). When the matching is done, 'yokes' has a score of 7.014, while 'yikes' has a score of 4. That difference is bigger than 1.3, so our algorithm for words like ill->I'll kicks in, and gives an autocorrect for 'yokes'. So in theory this is all correct. But I don't think we should do it for low freq matches. So if suggestion[0][1] is below X (10? 15?) then don't run this code. F.e. these tests still work in that case: 'ill' has: [ [ "I'll", 22.57 ], [ "ill", 17 ] ] 'wont' has: [ [ "won't", 21.85 ], [ "wont", 8 ] ] @djf, What do you think?
Flags: needinfo?(dflanagan)
Attached image screenshot
(In reply to Jan Jongboom [:janjongboom] (Telenor) from comment #2) > So this is defined behavior. Yokes has a much bigger frequency (60) than > yikes (18). When the matching is done, 'yokes' has a score of 7.014, while > 'yikes' has a score of 4. That difference is bigger than 1.3, so our > algorithm for words like ill->I'll kicks in But "yikes" isn't *any* of the 3 suggestions. As shown here in this screenshot from Simulator w/ Firefox OS 2.0, it suggests "yokes", "hikes", and "yoke's". Is "yikes" really lower-frequency than "yoke's" and "hikes" (particularly with whatever likelihood-bumping is introduced by the fact that I *did* actually hit the buttons for "y", "i", "k", etc., not a "h" and not a "o")
(In reply to Daniel Holbert [:dholbert] from comment #3) > Created attachment 8468523 [details] > screenshot > > (In reply to Jan Jongboom [:janjongboom] (Telenor) from comment #2) > > So this is defined behavior. Yokes has a much bigger frequency (60) than > > yikes (18). When the matching is done, 'yokes' has a score of 7.014, while > > 'yikes' has a score of 4. That difference is bigger than 1.3, so our > > algorithm for words like ill->I'll kicks in > > But "yikes" isn't *any* of the 3 suggestions. As shown here in this > screenshot from Simulator w/ Firefox OS 2.0, it suggests "yokes", "hikes", > and "yoke's". > > Is "yikes" really lower-frequency than "yoke's" and "hikes" (particularly > with whatever likelihood-bumping is introduced by the fact that I *did* > actually hit the buttons for "y", "i", "k", etc., not a "h" and not a "o") That's intended behavior. We have 4 suggestions shown there basically. First is the one you typed in (the X button), then the one we go to autocorrect (in blue), then two more. See bug 927286.
OK -- but even if I misspell it "Yikws" (fat-fingering the "e" and hitting "w" above it instead), I don't get "yikes" as any of the suggestions. Instead, I'm suggested "yokes", "hikes", and "yuk's". For the sake of argument, I'll assume that "yokes" and "hikes" are there because we consider them much more frequent than "yikes" (despite being further away in terms of edit-distance), per comment 2. But I can't believe that's the case for "yuk's"... This is really looking like "yikes" is just not in the dictionary. :)
Daniel: the wordlist on which the dictionary is based is at apps/keyboard/js/imes/latin/dictionaries/en_wordlist.xml It is straight out of the Android AOSP repo. It says that "yuk's" has a frequency of 25/255 and yikes has a frequency of 18/255. If I type "uikes", I get "yikes" as one of the suggestions, though it still wants to autocorrect to "hikes". Because of patent issues we can't display "yikes" if that is the word you actually typed, so it is expected that you would not see it as one of the suggestions when you type it correctly. But you're right that it should not autocorrect to something else. We only want that to happen for really high-frequency things like "ill" to "I'll". (And for what its worth, I agree with you that yikes is a totally useful word that should have a higher frequency in the wordlist, but maybe you and I are outliers... who are we to argue with Google about word frequency :-) Jan: what are we currently doing to decide whether to autocorrect? Is it the ratio of the highest weighted suggestion to the weight of the user's input that we consider? I think you're right that there should be an absolute threshold. (And in addition a close word length match as you've proposed in another bug.)
Flags: needinfo?(dflanagan)
(In reply to David Flanagan [:djf] from comment #6) > en_wordlist.xml [...] straight out > of the Android AOSP repo. [...] says that "yuk's" has a frequency of 25/255 > and yikes has a frequency of 18/255. I stand corrected, then. (and I'm curious what AOSP's word list comes from, but that's not a discussion for this bug.) Thanks for the clarification!
Update summary according to comment 6.
Summary: Add "yikes" to b2g autocorrect dictionary → Don't autocorrect a correct but low-frequency word, e.g. yikes
Summary: Don't autocorrect a correct but low-frequency word, e.g. yikes → Don't autocorrect a correct but low-frequency word, e.g. yikes (unless the correction is *really* high-frequency)
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: