Closed Bug 1773802 Opened 2 years ago Closed 2 years ago

Multi-spellchecker: English + Russian ignores all errors in Cyrillic text

Categories

(Core :: Spelling checker, defect, P3)

Firefox 101
defect

Tracking

()

RESOLVED FIXED
103 Branch
Tracking Status
firefox-esr102 102+ fixed
firefox102 --- fixed
firefox103 --- fixed

People

(Reporter: mentin, Assigned: dminor)

References

Details

Attachments

(2 files, 1 obsolete file)

Steps to reproduce:

  1. Open some web site with text box.
  2. Type some words in English and Russian, mixing correct and invalid words, e.g.
    "correct правильный, incarrect непровильный"
    (correct English and correct Russian words, followed by grammar errors in both)
  3. Enable 'Check Spelling',
  4. Make sure enabled languages are 'English (United States)' and 'Russian'

Actual results:

When both English and Russian spellcheckers are enabled,
the Latin words are correctly checked with English spellchecker,
but any Cyrillic word is assumed correct and is never flagged as invalid.

I.e. with example above only the third word is flagged.
The fourth word, while being invalid one is not flagged, which is a bug.

The workaround is to turn off English spellchecker, which
flags all Latin words and incorrect Cyrillic words.

I.e. with example above the first, third and fourth word are flagged.

Expected results:

I expect to be able to turn on both English and Russian spellcheckers, and
that to flag words that are neither correct English, nor correct Russian words.

I.e. with example above I expect the third and fourth word to be flagged,
like it works in e.g. Chrome or Microsoft Word.

The Bugbug bot thinks this bug should belong to the 'Core::Spelling checker' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Spelling checker
Product: Firefox → Core

Have multiple simultaneously active spell checkers worked before?

Flags: needinfo?(bugs)
Flags: needinfo?(bugs) → needinfo?(dminor)

The ability to enable multiple spell checkers is a relatively new feature - see bug 1402822, and deep history in bug 69687.
So it is a defect report for a new feature, which is not yet working in a usable way, not a regression.

I think we have a problem when spellchecking with the English dictionary and another language which uses a non-latin alphabet, this has been reported for Greek and Hebrew as well. (Bugs 1767932, Bug 1772762). I've had no problems with multilingual spellchecking with English and Spanish, and when I investigated the Greek bug, Spanish and Greek seemed to work as expected.

Flags: needinfo?(dminor)
See Also: → 1767932, 1772762

This seems rather bad bug. Dan, do you think you have time to take a look.

Severity: -- → S3
Flags: needinfo?(dminor)
Priority: -- → P3

Sure, I'll have a look.

Assignee: nobody → dminor
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(dminor)

It's definitely something with the character sets, with just the en-US default dictionary installed, in the test string correct правильный, incarrect непровильный only incarrect is flagged as misspelled, but the Cyrillic text should show up as misspelled as well.

For some reason, the en-US dictionary is accepting these strings.

This adds a check to see if the encoded word is not empty and does not start
with the null character. Hunspell accepts C-style strings and marks the empty
string as correctly spelled. This prevents other dictionaries from detecting
misspelled strings in languages that use a different charset, which has lead
to problems when using the en-US dictionary packaged with Firefox and Greek,
Hebrew and Russian dictionaries, where misspellings are not detected in the
non-English language.

These were added in Bug 1162823, but there is no explanation given
as to why we would want to ignore non-Latin words when spellchecking
when using the en-US dictionary. Doing this makes it impossible to
spellcheck using the en-US dictionary and a non-Latin dictionary at
the same time.

Depends on D149899

Attachment #9282409 - Attachment is obsolete: true

As a user and reporter of this bug, I think the behavior introduced in Bug 1162823 did make some sense when user could only select a single dictionary. At that time, having all non-Latin words in mixed text highlighted as errors only made it harder to find real errors in English words, so while it is a questionable behavior, it was probably overall win for the user.

But now with support for multiple simultaneous dictionaries, this behavior is no longer needed or beneficial, thanks for making this change.

Pushed by dminor@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/38138f07343b
Ignore empty strings when spellchecking; r=smaug
Pushed by mlaza@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/9792b73e9e1a
Specify charset in new test; r=smaug CLOSED TREE
Regressions: 1776285
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 103 Branch

Comment on attachment 9282233 [details]
Bug 1773802 - Ignore empty strings when spellchecking; r=smaug!

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Without this patch, multilingual spellchecking doesn't work properly when using the en-US dictionary and a non-latin character set.
  • User impact if declined: Broken multilingual spellchecking for users who want to spellcheck English and Russian, Hebrew, Greek or other languages at the same time.
  • Fix Landed on Version: 103
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low risk, because the patch is very small and the functionality is covered by unit tests.
Attachment #9282233 - Flags: approval-mozilla-esr102?
Attachment #9282632 - Flags: approval-mozilla-esr102?

Comment on attachment 9282233 [details]
Bug 1773802 - Ignore empty strings when spellchecking; r=smaug!

Beta/Release Uplift Approval Request

  • User impact if declined: Broken multi-lingual spellchecking when using en-US dictionary and a language with a non-latin character set like Russian, Greek, Hebrew.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low risk, the patch is small and is covered by automated tests.
  • String changes made/needed: None
  • Is Android affected?: Unknown
Attachment #9282233 - Flags: approval-mozilla-beta?
Attachment #9282632 - Flags: approval-mozilla-beta?
Attachment #9282233 - Flags: approval-mozilla-beta? → approval-mozilla-release?
Attachment #9282632 - Flags: approval-mozilla-beta? → approval-mozilla-release?
QA Whiteboard: [qa-103b-p2]

Comment on attachment 9282233 [details]
Bug 1773802 - Ignore empty strings when spellchecking; r=smaug!

Approved for 102.0.1, thanks.

Attachment #9282233 - Flags: approval-mozilla-release? → approval-mozilla-release+

Comment on attachment 9282632 [details]
Bug 1773802 - Specify charset in new test; r=smaug

Approved for 102.0.1, thanks.

Attachment #9282632 - Flags: approval-mozilla-release? → approval-mozilla-release+

Comment on attachment 9282233 [details]
Bug 1773802 - Ignore empty strings when spellchecking; r=smaug!

Approved for 102.1esr.

Attachment #9282233 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+

Comment on attachment 9282632 [details]
Bug 1773802 - Specify charset in new test; r=smaug

Approved for 102.1esr.

Attachment #9282632 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: