Open Bug 1073827 (spell-lang-change) Opened 10 years ago Updated 10 months ago

[meta] Spellchecker fails to honour/remember user's choice of language/dictionary

Categories

(Core :: Spelling checker, defect)

defect

Tracking

()

People

(Reporter: smjg, Unassigned)

References

(Depends on 4 open bugs)

Details

(Keywords: meta)

Attachments

(1 obsolete file)

A number of people have found that they can't get the language they select for spellchecking (either in about:config or from the context menu) to stay put.  This is a tracker for bug reports of this nature.
Adding dependencies.  Some of them might be duplicates - need to look through them and see.
Alias: spell-lang-change
Depends on: 1073840
Keywords: meta
This thread on mozillaZine is the best general description of the problem, including attempts to resolve it. However the only effective workaround remains to remove the less-used dictionaries that Firefox is switching to.

http://forums.mozillazine.org/viewtopic.php?f=38&t=2865587
(In reply to Paul from comment #2)
> However the only effective workaround remains to remove the 
> less-used dictionaries that Firefox is switching to.

By "less-used" do you mean:
(a) less used by the user having the problem?  Then it wouldn't make sense.  Uninstalling that dictionary, and installing it again on the odd occasions that one wants to use it, doesn't make sense at all.  If on the other hand nobody who uses that computer uses said dictionary at all, _then_ it would make sense to uninstall it.
(b) less used by the general world population?  It isn't always one of these that gets switched to - for example, I've frequently had it switching to en-US against my will.
Case (a), i.e. less frequently used from the user's perspective. I agree that this workaround is not satisfactory.

It's probably more common for a non-native (or even simply non-US) English speaker to suffer this problem, when Firefox switches to en-US per your experience.
Depends on: 1139550
At the request of a user I had a close look at the code involved and it seems the logic is broken in nsEditorSpellCheck.cpp

If I read the code correctly (please correct me if I'm wrong), the editor first checks if a content-pref is stored (and set+bail). If not, it will get the document language and store this in a global pref (spellchecker.dictionary), making the *next* document visited that has no content-pref stored and has no document language defined (all too common) use the pref from the previous document. That seems to be all sorts of wrong and leading to the confusing, erratic behavior seen.

Similarly, if a document is loaded with a content-pref, the content-pref language is stored in that global pref which will be applied to the next document loaded that has neither content-pref nor document language.

You're also clearing the content-pref if it matches the document language - which will in turn cause issues as well if you're visiting sites with multiple language documents with defined doc languages.

It probably just needs to be made a lot simpler: 
1) Check if a content pref is stored, if so use it. If not, check user language preference as selected by the user (spellchecker.dictionary should only be written if a user selects a global preference for language other than their own locale - might even be about:config only, although having a UI for it would be great). If not set, check document language and use that.
2) If a user selects a different dictionary in the context menu, store the content pref.
3) Do not clear the content pref based on doc language because other docs on the same host may have different languages.
I guess the third point also adds to the confusion of the broken logic here: if a user gets a wrong language for a site with no content-pref, and sets the document language to the proper one when noticing wrong spell checking, it'll get stored, but then cleared again later on, making it once again dependent on the global pref.
I guess the first criterion should be the language of the page which has the form. Likely the user will respond in that language.
(In reply to Boris 'pi' Piwinger from comment #7)
> I guess the first criterion should be the language of the page which has the
> form. Likely the user will respond in that language.

Not necessarily. There are many situations where a document language would not necessarily determine what language a user writes in - not to mention many pages not defining the language correctly which is often done site-wide.

But what I proposed would still adhere to this if the user hasn't set a globally preferred language with the pref. (content-pref (per host) > global pref (spellchecker.dictionary) > doc pref (form/content header) > use locale)
(In reply to Boris 'pi' Piwinger from comment #7)
> I guess the first criterion should be the language of the page which has the
> form. Likely the user will respond in that language.

This tracker is about honouring the _user's_ choice of spellchecker language.  Controlling the ability of websites to override the user's choice is part of this.  See bug 1073840.
Mark - what is this "content-pref"?  I've never seen any facility whereby the user can configure the spellchecker language on a per-site basis.  Do you have any idea which Mozilla browsers/versions thereof provide access to this feature?
Stewart,

If you select a dictionary to use from the context menu on a form element, it'll be stored in the content prefs (content-prefs.sqlite), and should be remembered from that point forward. Because of this bug, that's not always the case.

I don't know when this was introduced, but I gather a long time ago.
Attached patch Patch for dictionary selection (obsolete) — Splinter Review
This is a proposed patch for the logic. I'm not sure if it applies cleanly to anything here (probably not) since it's against my own browser's source, but should be a starting point.

What this does: prioritizes content-prefs > spellchecker.dictionary.override user pref > content > user locale > LANG env > en-US > any.

spellchecker.dictionary is no longer used. I chose a different name for the override because it may previously have been set by existing logic and it would be frozen to something unknown otherwise.

I hope this helps fix the issue.
This is a tracker bug.  No patches should be posted here.  Please post this patch under the specific blocking bug that the patch fixes.
OK.. hah, I'll just pick one at random then?
Posted in bug 1073840 since it seemed to be the best match of all the similar reports this meta bug tracks. Feel free to obsolete the patch attachment here.
Attachment #8612220 - Attachment is obsolete: true
I think most (all?) of those bugs are actually duplicates of the problem Mark is describing, but I don't have enough knowledge of the internals to be sure. If someone who knows more about it can take 10 minutes to look at them all and mark as many as possible as duplicates, that would be awesome, as it would allow the discussion and fix to happen in one place. (Right now as an user I don't know what to follow, put myself on CC, and/or vote.)
There are issues in at least two distinct areas:
- the ability of websites to override the user's choice of language
- persistence of the user's choice of language itself

So clearly they aren't all duplicates of each other.  I've had a look and tried to identify which are duplicates, but can't easily tell because the steps either are too vague or rely on the user having specific dictionaries installed.
Stewart, thanks for trying anyway!

If Mark is right (and I think he is), the persistence problem is caused by the buggy behaviour attempting to allow websites to override it, and his patch will fix both. The problem, of course, is that we won't know for sure until the patch it thoroughly tested, and we can't go marking bugs as duplicates based on a theory. ¯\_(ツ)_/¯
It may be the case that some of these bugs aren't identical but have the same root cause.  I'm convinced that fixing bug 959785 is the key to fixing at least a few of them.  Mark - any prediction of what effect your patch will have on this one?
I think it all has the same root cause.

This root cause being that the choice from a previous load affects the next load through the pref that persists the dictionary setting across documents (and then having the next document be a toss up of any of the things that can determine the dictionary language depending on if they are set or not, and in case of a language match can even clear a previously user-set per-site preference).
As-is I already had to draw a flowchart to understand the current sequence of events so it should probably just be vastly simplified and clearly isolated per document which it currently isn't.

I don't think however that my patch is 100% correct as it is (it seems to prevent proper storage of dictionary languages in per-site prefs) so it needs a good look-over. I ended up just adding an override for the time being, but it's still buggy.
It seems to me that your patch isn't a fix for any specific bug that's listed as a dependency of this one.  As such, I think it should be under a new bug report, set to block other bugs as appropriate.
Chipping in here following on from bug 909040. With the current non-smart locale selection (I'll get to what I mean with that later) the language of a site should never override the user preference. There are many more spellcheckers out there than there are locales on site forms. Case in point, Facebook, where I have seen people write in way more languages than Facebook itself is offered in. Then there is of course the scenario where users do not use an available localization for various reasons but still prefer their spellchecker locale to be a certain locale unrelated to the site language or even the browser locale. As a 3rd or 4th level option, I could live with the site language impacting the selection of the spellechecker but not as an override to user choice.

Now coming to my 'smart selection' - all suggestions seem to focus on the locale of a certain page, which in my view is a terrible predictor of the likely language a user will write in. IMHO, if we were to be really smart, we would somehow make the browser language aware i.e. there's the base setting, whether user selected or site locale driven but beyond that, the browser could compare the words typed against installed dictionaries and if the % of words goes over a certain threshold, it flicks language. So if I started air sgrìobhadh ann an cànan eile nach eil 'na Bheurla, it would at some point start checking that sentence using a different dictionary.

If we wanted to be even smarter, we'd allow that on a sentence by sentence basis because all multilinguals codeswitch and it's very far from reality that a user will always stick to a certain language fully, even within the same text. And the more informal it gets, the more we switch and the more we need a smart spellchecker.
(In reply to Michael Bauer from comment #22)
> If we wanted to be even smarter, we'd allow that on a sentence by sentence
> basis because all multilinguals codeswitch and it's very far from reality
> that a user will always stick to a certain language fully, even within the
> same text. And the more informal it gets, the more we switch and the more we
> need a smart spellchecker.

Hmm.  The danger of that is that an innocent misspelling might inadvertently trigger the spellchecker to switch languages.  Of course, it wouldn't decide which language to use based on a single word.  But a typical example is a UK English speaker/writer mistyping "labelled" as "labeled" (I've seen it happen), thereby causing it to switch to US English because that's the language in which the greatest number of the last 20 words, the sentence, the paragraph or whatever happen to be in the dictionary.
If you set the % high enough, shouldn't be too much of an issue. We could block it between regional variants like en-US and en-UK. Getting it wrong between en-US and fr-FR or zh-TW is much less likely. I use a predictive texting app like that and it works quite well, though I'd personally set the % threshold a bit higher.

Plus we could offer a user choice for disabling it.

But either way, defaulting to en-US all the time over manual user choice is not good.
Depends on: 1200533
Some work was done in bug 1073840 on the issues summarised here.

Sadly, the proposed solution was opposed and has fallen through:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/Et02D8Mk2d0

Instead, a solution that started at the UI was suggested. As instructed, I contacted the UX team and got told (quote):
> I think I'd defer to ehsan's comment about this being a complicated issue full of pitfalls.
> Not a priority for us to work on right now.

So any solution has been postponed indefinitely.
See Also: → 682564
(In reply to Jorg K (GMT+2) from comment #25)
> So any solution has been postponed indefinitely.

Solution to what, exactly?
All of it.

Please read this carefully:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/Et02D8Mk2d0
Quoting:

===
No amount of changes to the fallback paths in the code mentioned above
will make things work well for everyone.  That is why in my previous
email I said that we should get out of the guessing game and let the
user actually tell us what they want to happen. 

Believe me when I say that *every single time* we have tried to "fix"
something here, someone has told us that they actually want a different
behavior.  As such, I think that any change here must be performed as a
larger project to fix our spell checking UI.  Even fixing bugs in this
code has proved to cause more issues.

I have looked at this code for years.  May I suggest that looking at
things for a few days may not give you the full picture of where the
problems are?  :-) 
===

Ehsan, the reviewer of this area, apparently doesn't want to touch it. Instead he wants a "top down" solution starting at the user interface, but the user interface team is not interested.
Depends on: 1200186
Is this a general view of the UI team or just someone who maybe didn't quite get the issue? I've had this before, especially if the person at the other end is, well, monolingual.
We convinced the powers that be that fixes are necessary. Watch bug 1200533.
However, the current behaviour will be maintained more or less.
I forgot to say: Of the initial "Depends on" list, two bugs have already been closed. Once bug 1200533 lands, I will close these nine as well:
bug 455235, bug 728069, bug 836230, bug 853970, bug 858666, bug 909040, bug 923356, bug 932925, bug 1200186.
After carrying out adequate testing to verify that the patch for bug 1200533 has indeed fixed these others, I presume?
These others are so vague, that they cannot be tested adequately.

To do a proper test you need to:
1) remove the content preferences in content-prefs.sqlite (or start with a new profile).
2) Reset spellchecker.dictionary before you start.
3) report all the dictionaries installed.
4) report all the sites you visited.
5) report all the actions taken.

None of the bugs to that. So they will all be closed. After bug 1200533 lands, the expected behaviour will be crystal clear. If something goes "wrong", it will be due to a misunderstanding.
Now that bug 1200533 and bug 717433 have landed, the spell checker no longer fails to honour/remember user's choice of language/dictionary.

The priorities for selecting the dictionary are clearly defined was follows:
1) Content preference, so the language the user set for the site before.
   (Introduced in bug 678842 and corrected in bug 717433.)
2) Language set by the website, or any other dictionary that partly
   matches that. (Introduced in bug 338427.)
   Eg. if the website is "en-GB", a user who only has "en-US" will get
   that. If the website is generic "en", the user will get one of the
   "en-*" installed, (almost) at random.
   However, we prefer what is stored in "spellchecker.dictionary",
   so if the user chose "en-AU" before, they will get "en-AU" on a plain
   "en" site. (Introduced in bug 682564.)
3) The value of "spellchecker.dictionary" which reflects a previous
   language choice of the user (on another site).
   (This was the original behaviour before the aforementioned bugs
   landed).
4) The user's locale.
5) Use the current dictionary that is currently set.
6) The content of the "LANG" environment variable (if set).
7) The first spell check dictionary installed.

Note that the preference "spellchecker.dictionary", which serves as a fallback (see 2) and 3) above), will *only* be set when the user manually sets a dictionary via the context (right-click menu).

Let me repeat what the priorities mean: They mean that the language is determined by many factors, "spellchecker.dictionary" is only one of them. It is *not* correct to assume that the language stored in "spellchecker.dictionary" is the one that will be used when clicking in a text input field.

Most of the bugs on which this meta-bug depends reflect a misunderstanding by the user.

I will therefore close most of these bugs which only gave a vague problem description. The bugs can be reopened, if and only if a clearly reproducible test case meeting the following requirements can be supplied:

1) Content preferences in content-prefs.sqlite removed before the test (or test with a new profile).
2) Preference "spellchecker.dictionary" reset.
3) All dictionaries installed are reported.
4) All actions taken and all sites visited are reported.
(In reply to Jorg K (GMT+2) from comment #33)
> Note that the preference "spellchecker.dictionary", which serves as a
> fallback (see 2) and 3) above), will *only* be set when the user manually
> sets a dictionary via the context (right-click menu).
OK, I have to admit that I was wrong. I discovered bug 1204147 where a content preference was created and the preference "spellchecker.dictionary" was set without user interaction. This bug is fixed, the fix will land today or tomorrow.

> Most of the bugs on which this meta-bug depends reflect a misunderstanding
> by the user.
While this may be so, bug 1204147 has potentially caused an awful lot of confusion, since unwanted things happened behind the users back.

If any strange behaviour is observed in the future, it is very important to have a 100% reproducible test case (see previous comment).
Depends on: 1204147
Depends on: 1193293
Too bad, another bug, bug 1193293. Once again, the user choice got ignored on first click. Works on second click. Quite confusing.
Depends on: 1205983, 1209220, 1207713
I just read bug 682564 which took me here as it is not fixed (in Firefox 42.0.) from my point of view.  If I set spell checker to use English (AU) and I expect it to be locked on to that from that point on. I don't want the browser changing to English (US) no matter what content is on any page. This is the way preferences are supposed to work. I will change it manually if needed. Why is this so difficult to fix? 

If there are people who want this kind of smart switching (people who regularly do text entry in more than one language) would it not be better to do this as an add-on and simplify the coding? If would not be the first time core functionality is dropped from Firefox.
(In reply to Scott R from comment #36)
> I just read bug 682564 which took me here as it is not fixed (in Firefox
> 42.0.) from my point of view.
Please test in the latest nightly build as bug 697981 and bug 1205983, genuine bugs erroneously switching back to English(US), were fixed in Firefox 44.
Some of the dependent bugs were already fixed in Firefox 43, but for the fully fixed behaviour, you need to try Firefox 44.
Depends on: 1128294
Depends on: 1437111

Looks like I found a pattern — spellchecker.dictionary preference is ignored if the page <html> tag contains language code which matches the bundled dictionary.

Example:

Result:

  • if webpage contains <html lang="en"> Firefox will use bundled dict by default instead of spellchecker.dictionary pref
  • if webpage contains <html lang="ru"> or just <html> (without language code) Firefox will use ru-EN dict

Tested on Firefox 73

This is very annoying, for example, at Github, where some of the discussions in our project are conducted in English, and some in Russian, so I need to check the spelling in different languages within the same site

The last two comments look to me like bug 1073840.

My preference for the webpage language is French. As a consequence, when I visit something like a bug tracker, the interface is sometimes in French (when available), even though in practice, the comments from the users are always written in English. So in many cases, matching the webpage language does not make sense.

So either the user's default choice should be preferred over the webpage language, or there should be an option to let the user choose whether his choice should have the precedence for not yet visited sites.

Another solution would be to autodetect the language(s) when the user enters text. That would be like bug 1203024, which had been marked as a duplicated of the much more general bug 69687 (I don't know why).

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: