Closed Bug 1073840 Opened 10 years ago Closed 9 years ago

Pref: Always honour setting of spellchecker language/dictionary in user prefs, regardless of lang attribute or Content-Language header

Categories

(Core :: Spelling checker, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: smjg, Assigned: jorgk-bmo)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 9 obsolete files)

User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:32.0) Gecko/20100101 Firefox/32.0

At the moment, it appears that the language to select is determined as described in §8.1.2 of the HTML 4.01 specification.
http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1.2

<quote>
An element inherits language code information according to the following order of precedence (highest to lowest):

    The lang attribute set for the element itself.
    The closest parent element that has the lang attribute set (i.e., the lang attribute is inherited).
    The HTTP "Content-Language" header (which may be configured in a server). For example:

    Content-Language: en-cockney

    User agent default values and user preferences.
</quote>

The HTML5 specification makes a similar statement, albeit explained in a very different way:
http://www.w3.org/TR/html5/dom.html#the-lang-and-xml:lang-attributes

Note that "User agent default values and user preferences" is at the bottom of the list.  So while in principle the browser is acting according to the specs, in practice it is a nuisance.  The user has gone to the trouble to select a spellchecking language, and the website is being given power to override the user's selection.

In most cases, this wasn't even the developer's intention.  In a webmail or CMS interface, for example, the lang attribute is likely to reflect the language of the UI, and doesn't mean that the email or content that the user is editing is in that language.  You could argue that in such an interface any text field should have lang="" to indicate that the language is unknown.  There are two problems with this argument:
(a) "" wasn't a valid value for lang prior to HTML5
(b) the developer is liable to neglect to set this attribute (Mail.com is guilty of this, for example)

I therefore propose a setting to enable the user to override the behaviour prescribed in the W3C specs and instead just spellcheck in the language that the user has set in the prefs.  I suggest a three-way setting:
* follow the W3C specs (current behaviour)
* use the user-set language unless overridden at the level of the input element (e.g. <textarea lang="eo">)
* always use the user-set language

It's all about giving power to the user and meeting user expectations.
Is there positive opinion about this bug on Mozilla side? I could start to look into it if there is some.
I comment this bug just because this issue is annoying as hell for a multilingual user like me. I've installed 4 dictionaries (en_US it_IT en_FR and es_ES) and the selected dictionary is randomly changing according to some choice made by an unknown webmaster of a casual visited webpage. :'(

I've looked for this issue in Mozilla bugzilla and there are a lot of bugs (many duplications) filed that may be solved from the introduction of this pref:

bug #717433 #728069 #853970 #858666 #909040 #922586 #923356 #926166 #932925 #959785 #992852
(In reply to Snertev from comment #2)
> bug #717433 #728069 #853970 #858666 #909040 #922586 #923356 #926166 #932925
> #959785 #992852

Of these, all apart from bug 922586 are already listed as blockers of bug 1073827.  Some of them are vaguely stated, such that it's hard to tell which are duplicates.  But any help in identifying duplicates and marking them as such would be most appreciated!
Since I got slapped on the wrist for posting this in the tracking bug first, posting here as it seems to be the best match of the many deps/duplicates.

This is a proposed patch for the dictionary logic. I'm not sure if it applies cleanly to anything here (probably not) since it's against my own browser's source, but should be a starting point.

What this does: prioritizes content-prefs > spellchecker.dictionary.override user pref > content > user locale > LANG env > en-US > any.

spellchecker.dictionary is no longer used. I chose a different name for the override because it may previously have been set by existing logic and it would be frozen to something unknown otherwise.

I hope this helps fix the issue.
Ehsan, do you think we should cut through the "determine a language" confusion one day?

The meta-bug 1073827 depends on currently 16 other bugs, including this one, for example bug 697981, were you've already been involved (for example, bug 697981 comment #4 down to bug 697981 comment #30).

Mark has made a start. I think, as long as we can define a priority (as already proposed in comment #4), this isn't rocket-science. Perhaps you could give the patch a quick look, it's not long. I know: "Don't ask for review, please", but who else should I ask? Note that the spelling of "preferred" was corrected (was "prefered"), so lines where no functionality has changed appear as changed. It took me a while so see the difference.

I haven't tried the patch, perhaps it needs to be refreshed. My commments are:
Don't change "spellchecker.dictionary" to "spellchecker.dictionary.override", you'll break many add-ons. Also, I'm not sure whether this should really be removed:
} else {	
  // If user sets a dictionary matching (even partially), lang defined by	
  // document, we consider content pref has been canceled, and we clear it.	
  ClearCurrentDictionary(mEditor);

What do you think?
Flags: needinfo?(ehsan)
Coming up with a useful answer here requires a lot of investigation, and I don't really have time to do that right now.  Sorry.
Flags: needinfo?(ehsan)
If you or someone else has a clear proposal on what they want to do, please post to dev-platform.
This is Mark's patch refreshed. Things move fast in Mozilla, and the directory where the patched file resided doesn't even exist any more ;-(
I removed the new preference "spellchecker.dictionary.override" for the time being.

I added what was necessary in order not to break the editors with eEditorMailMask set, since they do NOT use the preference.

This patch establishes the following priority:
1) content-prefs (not for eEditorMailMask editors).
2) user preference spellchecker.dictionary (not for eEditorMailMask editors).
3) page content (eEditorMailMask editors want this!!)
4) user locale, or any dictionary matching it partially.
5) LANG environment variable
6) en-US
7) any dictionary found.

The patch does this:
a) reversed the order of user preference 2) and page content 3).
   Without the patch, the page content had a higher priority than
   the user preference.
b) Never writes to the user preference value. Before that always got changed.
c) Never clears the content preference. *This is a problem*.
   We need to test the case were - say - the page is in "en", the user sets "de-DE"
   and later changes to "en-US". This must clear the "de-DE" or else, we can never
   get rid of it. Unless we adopt point d) (below) where the content preference is 
   always set.
d) Calls SetCurrentDictionary when mPreferredLang is empty. *This is a problem*.
   So the content preference is always written, perhaps by accident.
   However, that has a positive side effect (and also solves point c)):
   On Bugzilla, the document language is en-US. When I set en-AU, this never got
   registered since "en" matches "en". Switching to another tab and returning 
   to the BMO page always reverted to en-US. Now the en-AU sticks, since its comes
   from the stored content preference.
   If this behaviour is desired, we should remove the use of mPreferredLang
   and store the content preference unconditionally.

Questions to Mark:
1) Does the refresh look right to you?
2) How does the user set the preference spellchecker.dictionary?
   There is no option anywhere in the user interface apart from about:config.
3) How would you like to address problems c) and d) pointed out above?
   Were you aware that moving the setting of mPreferredLang further down had
   the accidental side-effect of always writing out the content preference?

P.S.: When submitting a patch, please use "hg qnew" and "hg qref" to prepare them.
Attachment #8612239 - Attachment is obsolete: true
Flags: needinfo?(mark)
The refresh looks fine, but the thing with 'spellchecker.dictionary" (and the reason why I used ".override") is that spellchecker.dictionary was previously used to temporarily store the lang found (when not stored in contentprefs otherwise).

What I think should happen, ultimately, is that spellchecker.dictionary is used to (temporarily!) hold the language for the current document (determined by the priority you listed) but not stored otherwise. IOW what is set and stored in contentprefs should not directly depend on spellchecker.dictionary (including clearing of the contentpref after it has previously been set if it happens to align with the content language); *only the user* should be able to set the contentprefs by choice.

So, what I think needs to happen is pretty much the same, with the exception that your point (2) in the priority list should be a different pref that can be set by the user to override content detection when a per-site pref (1) is not present.
Flags: needinfo?(mark)
Assignee: nobody → mozilla
Proposed solution - Option 1: Clean up existing behaviour.

Remove the random writing to the user preference "spellchecker.dictionary".
This is the root of all because this is written when the content preference is written or cleared, but not on other occasions. It does NOT reliably contain the language used for spellchecking in the following cases:
- a pre-existing content preference is retrieved
- the language gets set based on the page content or by any other method.

It simply doesn't make any sense to write this user preference at random.

With this "option 1" patch we confirm the pre-existing priority of language choice, but
without a random value in spellchecker.dictionary and two further tweaks.
Priority:
1) content-prefs (not for eEditorMailMask editors), unless they have become invalid for some
   reason (removal of dictionary).
2) "page" content (element/document) (eEditorMailMask editors want this)
3) IF SET: user preference "spellchecker.dictionary".
   Not for eEditorMailMask editors, since they already get their language in 2).
   Tweak one: user preference is *always* used, not just if it matches page content.
4) user locale
5) If 2) 3) or 4) don't set a valid dictionary, try a partial match.
6) If no "page" language was defined (and all the above hasn't worked):
   7) LANG environment variable
   -) Tweak two: (removed: en-US, simply makes no sense).
   8) any dictionary found.
   Note that 7) and 8) are *not* executed if a page language was defined.
   We might still want to change that.

Yes, the bug is about that the user preference should take preference over the page content. Whilst this is worth considering I think it may change too much existing behaviour and may be hard to get consensus on.

If the users are not happy with the language choice, they can set a different language and that will be stored as content preference. This part already works.

If the site they visit doesn't have any language information, then the user preference will be used instead. But it will be the user preference as set by the user and not some random value generated when dealing with a content preferences, maybe in a previous session.
Check MXR: http://mxr.mozilla.org/mozilla-central/search?string=spellchecker.dictionary
The preference is only ever written once in the entire code and I propose to remove that since it has no good reason.

This option could be landed without great discussion on "dev-platform" as suggested in comment #7.

Option 2 based on Mark's patch will follow.
Attachment #8653478 - Attachment is obsolete: true
Oops, spotted another problem in option 1.
Attachment #8653876 - Attachment is obsolete: true
Oops, spotted yet another problem in option 1.
Attachment #8653903 - Attachment is obsolete: true
OK, here is my proposed solution, option 2. It mainly follows Mark's idea, but undoes some of the damage he did regarding the writing/clearing of content preferences.

This patch establishes the following priority:
1) content-prefs, unless they have become invalid for some
   reason (removal of dictionary) - (not for eEditorMailMask editors).
2) IF SET: user preference "spellchecker.dictionary" (not for eEditorMailMask editors)
3) "page" content (element/document) (eEditorMailMask editors want this)
4) user locale
5) If 2) 3) or 4) don't set a valid dictionary, try a partial match.
6) If no "page" language was defined (and all the above hasn't worked):
   7) LANG environment variable
   -) Tweak: (removed: en-US, simply makes no sense).
   8) any dictionary found.
   Note that 7) and 8) are *not* executed if a page language was defined.
   We might still want to change that.

Basically it contains everything that option 1) does, but it reverts page content and user preference.

Now if the user preference is set, it will override the language set by the page. So users who set the preference to a valid dictionary, will always get their spellchecking in the specified language.

If the user preference is not set, users will just get the previous behaviour: If no content preference is established, the page will decide the spelling language.

I believe this is a clear and simple solution that any user can understand.

Note: Since the use of spellchecker.dictionary being written at random was removed, I can now use it as the "override" preference that Mark established separately.

Mark, please try the patch, I've left debug statements in the code so you can see.

If you're happy with this solution, I will undertake the necessary steps to get this discussed (see comment #7). The result of the discussion might still be to leave the existing behaviour and use option 1). 

As I said, option 1) is simpler and could be implemented without discussion.
Flags: needinfo?(smjg)
Flags: needinfo?(mark)
Hi Jorg,

I'll have a look at the patch this weekend. 

Although I understand that your option 1 is more in line with what is currently established in code, simpler to implement, and what the W3C apparently prefers, it's clear from the feedback from users that it's definitely not what *they* want -- they want to be in control of their choice of dictionary, not the website, regardless of (often also incorrectly defined) language for docs/elements. It should be discussed if there's resistance against this change in behavior of course, but I think establishing a technical baseline and working solution in this bug is indeed priority ;)

I'll report back when I've had time to patch, build and check.
Mark, I am glad that you're taking the time to follow this.

Sadly this code is quite a mess. Even if you look at option 1), I had to do a few corrections apart from the tweaks that I mentioned. A few return statuses weren't checked, etc. As you said in bug 1073827 comment #20, the code is so difficult, that you have to draw a flowchart. I think more and more fallback options were added, always with
  if (NS_FAILED(stuff we tried before)) {
    try something else.

There are also some details I don't understand but have left alone for now:
6) If no "page" language was defined (and all the above hasn't worked): ... or in code:
   if (mPreferredLang.IsEmpty()) {
     nsAutoString currentDictionary;
     rv = GetCurrentDictionary(currentDictionary);
     if (NS_FAILED(rv) || currentDictionary.IsEmpty()) {
OK, if a current dictionary is set, leave it alone, otherwise, try getenv("LANG") or else anything.
But why do we only do this if mPreferredLang.IsEmpty()?? Makes no sense and should most likely be removed in both options.

Coming to the options:
In option 1) the user can also be in control of the choice of dictionary using a content pref, that is setting the dictionary individually for each site. This works. They just don't have a global override option.

I also see no problem with option 2). If the user does not set the "spellchecker.dictionary", they get the original behaviour.

Note that http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1.2 lists user preferences last.

This is already totally violated with the content preferences, which take the highest priority!! Only that they are so well hidden, and you have to dig through a SQLite database to figure out what's going on.

I have no problem to move the "spellchecker.dictionary" as a global override up in priority and just after the individual content preference.
Note that setting "spellchecker.dictionary" was introduced in bug 678842, attachment 555104 [details] [diff] [review], landed here:
http://hg.mozilla.org/mozilla-central/rev/a624f57a9e6f
http://hg.mozilla.org/mozilla-central/diff/a624f57a9e6f/editor/composer/src/nsEditorSpellCheck.cpp
I guess, that wasn't a good choice back then.
Since I am already active with a broom, let's sweep away some more stuff.

First, the content preferences only looked at the language part, so on an "en" site, you couldn't store "en-GB" as a preference. When you came back to the tab after visiting another tab, your "en-GB" was gone. Not so nice.

On an "en" site, we will already switch to an en-* dictionary, more or less at random. I the user is not happy with the choice and chooses another one, we shouldn't obliterate the choice. This BMO site is an "en" site, but I can never get it to respect my "en-AU" choice.

The next thing is http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1.2.
This clearly states to look at the element, and then and the "Content-Language" header.
I moved this processing together to the front. No idea why this was torn apart.

Altogether I'm quite happy with all the clean-up and straightening out of little quirks. 

So Mark, this is the patch to try - unless I change it again ;-)
Attachment #8653966 - Attachment is obsolete: true
Digging around the past is quite interesting:
http://hg.mozilla.org/mozilla-central/diff/a624f57a9e6f/editor/composer/src/nsEditorSpellCheck.cpp

Back then, this was removed:
   1.379 -// Save the last set dictionary to the user's preferences.
   1.380 -NS_IMETHODIMP
   1.381 -nsEditorSpellCheck::SaveDefaultDictionary()
   1.382 -{
   1.383 -  if (!mDictWasSetManually) {
   1.384 -    return NS_OK;
   1.385 -  }
   1.386 -
   1.387 -  nsAutoString dictName;
   1.388 -  nsresult rv = GetCurrentDictionary(dictName);
   1.389 -  NS_ENSURE_SUCCESS(rv, rv);
   1.390 -
   1.391 -  return Preferences::SetString("spellchecker.dictionary", dictName);
   1.392 -}
   1.393 -

So if the user set the dictionary *once* manually, that was recorded in the preference and used forever, since other code said:
   1.411 -nsEditorSpellCheck::UpdateCurrentDictionary(nsIEditor* aEditor)
   1.412 +nsEditorSpellCheck::UpdateCurrentDictionary()
   1.413  {
   1.414 -  if (mDictWasSetManually) { // user has set dictionary manually; we better not change it.
   1.415 -    return NS_OK;
   1.416 -  }
   1.417 -

So quite clearly, back in the olden days, "spellchecker.dictionary" was a *global* override, that was set when the user set a dictionary manually. When content preferences were implemented, it sadly wasn't done right, since the author of the change just thought that he'd keep setting "spellchecker.dictionary". Only by then, it was NOT the global override anymore that is used to be. The marriage of content preferences and "spellchecker.dictionary" wasn't a happy one.

It is quite clear, that users want the "spellchecker.dictionary" back as a global preference as it used to be.

This is what option 2 promotes:

If "spellchecker.dictionary" is *NOT* set:
Document determines language, unless overridden individually by content preferences.

If "spellchecker.dictionary" is set:
It overrides the document language, but in individual cases, content preferences can be used and have higher priority.
I've just had a little look at Chrome. They give the user a choice of spelling language that will then apply to all pages. This is what "spellchecker.dictionary" would do if set.

So our option 2 is the way to go. One should consider adding the setting of the preference to the UI somewhere. But fixing up the editor code is the first step in the right direction.
Comment on attachment 8653918 [details] [diff] [review]
Proposed solution - Option 1: Clean up existing behaviour (v3)

I don't think option 1 is viable, so let's drop it.
Attachment #8653918 - Attachment is obsolete: true
As requested by Ehsan in comment #7 I will post this to the "dev-platform" mailing list:

===

How would you like to spell check this?

Before I get to my proposal, we need to take a little stroll down memory lane.

In the good old days, when setting the spell check language in a text field, the user preference "spellchecker.dictionary" was set to the selected language and this language was used until the user changed it again.

Before bug 338427 the language of a certain element (or its closest parent) was not respected for text input fields, so users had to change the language manually if they wanted to enter text in different languages on different sites. With the change, the language of the site/document overrode the setting in "spellchecker.dictionary". Details here:
http://hg.mozilla.org/mozilla-central/diff/f3f7872db0ae/editor/composer/src/nsEditorSpellCheck.cpp

Obviously the "single language users" who always want to write in their mother tongue were left standing in the rain, since they now had to change dictionary many times.

Consequently, after landing of this bug in Firefox 8, people started to complain. If users didn't want to use the language proposed by the site (or the site provided the wrong information), they had to set the dictionary manually. So in bug 678842 the language choice for a particular site was stored in the so-called "content preferences" (which is a SQLite database in the user profile). This was landed in Firefox 9. Users could now store the language they wanted to use on a particular site individually.

However, the "single language users" who always want to write in their mother tongue were still standing out in the rain. They did actually not want to change the language on all the different sites they visited.

Additionally, both implementation caused even more confusion. After bug 338427, the user preference in "spellchecker.dictionary" was still used as a fall-back, if the site didn't provide language information. After bug 678842, the user preference also still served as a fall-back in cases where no content preference was specified and where the site didn't provide any language information. However, the user preference was set at the moment where the user changed the spell check dictionary and a content preference was created. Details here:
http://hg.mozilla.org/mozilla-central/diff/a624f57a9e6f/editor/composer/src/nsEditorSpellCheck.cpp

This three-tier approach, with the user preference being set more or less randomly but later not obeyed because it was overruled by content preference or site language has lead to a host of problems:

Bug 1073827 - Meta bug to track all the problems
Bug 455235 - spellchecker doesn't choose the correct locale dictionary in Firefox
Bug 717433 - Chosen dictionary intermittently resets from en-GB to en-US after session restart
Bug 728069 - Firefox should remember manual setting of spell checker language
Bug 836230 - Preference spellchecker.dictionary doesn't retain user set value
Bug 853970 - Spell checker in browser forgets language setting
Bug 858666 - spelling checker language setting changes spontaneously
Bug 909040 - Spell checking language gets reset
Bug 923356 - Wrong language used on start up to spell check
Bug 926166 - changing the dictionary (language) is sometimes ignored
Bug 932925 - Spellcheck keeps changing to other languages
Bug 959785 - Chosen language is forgotten after re-focusing textbox
Bug 1073840 - Pref: Always honour setting of spellchecker language/dictionary in user prefs,
              regardless of lang attribute or Content-Language header
Bug 1139550 - Firefox keeps resetting spellcheck language to Cuban Spanish

Mark Straver and I have been looking into bug 1073840. Our proposal is the following:

Reinstate "spellchecker.dictionary" as a global override, if it has a value set. So we establish the following list of priorities (in this bug this is called "Option 2"):
1) Content preference
2) "spellchecker.dictionary", if set
3) language determined by site according to http://www.w3.org/TR/html401/struct/dirlang.html#h-8.1.2.
4) Fall-back options including the locale and others.

We also suggest not to ever set "spellchecker.dictionary" programatically. If the user changes the dictionary via the context (right click) menu, this will be stored as a content preference. "spellchecker.dictionary" can be set via "about:config" or an option which would have to be provided somewhere in the user interface.

This approach cuts through all the confusion:
"Single language users" set "spellchecker.dictionary" to their preferred language and forget about the issue.
"Single language users" who went to a language course and want to try writing in a different language, change the dictionary on a particular site and store a content preference.
"Polyglot users" unset "spellchecker.dictionary" and enter the text in the language the site requires. However, they lose "spellchecker.dictionary" as a fall-back, so their fall-back would be their locale (see point 4).

I have also considered another option, where the priority is reversed (in this bug this is called "Option 1" since it was considered first):
1) Content preference
2) language determined by site
3) "spellchecker.dictionary", if set

Mark and I believe that this is not what most users want who are "single language users". They want the "global override" back. (The only advantage of this option would be that "spellchecker.dictionary" could serve as a fall-back for the minority "polyglot users". However, I think their locale is good enough as a fall-back.)

Looking further afield:
Thunderbird from version 38 has changed the way it uses "spellchecker.dictionary" (bug 967494). There the user preference remains unchanged and determines the language for every new e-mail. If the users wants to write in a different language, they can change the dictionary, but that doesn't change the default. The Thunderbird team is working on a smarter solution to set the language from the recipients or other hints.
Chrome offers the users the Firefox pre-version-8 state: They can set the dictionary, and the choice sticks for all sites until they change it to something else.

So please voice your objections to the proposed solution, if any ;-)
I had a look at the patch, and I think the logic should work this way. it's still rather complex and I think one of the fallbacks would cause content pref inflation as it is now, as well as unnecessary rewriting of prefs:

Rewriting of prefs if a content pref is already present, and element/doc language is either empty or not equal to the content pref. The same value would be written back to content prefs (not very efficient) ;)

Inflation:
1. If there's no element/doc preferred language (W3C)
2. If there's no content pref

This means mPreferredLang is always empty, and the routine falls back to the user's locale or the preferred language set in spellchecker.dictionary (or first available if all else fails). Since mPreferredLang is empty, this would write a content pref for *every* page visited that doesn't have one yet and where the language isn't set (most of the pages out there), basically locking in the user's locale or their pref-set language for each visited site as a content pref from that point forward, which I sincerely doubt is what people want ;)

I think part of the problem is that SetCurrentDictionary is used as a catch-all, but the possible situations are so different that it's not going to be as simple to use. I assume this function is primarily called from when the user manually chooses a dictionary from the context menu (? Correct me if I'm wrong), in which case it does exactly what it should be doing (store content pref when mPreferredLang is empty or not equal to the dictionary passed in, and clear if equal).

At the very least, I guess the mPreferredLang.IsEmpty() should be dropped in the if statement in SetCurrentDictionary() *if* coming from DictionaryFetched; that would prevent unnecessary writing of content prefs for exisitng content prefs, spellchecker.dictionary, and user locale when there's no document/element language.

I also think that, likely, we wouldn't want to store the global override language as a per-site content pref if it happens to not be equal to the element/doc language (do we want to do ANY sort of content pref manipulation based on the override pref, anyway? probably not...)

It's probably a better solution to make decisions in DictionaryFetched itself about storing the content prefs instead of calling SetCurrentDictionary, so this confusion can be avoided altogether:

1. If a content pref exists, don't write it again, but if the doc/element language is equal we can still opt to clear it.
2. If an override pref is used, don't do anything with content prefs (the user will choose it, themselves, if so, from the GUI)
3. If 1 or 2 don't apply, this means element/doc language is used, which will always be equal to mPreferredLang, and will always clear the content pref if present (in the current logic) -- but there is no content pref (1 doesn't apply), so this is a pointless call to clear data...
4. If the user's local or first-available is used as a last-ditch effort, we wouldn't want to write the content prefs either with these values.

But, I guess not using SetCurrentDictionary would cause issues because it's made async? If that's not an issue then this can probably be simplified even more...
Flags: needinfo?(mark)
(In reply to Mark Straver from comment #22)
> Since mPreferredLang is empty, this
> would write a content pref for *every* page visited that doesn't have one
> yet and where the language isn't set (most of the pages out there),
> basically locking in the user's locale or their pref-set language for each
> visited site as a content pref from that point forward, which I sincerely
> doubt is what people want ;)

Did you look at the patch or did you actually run it with the debug?

Please take another look at SetCurrentDictionary():
This is the key line:
  if (!mUpdateDictionaryRunning) {

Setting the content preference only happens when the user does a right click and sets the language:
http://mxr.mozilla.org/mozilla-central/source/toolkit/modules/InlineSpellChecker.jsm#341

In all other cases where SetCurrentDictionary() is called from nsEditorSpellCheck::DictionaryFetched() only the last line is executed:
return mSpellChecker->SetCurrentDictionary(aDictionary);
This sets the spell check language if the dictionary is available.
mUpdateDictionaryRunning is set here:
http://mxr.mozilla.org/mozilla-central/source/editor/composer/nsEditorSpellCheck.h#70
which is called in the constructor of UpdateDictionaryHolder.

Please *run* the software to see what it does. That's why I left the debug in. After that, please revise your comment, which left me rather confused. If you don't want to compile it yourself, I can do a try build and you can download that.

All I need to know from you is whether you agree with was was said in comment #21, which I have posted to the mailing list.
Well sorry for not understanding what you wanted from me then. Yes I agree with what you posted on the mailing list. I also was fairly sure that you wanted me to have a look at the patch and check the logic.

And no, I did not understand from this:

   // The purpose of mUpdateDictionaryRunning is to avoid doing all of this if
   // UpdateCurrentDictionary's helper method DictionaryFetched, which calls us,
   // is on the stack.

that this would actually be used for what you just said in your comment. I assumed that mUpdateDictionaryRunning would be used to prevent multiple instances of this running concurrently. That was probably my mistake for assuming I understood the code comment and the naming of the variable in use.

Maybe I made a mistake porting the patch across because I did run it and it showed me some odd behavior as I described. Sorry if that was a typo on my end in that case. I'll put more time in and go over my code with a fine comb.
Try run here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6d527919576c
This time I built Linux, Windows and Mac, since I don't know which system you're using. Please let me know. You can download from here: ... (coming up when it's built).

P.S.: Yes, I wanted your comment on the patch, but only after running it, as you said you would.
Anyway, soon you can download a version and don't have to compile anything.
I said I did. I did patch it, and I did run it; then went back to the patch and tried to follow the logic based on what I saw in the result (which was a blatant application of everything in setcurrentdict, regardless of mUpdateDictionaryRunning). Please don't call me a liar - I probably just made a typo or had a slip of the finger.

Thanks for launching a try run, it'll probably be faster than a local build on my end, and I'll have a poke at a hopefully typo-free build from that before I poke at my own code again to see where I made a mistake :P
Download here: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mozilla@jorgk.com-6d527919576c/
Right now, only the Linux and Mac stuff has built, so check in a few hours or tomorrow.

P.S.: I'm not sure how a patch can go wrong:
- Download the patch to .hg/patches.
- Insert patch into series file
- hg qpush file.patch
- mach build editor/composer

Here's the debug I get:
On a Spanish site upon first visit:
***** mPreferredLang (element) |es|
***** Assigned from element/doc |es|
***** Setting of |es| failed
***** Trying dictStr |en-GB|
***** Trying dictStr |es-ES|
***** Setting worked.

After setting "spellchecker.dictionary" to "en-AU":
***** mPreferredLang (element) |es|
***** Assigned from spellchecker.dictionary |en-AU|
***** Setting worked.

With a content preference, after switching the dictionary to "de-DE":
***** Writing content preferences for |de-DE|
(switch to another tab an return)
***** mPreferredLang (element) |es|
***** Assigned from content preferences |de-DE|

Works beautifully and exactly how *you* wanted it!
It was a silly typo getting in the way. I've built and run it again from an incremental build which finished pretty fast and the behavior seems pretty much exactly as-intended from a casual test on multiple sites, preserving intended priority of content prefs > spellchecker.dictionary > element ~!

So :thumbsup:

The only oddity I saw was when elements/docs would prefer a single language code for something that has flavo(u)rs of the language: setting of [en] on the element/doc header fails, and the browser will then choose a fallback (first-available) instead of the closest match for the base language (afaict it skips LangCode? seems to happen with your build/testing too - not sure if it's intentional):

***** mPreferredLang (element) |en|
***** Assigned from element/doc |en|
***** Setting of |en| failed
***** Trying dictStr |en-GB|
***** Setting worked.

I'm on an en-US browser in an en-US OS, but have en-GB installed as a secondary dictionary for when I'm on en-GB sites.

PS: I'm on Windows.

PPS: As for patching, it can go wrong if you aren't exactly building Nightly, and have to port things across and insert code manually ;)
Also, I'm unfamiliar with mercurial queues (and the few times I tried it it blew up my repos, so I got rather discouraged at using it ever again) and nobody ever told me how to add these patch files that way. I'm more of a git person and rely on SourceTree for most of my repo work.
(In reply to Mark Straver from comment #28)
> The only oddity I saw was when elements/docs would prefer a single language
> code for something that has flavo(u)rs of the language: setting of [en] on
> the element/doc header fails, and the browser will then choose a fallback
> (first-available) instead of the closest match for the base language (afaict
> it skips LangCode? seems to happen with your build/testing too - not sure if
> it's intentional):
> 
> ***** mPreferredLang (element) |en|
> ***** Assigned from element/doc |en|
> ***** Setting of |en| failed
> ***** Trying dictStr |en-GB|
> ***** Setting worked.

Sorry, what is the problem?

Look at the code here:
      // Otherwise, try langCode (**if we haven't tried it already**)
      if (NS_FAILED(rv)) {
        if (!dictName.Equals(langCode) && !preferredDict.Equals(langCode)) {
          rv = SetCurrentDictionary(langCode);
          printf ("***** Trying langCode |%s|\n", NS_ConvertUTF16toUTF8(langCode).get());
        }
      }

At that stage "en" == mPreferredLang == dictName == langCode, so no point of trying it again. So the langCode test is skipped, that is intentional. It would only be triggered if dictName came in as "xx-XX", either from the document or the preference, then setting "xx-XX" fails, so we try "xx" alone, which will most likely also fail. But anyway, it's just a try.

Actually, "&& !preferredDict.Equals(langCode)" can go, since if preferredDict was set, it would have been assigned to dictName before. If preferredDict isn't set, then this part always comes out 'true'. This is a hang-over from the original, that I didn't spot. Anyway, this second part doesn't change the behaviour.

For "es" sites it will use the first es-* match it can find. The debug is somewhat unfortunate:
***** mPreferredLang (element) |es|
***** Assigned from element/doc |es|
***** Setting of |es| failed
***** Trying dictStr |en-GB| <=== This was NOT tried, it was skipped.
***** Trying dictStr |es-ES|
***** Setting worked.

Perhaps you can give it some more testing. If we're happy that it works and no objections were raised via the mailing list, I'll submit this for review.
Additional testing on this, please.

Changes:
- Further cleaned up fall-back behaviour: Trying langCode was unnecessary, since
  all available dictionaries starting with langCode are tried later.
  Thanks, Mark, for pointing that out!
- Improved debug, so it only reports the ones that are really tried.
- Straightened out further fall-back behaviour which will now always be
  executed, regardless of "mPreferredLang.IsEmpty()"
Attachment #8654078 - Attachment is obsolete: true
Oh boy, fall-back needed some more straightening out.

That was noted in testing, with test case "kr". Test will be attached next.
Attached file Some test cases.
You need en-US and en-GB for the test.

Debug for the four test cases:

***** mPreferredLang (element) ||
***** mPreferredLang (content-language) |en-US|
***** Assigned from element/doc |en-US|
***** Setting worked.

***** mPreferredLang (element) |en-GB|
***** Assigned from element/doc |en-GB|
***** Setting worked.

***** mPreferredLang (element) |en-ZA|
***** Assigned from element/doc |en-ZA|
***** Setting of |en-ZA| failed
***** Trying dictStr |en-GB|
***** Setting worked.

***** mPreferredLang (element) |en|
***** Assigned from element/doc |en|
***** Setting of |en| failed
***** Trying dictStr |en-GB|
***** Setting worked.

***** mPreferredLang (element) |kr|
***** Assigned from element/doc |kr|
***** Setting of |kr| failed
***** Assigned from locale |en-US|
***** Setting worked.

So Mark, more testing, on the current patch, version 4, please.
Attachment #8654721 - Attachment is obsolete: true
Make that "five" test cases.
I've done some testing with the updated patch and the test cases, as well as some other sites like fora.

Everything seems to be working just fine, with one exception: checking the element/doc language is case-sensitive.

If webmasters set the language to something in all lowercase, for example |en-gb| instead of |en-GB| (a common occurrence, e.g. phpBB does this) then the browser seems to pick the wrong one because it'll fail on a case mismatch, and go down the list of matching langcode entries for the first |en| one, and in my test picked en-US as a result.

***** mPreferredLang (element) |en-gb|
***** Assigned from element/doc |en-gb|
--> nsEditorSpellCheck::SetCurrentDictionary 1
***** Setting of |en-gb| failed
***** Trying dictStr |en-US|
--> nsEditorSpellCheck::SetCurrentDictionary 1
***** Setting worked.

Probably needs to just make the check case-insensitive?

I think with that addressed, everything would be perfect :)
FYI a good test case for manual testing (known to be problematic with the current implementation) is Facebook.
Thanks for the testing and feedback.

(In reply to Mark Straver from comment #34)
> ***** mPreferredLang (element) |en-gb|
> ***** Assigned from element/doc |en-gb|
> ***** Setting of |en-gb| failed
> ***** Trying dictStr |en-US|
> ***** Setting worked.

Hmm, this never worked. The en-gb is passed to mSpellChecker->SetCurrentDictionary(dict); and this fails, since it has no dictionary "en-gb". The dictionary is called "en-GB". The code that rejects "en-gb" is not in the file I'm working on. It would be a different issue to change that:
http://mxr.mozilla.org/mozilla-central/source/extensions/spellcheck/src/mozSpellChecker.cpp#377

https://www.phpbb.com/ is funny, look:
<!DOCTYPE html>
<html lang="en-GB"><head>
<meta http-equiv="content-type" content="text/html; charset=UTF-8" />
<meta http-equiv="content-style-type" content="text/css" />
<meta http-equiv="content-language" content="en-gb" />

On their search box text entry field I get:
***** mPreferredLang (element) |en-GB|
***** Assigned from element/doc |en-GB|
***** Setting worked.

And re. Facebook. "My" Facebook page has <html lang="en">, so the system picks the first en-* dictionary it can find. All good.
So, what shall we do -- format the element/doc language in this code to be properly cased, since we're right here and fixing this up anyway, or should this be a new bug to change mozSpellChecker?
Well, it's not obvious how to properly "case" it. The bit after the hyphen can also be lower case, see fr-modern, fr-classic, etc. I'm afraid this is yet another bug. Also, please keep the summary of this bug in mind: "Pref: Always honour setting of spellchecker language/dictionary in user prefs, regardless of lang attribute or Content-Language header".

This is not bug: "Fix spell checker so it can handle websites which specify the dictionary in lower case".
Filed Bug 1200186
Thanks. You missed the honourable 1.2 million ;-)

I will let this rest for a day or so to see whether we get any feedback on the mailing list and then present this for review. Then we'll need to see whether to change the UI to gain access to the preference.
Looks like the proposal just got canned:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/Et02D8Mk2d0, see Ehsan's post.

I might still carry some of the clean-up forward to another bug.
So the consensus from that discussion is "do nothing and keep broken logic"? :D
I'm sorry if I absolutely fail to understand the reasoning here. The pref has been used for multiple things over time, and currently it's used to programmatically hold something that bleeds through to the next site/page (and can result in removing user's content prefs in bad cases) so it's not doing what it should.

Ehsan's post would make sense if the code actually worked in the as-intended way explained, but it doesn't. As you assert it's broken in many many ways that took me a while to follow through as well. I think the proposal is a big step forward from what we had before.

In any case, I personally AM taking this proposal and tested patch for my own browser (since I do think this is the way it is supposed to work, from a user perspective) with one change: the overriding pref will be spellchecker.dictionary.override instead of spellchecker.dictionary, as I originally intended to begin with -- that way nobody has to "unset" anything if they don't want to be subjected to a totally random setting from the previous behavior.

I hope you can come to some form of consensus on what you want in Mozilla Firefox in this respect.
(In reply to Mark Straver from comment #42)
> So the consensus from that discussion is "do nothing and keep broken logic"?
I don't think so, I will lobby at least fixing the broken logic in another bug.

> I'm sorry if I absolutely fail to understand the reasoning here. The pref
> has been used for multiple things over time, and currently it's used to
> programmatically hold something that bleeds through to the next site/page ...
I can partly understand the reasoning. Ehsan prefers an approach that starts at the user interface. As I already noted, having the user preference do something with no UI to set or clear it, wasn't a brilliant approach. I believe that there is a good chance that the UI team might be in favour of an override, so we might revisit this one day.

> As you assert it's broken in many many ways
> that took me a while to follow through as well.
I hope to fix that at least. My approach to Mozilla is that I like to fix what annoys me myself, and in this case it would be not being able to store "en-AU" for BMO (this site) (since is content preference is not stored for part matches).

> I hope you can come to some form of consensus on what you want in Mozilla
> Firefox in this respect.
I'll try.

I am as disappointed as you are, since I invested a considerable amount of effort. Sadly community supported projects are like this, particularly if the people making the decisions are strapped for time and rather push off problems elsewhere than take them on themselves. We also need to keep in mind that multi-lingual people are in the minority and people who care about spelling even more so. (It's a different story in Thunderbird, a tool people use to write).
Flags: needinfo?(smjg)
> I can partly understand the reasoning. Ehsan prefers an approach that starts at 
> the user interface. As I already noted, having the user preference do something 
> with no UI to set or clear it, wasn't a brilliant approach.

There are many preferences without a UI in Mozilla code, and people in general have no issue using about:config to set them.
I think the main focus first and foremost should be to fix what is so terribly broken, which is what this bug (and all the others dep'd on the meta bug that are all symptoms of the same) is about.
Adding a UI for the override (which is only part of this problem) can be done easily enough later on - and shouldn't block a clear fix of broken back-end code. UI shouldn't even be an argument for the basic change of behavior you posted on the list, IMHO. But eh, what do I know?
I agree.

Sadly, as you can see as reaction to my post on the mailing list, there seems to be very little interest in the subject.
FWIW —

> That is only true if the majority of the websites on the Internet use
> the lang attribute incorrectly, and also only for those users who use a
> localized build in a language different than their language (for
> example, a German speaker who doesn't ever enter text in English) but
> uses an en-US Firefox.

That's not true, because there is the bug mentioned above about not storing en as a content preference (is it en or is it whatever the browser language is?)

In my case, my browser is in English, my OS locale is English, but sometimes a site will specify de in the language attribute (sometimes even correctly), and that will propagate to all other sites because it gets stored in the preference, and the other sites don't have a content pref. If I then change the language manually, instead of setting a content pref to en, it clears it — which means if I go back to the de site, the same thing happens again.
The various problems with the current behaviour will be looked at in bug 1200533.
Update:
The change here was opposed:
https://groups.google.com/forum/#!topic/mozilla.dev.platform/Et02D8Mk2d0

A solution that started at the UI was suggested. As instruction, I contacted the UX team and got told (quote):
> I think I'd defer to ehsan's comment about this being a complicated issue full of pitfalls.
> Not a priority for us to work on right now.

So any solution here has been postponed indefinitely.
Considering the work in this bug directly fixes things reported as far back as FF12 (bug 717433), and actually restores functionality (that didn't have a UI either) that was present before but broke, I don't exactly see how this should hinge on an as of yet non-drafted UI design change that is not even considered a priority. Do you want another 3.5 years to pass...? X}

We looked at the pitfalls, saw where the logic stepped in a few of them, and found a solution. Whether this bug or bug 1200533 is chosen (basically with the same fixes but with different behavior, this one with more user control, the other with more website control) doesn't really matter for the actual fix of most of the issues, but please do allow it to be fixed? If we're going to have to wait for this elusive UI change that nobody is working on, it'll just remain terribly broken. I don't see how that is a good thing for anyone.
We convinced the powers that be that fixes are necessary. Watch bug 1200533.
Comment on attachment 8654755 [details] [diff] [review]
Proposed solution - Option 2: Changing the existing behaviour (v4)

After bug 1200533 landed, this is 100% obsolete.
Attachment #8654755 - Attachment is obsolete: true
Won't fix for now, perhaps later as part of bug 1203024. Sorry, it found no consensus when discussed on dev-platform: https://groups.google.com/forum/#!topic/mozilla.dev.platform/Et02D8Mk2d0
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: