Open Bug 666731 Opened 13 years ago Updated 2 years ago

Tie spellchecker language interface into new language tag master list

Categories

(Core :: Spelling checker, defect)

defect

Tracking

()

People

(Reporter: GPHemsley, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [bcp47])

Once the master language tag list (bug 666662) gets implemented, tying in the spellchecker language detector interface in with it will have the least risk and highest reward. It should cause little disruption while also resolving any outstanding bugs regarding a particular language showing only its language code (not its name) when its spellchecker is installed.
The spellchecker automatically ties in to the list of language names, but it doesn't tie in to the list of region or script names. It appears that script subtags are stripped and region subtags are displayed as literals.

We'll need to modify the code (wherever it is—haven't looked for it yet) to fully parse the language tag.
Target Milestone: --- → mozilla7
Version: Trunk → 7 Branch
Target Milestone: mozilla7 → ---
Version: 7 Branch → Trunk
Hmm... it appears that the code may be taking whatever comes last in the hyphenated tag and displaying it as a literal. Macedonian takes advantage of this in order to display both the Latin and Cyrillic variants of their spellchecker, but they do it by using the invalid tags 'mk-MK-Latn' and 'mk-MK-Cyrl'.

Moving this to the spellchecking component, since it seems like it may be better categorized there.
Assignee: smontagu → nobody
Component: Internationalization → Spelling checker
QA Contact: i18n → spelling-checker
Just recording an observation from our chat just now; Macedonian spell checker has files named "mk-MK-Cyrl.aff" and in my 6.0 ga-IE build it shows up in the context menu as "Macadóinis / An Mhacadóin (Cyrl)".   That is, "Macedonian / Macedonia (Cyrl)" probably in en-US.
Assignee: nobody → smontagu
Component: Spelling checker → Internationalization
QA Contact: spelling-checker → i18n
Assignee: smontagu → nobody
Component: Internationalization → Spelling checker
QA Contact: i18n → spelling-checker
I believe this is the code that decides what language (sub)tag is used to pick the dictionary:
http://mxr.mozilla.org/mozilla-central/source/extensions/spellcheck/hunspell/src/mozHunspell.cpp#129

AIUI, it basically just takes a substring of the filename from the start to the first occurrence of a '-' character (or a '_' character, if '-' is not found).

However, I've been on an unfruitful wild goose chase that sent me around in circles trying to track down all the different definitions and uses of mDictionary, aDictionary, mLanguage, aLanguage, etc. I don't really know how to read C/C++ code or what it does, so I can't figure out where the values of the variables originally start. Nor can I figure out where it trades the language code for the (localized) language name.
In that code, the desired dictionary comes in as the argument aDictionary.  The SetDictionary method gets called from mozSpellChecker::SetCurrentDictionary:
http://mxr.mozilla.org/mozilla-central/source/extensions/spellcheck/src/mozSpellChecker.cpp#367
which is called from the editor code various places, e.g. here:
http://mxr.mozilla.org/mozilla-central/source/editor/composer/src/nsEditorSpellCheck.cpp#112
This function implements the logic for choosing the spell checker - it looks at the pref spellchecker.dictionary, then falls back to current locale, then falls back to en-US.
Gordon, here's the code for building the localized list of dictionaries: language names + region names:
http://mxr.mozilla.org/mozilla-central/source/toolkit/content/InlineSpellChecker.jsm#172
Looks pretty straightforward, and it's clear why it's broken; assumes region comes second, and tacks on anything after a second component in parens (explaining the Macedonian weirdness).
Depends on: 730209
Depends on: 739861
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.