Last Comment Bug 514151 - Spelling dictionary names not human readable on Linux
: Spelling dictionary names not human readable on Linux
Product: Toolkit
Classification: Components
Component: XUL Widgets (show other bugs)
: unspecified
: x86 Linux
: -- normal (vote)
: mozilla1.9.3a1
Assigned To: Reed Loden [:reed] (use needinfo?)
: Neil Deakin
Depends on: 528831
  Show dependency treegraph
Reported: 2009-09-02 04:24 PDT by era+mozilla
Modified: 2009-12-30 18:41 PST (History)
9 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

patch - v1 (1004 bytes, patch)
2009-10-29 23:31 PDT, Reed Loden [:reed] (use needinfo?) review+
mbeltzner: approval1.9.2+
dveditz: approval1.9.0.18-
Details | Diff | Splinter Review
patch for 1.9.1 (539 bytes, patch)
2009-11-22 23:49 PST, Mike Hommey [:glandium]
dveditz: approval1.9.1.8+
Details | Diff | Splinter Review

Description era+mozilla 2009-09-02 04:24:48 PDT
User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/2009080316 Ubuntu/8.10 (intrepid) Firefox/3.0.13
Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/2009080316 Ubuntu/8.10 (intrepid) Firefox/3.0.13

It seems that the code for finding spelling dictionaries in inlineSpellCheckUI.js expects nonstandard dictionary names.  On Linux, most spelling dictionaries have a file name like lc_RC where lc = language code (e.g. "en") and RC = region code (e.g. "US"), separated by an underscore.  By contrast, Mozilla expects the separator to be a hyphen.

On a Fedora 12 alpha Live CD, when you install hunspell-en (the dictionary pack for English for hunspell) you get a language menu with unreadable computer-ish names like "en_GB" and "en_US".  On Windows (and Debian, because they have a workaround in place), you get human-readable names like "English / United Kingdom" and "English / United States" in the spelling dictionary language menu.

Reproducible: Always

Steps to Reproduce:
1. Navigate to a page with a <textarea> or other spell-checkable element
2. Type in some typos
3. Right-click in the textarea, and inspect the Languages > submenu in the menu you got when you right-clicked
Actual Results:  

Expected Results:  
 English / Australia
 English / South Africa
 English / United Kingdom
 English / United States is a related Ubuntu bug which has lots of triage notes (and also a fair amount of noise).

Note that Ubuntu inherited the Debian workaround, which has a problem (which is really the topic of the Ubuntu bug in question): it displays *both* the underscored names and the human-readable names.
Comment 1 Jo Hermans 2009-09-02 04:55:13 PDT
This is supposed to be done by the fix in bug 335600 - why doesn't it work on Linux ?
Comment 2 era+mozilla 2009-09-02 09:58:37 PDT
> var isoStrArray = list[i].split("-");

This splits the lc and RC on hyphen, but if they are not hyphen-separated, it doesn't manage to split them properly.  Most other tools on Linux (OpenOffice, for example) have the underscore as separator.  So the problem isn't really in the code per se, but in the assumption that the dictionary file names will be using a hyphen between these parts.

(The Debian workaround is to install symlinks lc-RC -> lc_RC for each installed dictionary, but that has other problems, as mentioned above.)

My proposed fix would simply use an underscore and be done with it, but that requires Windows (and Mac?) dictionaries to be named according to the Linux / ISO convention.  Who controls that, are the Windows and Mac spelling dictionaries distributed by Mozilla or by a third party?  Or anyway, perhaps the code could use a different separator on different platforms, but I'm not enough versed in JavaScript to know how to do that elegantly and idiomatically.  Or well, perhaps it can split on *either* hyphen or underscore, which should work with minimal fuss on all platforms.

(A crude patch is available as an attachment to the Launchpad bug linked above, but it's completely trivial.)
Comment 3 Reed Loden [:reed] (use needinfo?) 2009-10-29 23:21:45 PDT shows en-US. So, do we need to support both '_' and '-' as separators?
Comment 4 Reed Loden [:reed] (use needinfo?) 2009-10-29 23:31:39 PDT
Created attachment 409292 [details] [diff] [review]
patch - v1

Support both separators using a regex.
Comment 5 era+mozilla 2009-11-09 06:17:36 PST

Dunno if I'm allowed to say that here but that seems like an elegant and unintrusive fix.
Comment 6 Reed Loden [:reed] (use needinfo?) 2009-11-09 09:34:15 PST
Comment 7 Reed Loden [:reed] (use needinfo?) 2009-11-09 09:43:06 PST
Comment on attachment 409292 [details] [diff] [review]
patch - v1

Ubuntu and other Linux distros would appreciate this fix being backported, as it fixes a very ugly UI eye sore.
Comment 8 Mike Hommey [:glandium] 2009-11-22 23:33:56 PST
It only fixes half of the issue though. The duplicates still show up.
Comment 9 Reed Loden [:reed] (use needinfo?) 2009-11-22 23:42:43 PST
(In reply to comment #8)
> It only fixes half of the issue though. The duplicates still show up.

Isn't that easily fixable on your end by removing the hack (symlinks) you've used for years for this?
Comment 10 Mike Hommey [:glandium] 2009-11-22 23:49:52 PST
Created attachment 414001 [details] [diff] [review]
patch for 1.9.1

FWIW, the file to which the patch - v1 applies doesn't exist in older releases. This patch applies to the right file on these older releases.
Comment 11 Mike Hommey [:glandium] 2009-11-23 00:01:15 PST
(In reply to comment #9)
> Isn't that easily fixable on your end by removing the hack (symlinks) you've
> used for years for this?

I really don't know where they come from, though. I certainly didn't add them myself as I'm not the dictionary maintainer, and I don't recall requesting them... I would happily have fixed the current bug if I had known these links were being added especially for mozilla products, which seem to be the case. So yes, removing the symlinks is the fix. I'm still puzzled why they got there, though.
Comment 12 Micah Gersten 2009-11-23 00:51:10 PST
(In reply to comment #8)
> It only fixes half of the issue though. The duplicates still show up.

I filed  Bug 528831 to address that issue.
Comment 13 era+mozilla 2009-11-23 02:42:26 PST
FWIW Mike Hommey opened a bug on the Debian side about the now-redundant symlinks:
Comment 14 Mike Beltzner [:beltzner, not reading bugmail] 2009-12-02 08:13:39 PST
Comment on attachment 409292 [details] [diff] [review]
patch - v1

Comment 15 Reed Loden [:reed] (use needinfo?) 2009-12-02 12:32:55 PST
Comment 16 Daniel Veditz [:dveditz] 2009-12-21 15:19:31 PST
Comment on attachment 414001 [details] [diff] [review]
patch for 1.9.1

Approved for, a=dveditz for release-drivers
Comment 17 Reed Loden [:reed] (use needinfo?) 2009-12-30 18:41:20 PST

Note You need to log in before you can comment on or make changes to this bug.