Closed Bug 908286 Opened 11 years ago Closed 11 years ago

Dictionaries needed for v1.1 locales

Categories

(Firefox OS Graveyard :: Gaia::Keyboard, defect)

x86
macOS
defect
Not set
normal

Tracking

(blocking-b2g:leo+, b2g18 fixed, b2g-v1.1hd fixed)

RESOLVED FIXED
blocking-b2g leo+
Tracking Status
b2g18 --- fixed
b2g-v1.1hd --- fixed

People

(Reporter: jcheng, Assigned: djf)

References

Details

Attachments

(1 file)

partners require these locales to ship

Polish, Croatian, Czech, English, German, Romanian, Hungarian, Greek, Bulgarian, Dutch, Russian, Slovak, Turkish, Croatian          

Hungarian, Serbian


From the email thread. Missing keyboard layouts: Croatian, Dutch, Hungarian, Romanian, Bulgarian
Added all corresponding localizers to this bug.

Please note that you have mentioned Bulgarian above, but that is not a v1.1 shipping locale that l10n team is aware of. Please confirm
Flags: needinfo?(jcheng)
Flags: needinfo?(jcheng)
The title of this bug suggests that it is just about the autocorrect dictionaries for these locales, and that is all I am able to provide as owner of the bug.  The description in comment 0, however, suggests that it may also be intended as a metabug to cover missing keyboard layouts.

The bugs that are dependent on this one actually don't have anything to do with auto correct, and are related to keyboard layout instead.

Joe, could you clarify the scope of this bug?  If this is the main tracking bug for all of the keyboard localization stuff, I can't be the owner of it.

I'd suggest that we have one tracking bug for this leo+ keyboard localization effort for all of these eastern european locales. We then need one child bug for each of the missing keyboard layouts.  And one child bug (assigned to me) to add auto-correct dictionaries for as many of the locales as I can.
Flags: needinfo?(jcheng)
Hi David,

I have a patch to add the missing keyboard layouts, Bug 907763 Comment 8.
So I guess we can handle the dictionary part in this bug.

Please let me know if you have any questions.
Any guide, how can the community generate such dicts? I plan to use the word collection of HunSpell spell checker.
correction on comment 0:
from the email thread. Missing keyboard layouts: Croatian, Dutch, Hungarian, Romanian

Bulgarian is actually not needed. sorry about the incorrect information 

Let's make this bug for dictionaries only on Croatian, Dutch, Hungarian, Romanian. I changed the description as well. hopefully it's better. let me clear the dependencies as well.

Does it sound good :dfj?
Flags: needinfo?(jcheng)
Summary: Dictionaries needed for all v1.1 locales → Dictionaries needed for v1.1 locales (Croatian, Dutch, Hungarian, Romanian)
No longer blocks: 908329, 908343, 908350, 905051
Blocks: 908393
(In reply to Kami from comment #4)
> Any guide, how can the community generate such dicts? I plan to use the word
> collection of HunSpell spell checker.

Kami,

When possible we base the dictionaries on open-source wordlists from Android.

When Android does not have a wordlist available, we've been asking contributor Kevin Scannell to create wordlists for us. He's been involved in creating spelling-checker dictionaries for a long time and seems to be a real expert at this.
Note that we currently have keyboard layouts for a number of languages that do not have auto-correct dictionaries installed. We may only need layouts for four locales, but if we want auto-correction as an option for every 1.1 locale, then we're going to need to add more than 4 dictionaries.

Here's the current status of master (before Rudy's patch to add new layouts lands):

Language        Code    Has Layout      Has Dictionary
----------------------------------------------------------

                Already supported locales

Spanish         es      yes             yes
Portuguese (br) pt_BR   yes             yes (pt_br)
French          fr      yes             yes
Catalan         ca      yes             yes

                Required for v1.1

Croatian        hr      no              no
Czech           cs      no              no
Dutch           nl      no              no
English         en      yes             yes (en_us)
  Dvorak                yes             yes (en_us)
German          de      yes             yes
Greek           el      yes             no
Hungarian       hu      no              no
Polish          pl      yes             yes
Romanian        ro      yes             no
Russian         ru      yes             no
Serbian         sr
  cyrillic              yes             no
  latin                 yes             no
Slovak          sk      yes             no
Turkish         tr
  Turkish Q             yes             no
  Turkish F             yes             no

This may be a little different on v1-train. I don't know if we have the two separate turkish layouts in that version, for example.  In any case, I'm going to convert the title of this bug back to its original form and assume that we want auto-correct dictionaries for all languages that we have a keyboard layout for.

The reason we haven't done this before is that each dictionary adds about 3 mb to the size of the build and we didn't want to clutter things up with dictionaries that our partners didn't need. But if we need all of them, then they should be there.
Edited the bug summary to remove the four enumerated locales since I'll be adding dictionaries for more languages than that
Summary: Dictionaries needed for v1.1 locales (Croatian, Dutch, Hungarian, Romanian) → Dictionaries needed for v1.1 locales
Just wanted to note that there were some keyboard issues in Russian as well: see Bug 908350 and Bug 908343

Not sure if this has an impact on this bug though, please let me know
(In reply to delphine from comment #9)
> Just wanted to note that there were some keyboard issues in Russian as well:
> see Bug 908350 and Bug 908343
> 
> Not sure if this has an impact on this bug though, please let me know

We should take the patch on that two bugs instead.
But if Russian is a required shipping language, why shouldn't it get an autocorrect dictionary as well? (I just checked and it doesn't have one)
(neither do Slovak and Greek)
Trying to understand the logic so we can then test everything correctly
thanks
delphine: see comment 7 above.

I'll add all auto-correct dictionaries, and have changed the bug title appropriately.
makes sense, got confused with all these changes :)
thanks
The linked pull request adds wordlists and auto-correct dictionaries for all of the supported locales that we have dictionaries for.

We don't yet have a wordlist for Slovak or a latin wordlist for Serbian (we have Cyrillic only for Serbian).

This pull request adds something like 30mb to the size of the builds.
Attachment #794921 - Flags: review?(anygregor)
I miscalculated the increase in build size.  It is more like 18mb, not 30.
Comment on attachment 794921 [details]
link to patch on github

Not super excited about the build size increase but we should start testing the dictionaries and not block on the customization feature.
Attachment #794921 - Flags: review?(anygregor) → review+
Uplifted to v1-train: https://github.com/mozilla-b2g/gaia/commit/29a83977c6c7a9f5f1422c38b235461990fe438a

While testing on v1-train, I discovered a problem with the Hungarian wordlist, so this uplift includes a modified Hungarian wordlist and dictionary.

I'll now have to create a new pull request to get those modified files back to master.
Applied the v1-train Hungarian fix (down-lifted?) to master: https://github.com/mozilla-b2g/gaia/commit/639d0ab2b9221a9e0136dc95abe1e1160452d89d
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I've closed this bug, and have filed 908938 and 908941 for the still-missing Slovak and Serbian (Latin) dictionaries
v1.1.0hd: 29a83977c6c7a9f5f1422c38b235461990fe438a
v1.1.0hd: 24b42e6e6e7637c64c11e68aec8d22dcd94b0230
It looks like this may all get backed out because we're running out of space on the Buri device with this many autocorrect dictionaries.
Nevermind, it now looks like it was an unrelated bug and this won't get backed out.
(In reply to David Flanagan [:djf] from comment #21)
> I've closed this bug, and have filed 908938 and 908941 for the still-missing
> Slovak and Serbian (Latin) dictionaries

Those two bugs now have pull requests under review, and need to be given leo+ so that the dictionaries can be uplifted to v1-train.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: