Closed Bug 1033717 Opened 10 years ago Closed 6 years ago

[B2G][Keyboard] Tamil: 'Auto correction' and 'Word suggestion' are not working with the Tamil keyboard

Categories

(Firefox OS Graveyard :: Gaia::Keyboard, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(tracking-b2g:backlog, b2g-v2.0 affected)

RESOLVED WONTFIX
tracking-b2g backlog
Tracking Status
b2g-v2.0 --- affected

People

(Reporter: dharris, Unassigned)

References

()

Details

(Whiteboard: LocRun2.0)

Attachments

(3 files)

Description:
With the Tamil keyboard selected neither 'Auto correction' or 'Word suggestion' are working.

Pre-req: Use 'GAIA_KEYBOARD_LAYOUTS=ko make reset-gaia' to add the Tamil keyboard to the build

Repro Steps:
1) Update a Flame to BuildID: 20140702000201
2) Open Settings App> Keyboards> Selected Keyboards
3) Select 'Add more keyboards'
4) Tap on Tamil to select> tap on English to deselect
5) Go to Homescreen> Open Messages App
60 Select Compose new message icon> Start typing

Actual:
There is no 'Auto correction' or 'Word suggestion' for Tamil

Expected:
'Auto correction' and 'Word suggestion' appear above the keyboard

Environmental Variables:
Device: Flame 2.0
BuildID: 20140702000201
Gaia: 3bfe47c58c959c42f5ffe0309b5380ea514ccd69
Gecko: f40e767ea283
Version: 32.0a2 (2.0) 
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:32.0) Gecko/32.0 Firefox/32.0

Repro frequency: 100%
Link to failed test case: https://moztrap.mozilla.org/manage/case/12393/
See attached: Logcat, Video - http://youtu.be/w3qN_AuhxmI
This issue does not occur on 1.4 Flame.

Environmental Variables:
Device: Flame 1.4
BuildID: 20140630000201
Gaia: aa896d5db1b4929f3bf31a0f4bb7de50530222a8
Gecko: 8cba60bc12ef
Version: 30.0 (1.4) 
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:30.0) Gecko/30.0 Firefox/30.0

Swahili was not an available language.
No longer blocks: 1032262
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Does this issue occur on 2.1?
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage-]
Flags: needinfo?(ktucker) → needinfo?(dharris)
This issue does Reproduce on 2.1

No auto-correct or word suggestion bar appears when typing

Flame 2.1

Environmental Variables:
Device: Flame Master
Build ID: 20140702040207
Gaia: 5725321dd1aef29077b6fc5c4c49b43dccf208dc
Gecko: 7075808c3306
Version: 33.0a1 (Master) 
Firmware Version: v122
User Agent: Mozilla/5.0 (Mobile; rv:33.0) Gecko/33.0 Firefox/33.0
Flags: needinfo?(dharris) → needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage-] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Delphine - Can you triage this to determine if this is a blocker for l10n?
Flags: needinfo?(lebedel.delphine)
This is a shipping locale. Therefore nominating
blocking-b2g: --- → 2.0?
Flags: needinfo?(lebedel.delphine)
blocking-b2g: 2.0? → 2.0+
We don't have a Tamil dictionary to support this feature request.
Since professor Kevin Scannell is our source for wordlists for languages not supported by Android,let's ask for his opinion on this.

Hi Kevin,

Is Tamil also a language that you could help to provide a wordlist for?
Thanks.
Flags: needinfo?(kscanne)
(In reply to Rudy Lu [:rudyl] from comment #6)
> We don't have a Tamil dictionary to support this feature request.
> Since professor Kevin Scannell is our source for wordlists for languages not
> supported by Android,let's ask for his opinion on this.
> 
> Hi Kevin,
> 
> Is Tamil also a language that you could help to provide a wordlist for?
> Thanks.

I don't think our word suggestion engine support non-European language even if we have word list available. Unless I was wrong and latin.js apply generally and covers Tamil, we need to mark this feature out of 2.0.

Needinfo Bruce to confirm.
Flags: needinfo?(bhuang)
I can do this without too much trouble.  Will wait for confirmation that the engine supports Tamil script.
Flags: needinfo?(kscanne)
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage+][lead-review+]
(In reply to Tim Guan-tin Chien [:timdream] (MoCo-TPE) (please ni?) from comment #7)
> 
> I don't think our word suggestion engine support non-European language even
> if we have word list available. Unless I was wrong and latin.js apply
> generally and covers Tamil, we need to mark this feature out of 2.0.
> 
> Needinfo Bruce to confirm.

Is this confirmed?  My previous understanding was that only the word list is needed, if it's more than that we need to look into this as an additional feature (meaning out of 2.0).
Flags: needinfo?(bhuang) → needinfo?(timdream)
Let's not guess but confirm with our localization contributors.

Arun, Kengatharaiyer, as team leads of ta and ta-LK teams, would you mind tell us how does spell check work in Tamil? Quick web search showed open source libraries like hunspell does comes with Tamil, but I don't know how effective it is.

(djf is OOO)
Flags: needinfo?(timdream)
Flags: needinfo?(iamsarves)
Flags: needinfo?(arunprakash.pts)
There are complexions with tamil spell checking and there is no perfect spell checker for tamil. But as of now hunspell works better. 
Worth to look at this : https://addons.mozilla.org/en-US/firefox/addon/thamizha-solthiruthi/?src=ss
Flags: needinfo?(arunprakash.pts)
Arun, thanks for the feedback.

Let's just assume word suggestion will work to a certain degree if we have a word list. We can always disable that when it turned out not to be the case. Rudy, could you take this bug and check-in Prof. Scannell's word list when it's available?
Flags: needinfo?(iamsarves) → needinfo?(kscanne)
Assignee: nobody → rlu
Whiteboard: LocRun2.0 → LocRun2.0 [p=1]
Target Milestone: --- → 2.0 S6 (18july)
Professor Kevin helped to provide the word list as follows,
 http://borel.slu.edu/obair/ta.zip

However, I could not get the word suggestion to work with latin IME with the following attempts,
 1. Try to convert the word list to dict directly, this gives us a ta.dict of 14.4 MB.
 2. Then I tried to filter out the words with frequency <= 2, then ta.dict was cut down to about 4.8 MB.

Both of these dictionaries could not get the latin IME to show the word suggestion.


--
David, could you help on this issue?
I was wondering if we have any limitation around latin IME engine to handle this kind of non-latin script, and if that is the case, maybe we could ask for PM's help to put this out of v2.0's scope.
Thanks.
Flags: needinfo?(dflanagan)
Whiteboard: LocRun2.0 [p=1] → LocRun2.0
Thanks for Kevin's help to provide the word list.
Flags: needinfo?(kscanne)
Kevin: any idea why your Tamil wordlist is so much bigger than all the other we have?  It include 992K words, compared to 411K for Hungarian.
Flags: needinfo?(dflanagan) → needinfo?(kscanne)
The ta.js keyboard layout file specifies the india input method, so word suggestions are not expected to work with it.
I edited the apps/keyboard/js/layouts/ta.js file like this:

-  imEngine: 'india',
+  imEngine: 'latin',
+  autoCorrectLanguage: 'ta',

And with the ta.dict file, I got word suggestions and autocorrections to appear.

There are still bugs however: sometimes after autocorrection input would freeze up and I would not be able to type or delete any more characters.  So something is going wrong, but we are able to do at least basic autocorrect. I don't know whether the word suggestions make any sense, though.
Arun: do you have a FirefoxOS device that you could try the patch above out on? Or can you try it on desktop?  It enables autocorrect for Tamil, though as noted above it seems to only work a couple of times before getting stuck.  Still you might be able to tell whether it looks like it would be useful or if we will be better off without it.

Enabling autocorrect means that we switch to the Latin IME. That IME has auto-capitalization rules that may not make sense for the Tamil keyboard, so that could be a reason not to enable this.
Flags: needinfo?(arunprakash.pts)
(In reply to David Flanagan [:djf] from comment #16)
> The ta.js keyboard layout file specifies the india input method, so word
> suggestions are not expected to work with it.

Yes, I should have changed that as well in my test, after more testing, I think I was not using python3 to convert the wordlist to .dict.
David, thanks for your help on this issue.
(In reply to David Flanagan [:djf] from comment #15)
> Kevin: any idea why your Tamil wordlist is so much bigger than all the other
> we have?  It include 992K words, compared to 411K for Hungarian.

The other languages I've done had mature spell checkers, so for those I filtered out anything not accepted by the spell checker.   For Tamil this would have cut the list down to less than 100k words, leaving out some common words.
Flags: needinfo?(kscanne)
Rudy, could you provide a update of this bug? It is unclear to me what's the solution we want here and project management require daily update to 2.0 blockers.

IMHO switch the IMEngine from india to latin is too risky at this point. We don't know if latin will auto "incorrect" the Tamil words.
Flags: needinfo?(rlu)
Right now, we are waiting for Arun's help to evaluate if the current suggestion is really useful for Tamil users.

But even with that evaluation available, we still face the following issues,
 1. The typing might get stuck when auto-correction is enabled.
 2. Latin IME has auto-capitalization rules that may not apply well to Tamil.
 3. The Tamil dictionary is much larger than other languages.

With these issues to be addressed, I would suggest we re-consider putting this feature into v2.0.
Ni Howie and Bruce for their opinion.
Flags: needinfo?(rlu)
Flags: needinfo?(hochang)
Flags: needinfo?(bhuang)
Attached image Screenshot
Sorry for the delay. Just checked it. I could see that whenever i type some characters only some four related words are shown as suggestions.
Flags: needinfo?(arunprakash.pts)
With the current status I'm going to recommend we just keep the layout and defer autocorrect for a later release.  Originally if it was just lacking a word list we'd be in better shape but our current mechanism seems to need additional work for specific languages.
Flags: needinfo?(bhuang)
unblock this since the auto correction and word suggestion for Tamil is out of scope in v2.0.
blocking-b2g: 2.0+ → backlog
Flags: needinfo?(hochang)
Unassign myself for now, will focus on other feature work first.
Assignee: rlu → nobody
Target Milestone: 2.0 S6 (18july) → ---
Tamil has been requested on 1.4 and all onwards versions. We will therefore be needing autocorrection/wordsuggestion on all those versions. Please let me know if there are any concerns about getting this in. thanks!
blocking-b2g: backlog → ---
Blocks: 1183455
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: