Closed Bug 1102835 Opened 10 years ago Closed 9 years ago

[UserDictionary] Let Keyboard App use User-Dictionary

Categories

(Firefox OS Graveyard :: Gaia::Keyboard, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
2.2 S9 (3apr)

People

(Reporter: mnjul, Assigned: mnjul)

References

Details

(Whiteboard: [p=5])

Attachments

(2 files, 1 obsolete file)

In this bug we'll open a new prediction worker for keyboard app's Latin IME engine such to process the user dictionary binary blob, and finalize the preliminary version of the feature.
We'll need access to IndexDB from worker thread ;)
Depends on: AsyncIDB
(The estimated story points include feedback loop with UX)
Assignee: nobody → jlu
Whiteboard: [p=3]
Target Milestone: --- → 2.2 S7 (6mar)
We also no longer depends on the AsyncIDB feature since the blob retrieval is done at main thread as of bug 1110028.
No longer depends on: AsyncIDB
Depends on: 1138343
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

Tim,
Here's the feedback request. As offline discussed I'll continue to work on writing tests.

Apart from the commit messages, here are some noteworthy remarks:
- For prediction merging: latin.js will always suggest at least one suggestion from user dictionary, even if that suggestion has a weight very low. It gains the “this is from user dictionary” knowledge by worker’s annotation. Such annotation is rather ugly because we have used a heterogeneous array to host suggestion/weight in a tuple-like fashion (suggestion[0] is word, suggestion[1] is weight). Let me know if you want that changed.
- We currently don’t do any caching of user dict blob; at Latin IME’s activation, the blob is always retrieved from indexedDB, re-passed to worker (if any worker is there), and the predictions.js does all the tree parsing again -- even though the dict blob has not changed. Indeed this leaves much to be optimized but we might have to deal with quite some asynchronousness and aborting/restarting (and updating the returned status object somehow), if we actually opportunistically start prediction with the cached blob only to discover the blob has changed.
- User dictionary blob is currently not in TransferList, as we represent an empty dictionary as |undefined| which cannot be Transferred. (We won’t want to Transfer it either if we want to cache it or detect for changes).

Omega,
I’d like to invite you to do a UX/UI check on this. You’ll surely stumble upon non-ideal prediction results. I’d like to hear about a number of examples from you (user dictionary words & expected/actual predictions on some inputs) and will adjust parameters and strategies according to your example. However, I’d really want to defer such adjustments to follow-up bugs and I expect the criteria to close this bug to be no major UX breakage.

When both of you check out the patch, please keep in mind that we’re still affected by bug 1119157 right now; from my observation, you need to *at least restart your device once* after you |make reset-gaia| to see the prediction worker working. If you don’t see any prediction coming up and you see |Too much recursion| in adb log, please reboot and try again. Omega, please feel free to ask me if you need any assistance on this.

If it's hard to get the worker stable due to bug 1119157 then I might need to ping Cervantes, though.
Attachment #8573020 - Flags: ui-review?(ofeng)
Attachment #8573020 - Flags: feedback?(timdream)
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

Also Ricky: user dictionary is now enabled by default in this patch and all build-related mechanisms to enable it optionally by flag are now removed. Please take a look. Thanks!
Attachment #8573020 - Flags: feedback?(ricky060709)
ETA is late next week to account for Tim's PTO.
Target Milestone: 2.2 S7 (6mar) → 2.2 S8 (20mar)
(In reply to John Lu [:mnjul] [MoCoTPE] from comment #5)
> Comment on attachment 8573020 [details] [review]
> [gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master
> 
> When both of you check out the patch, please keep in mind that we’re still
> affected by bug 1119157 right now; from my observation, you need to *at
> least restart your device once* after you |make reset-gaia| to see the
> prediction worker working. If you don’t see any prediction coming up and you
> see |Too much recursion| in adb log, please reboot and try again. Omega,
> please feel free to ask me if you need any assistance on this.
> 
> If it's hard to get the worker stable due to bug 1119157 then I might need
> to ping Cervantes, though.

Side note: I was told by Cervantes that changing device preference of |dom.ipc.processPrelaunch.enabled| to |false| allows us to work around that bug.
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

f+, Build part seems fine to me.
Attachment #8573020 - Flags: feedback?(ricky060709) → feedback+
Gaia-try seems not so happy, so remember to fix Gu and jshint failures before landing.
(In reply to Ricky Chien [:rickychien] from comment #10)
> Gaia-try seems not so happy, so remember to fix Gu and jshint failures
> before landing.

Yeah this patch doesn't include tests, thanks for noting that :)
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

I believe it would be good to invite Jan to take a glimpse at this too :)
Attachment #8573020 - Flags: feedback?(janjongboom)
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

Unit tests have been amended so I'm converting feedback request to review request.
Attachment #8573020 - Flags: feedback?(timdream) → review?(timdream)
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

@John, as discussion, here are some findings:
1) The issue combining Latin and non-Latin characters is fine.
2) We almost always show one suggestion from user dictionary even it doesn't that match. (e.g. Type "p" and it shows "petrificustotalus" from user dictionary.) Words from user dictionary should have a little higher score but don't need to display all the time. (I don't have idea how to define the score. We can tweak it and try it anyway.)
3) The title "User dictionary" should be "User Dictionary".
Attachment #8573020 - Flags: ui-review?(ofeng)
Attached image Benchmark result sheet
Alright, I did some benchmark on the user dictionary features.

Things under question: How much time we spend on dict blob generation, how much time PromiseStorage.setItems() takes, and how much time we spend on prediction when predicting from built-in dictionary only, and when predicting from both dictionaries.

Word lists of 100, 200, 400, 800, 1200 words are generated, with each word consisting of 4~12 random lowercase chars; with QWERTY layout and en_us built-in dictionary.

For blob generation & PromiseStorage.setItems() tests, timing measurement results of 20 iterations are taken (we have average, max, min, stddev). For prediction results, we measure predictions from 100 random inputs (consisting of 3~6 random lowercase chars) and those 100 results are taken. The two prediction tests (one with built-in dict and one with both dicts) run on the same set of 100 randomized user inputs.

Please see the results in the attached screenshot (numbers in milliseconds)
Legend: u: average, x: max, n: min, s: stddev

Some analyses from the results:
1. No significant difference between 1G / 319M on Flame.
2. Time spent on blob generation is pretty much acceptable with 100 user words (mean ~53ms) and can get quite longer *at the worst case* for larger word lists. However, standard deviation of blob generation time is pretty huge, and the minimum value is still within 70ms even for 1200 words, so I'm not sure if Gecko's JavaScript engine is doing something synchronously occasionally (GC?) that can impact the performance.
3. From 2, the worst-case time is still slower than our dialog transition duration (500ms) so my wild guess is, if we want, we can overlap blob generation AND dialog transition, maybe by WebWorker. Gonna be some undertaking, though.
4. Unsurprisingly, predicting from both built-in and user dictionaries takes more time than predicting from built-in dictionary only. However, only the maximum time seems to be much impacted, with the average value remains roughly the same; the difference does not scale with the largeness of UD for at least our max UD size (1200 words) in this test.

I can't reason why Flame/1G/1200words has less max predictions time for with-UD than that for w/o-UD, though (tested a couple times and it remained like that.)

Some further information:
- Device is restarted after each change of number of UD words.
- Only settings app & keyboard settings app are open during blob generation/saving tests; only the UI tests app is open during prediction tests.
Tests are carried out against Gecko build ID 20150315160203.
- A list of 1200 user dictionary words, random generated as described above, takes about 19~20KB in blob size.
Please see https://github.com/mnjul/gaia/tree/bug_1102835_benchmarks for benchmark codes (very nastily written, though) and some calling usage in the code's comments.
Attachment #8574397 - Attachment is obsolete: true
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

(In reply to Omega Feng [:Omega] [:馮於懋] (please ni?) from comment #15)
> 2) We almost always show one suggestion from user dictionary even it doesn't
> that match. (e.g. Type "p" and it shows "petrificustotalus" from user
> dictionary.) Words from user dictionary should have a little higher score
> but don't need to display all the time. (I don't have idea how to define the
> score. We can tweak it and try it anyway.)
Revised the merging strategy and offline discussed with Omega and he's good with the new approach.

Now, we'll only bump a prediction from user dictionary if its weight >=1 or if we have <=2 built-in dictionary predictions. To give the "1" magic value, the uniform frequency of user dictionary words have been reduced from 31 to 10.

> 3) The title "User dictionary" should be "User Dictionary".
Also fixed.

Omega, could you set ui-review+ here? Thanks!
Attachment #8573020 - Flags: ui-review?(ofeng)
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

Good job! Thanks!
Attachment #8573020 - Flags: ui-review?(ofeng) → ui-review+
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

Sorry for the really late review. Everything is cool except the Promise chain. Please rearrange that part and ask for review. The next review should be really quick.
Attachment #8573020 - Flags: review?(timdream) → feedback+
(In reply to Tim Guan-tin Chien [:timdream] (slow response; please ni? to queue) from comment #20)
> Comment on attachment 8573020 [details] [review]
> [gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master
> 
> Sorry for the really late review. Everything is cool except the Promise
> chain. Please rearrange that part and ask for review. The next review should
> be really quick.

I'd like to discuss the rationale of that Promise chain with you. Please see the short-version github comment there :)
Flags: needinfo?(timdream)
(In reply to John Lu [:mnjul] [MoCoTPE] from comment #21)
> (In reply to Tim Guan-tin Chien [:timdream] (slow response; please ni? to
> queue) from comment #20)
> > Comment on attachment 8573020 [details] [review]
> > [gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master
> > 
> > Sorry for the really late review. Everything is cool except the Promise
> > chain. Please rearrange that part and ask for review. The next review should
> > be really quick.
> 
> I'd like to discuss the rationale of that Promise chain with you. Please see
> the short-version github comment there :)

Got it. My mistake.
Flags: needinfo?(timdream)
Off we go :)
Keywords: checkin-needed
Whiteboard: [p=3] → [p=5]
Target Milestone: 2.2 S8 (20mar) → 2.2 S9 (3apr)
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment on attachment 8573020 [details] [review]
[gaia] mnjul:bug_1102835_kb_app_use_ud_to_predict > mozilla-b2g:master

Hoooray! Thanks everyone :)
Attachment #8573020 - Flags: feedback?(janjongboom)
Depends on: 1148326
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: