Closed Bug 1571915 Opened 1 year ago Closed 10 months ago

Switch to `unic-langid` to back MozLocale

Categories

(Core :: Internationalization, task, P3)

task

Tracking

()

RESOLVED FIXED
mozilla72
Tracking Status
firefox72 --- fixed

People

(Reporter: zbraniecki, Assigned: zbraniecki)

References

Details

Attachments

(3 files, 1 obsolete file)

As a warm-up to bug 1560038, I'd like to move MozLocale to use unic-langid.

This will reduce the number of codepaths we have to parse Language Identifiers and pave the way to use unic-locale there in the future.

This will also unify our locale management between LocaleService and fluent-rs.

Assignee: nobody → gandalf
Status: NEW → ASSIGNED

The patch is still a WIP, but is almost fully functional (two tests to fix), and ready for initial Rust/cbindgen/FFI review from :emilio. Setting NI!

Flags: needinfo?(emilio)

It looks reasonable, I took a pass on phab, but I think there's missing code? I don't see the implementation of the FFI functions.

Flags: needinfo?(emilio)
Priority: -- → P3
See Also: → 1581960

I profiled the startup locale parsing between central and this patch. There are small differences due to changes in callers, but overall the results are similar.

We parse ~400 language identifiers during startup.

On average, parsing a locale takes:

  • central: 665 ns
  • unic-langid: 417 ns (37% faster)

data: https://docs.google.com/spreadsheets/d/1eWjslErJmzwGnWSX5eNXctjwa_KlK9QkyyM3NqPHgm8/edit#gid=0

The patch is now passing all tests and is pending a couple review decisions from emilio.

I updated the spreadsheet with new data, which I separated for the initial cold startup with a new profile, and 3 runs with the same profile averaged.

== New profile:

Count: 956 -> 786 (-18%)
Sum: 671661 -> 403094 (-40%)
Average: 702 -> 512 (-28%)
Min: 255 -> 148 (-42%)
Max: 4595 -> 11901 (150%)
Stdev: 497 -> 986 (98%)

The count drops thanks to the change in how we calculate likely subtags (without having to parse the output into a new Locale).
The average and min drops, which comes from better performance (already measured before).

The combination of the two impacts the total time that the parsing takes, with 40% win, which is really nice!

Max goes up - the first parsing takes significantly longer with Rust than C++. I'm wondering if it is because that becomes the first point where we load gecko-rust lib? In result stdev goes up.
I'm not super worried about it, because majority of parsings are now much faster, and the total numbers are very good.

It still may be worth investigating why the max goes up.

== Old profile (avg. of 3 runs)

Count: 340 -> 274 (-20%)
Sum: 238151 -> 144606 (-39%)
Average: 523 -> 394 (-25%)
Min: 212 -> 112 (-47%)
Max: 10340 -> 9618 (-7%)
Stdev: 606 -> 812 (33%)

Here we can see that the total count stays 20% lower, total time remains 40% lower, average remains 25% lower and min remains 45% lower!
Max is now lower by 7%, which makes me think that the new profile max may have been a fluke. Stdev is still a little bit higher.

All in all, I think it looks really good!

Attachment #9083506 - Attachment is obsolete: true
Depends on: 1591191

Hi Jonathan - can you please take a look at the r? in the PR?

Flags: needinfo?(jfkthame)

I updated spreadsheet with new numbers based on today's nightly: https://docs.google.com/spreadsheets/d/1eWjslErJmzwGnWSX5eNXctjwa_KlK9QkyyM3NqPHgm8/edit#gid=0

== New profile:

Count: 956 -> 784 (-18%)
Sum: 671661 -> 142324 (-79%)
Average: 702 -> 181 (-74%)
Min: 255 -> 46 (-82%)
Max: 4595 -> 5424 (+18%)
Stdev: 497 -> 391 (-21%)

== Old profile (avg. of 3 runs)

Count: 454 -> 339 (-25%)
Sum: 317534 -> 64443 (-79%)
Average: 697 -> 190 (-72%)
Min: 282 -> 45 (-84%)
Max: 13786 -> 7819 (-43%)
Stdev: 808 -> 595 (-26%)

This looks really good. I also measured unic-langid against raw ICU4C 64 - https://github.com/zbraniecki/intl-measurements#current-results-as-of-oct-30-2019

Flags: needinfo?(jfkthame)
Attached patch mozlocale.diffSplinter Review

Adding a patch I used to dump all MozLocale constructors without the unic-langid patch and their performance cost.

Attached patch unic-time.diffSplinter Review

Adding a patch I'm using with the unic-langid patch to collect all MozLocale constructors and their performance cost.

Pushed by zbraniecki@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5a8fc2574be5
Switch MozLocale to use unic-langid. r=emilio,jfkthame
Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla72
Regressions: 1599532
Regressions: 1607052
You need to log in before you can comment on or make changes to this bug.