Closed Bug 1570370 Opened 1 year ago Closed 4 months ago

Move language tag parsing and Intl.Locale to C++

Categories

(Core :: JavaScript: Internationalization API, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla71
Tracking Status
firefox70 --- wontfix
firefox71 - fixed

People

(Reporter: anba, Assigned: anba)

References

Details

Attachments

(9 files)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
No description provided.

Also removes the "Native"-suffix from function names.

Depends on D40072

StringBuffer directly calls NewInlineString for short strings, which prevents
looking up static strings.

Depends on D40073

We already have MozLocale in intl/locale which is aiming to cover the same scope (Unicode Locale Identifier) and I recently released unic-locale crate for Rust, that we'll include in Gecko soon and also aims at Unicode Locale Identifier.

My plan is to migrate MozLocale to use unic-locale.

Is there a chance to use it for SpiderMonkey as well?

The parser is for the most part uninteresting, but canonicalisation is more problematic, because there are different canonicalisation algorithms in UTS 35 and ECMA-402 has some specific requirements which are most likely neither covered in MozLocale nor unic-locale. :-/

because there are different canonicalisation algorithms in UTS 35 and ECMA-402 has some specific requirements which are most likely neither covered in MozLocale nor unic-locale. :-/

Gotcha. That's sad. I was hoping to unify all of that (SpiderMonkey, Gecko and Rust) around a single UTS 35 based algorithm and have a single implementation in Gecko.

The canonicalisation parts in ECMA-402 are likely to change again [1, 2], but perhaps we end up with something that can be shared across multiple users, so maybe there's a future where we actually can use a single implementation. But that's not today. :-)

[1] I've listed a few to-dos in https://github.com/tc39/ecma402/issues/330.
[2] https://github.com/tc39/test262/pull/2097 and https://github.com/tc39/test262/pull/2261 updated test262 to use the canonicalisation as currently defined in ECMA-402. Because I know that V8 doesn't implement that yet, their next test262 update will lead to dozens of failures, which quite likely (and understandably) is going to annoy someone, but also ensures https://github.com/tc39/ecma402/issues/330 will gain some traction. :-/

Blocks: 1373089
Blocks: 1570921

Are you still intending to work on this ? I see a bunch of patches but no priority on the bug. I think it is too late to get a fix into 70 but still possible for 71.

Flags: needinfo?(jwalden)
Flags: needinfo?(andrebargull)

Jeff, can help determine what work is left and help to get this landed?

I guess I can pick this up and do the last work needed on the earliest patches to get it landed. The issue is going to be that the requested changes on those parts are not trivial, so more review is going to be needed -- and at that point the two people most capable of reviewing will have written all of it, so we'll have to drag in someone with much less experience with this code and these algorithms. :-| And if it took me considerable time to carefully review this coming in with familiarity with it all, the delay for review is going to be much worse for someone else.

Flags: needinfo?(jwalden)

Probably more reasonable to aim for 71/72 then. Thanks Steven and Jeff.

(In reply to Liz Henry (:lizzard) from comment #14)

Are you still intending to work on this ? I see a bunch of patches but no priority on the bug. I think it is too late to get a fix into 70 but still possible for 71.

Yes, I'm actively working on this. I've already worked through most of the review comments and right now I'm checking how to best apply some of the comments without sacrificing runtime performance. 71 is probably too ambitious given that the soft-freeze is in 11 days (and the past review-cycles on this bug took its time), but 72 should be doable.

Flags: needinfo?(andrebargull)

Updated the patches to address the review comments.

Blocks: 1583269

Mm. I went and spent most of the last two days working on addressing my review comments in those patches locally. :-| Not ideal that we duplicated work. But, I guess it did let me page this back into memory, which isn't all bad. And it's a second approach to things somewhat, so I can compare what you've done and what I've done and we can get to a more harmonious final state, ideally.

Looking at patches now...

Incidentally, when I imported the first three patches in order to test stuff out for myself, I left them in my tree and have been periodically rebasing them "just in case". And part of testing/updating here, involved rerunning the intl-data generation script. And it happens that, despite our best efforts, there is a small bit of nondeterminism in it, in the order in which things are sorted/tested in certain cases. I've fixed that in a separate patch that I've locally applied before patches here, and I'll probably spin it out to a new bug shortly. Shouldn't greatly affect anything that happens here, but it might require rerunning the intl-data generation script per patch that changes generation.

Pushed by cbrindusan@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5dad4aa34c7f
Part 1: Port Unicode BCP 47 locale identifier parser to C++. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/4eca32dbe70f
Part 2: Port Intl.Locale to C++. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/64993f76caaf
Part 3: Switch Intl.Locale from JS to C++ version. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/22ff988b77c6
Part 4: Rename NativeLocaleObject to LocaleObject. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/f22fdbd968ed
Part 5: Rename NativeLocale.cpp to Locale.cpp. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/7e272f3c9fa4
Part 6: Switch language tag parser from JS to C++. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/e7dbd9ac5b7d
Part 7: Remove no longer used JS code generation parts from make_intl_data. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/51c5dbb80a13
Part 8: Add lookup for static strings to StringBuffer. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/614162937d31
Part 9: Move UnicodeExtensionsGenerated.cpp into LanguageTagGenerated.cpp. r=jwalden

Keywords: checkin-needed

== Change summary for alert #23456 (as of Tue, 15 Oct 2019 04:45:53 GMT) ==

Improvements:

4% Base Content JS windows7-32-shippable opt 3,250,930.67 -> 3,124,816.00
4% Base Content JS windows7-32-shippable opt 3,250,534.67 -> 3,124,784.00
4% Base Content JS windows7-32 opt 3,250,248.00 -> 3,124,816.00
4% Base Content JS windows10-64 opt 4,179,232.00 -> 4,022,640.00
4% Base Content JS windows10-64-qr opt 4,179,468.00 -> 4,022,578.67
4% Base Content JS windows10-64-shippable opt 4,179,308.00 -> 4,022,677.33
4% Base Content JS windows10-64-shippable-qr opt 4,179,468.00 -> 4,022,605.33
4% Base Content JS linux64 opt 4,111,880.00 -> 3,959,376.00
4% Base Content JS linux64-qr opt 4,112,146.67 -> 3,959,562.67
4% Base Content JS linux64-shippable opt 4,112,030.67 -> 3,959,376.00
4% Base Content JS linux64-shippable-qr opt 4,112,110.67 -> 3,959,456.00
4% Base Content JS macosx1014-64-shippable opt 4,117,112.00 -> 3,964,528.00

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=23456

== Change summary for alert #23444 (as of Sat, 12 Oct 2019 18:25:06 GMT) ==

Improvements:

24% raptor-tp6m-amazon-search-geckoview-cold loadtime android-hw-g5-7-0-arm7-api-16 pgo 3,669.08 -> 2,787.04
7% raptor-tp6m-amazon-search-geckoview-cold android-hw-g5-7-0-arm7-api-16 pgo 1,985.27 -> 1,855.28

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=23444

You need to log in before you can comment on or make changes to this bug.