Unify ListFormat in SpiderMonkey
Categories
(Core :: Internationalization, task, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox94 | --- | fixed |
People
(Reporter: gregtatum, Assigned: allstars.chh)
References
Details
(Whiteboard: [i18n-unification])
Attachments
(3 files)
https://searchfox.org/mozilla-central/source/js/src/builtin/intl/ListFormat.cpp
This will most likely need a mozilla::intl::ListFormat
implementation, as a replacement for UListFormatter
.
Updated•2 years ago
|
Reporter | ||
Comment 1•2 years ago
|
||
FYI, we have a design document from April. I left a few comments, but I plan on updating it for the current state of things: https://docs.google.com/document/d/1Mr16-dEoQxR-nGNB6RQmiZrTBqxCxHiI8n0WJpL7T38/edit#
Assignee | ||
Comment 2•2 years ago
|
||
Move implementations to mozilla::intl::ListFormat
Assignee | ||
Comment 3•2 years ago
|
||
Still working on some tests.
According to the document from comment 1, the implementation is as following:
-
ListFormat.format:
SM allocates a 'list' of type ListFormatStringList and a FormatBuffer and pass them to Mozilla::intl::ListFormat, and Mozilla::intl::ListFormat will fill the result into FormatBuffer. -
ListFormat.formatToParts
Like format, SM also allocates a list and the parts of type ListPartVector, and Mozilla::intl::ListFormat will append the parts into ListPartVector.
The ListPartVector implementation is referenced from NumberPartVector from Mozilla::intl::NumberFormat
Updated•2 years ago
|
Reporter | ||
Comment 4•2 years ago
|
||
Still working on some tests.
Ah I missed this while reviewing. Happy to take another looks with more granular tests.
Assignee | ||
Comment 5•2 years ago
|
||
Hi Daniel, Greg and Zibi
For the "Formatting Operation API" in https://docs.google.com/document/d/1Mr16-dEoQxR-nGNB6RQmiZrTBqxCxHiI8n0WJpL7T38
I have some questions regarding to the formatToParts.
For the ListFormat.format(), it's implemented according to the gdoc,
js::intl::ListFormat provides a FormatBuffer, and mozilla::intl::ListFormat fills the buffer with FillBufferWithICUCall API.
But for ListFormat.formatToParts is a little more complicated.
So in ListFormat.formatToParts will return an array of parts according to https://tc39.es/ecma402/#sec-formatlisttoparts
In my patch the ListPart is represented as std::pair<ListPartType, Span<const char16_t>>,
my question is, who takes the ownership of the data in the mozilla::Span in this 'formatToParts' case?
I have some approaches.
-
As shown in my patch, the string of the part.value is from ICU (the variable 'formattedChars' in ListFormat::formatToParts), then mozilla::intl::ListFormat will pass the Span of the string (from ICU) to js::intl::ListFormat.
The downside is now we have to make sure the ICU string is valid, so I move ulistfmt_openResult to ListFormat cstor, and ulistfmt_closeResult to ListFormat dtor. Originally it was done in https://searchfox.org/mozilla-central/rev/3a8091d1c29473a0839ad7a5810028f41363fe2e/js/src/builtin/intl/ListFormat.cpp#293-298 -
js::intl::ListFormat will allocate a Vector<FormatBuffer<char16_t>>, and mozilla::ListFormat will fill those FormatBuffers in the vector.
-
Add a Callback parameter in the ListFormat::formatToParts.
-
Add another Listformat::closeParts interface and js::intl::ListFormat will call this interface after Spidermonkey has copied the data from the parts.
What do you think?
Thanks
Reporter | ||
Comment 6•2 years ago
|
||
- The downside is now we have to make sure the ICU string is valid
Yeah I think this is a bit risky. I word prefer avoiding "use after free" type of bugs if possible.
- js::intl::ListFormat will allocate a Vector<FormatBuffer<char16_t>>, and mozilla::ListFormat will fill those FormatBuffers in the vector.
This seems like it would work, but at the cost of being a little allocation heavy, or inefficient in memory usage. Tt might be weird to inflate the stack with lots of memory to hold all of it. For short lists you could avoid heap allocations, but I wouldn't want to have the stack size be too large here. The benefit here is that it's harder to use the API incorrectly.
- Add a Callback parameter in the ListFormat::formatToParts.
I think a callback parameter may be a reasonable solution, and allow the caller to handle the memory. This seems like a simple and safe implementation. I prefer to have something that is hard to do the wrong thing.
- Add another Listformat::closeParts interface and js::intl::ListFormat will call this interface after Spidermonkey has copied the data from the parts
This seems like a way to make 1. safer, but it's still possible to do the wrong thing. You could at least add assertions to check for incorrect behavior.
I'm having a little trouble visualizing all of the flows, but can you do option 3 with a callback, where the lifetime of the original string is guaranteed valid across the callback, and then work with Span<char16_t>
that point back to the valid string? Then SpiderMonkey could do the right thing from there with the Spans. Maybe that's what you are saying in option.
Updated•2 years ago
|
Assignee | ||
Comment 7•2 years ago
|
||
Thanks, will update the patch with method 3.
Updated•2 years ago
|
Assignee | ||
Comment 8•2 years ago
|
||
Assignee | ||
Comment 9•2 years ago
|
||
As discussed in https://bugzilla.mozilla.org/show_bug.cgi?id=1719747#c6,
In Part 1, the ownership of the Span in PartsVector is maintained by
ICU. This method adds a callback so js::intl::ListFormat could copy the
content of the Span to Spidermonkey, and mozilla::intl::ListFormat could
use ScopedICUObject to release the string owned by ICU earlier.
Updated•2 years ago
|
Assignee | ||
Comment 10•2 years ago
|
||
Ted, review ping.
Your review has been pending for weeks, if you are busy, can you suggest a different reviewer review it?
Or if Greg and Anba's reviews are enough, then I'll remove your r?.
I have been rebasing to the latest m-c many times to fix the conflict during my leave, now my leave is going to end however the patches are still waiting for your review.
Thanks
Updated•2 years ago
|
Comment 12•2 years ago
|
||
Pushed by allstars.chh@gmail.com: https://hg.mozilla.org/integration/autoland/rev/a83961b4491c Part 1: Unify ListFormat in SM. r=gregtatum,anba,platform-i18n-reviewers,tcampbell https://hg.mozilla.org/integration/autoland/rev/1aa6a3fcd632 Part 2: Use NewDenseFullyAllocatedArray. r=anba https://hg.mozilla.org/integration/autoland/rev/0433a2711b0a Part 3: ListFormat::FormatToParts takes a callback. r=gregtatum,tcampbell
Comment 13•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a83961b4491c
https://hg.mozilla.org/mozilla-central/rev/1aa6a3fcd632
https://hg.mozilla.org/mozilla-central/rev/0433a2711b0a
Description
•