Use char rather than uint8_t for utf-8 in unified components
Categories
(Core :: Internationalization, enhancement)
Tracking
()
Tracking | Status | |
---|---|---|
firefox92 | --- | fixed |
People
(Reporter: dminor, Assigned: dminor)
References
Details
(Whiteboard: [i18n-unification])
Attachments
(1 file)
We should replace uint8_t with char in the unified components. I had originally used uint8_t for compatibility with Rust which is using u8, but the ICU4X C FFI has been developed around char instead.
This is pending the resolution of https://github.com/unicode-org/icu4x/issues/769, in case we end up choosing something different there.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 1•2 years ago
|
||
Assignee | ||
Updated•2 years ago
|
Comment 2•2 years ago
|
||
My gut instinct would be to prefer an unsigned type here; UTF-8 uses (virtually) the full range of byte values from 0x00 - 0xFF, and having half of them be negative on the C side can be something of a potential footgun.
But I guess we need to see where the ICU4X issue goes....
Assignee | ||
Comment 3•2 years ago
|
||
My take on the upstream issue is that the decision is to use char
and the issue remains open to update the unit tests where needed. I'll double check.
Assignee | ||
Comment 4•2 years ago
|
||
Confirmed that upstream is going to use char
.
Assignee | ||
Comment 5•2 years ago
|
||
Pushed by dminor@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5e6a1afb2e39 Use char rather than uint8_t for utf-8 in unified components r=platform-i18n-reviewers,gregtatum
Comment 7•2 years ago
|
||
bugherder |
Updated•2 years ago
|
Description
•