Complain about attempts to vendor crates that duplicate ICU4X functionality
Categories
(Developer Infrastructure :: Mach Vendor & Updatebot, enhancement)
Tracking
(firefox144 fixed)
| Tracking | Status | |
|---|---|---|
| firefox144 | --- | fixed |
People
(Reporter: hsivonen, Assigned: hsivonen)
References
Details
Attachments
(1 file)
Bug 1948330 added a transitive dependency on unic-ucd-ident (and its unic-* dependencies), whose last development activity was on 2020-10-21 and, therefore, is out of date relative to the latest Unicode version. Additionally, the duplication with icu_properties is bad for binary size. (This is now being addressed.)
mach vendor rust should probably complain about attempts to vendor crates that are duplicative with ICU4X which we're trying to migrate towards.
In addition to the Unicode-aware functionality in the standard library, the Rust ecosystem has four sets of crates for Unicode stuff:
- ICU4X (crates prefixed with
icu_; this is what we are trying to migrate towards) - unicode-rs (crates prefixed with
unicode-; a loose constellation, not all managed by https://github.com/unicode-rs/ ) - UNIC (crates prefixed with
unic-) - rust_icu (crates prefixed with
rust_icu_; not actually Rust internals but binding for ICU4C)
I suggest adding the following denylist to mach vendor rust:
- Block vendoring of crates whose name starts with
unic-with the exception ofunic-langidandunic-langid-ffi. Rationale: UNIC in general is unmaintained and not being updated to new Unicode versions.unic-langidis maintained, though, and is used widely enough in Gecko to require a more careful migration plan toicu_locale. - Block vendoring of crates whose name starts with
rust_icu. Rationale: These are bindings for ICU4C, and we're trying to migrate away from ICU4C towards ICU4X. (The risk of accidentally vendoring these is low, but it's also an easy denylist item.) - Block vendoring of
unicode-normalization,unicode-segmentation,unicode-ccc,unicode-canonical-combining-class,unicode-general-category,unicode-joining-type, andunicode-case-mapping. Rationale: These would be duplicative relative toicu_normalizer,icu_segmenter,icu_properties, andicu_casemap. (It's not practical to deny-list everything that starts withunicode-. Notably,unicode-bidias a whole is out of scope for ICU4X and we useunicode-bidiwith Unicode Database things redirected to ICU4X. At present, ICU4X does not have the API surface ofunicode-widtheven though it has the raw East Asian Width data.)
Comment 1•3 months ago
|
||
It would be great for us to also be able to migrate onwards from the unic-langid crates, as they're in practice only "maintained" in the sense that if we get something like bug 1917175 or bug 1872962 reported, I'll need to provide a fix for the upsteam packages. But as noted above, such a migration would need more care and attention than has been available so far, and would need to include work on the fluent.rs crates, which are no longer maintained by Mozilla, but by outside contributors.
| Assignee | ||
Comment 2•3 months ago
|
||
(In reply to Henri Sivonen (:hsivonen) from comment #0)
It's not practical to deny-list everything that starts with
unicode-.
On second thought, perhaps we should try denylisting crates whose name starts with unicode- and make exceptions for unicode-bidi, unicode-bidi-ffi, unicode-width, and unicode-ident and see how it goes. (unicode-ident seems to be used enough that it's not practical to denylist it at this point in time. It seems that unicode-normalization has been re-introduced while I wasn't looking. I'll file a separate bug about that.)
While at it, let's put feruca on a mach vendor rust denylist. It is duplicative with icu_collator (bug 1937541) but less complete.
Comment 3•3 months ago
|
||
unicase vs icu_casemap might be same issue.
| Assignee | ||
Comment 4•3 months ago
|
||
Updated•3 months ago
|
| Assignee | ||
Comment 5•3 months ago
|
||
(In reply to Makoto Kato [:m_kato] from comment #3)
unicase vs icu_casemap might be same issue.
Indeed. Let's talk about the Application Services aspect in bug 1986265. I also filed bug 1986401.
Comment 7•3 months ago
|
||
| bugherder | ||
Description
•