Closed Bug 1431356 Opened 3 years ago Closed 3 years ago

Update encoding_rs to 0.7.2

Categories

(Core :: Internationalization, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

Attachments

(1 file)

encoding_rs adds a new mem module for dealing with in-memory text (as opposed to data coming from or going to an IO boundary). Docs: https://docs.rs/encoding_rs/0.7.2/encoding_rs/mem/index.html

This module a prerequisite for bug 1402247 and bug 1431025.

The code doesn't look so reviewable, so it's probably better to review the process instead.

There is a (slower) second implementation of the same functions in 100% safe and clean Rust: https://github.com/hsivonen/safe_encoding_rs_mem/blob/master/src/lib.rs

Then there's a fuzzer that checks that the faster and messier implementation agrees with the safe and clean one: https://github.com/hsivonen/encoding_rs/blob/master/fuzz/fuzzers/fuzz_mem.rs

Hopefully reviewing safe_encoding_rs_mem and fuzz_mem is convincing that encoding_rs::mem is worthy of landing.
It's intentional that no Gecko callers of encoding_rs::mem are introduced in this bug.
(In reply to Henri Sivonen (:hsivonen) from comment #1)
> It's intentional that no Gecko callers of encoding_rs::mem are introduced in
> this bug.

Could you attach Gecko callers to blocking bugs? I would like to see how these functions are used.
(In reply to Masatoshi Kimura [:emk] from comment #3)
> (In reply to Henri Sivonen (:hsivonen) from comment #1)
> > It's intentional that no Gecko callers of encoding_rs::mem are introduced in
> > this bug.
> 
> Could you attach Gecko callers to blocking bugs? I would like to see how
> these functions are used.

There's now one example in https://reviewboard.mozilla.org/r/213924/diff/1#index_header

I'll link to more examples as I write them.

In general, str is for Rust code to call directly, UTF-16 is for nsAString-oriented Gecko code and non-str UTF-8 is for nsACString Gecko code that forgets whether it's valid UTF-8.

Some of the UTF-8 things are written in anticipation of the new XML parser. (It's unclear if calling is_utf8_bidi() is ever going to be a good idea. At least it should not bloat our binary size as long as there are no callers.)
(In reply to Henri Sivonen (:hsivonen) from comment #4)
> I'll link to more examples as I write them.

There's now more WIP code associated with bug 1402247.
Does the code on ReviewBoard in bug 1402247 and in bug 1431025 give a sufficient demonstration of usage?
Flags: needinfo?(VYV03354)
Comment on attachment 8943557 [details]
Bug 1431356 - Update encoding_rs to 0.7.2 and simd to 0.2.1. .

https://reviewboard.mozilla.org/r/213906/#review220872

I didn't actually read thorough the SIMD-optimized code.
Attachment #8943557 - Flags: review?(VYV03354) → review+
Flags: needinfo?(VYV03354)
Thank you.
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d02263dd031f
Update encoding_rs to 0.7.2 and simd to 0.2.1. r=emk.
https://hg.mozilla.org/mozilla-central/rev/d02263dd031f
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Depends on: 1433747
You need to log in before you can comment on or make changes to this bug.