Closed Bug 1395527 Opened 7 years ago Closed 7 years ago

Delegate IsASCII and IsUTF8 to encoding_rs

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: hsivonen, Assigned: hsivonen)

Details

Attachments

(2 files)

Consolidate implementations if faster.
See https://groups.google.com/d/msg/mozilla.dev.platform/k8bKVfUx4h4/_9n3qXQjBwAJ for the behavior change to IsUTF8(). TL;DR: I believe it was wrong to introduce an option for WebSockets to opt out of a quirk that no one was sure was necessary to begin with and, instead, we should give the non-quirky behavior that WebSockets need to all callers.
Nevermind. Comment 6 tested wrong revision.
Attachment #8903123 - Flags: review?(nfroyd)
Attachment #8903503 - Flags: review?(nfroyd)
Comment on attachment 8903123 [details]
Bug 1395527 part 1 - Add microbenchmarks for IsASCII and IsUTF8.

https://reviewboard.mozilla.org/r/174924/#review180610

r=me assuming we are reasonably confident the compiler is not magically making this go away.

::: xpcom/tests/gtest/TestStrings.cpp:1368
(Diff revision 2)
> +    nsCString test(OneASCII);
> +    for (int i = 0; i < 200000; i++) {
> +      IsUTF8(test);
> +    }

How confident are we that the compiler is not seeing this as a no-op function and optimizing the whole thing out?  Have you, say, looked at the benchmark assembly on at least one different platform?  (Similarly for the remaining tests.)

I suppose the compiler simply considers IsUTF8 to be a black box that could have unknown side effects at this time, but a smarter, LTO-aware compiler might not seen things that way...or if we added compiler-specific annotations to IsUTF8 to tell the compiler it didn't modify global memory or similar.)
Attachment #8903123 - Flags: review?(nfroyd) → review+
Comment on attachment 8903503 [details]
Bug 1395527 part 2 - Delegate IsASCII and IsUTF8 to encoding_rs.

https://reviewboard.mozilla.org/r/175336/#review180608

Well, that's a nice pile of code we don't have to maintain any more.  Thank you!
Attachment #8903503 - Flags: review?(nfroyd) → review+
Comment on attachment 8903123 [details]
Bug 1395527 part 1 - Add microbenchmarks for IsASCII and IsUTF8.

https://reviewboard.mozilla.org/r/174924/#review180610

> How confident are we that the compiler is not seeing this as a no-op function and optimizing the whole thing out?  Have you, say, looked at the benchmark assembly on at least one different platform?  (Similarly for the remaining tests.)
> 
> I suppose the compiler simply considers IsUTF8 to be a black box that could have unknown side effects at this time, but a smarter, LTO-aware compiler might not seen things that way...or if we added compiler-specific annotations to IsUTF8 to tell the compiler it didn't modify global memory or similar.)

I'm not at all confident that the short-string benches aren't optimized away, but I'm confident the old code couldn't have been faster for short strings that the new tiny loops that inline. I'm confident that cases that call into Rust can't be optimized away by the C++ compiler.

Do we have a black box function in gtest to inhibit code elimination? E.g. `BlackBox(void*)` such that does nothing with the argument in such a way that the compiler can't tell it's nothing?
Comment on attachment 8903123 [details]
Bug 1395527 part 1 - Add microbenchmarks for IsASCII and IsUTF8.

https://reviewboard.mozilla.org/r/174924/#review180610

> I'm not at all confident that the short-string benches aren't optimized away, but I'm confident the old code couldn't have been faster for short strings that the new tiny loops that inline. I'm confident that cases that call into Rust can't be optimized away by the C++ compiler.
> 
> Do we have a black box function in gtest to inhibit code elimination? E.g. `BlackBox(void*)` such that does nothing with the argument in such a way that the compiler can't tell it's nothing?

Filed bug 1396208.
Comment on attachment 8903123 [details]
Bug 1395527 part 1 - Add microbenchmarks for IsASCII and IsUTF8.

https://reviewboard.mozilla.org/r/174924/#review180610

I'm reasonably confident that part 2 is a real improvement, so I'm going to land these with a follow-up filed to make the benchmarks more trustworthy for the parts that the C++ optimizer can see. (I'm confident it can't see Rust code, so Rust code showing an improvement over C++ indicates that the baseline C++ didn't get optimized away.)
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/40934888ed06
part 1 - Add microbenchmarks for IsASCII and IsUTF8. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/d59b62713c66
part 2 - Delegate IsASCII and IsUTF8 to encoding_rs. r=froydnj
Backed out for Android bustage at uriloader/exthandler/android/nsMIMEInfoAndroid.cpp:62: too many arguments to function 'bool IsUTF8(const nsACString&)':

https://hg.mozilla.org/integration/autoland/rev/e990298e1596c5d5df0a73c929c0dacef88f5e53
https://hg.mozilla.org/integration/autoland/rev/cae6eeaf3f0c73cb62b87f465e9dd69ea2132bd3

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=d59b62713c66023954833e89374c5ecc2b92df72&filter-resultStatus=testfailed&filter-resultStatus=runnable&filter-resultStatus=busted&filter-resultStatus=usercancel&filter-resultStatus=retry&filter-resultStatus=exception
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=127968819&repo=autoland

[task 2017-09-02T08:33:06.363264Z] 08:33:06     INFO -  In file included from /builds/worker/workspace/build/src/obj-firefox/uriloader/exthandler/Unified_cpp_uriloader_exthandler0.cpp:65:0:
[task 2017-09-02T08:33:06.365832Z] 08:33:06     INFO -  /builds/worker/workspace/build/src/uriloader/exthandler/android/nsMIMEInfoAndroid.cpp: In static member function 'static bool nsMIMEInfoAndroid::GetMimeInfoForMimeType(const nsACString&, nsMIMEInfoAndroid**)':
[task 2017-09-02T08:33:06.365942Z] 08:33:06     INFO -  /builds/worker/workspace/build/src/uriloader/exthandler/android/nsMIMEInfoAndroid.cpp:62:30: error: too many arguments to function 'bool IsUTF8(const nsACString&)'
[task 2017-09-02T08:33:06.366431Z] 08:33:06     INFO -     if (!IsUTF8(aMimeType, true))
[task 2017-09-02T08:33:06.366755Z] 08:33:06     INFO -                                ^
Flags: needinfo?(hsivonen)
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/06b701902e64
part 1 - Add microbenchmarks for IsASCII and IsUTF8. r=froydnj
https://hg.mozilla.org/integration/autoland/rev/f62dccfbc68e
part 2 - Delegate IsASCII and IsUTF8 to encoding_rs. r=froydnj
Flags: needinfo?(hsivonen)
https://hg.mozilla.org/mozilla-central/rev/06b701902e64
https://hg.mozilla.org/mozilla-central/rev/f62dccfbc68e
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.