Avoid calling std::tolower in hex-conversions, replace JS7_xxx macros, and remove remaining calls to ctype.h functions

RESOLVED FIXED in Firefox 68

Status

()

task
RESOLVED FIXED
3 months ago
2 months ago

People

(Reporter: anba, Assigned: anba)

Tracking

Trunk
mozilla68
Points:
---

Firefox Tracking Flags

(firefox68 fixed)

Details

Attachments

(10 attachments)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
Assignee

Description

3 months ago

While rebasing bug 1421400, I saw that JS7_UNHEX is calling std::tolower to convert A-F to a-f. mozilla::AsciiAlphanumericToNumber provides a more efficient and modern alternative. For example the following µ-benchmark improves from 545 ms to 415 ms for me with the forthcoming patches for this bug:

function f() {
    var xs = ["%E0%BF%BF".repeat(10),"%E0%BF%BF".repeat(10)];
    var q = 0;
    var t = dateNow();
    for (var i = 0; i < 1000000; ++i) {
        q += decodeURI(xs[i & 1]).length;
    }
    return [dateNow() - t, q];
}

But why stop with only replacing JS7_UNHEX, the other JS7_xxx macros can be modernised as well. And then we might as well just replace the remaining callers to ctype.h functions to use the alternatives from mozilla/TextUtils.h (mozilla::IsAsciiDigit for std::isdigit, mozilla::IsAsciiAlpha for std::isalpha), util/Unicode.h (unicode::ToLowerCase for std::tolower, unicode::IsSpace for std::isspace), or when no alternative exists, simply add it (std::isprint).

Assignee

Comment 1

3 months ago

This avoids a call to std::tolower, making hex-conversions slightly faster.

Assignee

Comment 2

3 months ago

Clang and GCC generate slightly better assembly when IsAsciiHexDigit is called,
because the cmp instruction for the < 127 check in JS7_ISHEX is no longer
emitted.

Depends on D26504

Assignee

Comment 3

3 months ago

JS7_ISOCT and JS7_UNOCT were only used in TokenStream, so the new functions were
directly moved into that file instead of adding them to util/Text.h.

Depends on D26505

Assignee

Comment 4

3 months ago

js::AsciiDigitToNumber is an optimised version of mozilla::AsciiAlphanumericToNumber
for known ASCII digit-only cases, which avoids the extra comparisons for ASCII
alphabetical characters. This ensures replacing JS_UNDEC with js::AsciiDigitToNumber
still emits the same assembly.

Depends on D26506

Assignee

Comment 5

3 months ago

mozilla::IsAsciiDigit is equivalent to std::isdigit, except it's not necessary
to worry about UB when calling it with an input which can't be represented as
unsigned char.

Depends on D26507

Assignee

Comment 6

3 months ago

Provide js::IsAsciiPrintable as a safe alternative to std::isprint, which doesn't
lead to UB for inputs not representable as unsigned char and which also doesn't
depend on the current locale.

Depends on D26508

Assignee

Comment 7

3 months ago

std::tolower can be safely replaced with js::unicode::ToLowerCase in both contexts.

Depends on D26509

Assignee

Comment 8

3 months ago

More removal of ctype functions.

Depends on D26510

Assignee

Comment 9

3 months ago

Remove the last remaining call to ctype functions.

Depends on D26511

Comment 12

2 months ago

Pushed by nerli@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5b6bf24882ce
Part 1: Replace JS7_UNHEX with mozilla::AsciiAlphanumericToNumber. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/bac7b9ec895c
Part 2: Replace JS7_ISHEX with mozilla::IsAsciiHexDigit. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/ae1ac8554873
Part 3: Replace JS7_ISOCT and JS7_UNOCT macros with proper functions. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/544ee6df501b
Part 4: Replace JS_UNDEC with js::AsciiDigitToNumber. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/21497aef0046
Part 5: Replace std::isdigit with mozilla::IsAsciiDigit. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/e130d82fa361
Part 6: Replace std::isprint with js::IsAsciiPrintable. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/ed1ef9737580
Part 7: Replace std::tolower with js::unicode::ToLowerCase. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/ad9ff91000a9
Part 8: Replace std::isalpha with mozilla::IsAsciiAlpha. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/e28d3c25f82c
Part 9: Replace std::isspace with js::unicode::IsSpace. r=jwalden
https://hg.mozilla.org/integration/autoland/rev/27abf0e843c3
Part 10: Remove unnecessary includes for ctype.h. r=jwalden

Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.