Closed Bug 1995626 Opened 5 months ago Closed 4 months ago

from Base64/Hex are slower than chrome

Tracking

()

Status:

RESOLVED FIXED

Milestone:

146 Branch

Tracking Flags:

Tracking

Status

firefox146

---

fixed

People

(Reporter: mgaudet, Assigned: anba)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Bug 1995626 - Part 1: Use a table lookup to replace HexDigitToNibbleOrInvalid. r=iain! 4 months ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1995626 - Part 2: Add separate loop to decode full chunks in FromBase64. r=iain! 4 months ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Description

•

5 months ago

From Comment 18 on Bug 1994067:

I see:

1.4x improvement in fromHex, but it's still ~2x slower than JS

1.3x improvement in fromBase64, but it's still ~1.7x slower than JS in SpiderMonkey (or 2x slower than JS in Firefox)

4.3x improvement in toHex, that is now faster than JS (but ~2.7x slower than v8)

3.7x improvement in toBase64, that is now faster than JS in SpiderMonkey and in Firefox (but still ~8x slower than v8)

I think the original issue still stands for fromHex and fromBase64 and this should be reopened
As in: JS is still faster for those two

See that bug for test cases as well.

Matthew Gaudet (he/him) [:mgaudet]

Reporter

Updated

•

5 months ago

Severity: -- → S3

Priority: -- → P3

Iain Ireland [:iain]

Comment 1

•

5 months ago

Note: further improvements here are likely to involve using SIMD instructions. Anba made all the easy perf fixes in the previous bug.

It looks like JSC and V8 are both using the simdutf library in (at least) the fromBase64 case.

Nikita Skovoroda

Comment 2

•

4 months ago

The issue inherited from Bug 1994067 is that fromHex/fromBase64 impls in Firefox are 2x slower than compared to JS-based impl run in Firefox

Even disregarding Chrome/WebKit

André Bargull [:anba]

Assignee

Comment 3

•

4 months ago

Attached file Bug 1995626 - Part 1: Use a table lookup to replace HexDigitToNibbleOrInvalid. r=iain! — Details

Use a table lookup to replace HexDigitToNibbleOrInvalid. The decode table has
256 entries so that Latin-1 characters can be decoded branch-free. The table
element type is int8_t, to keep the table size small. The elements are later
loaded as int32_t for faster error detection.

Generated code for decoding four characters, extracted from a standalone C++
implementation, but should be similar enough to code generated for FromHex:

;; Load four characters
movzx   eax, byte ptr [rdi]
movzx   ecx, byte ptr [rdi + 1]
movzx   edx, byte ptr [rdi + 2]
movzx   esi, byte ptr [rdi + 3]

;; Decode table
lea     rdi, [rip + Hex::Table]

;; Decode c2, sign-extend int8 to int32
movsx   edx, byte ptr [rdx + rdi]
shl     edx, 12

;; Decode c3, ...
movsx   esi, byte ptr [rsi + rdi]
shl     esi, 8
or      esi, edx

;; Decode c0, ...
movsx   edx, byte ptr [rax + rdi]
shl     edx, 4
or      edx, esi

;; Decode c1, ...
movsx   eax, byte ptr [rcx + rdi]
or      eax, edx

;; Check SF set by previous or-instruction
js      .invalid_char

Phabricator Automation

Updated

•

4 months ago

Assignee: nobody → andrebargull

Status: NEW → ASSIGNED

André Bargull [:anba]

Assignee

Comment 4

•

4 months ago

Attached file Bug 1995626 - Part 2: Add separate loop to decode full chunks in FromBase64. r=iain! — Details

Two changes:

Extend the decode table to 256 elements and change the element type to
int8_t. This matches the changes from part 1.
Add a separate loop to process full chunks. This saves additional branches,
because we no longer have to if the output is full for each character read.
Also try to read four consecutive characters if possible and treat whitespace
characters in a slow path.

André Bargull [:anba]

Assignee

Comment 5

•

4 months ago

These two patches should help to make the fromHex and fromBase64 cases noticeably faster.

André Bargull [:anba]

Assignee

Comment 6

•

4 months ago

I've filed bug 1996197 for another possible fromBase64 optimisation.

Pulsebot

Comment 7

•

4 months ago

Pushed by andre.bargull@gmail.com: https://github.com/mozilla-firefox/firefox/commit/37920fe3b2c1 https://hg.mozilla.org/integration/autoland/rev/1291aaccc5b7 Part 1: Use a table lookup to replace HexDigitToNibbleOrInvalid. r=iain https://github.com/mozilla-firefox/firefox/commit/0eb57901407a https://hg.mozilla.org/integration/autoland/rev/963a08d7c24a Part 2: Add separate loop to decode full chunks in FromBase64. r=iain

Pulsebot

Comment 8

•

4 months ago

Pushed by amarc@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/12d2f3d8575d https://hg.mozilla.org/integration/autoland/rev/d33331bedf01 Revert "Bug 1995626 - Part 2: Add separate loop to decode full chunks in FromBase64. r=iain" for causing SM bustages @ TypedArrayObject.cpp

amarc

Comment 9

•

4 months ago

Backed out for causing sm bustages @ TypedArrayObject.cpp

Flags: needinfo?(andrebargull)

André Bargull [:anba]

Assignee

Updated

•

4 months ago

Flags: needinfo?(andrebargull)

Pulsebot

Comment 10

•

4 months ago

Pushed by andre.bargull@gmail.com: https://github.com/mozilla-firefox/firefox/commit/8befa9c7c4b6 https://hg.mozilla.org/integration/autoland/rev/f621bb9d2ef2 Part 1: Use a table lookup to replace HexDigitToNibbleOrInvalid. r=iain https://github.com/mozilla-firefox/firefox/commit/336e5db150e3 https://hg.mozilla.org/integration/autoland/rev/2666f184df80 Part 2: Add separate loop to decode full chunks in FromBase64. r=iain

Nikita Skovoroda

Comment 11

•

4 months ago

Will test after this gets into a nightly build

Nikita Skovoroda

Comment 12

•

4 months ago

Tested on https://archive.mozilla.org/pub/firefox/integration/autoland/2025/10/25/860fd0ffd63968ac63e185431cd3612c8b2cf125/

Yes, that fixed it, thanks!

I see ~4.8x improvement on both fromHex and fromBase64
v8 is still faster, but now only about ~1.7-2x

Now native is faster than JS for all four methods, this can be closed, I think

Nikita Skovoroda

Comment 13

•

4 months ago

(while this is fixed, numbers from the build above are likely lower than what'll actually be in Nightly, as apparently integration a less optimized build)

Sandor Molnar[:smolnar]

Comment 14

•

4 months ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f621bb9d2ef2
https://hg.mozilla.org/mozilla-central/rev/2666f184df80

Status: ASSIGNED → RESOLVED

Closed: 4 months ago

status-firefox146: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 146 Branch

André Bargull [:anba]

Assignee

Comment 15

•

4 months ago

Thanks for verifying that the changes improved performance!

Camelia Badau [:cbadau], Desktop Test Engineering

Updated

•

4 months ago

QA Whiteboard: [qa-triage-done-c147/b146]

Denis Palmeiro [:denispal]

Updated

•

3 months ago

Updated

•

3 months ago