Open Bug 1406366 Opened 7 years ago Updated 2 years ago

Manually vectorize nsBidi::GetLogicalRun

Tracking

()

Status:

NEW

Tracking Flags:

Tracking

Status

firefox58

---

affected

People

(Reporter: xidorn, Unassigned)

References

(Blocks 1 open bug)

Details

Xidorn Quan [:xidorn] UTC+11

Reporter

Description

•

7 years ago

Before bug 1288761, nsBidi::GetLogicalRun uses ubidi_GetLogicalRun which is based on runs, so it's faster when there are a few runs but lots of characters.

Bug 1288761 changes it to a level-based algorithm to address O(n^2) performance issue when there are lots of runs. But its current implementation can be slow in case there are few runs but lots of characters, since it compares one byte each time.

We should be able to vectorize it to compare 4byte/8byte/16byte each time so that it can run faster.

I've verified that there is no compiler we use so far do this auto-vectorization.

Xidorn Quan [:xidorn] UTC+11

Reporter

Comment 1

•

7 years ago

With SSE2, we can do 16 bytes each loop, with intrinsics
* __m128i _mm_cmpeq_epi8 to compare 16 bytes together
* _mm_movemask_epi8 to compress the result into an int32_t
then we can locate the first one bit.

Bad side of using SSE2 is that we would probably need more tests to ensure its correctness because it cannot be generalized to a template function...

We can probably just write a 4/8 bytes version first...

Jonathan Kew [:jfkthame]

Comment 2

•

7 years ago

Before spending too much effort on this, we should profile (or instrument) some examples with really long direction runs, and see whether the current implementation actually takes enough time to be worth optimizing further.

Priority: -- → P3

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Manually vectorize nsBidi::GetLogicalRun

Categories

(Core :: Layout: Text and Fonts, enhancement, P3)

Tracking

()

People

(Reporter: xidorn, Unassigned)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated