Open Bug 1798958 Opened 3 years ago Updated 4 months ago

Experiment with removing char16_t JS parser

Tracking

()

Status:

NEW

People

(Reporter: tcampbell, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Ted Campbell [:tcampbell]

Reporter

Description

•

3 years ago

We still have two copies of the parser compiled. This can have impact on CPU icache since the parser is quite large. Ideally we would remove the char16_t parser and allow the utf-8 parser to receive unmatched char16_t surrogates (a la WTF-8 encoding).

Ted Campbell [:tcampbell]

Reporter

Comment 1

•

3 years ago

In practice, whether we use UTF-8 vs char16_t is determined by call site.

char16_t
- eval
- js:// urls
- inline scripts
- event handlers
utf-8
- out-of-line scripts

The bulk volume of JS bytes parsed is by the utf-8 parser, but more than half of invocations to parse are char16_t entry points. This continues to mean that icache pollution is a potential problem.

The utf8 parser seems to be functional if I simply disable the check for invalid surrogates and eagerly turn on utf-8 parsing for evals.

The next step is to convert the remaining scripts (which are typically much smaller) to wtf-8 just before the parse and then removing the actual char16_t parser from the build.

Bryan Thrall [:bthrall]

Updated

•

3 years ago

Severity: -- → N/A

Priority: -- → P2

Ted Campbell [:tcampbell]

Reporter

Updated

•

3 years ago

Depends on: 1803495

Ted Campbell [:tcampbell]

Reporter

Updated

•

3 years ago

Blocks: 1801192

Ted Campbell [:tcampbell]

Reporter

Comment 2

•

3 years ago

The basic prototype seems to be working and I was able to do some initial testing. Unsurprisingly I was not able to see obvious wins in speedometer, but with many retries I do see that the highest confidence results are almost all improvements.

https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=9f1e3d9b2d3648dd8994b6a74881e8b7d84ab42a&originalSignature=3445603&newSignature=3445603&framework=13&originalRevision=b8cb2e1be27c3ee45b10871b6f6d0aa8e2e3b69c&page=1

In this incarnation of the prototype, I see about 150kB reduction in firefox installer size. I was seeing 400kB in spidermonkey shell builds, and I'm not sure if there is more stuff that can be removed.

The main blocker to doing this for real is that I don't precisely handle the unmatched char16_t surrogates cases yet and that requires a more consistent approach to allowing WTF-8 strings in Gecko.

For now, there are a number of pieces of the prototype that can be landed today which will move more cases to use the UTF-8 parser.

Ted Campbell [:tcampbell]

Reporter

Updated

•

3 years ago

Depends on: 1806169

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

3 years ago

Whiteboard: [sp3]

Jira Integration Bot

Updated

•

3 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/SP3-193

Ted Campbell [:tcampbell]

Reporter

Updated

•

3 years ago

Assignee: tcampbell → nobody

Nicolas B. Pierron [:nbp]

Comment 3

•

4 months ago

(In reply to Ted Campbell [:tcampbell] from comment #0)

We still have two copies of the parser compiled. This can have impact on CPU icache since the parser is quite large. Ideally we would remove the char16_t parser and allow the utf-8 parser to receive unmatched char16_t surrogates (a la WTF-8 encoding).

From having investigated this issue at the time of jsparagus / SmooshMonkey, I can tell that a single instance of the JS parser does not fit in the instruction cache (L1i). Thus, this point is valid, but not solvable by having a single instance of our parser.

However, the problems of having multiple instances of the parser, for different token sizes, is in the L2 / L3 caches and in the download size of the binary.

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Experiment with removing char16_t JS parser

Categories

(Core :: JavaScript Engine, task, P2)

Tracking

()

People

(Reporter: tcampbell, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Updated

Updated

Comment 2

Updated

Updated

Updated

Updated

Comment 3