Open
Bug 2007152
Opened 5 months ago
Updated 5 months ago
Speed up computeColumnOffsetForUTF8 for ASCII strings
Categories
(Core :: JavaScript Engine, task, P2)
Core
JavaScript Engine
Tracking
()
NEW
People
(Reporter: iain, Unassigned)
References
(Blocks 2 open bugs)
Details
While looking at parser profiles for multi-inspector-code-load in JS3, I noticed that UTF8 parsing spends ~5% of its time in computeColumnOffset, whereas UTF16 parsing spends ~0.1%. Most of the time is spent here counting the length of a range in UTF16 code units. However, in the ASCII case, this is trivial. It would be nice if we could detect this case and avoid counting.
A few options:
- If we know for some reason that the input is ASCII before parsing, we could set a flag to skip counting here. (One example of where we might know this is for
evalon a Latin1 string; I am experimenting with a patch that quickly checks whether a Latin1 string is valid ASCII and then parses it as UTF8.) - Alternatively, we could consider a quick initial scan of the UTF8 string to see if it is valid ASCII. I imagine this would succeed for many UTF8 inputs in the real world.
- We might also be able to set a flag dynamically when we see the first non-ASCII character on a line (here-ish?), and compute columns the cheap way until that flag is set. This is the most flexible, but also adds extra code in a hot path.
You need to log in
before you can comment on or make changes to this bug.
Description
•