Closed Bug 1345703 Opened 7 years ago Closed 6 years ago
[meta] Byte stream-specialized tokenizer
A tokenizer specialized for Latin1 should be a healthy speed up for Latin1 scripts.
Summary: [meta] Latin1-specialized tokenizer → [meta] Byte stream-specialized tokenizer
See comments in bug 1344152. UTF-8 seems to be the better approach. Being able to parse byte streams directly without inflating is likely the the important thing, regardless of approach.
Marking p1 because parsing improvements directly lead to load time improvements. Inflation is both linear on the source, and also the output is twice the size, causing lots of memory/cache churn.
Whiteboard: [qf] → [qf:p1]
Jeff can you evaluate if this work still makes sense?
Assignee: nobody → jwalden+bmo
Whiteboard: [qf:p2] → [qf:p1]
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Performance Impact: --- → P3
You need to log in before you can comment on or make changes to this bug.