Closed Bug 1542106 Opened 5 years ago Closed 5 years ago

Cache the last computed column and use it to optimize a next column computation further in the same line

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: Waldo, Assigned: Waldo)

References

Details

Attachments

(2 files)

Attached file worker.js

The current column-computation approach is to iterate through the entire line every time. This is obviously linear, but in the length of the script already being parsed -- and parsing is necessarily linear in that, so big-O isn't changing, so ideally that shouldn't be a problem.

But linear repeated a bunch can become quadratic. I'd hoped that wouldn't matter in the real world, but it looks like it does. For example, running the attached file (derived from a file in our source tree, and probably something that appears on the web somewhere as well) in a debug shell in an unaltered build results in this:

[jwalden@find-waldo-now src]$ time dbg/js/src/js -f /tmp/worker.js 

real	0m1.590s
user	0m1.579s
sys	0m0.011s

Slow, but okay, nothing related to this. But if you flip the switch to count whole code points...

[jwalden@find-waldo-now src]$ time dbg/js/src/js -f /tmp/worker.js 

real	0m15.592s
user	0m15.479s
sys	0m0.021s

Yikes. 10x increase on an actual real-world thing? No bueno.

So, let's do the simplest easiest trick, akin to what SourceCoords already does for offset-to-line-number mapping to avoid a binary search every time. After a column has been computed for line number lineno and offset offset1 corresponding to column c, if the next column computation is still on line lineno, at offset offset2 >= offset1, then only linearly count code points from offset1 forward to offset2 and add that to the previously-computed c.

In principle, for column computations at increasing offsets, we'll only scan through any given code unit once, and it's back to linear cost in the size of the script. And it seems that backwards-looking column computations aren't especially common, so the unhandled case isn't absolutely terrible. If we implement that, we get this result:

[jwalden@find-waldo-now src]$ time dbg/js/src/js -f /tmp/worker.js 

real	0m1.617s
user	0m1.590s
sys	0m0.027s

So let's do that to start. There are source code constructs that will do backwards-looking column computations -- I think for(;;) loops will, for example, and minified code might make this a concern -- but those can be addressed if they show up in the wild.

Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/autoland/rev/2756d57cc947
Cache the last (line number, offset => column) mapping returned and use it to optimize a subsequent lookup that's further along in the same line.  r=arai
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: