Closed Bug 511566 Opened 15 years ago Closed 8 months ago

Reconsider atomizing strings

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: gal, Unassigned)

Details

Histogram of the length of strings going into js_AtomizeString for a SS run.

It might be faster to actually use string comparison and not atomize, in particular since SSE makes strcmp really fast these days, and atomizing requires a global lock which is slow like hell.

0 48
1 45190
2 49743
3 3966
4 2354
5 573
6 800
7 8536
8 670
9 653
10 2953
11 351
12 350
13 220
14 146
15 117
16 76
17 59
18 49
19 27
20 31
21 32
22 29
23 21
24 19
25 11
26 18
27 8
28 9
29 9
30 2
31 4
32 9
33 4
34 4
35 6
36 2
37 2
39 4
40 3
41 3
42 8
44 5
45 2
46 1
47 2
49 1
51 1
52 1
54 1
55 1
60 1
64 3
65 1
66 1
76 1
85 1
162 1
287 1
405 1
428 1
604 1
608 1
629 1
989 1
1343 1
4448 1
5321 1
11305 1
14593 1
14680 1
21955 1
38454 1
52252 1
101745 1
146663 1
146665 1
Just discussed this with shaver. Its fine to atomize where we care about space, but if we do comparison by value, we don't have to atomize if we don't want to because it hurts perf.
Atomizing uses compare and swap. Is that not happening and we are using a mutex?

We already added an ATOMIZED flag to strings to avoid reatomizing a property id computed from a string literal. Is that broken?

I think first we need to be sure something hasn't regressed.

Second, let's stop micro-optimizing for sunspider. Computed property id accesses are rare compared to identifier-based (atomized by the lexer) accesses. If you try to add a strcmp path you'll degrade all the common identifier-based property accesses. There's no way to avoid it.

Avoiding locking for unit strings would be swell.

/be
(In reply to comment #2).
> 
> Second, let's stop micro-optimizing for sunspider. Computed property id
> accesses are rare compared to identifier-based (atomized by the lexer)
> accesses. If you try to add a strcmp path you'll degrade all the common
> identifier-based property accesses. There's no way to avoid it.

I agree that degrading identifier-based property access would be bad. Computed property access might be rare in source code, but it is also highly likely to occur in loops (e.g. sync.js usage), so it is worth optimizing.
(In reply to comment #3)
> 
> I agree that degrading identifier-based property access would be bad. Computed
> property access might be rare in source code, but it is also highly likely to
> occur in loops (e.g. sync.js usage), so it is worth optimizing.

to be clear, agreeing with brendan--don't do stuff that degrades identifier-based property access :)
Long ago kjs had no atomizing (interning), but Apple peeps added it for a big speedup. FWIW.

We could hack on more lock-free atom table goodness, but I wonder whether the CAS already in place (JSAtomState::lock) isn't working.

/be
Assignee: general → nobody
Severity: normal → S3
Status: NEW → RESOLVED
Closed: 8 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.