Reconsider atomizing strings

NEW
Unassigned

Status

()

Core
JavaScript Engine
8 years ago
3 years ago

People

(Reporter: gal, Unassigned)

Tracking

Trunk
x86
Mac OS X
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

8 years ago
Histogram of the length of strings going into js_AtomizeString for a SS run.

It might be faster to actually use string comparison and not atomize, in particular since SSE makes strcmp really fast these days, and atomizing requires a global lock which is slow like hell.

0 48
1 45190
2 49743
3 3966
4 2354
5 573
6 800
7 8536
8 670
9 653
10 2953
11 351
12 350
13 220
14 146
15 117
16 76
17 59
18 49
19 27
20 31
21 32
22 29
23 21
24 19
25 11
26 18
27 8
28 9
29 9
30 2
31 4
32 9
33 4
34 4
35 6
36 2
37 2
39 4
40 3
41 3
42 8
44 5
45 2
46 1
47 2
49 1
51 1
52 1
54 1
55 1
60 1
64 3
65 1
66 1
76 1
85 1
162 1
287 1
405 1
428 1
604 1
608 1
629 1
989 1
1343 1
4448 1
5321 1
11305 1
14593 1
14680 1
21955 1
38454 1
52252 1
101745 1
146663 1
146665 1
(Reporter)

Comment 1

8 years ago
Just discussed this with shaver. Its fine to atomize where we care about space, but if we do comparison by value, we don't have to atomize if we don't want to because it hurts perf.
Atomizing uses compare and swap. Is that not happening and we are using a mutex?

We already added an ATOMIZED flag to strings to avoid reatomizing a property id computed from a string literal. Is that broken?

I think first we need to be sure something hasn't regressed.

Second, let's stop micro-optimizing for sunspider. Computed property id accesses are rare compared to identifier-based (atomized by the lexer) accesses. If you try to add a strcmp path you'll degrade all the common identifier-based property accesses. There's no way to avoid it.

Avoiding locking for unit strings would be swell.

/be

Comment 3

8 years ago
(In reply to comment #2).
> 
> Second, let's stop micro-optimizing for sunspider. Computed property id
> accesses are rare compared to identifier-based (atomized by the lexer)
> accesses. If you try to add a strcmp path you'll degrade all the common
> identifier-based property accesses. There's no way to avoid it.

I agree that degrading identifier-based property access would be bad. Computed property access might be rare in source code, but it is also highly likely to occur in loops (e.g. sync.js usage), so it is worth optimizing.

Comment 4

8 years ago
(In reply to comment #3)
> 
> I agree that degrading identifier-based property access would be bad. Computed
> property access might be rare in source code, but it is also highly likely to
> occur in loops (e.g. sync.js usage), so it is worth optimizing.

to be clear, agreeing with brendan--don't do stuff that degrades identifier-based property access :)
Long ago kjs had no atomizing (interning), but Apple peeps added it for a big speedup. FWIW.

We could hack on more lock-free atom table goodness, but I wonder whether the CAS already in place (JSAtomState::lock) isn't working.

/be
(Assignee)

Updated

3 years ago
Assignee: general → nobody
You need to log in before you can comment on or make changes to this bug.