Currently JSAtom instances are global and stored in the default compartment. That requires to take a lock when atomizing strings. We should consider avoiding that by moving the atoms into the compartment or doing some mostly lockless hashing.
An obvious concern is that, with potentially hundreds of compartments in a normal web browser, this could waste a lot of memory. This exact issue of "the runtime is too coarse, the compartment is too fine, the thread-data isn't right b/c compartments can hop between threads" has come up at least 2 times in recent memory. Ultimately, I think we need a concept that satisfies two constraints: - allows single-threaded access access to data - a compartment is only ever associated with one of these things In Firefox, IIUC, we could get away with having 1 for the main thread, 1 for each web worker compartment (and probably a similar 1-for-each situation for add ons, I'm not sure). The mjit needed this for ExecutablePools and created a JaegerCompartment which, at the moment, is associated 1:1 with a compartment but, hopefully one day, could be tacked on to this new entity. Dave called it a "zone" (there are only so many topological-sounding terms...). Back to the bug at hand, I don't know whether the waste due to per-compartment atom duplication would be small enough to ignore; I just wanted to point out that there is a reoccurring theme here and, if it continues, we stand to save a lot of total memory by fixing it proper.
Created attachment 490129 [details] Shell microbenchmark This compares a property get using a string name which is a literal in the script to using one that's not. I get the following numbers for this testcase: -j: Literal string: 127 Generated string: 815 -m: Literal string: 74 Generated string: 75 Presumably JM pics this without atomizing or something? The TM time for the literal doesn't involve js_AtomizeString at all, since RootedStringToId checks for already-atomized strings. For the non-literal, 82% of the runtime is under js_AtomizeString, further breaking down as (percentages of total runtime): 20% self time 20% js_EqualStrings 16% js_HashString 13% OSAtomicCompareAndSwapPtr 10% js_Lock 4% js_Unlock Some of the self time might or might not be setting those calls up, but in any case the lock/unlock/CAS stuff is at least 27% of the total time. For reference, v8 shell on this testcase shows: Literal string: 98 Generated string: 314
My concern with moving atoms into a certain compartment is: 1) Compartments don't get deleted anymore because there are still atoms inside. 2) Measurements show that a very high percentage of the cross-compartment pointers (95-99%) are atoms related. If they point into the default Compartment we can "ignore" them. If they are added to the rootset of another compartment we introduce further complications.
(In reply to comment #3) > 2) Measurements show that a very high percentage of the cross-compartment > pointers (95-99%) are atoms related. Do you mean here that 95%-99% are just atom pointers (that always points into the default compartment)?
(In reply to comment #2) > For the non-literal, 82% > of the runtime is under js_AtomizeString, further breaking down as (percentages > of total runtime): > > 20% self time > 20% js_EqualStrings > 16% js_HashString > 13% OSAtomicCompareAndSwapPtr > 10% js_Lock > 4% js_Unlock Is this for the case where the atom already exist in the table or when a new atom is added? If it is the latter then 20% of time spend in js_EqualStrings implies a bad hash performance with a lot of collisions.
> Is this for the case where the atom already exist in the table or when a new > atom is added? The above percentages are for the "2" test in there. I would assume that the atom is already in the table, but check?
GCThings in the default compartment are never wrapped. They leak into other compartments. The rule is "same-compartment or in the default compartment."
Conceptually it would be cleaner to have atoms per compartment and then wrap them like any other value. As we transition compartment boundaries, we have to re-lookup atoms in the local compartment, so we eliminate looking in the atomization path, but we get additional lookups during compartment transitions.
(In reply to comment #6) > The above percentages are for the "2" test in there. For that micro-benchmark per-compartment atoms would help. Currently atoms live in the default compartment so a fast path that checks if the string is already atomized would always fail in the benchmark loop: for (var i = 0; i < 1e7; ++i) x = testObj2[str2]; Here str2 comes from the current compartment and cannot be an atom. With per-compartment atoms str2 will itself become an atom on the first iteration leading to a fast lookup.
(In reply to comment #1) > Ultimately, I think we need a concept that satisfies two > constraints: > - allows single-threaded access access to data > - a compartment is only ever associated with one of these things > In Firefox, IIUC, we could get away with having 1 for the main thread, 1 for > each web worker compartment (and probably a similar 1-for-each situation for > add ons, I'm not sure). If the association between compartment and thingy is unchanging, then the thingies are compartment groups, where only one thread is ever active within a group's compartments at a time. (Otherwise, if two threads were active within two compartments that share the same thingy, then the thingy's data couldn't be accessed without locking.)
(In reply to comment #10) Indeed. This is what the whole 'zone' discussion on the js-internal newsgroup has been about, btw. Use cases (and hopefully one day the patches) are being collected in bug 621205.
Do we have actual data proving that atoms-per-compartment is a memory overhead? The default compartment tends to be tiny in comparison to the actual compartment memory use.
Andreas: numbers? "tiny" * many == "big". /be
Also, we precompile XUL and XBL and execute in many compartments. This currently requires atoms to be runtime-wide. Adding a linking step could be done but why bother if we can use local-free zone-based atoms? /be
Comment 14 is not accurate. The code itself and the function object live in one specific compartment and we clone them into another compartment as needed. That data is many times bigger than the space atoms take up. tiny * many == big, but big << bigger to the point of irrelevant. Lets measure first.
JSRuntime is single-threaded now, and of course JSAtomState within it, so there are no locks in any of this.