Open Bug 1157839 Opened 10 years ago Updated 5 months ago

Investigate using refcounted strings for wrapping

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: sfink, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [MemShrink:P2])

Currently, strings wrapped for cross-compartment access are copied (unless they are atoms, in which case they are shared runtime-wide.) In some situations, I can imagine this leading to a large amount of excess memory usage. An alternative scheme would be to have larger strings share their data and use reference counting to clean them up. So this bug is to (1) measure how much this would save in practice, and then (2) implement some sort of sharing scheme.
It is unclear to me what the best sharing scheme would be. Options I can think of: 1. Malloc a chunk of data for the string and the reference count. I'm not sure if the refcount needs to be updated atomically or not, but it probably does. 2. When sharing, allocate a separate GCthing to hold the data, and make an edge from each of the strings. But that begs the question of where to allocate that GCthing, and the atoms zone seems tempting, which leads to... 3. Just atomize these strings, and get sharing through existing mechanisms. AFAICT, the JS language level wants them to compare identically equal anyway.
bz points out that this might enable sharing of (some) strings between spidermonkey and gecko, for even moar memory savings: <bz> sfink: if you made it atomic, we could use the same string buffer for both gecko and spidermonkey. <sfink> bz: er... with TwoByteChars? I thought our encodings were still mismatched <bz> well So What gecko has right now is a generic "stringbuffer" thing Which has a refcount, capacity (in bytes) and buffer All as a single allocation The consumer is expected to maintain the refcount and keep track of how much of the capacity is actually used. And whether the buffer is storing char or char16_t <sfink> hm, so at least for the subsets of strings that are identical, it's a fine backend <bz> So yeah, in the case when you have 1-byte chars we'd need to do something .... <sfink> and we'd need to be sure gecko didn't mutate things <bz> sure Note that right now gecko always copies JS strings on entry And uses either copying or an external string on return to JS.... <sfink> do the external strings point into these stringbuffers already? <bz> yes They just have a finalizer that casts from the char16_t to the stringbuffer and then releases
Whiteboard: [memshrink]
Hah, SM-Gecko sharing is also the first thing that popped into my head; see bug 686989 for more discussion.
(In reply to Steve Fink [:sfink, :s:] from comment #0) > So this bug is to (1) measure how much this would save in practice [MemShrink:P1] for this part > (2) implement some sort of sharing scheme. [MemShrink:P?] after #1 comes back :)
Whiteboard: [memshrink] → [MemShrink:P1]
Let's MemShrink:P2 this until we have data on how much it might save.
Whiteboard: [MemShrink:P1] → [MemShrink:P2]
Severity: normal → S3

We sort of have this now, but maybe there's more here?

Blocks: sm-runtime
Severity: S3 → N/A
Type: defect → enhancement
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.