Open
Bug 1367471
Opened 8 years ago
Updated 4 days ago
De-duplicate strings or other constant & common data during compaction
Categories
(Core :: JavaScript: GC, enhancement, P3)
Core
JavaScript: GC
Tracking
()
NEW
People
(Reporter: pbone, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: triage-deferred, Whiteboard: [MemShrink:P2])
erahm noticed that there were 10MB of black, ("#000000") reported in about:memory so fitzgen suggested de-duplication during compaction.
Comment 1•7 years ago
|
||
In bug 1338930 there is 56MB (2.5 million copies) of the string ":DIV".
Updated•7 years ago
|
Whiteboard: [MemShrink]
Updated•7 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P2]
Updated•7 years ago
|
Keywords: triage-deferred
Priority: -- → P3
Comment 2•6 years ago
|
||
This sort of thing makes me sad.... yes, it's a website leak, perhaps we could to (some) deduping without a big perf hit to limit the pain of this sort of thing. Or a separate idle-time-scheduled de-duping pass. I'd suggest using some heuristics to decide if there's any chance of a big win before throwing too many cycles at it; perhaps during compaction we can record a histogram of string sizes and see if there are hot spots.
1,248.01 MB (70.42%) -- string(length=648061, copies=624, "url("" (truncated))
Comment 3•6 years ago
|
||
We talked about this during the GC meeting at Orlando.
1) We should probably prioritize solving this for short strings like in comment #0 and comment #1; given how short they are, the overhead of hashing them should be relatively small.
2) For huge strings like in comment #2 we'd need to do something a little more clever - maybe we could hash the first page and compare prefixes before we commit to comparing the full strings.
3) I remember a bug where we had a large amount of strings that shared a long prefix but each had a unique suffix - it would be great if we could turn these into ropes somehow.
4) Another bug I remember has us keeping long strings alive even though the JS only used short substrings; it would be great to copy/inline/deduplicate these during compaction as well.
I wondered whether it would be a good idea to store a hash per page of a string (or rope) so we could deduplicate page-sized chunks. Of course, it won't work if you're comparing flattened strings with variable length prefixes, but it would work on mostly identical strings with differing fixed length prefixes.
Updated•2 years ago
|
Severity: normal → S3
Updated•4 days ago
|
Blocks: sfink.backlog
You need to log in
before you can comment on or make changes to this bug.
Description
•