Closed Bug 1568923 Opened 5 years ago Closed 4 years ago

Deduplicate nursery strings

Categories

(Core :: JavaScript: GC, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
mozilla79
Tracking Status
firefox79 --- fixed

People

(Reporter: sfink, Assigned: sfink)

References

(Blocks 4 open bugs)

Details

Attachments

(7 files, 6 obsolete files)

7.52 MB, application/x-javascript
Details
43.33 KB, text/plain
Details
6.25 KB, text/plain
Details
45.37 KB, text/plain
Details
56.48 KB, patch
Details | Diff | Splinter Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review

Before considering deduplicating strings during compaction, we should try deduplicating them during tenuring, so that few duplicate strings will have a chance to make it to the tenured heap.

String deduplication during tenuring

Most live nursery strings can be deduplicated in moveToTenured through a hashset. 
Dependent strings are complicated to deal with since their chars need to be updated with the newly deduplicated base chars.
If the dependent string is tenured, its bases cannot be deduplicated since a tenured DS chars cannot be updated. Otherwise, the following steps will be able to update its chars.

  1. Preserve the nursery base chain by saving DS nursery bases in its relocation overlay. This allows DS nursery root base to be reached.
  2. Calculate the original DS base offset: DS original chars - DS root base original chars. Root base original chars is saved in its relocation overlay.
  3. Update the DS chhars with its newly forwarded DS root base chars and the calculated offset.
  4. Assign either the root base or the undepended string in which the DS uses chars from as the new DS base to unchain any potentially long DS chain.
Attachment #9080802 - Attachment description: Bug 1568923 r=sfink → Bug 1568923 - String deduplication during tenuring. r=sfink
Priority: -- → P1
Pushed by sfink@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d99e941429d0 String deduplication during tenuring. r=sfink

Comment on attachment 9080802 [details]
Bug 1568923 - String deduplication during tenuring. r=sfink

We're having trouble tracking down some remaining problems here. It crashes the browser with ASAN pretty often on try, and crashes much less often without it. It would be nice to have a shell test case.

Flags: needinfo?(krystalyang2)
Attachment #9080802 - Flags: feedback?(nth10sd)
Attachment #9080802 - Flags: feedback?(choller)

Comment on attachment 9080802 [details]
Bug 1568923 - String deduplication during tenuring. r=sfink

I have an upcoming hard-to-reduce but reproducible testcase:

--enable-debug --enable-more-deterministic --enable-address-sanitizer (and maybe even disable jemalloc)
on m-c rev fce0b326cd31
run with:
--fuzzing-safe --no-threads --no-baseline --ion-eager --more-compartments --gc-zeal=10,421 --blinterp --blinterp-eager --blinterp-warmup-threshold=100 --execute="setJitCompilerOption(\"ion.forceinlineCaches\",1)"

Change the top of the testcase to point to your local m-c repository on rev fce0b326cd31

Attachment #9080802 - Flags: feedback?(nth10sd) → feedback-

Pernosco session with this testcase:

https://pernos.co/debug/DbuTTxd2JqWCl3eKbJcsmQ/index.html

However, this is without --disable-optimize as adding it seemed to make the ASan crash go away.

Flags: needinfo?(sphink)

https://pernos.co/debug/DbuTTxd2JqWCl3eKbJcsmQ/index.html

gdb in Pernosco failed to go backwards when I ran reverse-next, so not sure how useful this is (or whether I'm doing things properly).

gdb in Pernosco failed to go backwards when I ran reverse-next, so not sure how useful this is (or whether I'm doing things properly).

This was a Pernosco bug. I fixed it and the fix is live, so that trace should be good for debugging.

// Adapted from randomly chosen test: js/src/jit-test/tests/basic/bug674776.js
let x = "";
for (let i = 0; i < 99999; i++) {
    x += "12345678" + i + "123456" + i + "1";
}

asserts at Assertion failure: aKey->flags() == aLookup->flags(), at gc/Nursery.h:572 (as per CI), on the same debug ASan build as per comment 6, run with --fuzzing-safe --no-threads --no-baseline --ion-eager --gc-zeal=24,452

(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #12)

asserts at Assertion failure: aKey->flags() == aLookup->flags(), at gc/Nursery.h:572 (as per CI), on the same debug ASan build as per comment 6, run with --fuzzing-safe --no-threads --no-baseline --ion-eager --gc-zeal=24,452

https://pernos.co/debug/xfJPwQNatDpkAxe84eUieg/index.html

rr rev: cbe18bc1807c9bcc822ba7119d112c3aec2ec0fc
pernosco-submit rev: 6a50323d71a8d2ea4d782945e6074f5bbc1c2cee

(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #12)

asserts at Assertion failure: aKey->flags() == aLookup->flags(), at gc/Nursery.h:572 (as per CI), on the same debug ASan build as per comment 6, run with --fuzzing-safe --no-threads --no-baseline --ion-eager --gc-zeal=24,452

Ugh. I thought the fix for this one had already been uploaded to phabricator, but it looks like it wasn't. I'll update the patch.

I'm not sure if the previous recording was triggered from the same thing, but I don't think so. It looks more like the problem I was trying to track down.

Thank you!

Flags: needinfo?(sphink)

Excellent, that first recording enabled me to track the bug down and the fix was easy. (A previous fix was applied to AutoStableStringChars::init but there's a similar ::initTwoByte that needs the same treatment.) This was a solid win for having Pernosco recordings of fuzz bugs, even of unminified test cases running on optimized builds.

Comment on attachment 9080802 [details]
Bug 1568923 - String deduplication during tenuring. r=sfink

There's an updated patch up now that I'm running through try, but it would be nice to get some fuzzing on as well.

Attachment #9080802 - Flags: feedback- → feedback?(nth10sd)
This is an automated crash issue comment: Summary: Assertion failure: ptr.found() && &*ptr == &e.front(), at js/src/gc/Zone.cpp:327 Build version: mozilla-central revision 0cf9eded35d8+ Build flags: --enable-valgrind --enable-gczeal --disable-tests --enable-debug --enable-optimize Runtime options: --fuzzing-safe --ion-offthread-compile=off Testcase: evaluate(` var g2 = newGlobal({newCompartment: true}); var dbg = Debugger(g2); dbg.onDebuggerStatement = function (frame) { let env = frame.environment; try { env.setVariable('c', 12) } catch (exc) {} env.setVariable('x', 7); }; f23 = g2.eval("debugger;"); `, {catchTermination : true}); gczeal(13, 8); Backtrace: received signal SIGSEGV, Segmentation fault. JS::Zone::checkStringWrappersAfterMovingGC (this=this@entry=0x7ffff4d61000) at js/src/gc/Zone.cpp:327 #0 JS::Zone::checkStringWrappersAfterMovingGC (this=this@entry=0x7ffff4d61000) at js/src/gc/Zone.cpp:327 #1 0x00005555560f97c0 in JS::Zone::checkAllCrossCompartmentWrappersAfterMovingGC (this=0x7ffff4d61000) at js/src/gc/Zone.cpp:311 #2 0x00005555560684f1 in js::gc::CheckHashTablesAfterMovingGC (rt=rt@entry=0x7ffff5f26000) at js/src/gc/GC.cpp:8118 #3 0x00005555560dfc88 in js::Nursery::doCollection (this=this@entry=0x7ffff5f29290, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME, tenureCounts=...) at js/src/gc/Nursery.cpp:1110 #4 0x00005555560dfff8 in js::Nursery::collect (this=0x7ffff5f29290, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/Nursery.cpp:936 #5 0x00005555560541b7 in js::gc::GCRuntime::minorGC (this=this@entry=0x7ffff5f266d8, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME, phase=phase@entry=js::gcstats::PhaseKind::EVICT_NURSERY_FOR_MAJOR_GC) at js/src/gc/GC.cpp:7460 #6 0x00005555560816eb in js::gc::GCRuntime::gcCycle (this=this@entry=0x7ffff5f266d8, nonincrementalByAPI=nonincrementalByAPI@entry=true, budget=..., gckind=..., reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7027 #7 0x0000555556081f7e in js::gc::GCRuntime::collect (this=this@entry=0x7ffff5f266d8, nonincrementalByAPI=nonincrementalByAPI@entry=true, budget=..., gckindArg=..., reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7247 #8 0x0000555556082565 in js::gc::GCRuntime::gc (this=this@entry=0x7ffff5f266d8, gckind=gckind@entry=GC_NORMAL, reason=reason@entry=JS::GCReason::DESTROY_RUNTIME) at js/src/gc/GC.cpp:7329 #9 0x0000555555b81b2a in JSRuntime::destroyRuntime (this=this@entry=0x7ffff5f26000) at js/src/vm/Runtime.cpp:287 [...] #13 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at js/src/shell/js.cpp:11257 rax 0x555557d1d120 93825033949472 rbx 0x555556c8ae88 93825016573576 rcx 0x7ffff6c1c2dd 140737333281501 rdx 0x0 0 rsi 0x7ffff6eeb770 140737336227696 rdi 0x7ffff6eea540 140737336223040 rbp 0x7fffffffcca0 140737488342176 rsp 0x7fffffffcbc0 140737488341952 r8 0x7ffff6eeb770 140737336227696 r9 0x7ffff7fe6cc0 140737354034368 r10 0x0 0 r11 0x0 0 r12 0x7ffff4d61000 140737301057536 r13 0x7fffffffcc10 140737488342032 r14 0x7fffffffcc00 140737488342016 r15 0xffffffffffffff 72057594037927935 rip 0x5555560f94ab <JS::Zone::checkStringWrappersAfterMovingGC()+763> => 0x5555560f94ab <JS::Zone::checkStringWrappersAfterMovingGC()+763>: movl $0x0,0x0 0x5555560f94b6 <JS::Zone::checkStringWrappersAfterMovingGC()+774>: ud2 This issue only reproduces with the patch applied.

The other issue the fuzzer was seeing was

Assertion failure: aKey->flags() == aLookup->flags(), at js/src/gc/Nursery.h:572

but I can confirm that is fixed with the new patch.

Comment on attachment 9080802 [details]
Bug 1568923 - String deduplication during tenuring. r=sfink

f+ with that one outstanding Debugger issue fixed :)

Attachment #9080802 - Flags: feedback?(choller) → feedback+

Comment on attachment 9080802 [details]
Bug 1568923 - String deduplication during tenuring. r=sfink

I don't think I have found anything with the newer versions of the patch so far either. :)

Attachment #9080802 - Flags: feedback?(nth10sd) → feedback+

I'm still seeing a number of different crashes with the updated patch, though decoder's comment 17 is the most common. Hopefully the others are different manifestations of the same issue. Thanks, that'll help a lot! My local attempts at reproducing have failed so far.

I am wondering if there any one is still working on this bug?!

Hi Thinker! Good to see you here.

(In reply to Thinker Li [:sinker] from comment #22)

I am wondering if there any one is still working on this bug?!

I was intending to get back to this, but haven't made it yet and likely won't for a couple of weeks.

The main things to do here are (1) rebase on top of the removal of undepended strings, which should simplify it somewhat, and (2) fix the remaining issues like comment 17.

Krystal wrote up a nice overview doc at https://docs.google.com/document/d/1CSOJQ_PS0X4rTwH_3qgYYIuXGVUnXoG9miMoK40y8_g/

There was another more detailed doc that we used over the summer to communicate running progress and issues and fixes, but I'm failing to find it right now.

I have rebased the patch to latest m-c. There are two major changes here.

  1. Code blocks dealing undepended strings are removed. I am not sure if it is correct, but I guess undepended strings have been removed from the code base. So, handling it is no more necessary.

  2. Fix the crash mentioned at comment #17. The major problem here is related to NeureryAwareHashMap. It maintains a map, crossZoneStringWrappers, from strings to wrappers. With deduplicating, two or more strings are deduplicated by replacing them with the same string. That means two or more entries in the map will be rekeyed with the same dest key. With the existing impl. of rekey, it will create two or more entries with the same key but different values/wrappers for this case. When doing enumeration, it would return all entries of with the same key. However, doing lookup always return the same entry for a given key. This is why the assertion in Zone::checkStringWrappersAfterMovingGC() fails.

I forgot to mention about the solution of 2 in the last comment.
I check the duplication in NuseryAwareHashMap when doing sweeping, and remove the duplicated ones instead of rekey them.

The open issue here is deduplicating string wrappers. I am not sure if it is worth to do.

Attached patch dedup-string.diff (obsolete) — Splinter Review

fix a minor bug.

Attachment #9123925 - Attachment is obsolete: true

The testcase, deduplicateTenuringStrings.js, fails at calling |ensureFlatString()|. I guess this symbol has gone. My questions are followings.

  • Is there any replacement of |ensureFlatString()|?
  • The tests for undepended strings should be removed, right?
  • Is there any new types of strings that should be tested, but not in the test case yet?

Flat strings are gone now, see bug 1330776. Any string that was previously flat, ought instead be linear now, I think. You might be able to replace with ensureLinear() now. I think undepended strings got removed too, don't remember the bug number, part of the same change I think? There are no new strings types -- we only removed existing ones.

Attached file dedup-string-v2.diff

Fix the test case. Have passed all test cases in js/src/tests & js/src/jit-test.

Attachment #9123936 - Attachment is obsolete: true

I have done tests for this patch to see how much we can gain from this patch.
The test is to scroll the tab of facebook, up and down, until it's js-zone (almost) grows to 30MB.
The results are

  1. Without the patch here, all copies=289942, duped copies=232859, dup size=12821 k, 29.75MB.
  2. With attachment #9123936 [details] [diff] [review], all copies=176201, duped copies=100839, dup size=7154 k, 29.75MB.
  3. With attachment #9123936 [details] [diff] [review] and the patch of reservoir sampling, all copies=177913, duped copies=88304, dup size=4199 k, while the size of all strings is 29.34MB.

Terminologies:

  • All copies here means the number of JSString instances.
  • Duped copies is the number of JSString instances that are duplicates of other strings. For example, there are 10 "ABCDEFG" strings, then it counts 9 (10-1) duped copies.
  • dup size is the size of all duplicates. For example, there are 10 "ABCDEFG". then the dup size of these instances are (10-1) * strlen("ABCDEFG").
  • reservoir sampling picks N instances in total, no master how many instances there is. The reservoir patch here take 1024 samples randomly and detects their duplicates at the mark phase. Once any of them has duplicates, it will be promoted to a duplicate set that is used to dedup during minor GC.

Results:

  1. With attachment #9123936 [details] [diff] [review], it reduces more than 5.5mb of duplicates. (about 18.3% of 30mb)
  2. With attachment #9123936 [details] [diff] [review] and the reservoir sampling patch, it reduces more than 8.5mb of duplicates. (about 28.3% of 30mb)

The reservoir sampling patch will be uploaded later.

Attached patch reservoir.diffSplinter Review

Implement the reservoir sampling. Pick samples at the mark phase.

Another thing to fix. The test is to scroll a facebook tab, until the size of all strings (not js-zone's size) in it's js-zone (almost) grows to 30mb.

Hi Steve,
do you think reservoir sampling is reasonable change?

Flags: needinfo?(sphink)

(In reply to Thinker Li [:sinker] from comment #34)

do you think reservoir sampling is reasonable change?

Sorry for the (long) delay.

I do think reservoir sampling is a good idea. I'd like the nursery-based deduplication to land separately, though. It is a very separable piece, and I think the reservoir sampling has a lot of places where we might want to adjust various details:

  • what should it do on out-of-memory?
  • should it even try to build its data structures when memory use is high?
  • how many reservoirs?
  • would it be good to turn deduplicated tenured strings into atoms?
  • do we need to schedule GCs with the reservoir a little differently so we are more likely to be able to free up some memory before we hit OOM?

Among others. We can always land something that basically works and then worry about tuning later, except we probably wouldn't want to land anything that would crash on OOM. (It would be better to discard all of the deduplication data structures and allow the GC to continue without deduplication.) Still, there are enough details that I would still prefer to split out the nursery deduplication to land first.

What do you think? And thank you very much for doing this.

Flags: needinfo?(sphink) → needinfo?(thinker.li)

I agree that landing separately is better and safer. I would open another bug for reservoir sampling.

Flags: needinfo?(thinker.li)

This patch bases on the work of krystalyang2 with minor bug fixes.
The patch includes three major parts,

  1. mark nursery strings pointed by tenured strings as
    non-deduplicatable,

  2. deduplicate strings when they are moved to tenured from nursery, and

  3. adjust dependent strings to correct it's pointers to the base
    string and external buffer after tenuring.

Assignee: krystalyang2 → sphink
Status: NEW → ASSIGNED
Attachment #9146686 - Attachment description: Bug 1568923 - Nursery string deduplication - trace whole cell buffer first → Bug 1568923 - String deduplication during tenuring. r=jonco
Attachment #9139978 - Attachment is obsolete: true
Attachment #9080802 - Attachment is obsolete: true

I split off the mechanism for disabling/re-enabling nursery strings. If I still want it (it's probably a good thing to have fuzzed, even though we don't implement re-enabling yet), I'll do it in another bug.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla79
Regressions: 1646694
Blocks: 1648173
Blocks: 1648174
Blocks: 1648196
Attachment #9146686 - Attachment is obsolete: true
Pushed by sfink@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2b4e1cece238 String deduplication fixups from review r=jonco
Regressions: 1645415
Attachment #9153261 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: