Closed
Bug 1242691
Opened 8 years ago
Closed 8 years ago
Investigate why CodeMirror frequently has ~20ms long minor GCs because of FULL_STORE_BUFFER
Categories
(Core :: JavaScript: GC, defect)
Core
JavaScript: GC
Tracking
()
RESOLVED
FIXED
mozilla47
Tracking | Status | |
---|---|---|
firefox47 | --- | fixed |
People
(Reporter: fitzgen, Assigned: fitzgen)
References
Details
Attachments
(1 file, 2 obsolete files)
4.33 KB,
patch
|
fitzgen
:
review+
|
Details | Diff | Splinter Review |
Tweaked MonoTypeBuffer::MaxEntries to see if that would lower the length of gc pauses, but it didn't help much: """ Loading 50MB source text into CodeMirror, max minor gc pause (microseconds, lower is better) current 48 * 1024 / sizeof(T) = 24775 2 * 1024 / sizeof(T) = 23029 4 * 1024 / sizeof(T) = 25364 8 * 1024 / sizeof(T) = 22432 16 * 1024 / sizeof(T) = 22230 32 * 1024 / sizeof(T) = 23001 """ Then I dumped out the number of slots in the store buffer for each minor GC (this is mostly from Array#splice which uses the slots buffer not the whole cell buffer), and found that it was consistently in the same ballpark, regardless the value of MaxEntries. Current theory is that we aren't reaching a safepoint in time, and maybe we should check in Array#splice if we should minor GC because the slots buffer might be full.
Assignee | ||
Comment 1•8 years ago
|
||
I am consistently seeing only 245 slots edges in the buffer, but the current MaxEntries is 6144. Added this to the bottom of array_splice_impl: cx->runtime()->gc.gcIfRequested(cx); But I'm not seeing any differences, which makes sense if we are only putting 245 slots in the store buffer. But that doesn't make sense since my gcreason always seems to be FULL_STORE_BUFFER... So I set a breakpoint in StoreBuffer::setAboutToOverflow, set breakpoint commands for it to dump the 5 youngest frames and then continue, and then ran the code mirror test case again. *We aren't ever hitting this breakpoint when running the test case.* Yes, we hit it on startup and shutdown, but never when running the actual test case. Digging in some more to see what is actually triggering the minor collections...
Assignee | ||
Comment 2•8 years ago
|
||
Frequently, the mutator will modify nearly the same elements of an object repeatedly. However, because the set of elements aren't exactly the same, the single item buffer in front of MonoTypeBuffer can't de-duplicate these edges. For example, in one CodeMirror test case, we would add 245 SlotsEdges entries for almost the same 50,000 elements in an object, causing us to trace these same 50,000 elements 245 times! This patch makes `js::gc::StoreBuffer::putSlot` check to see if the new range is overlapping with the last range added, and if so, merge the ranges rather than adding partially duplicated elements into the store buffer. This gives a 1000 point increase on Octane's pdf.js subsuite locally. The CodeMirror test case mentioned above goes from ~10 seconds execution time to ~1.5 seconds, with the max minor gc pause dropping from up to 40 milliseconds, down to 4 milliseconds.
Attachment #8713260 -
Flags: review?(terrence)
Comment 3•8 years ago
|
||
Comment on attachment 8713260 [details] [diff] [review] Merge overlapping SlotsEdges in the store buffer Review of attachment 8713260 [details] [diff] [review]: ----------------------------------------------------------------- \o/ ::: js/src/gc/StoreBuffer.h @@ +296,5 @@ > + // false if they do not overlap. > + bool overlaps(const SlotsEdge& other) const { > + // TODO FITZGEN: should we just do objectAndKind_ != > + // other.objectAndKind_ here or can we assume the compiler will see > + // through all this? I don't think the perf is going to matter that much, but I would still just compare objectAndKind_: I think the naming makes it clear enough what's being compared. @@ +303,5 @@ > + } > + auto end = start_ + count_; > + auto otherEnd = other.start_ + other.count_; > + return object() == other.object() && > + kind() == other.kind() && You are comparing object and kind a second time? I'd think you could skip this as they will always be true after the test above.
Attachment #8713260 -
Flags: review?(terrence) → review+
Assignee | ||
Comment 4•8 years ago
|
||
Addressed review comments (d'oh!) and made it so that adjacent ranges are considered overlapping so that a series of writes to element 0, then 1, then 2, ..., then N are coalesced into a single range [0, N] in the store buffer.
Attachment #8713326 -
Flags: review+
Assignee | ||
Updated•8 years ago
|
Attachment #8713260 -
Attachment is obsolete: true
Assignee | ||
Comment 5•8 years ago
|
||
Try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=15074bf694e7
Assignee | ||
Updated•8 years ago
|
Attachment #8713326 -
Attachment is obsolete: true
Assignee | ||
Comment 7•8 years ago
|
||
New try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=931905133c68
Assignee | ||
Updated•8 years ago
|
Keywords: checkin-needed
Comment 9•8 years ago
|
||
(In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #2) > This gives a 1000 point increase on Octane's pdf.js subsuite locally. The > CodeMirror test case mentioned above goes from ~10 seconds execution time to > ~1.5 seconds, with the max minor gc pause dropping from up to 40 > milliseconds, > down to 4 milliseconds. Can you confirm this is still the case with the newest patch? I didn't see a bump on AWFY on x86 and x64 shell. Which surprised me since there is only 50 points noise. As a result a 1000 point increase (doubling) would definitely be visible?
Comment 10•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/ba837846ee44
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Assignee | ||
Comment 11•8 years ago
|
||
(In reply to Hannes Verschore [:h4writer] from comment #9) > (In reply to Nick Fitzgerald [:fitzgen] [⏰PST; UTC-8] from comment #2) > > This gives a 1000 point increase on Octane's pdf.js subsuite locally. The > > CodeMirror test case mentioned above goes from ~10 seconds execution time to > > ~1.5 seconds, with the max minor gc pause dropping from up to 40 > > milliseconds, > > down to 4 milliseconds. > > Can you confirm this is still the case with the newest patch? > I didn't see a bump on AWFY on x86 and x64 shell. Which surprised me since > there is only 50 points noise. As a result a 1000 point increase (doubling) > would definitely be visible? ni myself to figure this out -- I didn't re-test the updated patches...
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(nfitzgerald)
Comment 12•8 years ago
|
||
Comment on attachment 8713392 [details] [diff] [review] Merge overlapping SlotsEdges in the store buffer Review of attachment 8713392 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/gc/StoreBuffer.h @@ +307,5 @@ > + auto start = start_ - 1; > + > + auto otherEnd = other.start_ + other.count_; > + return (start <= other.start_ && other.start_ <= end) || > + (start <= otherEnd && otherEnd <= end); I think this could be simplified to: return !(otherEnd < start || end < other.start); I don't understand why end needs to be _start + count_ + 1, either with or without my suggestion.
Assignee | ||
Comment 13•8 years ago
|
||
Just realized that I was testing this on top of the patch in bug 1233857, so that might be contributing to the speedup I observed when combined with this one.
Assignee | ||
Comment 14•8 years ago
|
||
Ran octane with all variations of this patch and WIP patch from bug 1233857: M = Adding elements edges write barrier into the MIR (patch from bug 1233857) R = Merging SlotsEdge ranges (this patch) M+R R (none) M pdf.js subscore 19951 16858 17991 20109 total score 32729 32358 32662 33127 Conclusions: I need to have larger sample sizes; I think this is all noise. Hopefully once I stop using shared js for my shell builds, I will be able to run octane under the shell rather than in the browser and gathering more samples will get easier. That said, the CodeMirror testcase's speedup stands with this patch, and that alone is super valuable IMO.
Flags: needinfo?(nfitzgerald)
Comment 16•8 years ago
|
||
Should this have the "perf" key word?
You need to log in
before you can comment on or make changes to this bug.
Description
•