Closed Bug 1100652 Opened 5 years ago Closed 5 years ago

Fix the StoreBuffer

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla36

People

(Reporter: terrence, Assigned: terrence)

References

(Blocks 1 open bug)

Details

Crash Data

Attachments

(1 file, 1 obsolete file)

Attached patch simplify_storebuffer-v0.diff (obsolete) — Splinter Review
I am incredibly dumb.

When I initially designed the StoreBuffer, we did not allow the GC to OOM. Thus, I ended up with a simple ungrowable buffer; a complicated O(nlogn) algorithm to keep the size tractable between GCs; and another complicated pile of heuristics to keep this O(nlogn) algorithm from eating the world. This was not terribly successful, sadly, as per bug 996813 and others.

Eventually we relaxed the the no-OOM requirement for other reasons. This allowed us to re-think the StoreBuffer a bit and we replaced the O(nlogn) algorithm with an O(n)  algorithm. This uses a HashSet to filter out duplicate entries. We still, naturally, need the pile of heuristics to tell us when to do this dedup so that we can keep the canonical linear buffer from getting too big, but also keep from re-doing the same work over and over.

This is simpler than the previous scheme, but seriously, fundamentally stupid. Like facepalm stupid.

We are putting every entry that came into the store buffer into a HashSet, then throwing the set away and keeping the linear buffer, then re-doing that work every time the linear buffer gets full. Why not just keep the HashSet around and ditch the buffer? Well, uh....

At no cost whatsoever, we can ditch all of our complex heuristics, our frequent OOMS from transient allocations for the dedup set, the long hangs from compacting the store buffer. Everything, basically, except for the HashSet itself. That said, in this patch, I have kept a trivial linear buffer in front of the set insertions to make the eventual jit implementation easier and to keep the largish HashSet insertion logic out-of-line. Otherwise though, all gone.

In a follow-up, I'd like to also reimplement GenericBuffer as well on top of realloc and ditch all of the weird LifoAlloc nonsense. That can come later though.
Attachment #8524158 - Flags: review?(sphink)
Actually, jitting is simpler if the buffer uses an insertion pointer instead of an offset.
Attachment #8524158 - Attachment is obsolete: true
Attachment #8524158 - Flags: review?(sphink)
Attachment #8524228 - Flags: review?(sphink)
Comment on attachment 8524228 [details] [diff] [review]
simplify_storebuffer-v1.diff

Review of attachment 8524228 [details] [diff] [review]:
-----------------------------------------------------------------

Um, yeah. No further comment. New code looks good.
Attachment #8524228 - Flags: review?(sphink) → review+
Uh, let's try that again, only this time with /all/ the hunks I intended:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=79091451eb13
(In reply to Terrence Cole [:terrence] from comment #0)
Nice, this is a whole lot simpler now!
Thanks Wes! Sorry for the trouble. This test also failed on Try, but I totally overlooked it: it didn't repro locally and I had similar bustage on tip under a different Try run, so assumed it was the same thing.
Flags: needinfo?(terrence)
There appears to be a case where we can clear without ever having marked the buffer. In this case we need to clear the linear array as well as the set. I've added a clearBuffer that resets the pointer and poisons in debug mode and called this from all the places we need to reset insert_.

https://hg.mozilla.org/integration/mozilla-inbound/rev/5f35b06c4c8b
https://hg.mozilla.org/mozilla-central/rev/5f35b06c4c8b
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla36
I see a bunch of OOM reports on beta at various flavors of js::gc::StoreBuffer::MonoTypeBuffer<T>::put. Do you expect this patch to fix/reduce those OOMs?
Flags: needinfo?(terrence)
It depends on the program being evaluated. The existing solution is a segmented array which intermittently allocates a hashset for all edges to dedup and shrink the original array. The new solution is just a hashset.

So for the before case:
  1x required entries with growth based on seen entries and occasional spikes adding 2x required entries to this.

For the new case:
  Flat 2x required entries.

Although the base memory requirements are higher with a hashset, this is mitigated heavily by three things. First, the set only grows with actual new edges, not proportional to observed edges. Since repeated edges are extremely common, this should actually lower average utilization. Secondly, it removes the temporary spikes to allocate a new hashset when the array is nearly full. Thirdly, the allocations are capped and not freed, so the currently OOMy allocations are likely to be made up front and much less likely to be made when the system is near the limit.

In summary, yes, I think it will reduce these OOMs, but it's not a given. It will be interesting to keep an eye on it for the next few weeks.
Flags: needinfo?(terrence)
Also, this code should be easy and safe to uplift if it does a good job of fixing the OOMs.
Crash Signature: [@ OOM | unknown | js::CrashAtUnhandlableOOM(char const*) | js::gc::StoreBuffer::MonoTypeBuffer<js::gc::StoreBuffer::WholeCellEdges>::put(js::gc::StoreBuffer*, js::gc::StoreBuffer::WholeCellEdges const&)] [@ OOM | unknown | js::CrashAtUnhandlableOOM(char…
I don't see an improvement on nightly (it's actually slightly worse) but the volume is so low to begin with that it's hard to say.
Duplicate of this bug: 988369
(In reply to dmajor (away/busy) from comment #14)
> I don't see an improvement on nightly (it's actually slightly worse) but the
> volume is so low to begin with that it's hard to say.

I guess that means we normally get wedged at some specific workload, rather than building up slowly in a session. I guess that's probably what we should have expected.
You need to log in before you can comment on or make changes to this bug.