Closed Bug 1112278 Opened 5 years ago Closed 5 years ago

Build a GC micro-benchmark harness for regression testing and optimization

Categories

(Core :: JavaScript: GC, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla37

People

(Reporter: terrence, Assigned: terrence)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

Generally, micro-benchmarks are bad, however, we still don't have a "real" GC benchmark, largely because building a real GC benchmark is unbelievably hard. I think we're at the point now where perfect is the enemy of good and we need something (anything really) to act as a guide -- reasonable or not -- for our efforts.

There are two aspects of GC testing that make this approach not as terrible as it is in general:

1) There are only a fixed (if pretty large) number of ways to trigger a GC, so we should be able to get full enough coverage that we might end up being good at "real" workloads as well.

2) Everyone else has the same problem. Most of the time when people complain about firefox's GC being terrible, they're using some horrid micro-benchmark that is totally unrepresentative of "real" apps. We get enough of these that I'm starting to think that "crap gc benchmarks" is itself a real workload.
This is a heavily modified version of Bill's JS pause graph. It has been extended with the ability to load multiple tests, tweak the test parameters, and to run one or all tests over a fixed interval. There is currently only the original workload and "no garbage" to choose from and the tests do not report any results yet. I figured I might as well get some early feedback and get it in tree for others to play with.

It probably also needs a README: I've been serving it so far with `python -m SimpleHTTPServer`. Adding a new test is best done by copying one of the js files in benchmarks, modifying to taste, and adding it to the list of scripts loaded at the top of index.html.
Attachment #8538124 - Flags: review?(sphink)
I'm not 100% sure of the scoring system yet -- millisseconds missed per second -- but the histogram output looks like it's going to be quite helpful.
Attachment #8538788 - Flags: review?(sphink)
Comment on attachment 8538124 [details] [diff] [review]
gc-ubench-v0.diff

Review of attachment 8538124 [details] [diff] [review]:
-----------------------------------------------------------------

Hey, it's something.
Attachment #8538124 - Flags: review?(sphink) → review+
Comment on attachment 8538788 [details] [diff] [review]
add_scoring-v0.diff

Review of attachment 8538788 [details] [diff] [review]:
-----------------------------------------------------------------

I'm just going to r+ this and play with it later. Seems like it's more important to just play around with this without worrying about review overhead for a while.
Attachment #8538788 - Flags: review?(sphink) → review+
(In reply to Steve Fink [:sfink, :s:] from comment #4)
> Comment on attachment 8538788 [details] [diff] [review]
> add_scoring-v0.diff
> 
> Review of attachment 8538788 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I'm just going to r+ this and play with it later. Seems like it's more
> important to just play around with this without worrying about review
> overhead for a while.

I strongly agree. Let me get this in so that we can all play with it equally.
4 more simple tests to start things off:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6cd496d732f7
https://hg.mozilla.org/mozilla-central/rev/30f9b7719d69
https://hg.mozilla.org/mozilla-central/rev/6cd496d732f7
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla37
You need to log in before you can comment on or make changes to this bug.