Closed Bug 787464 Opened 13 years ago Closed 13 years ago

GC: Run spinning balls benchmark overnight

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jonco, Unassigned)

Details

(Whiteboard: [js:t])

Attachments

(3 files, 1 obsolete file)

To try and track our GC performance, and especially the size of pause times, I put together a python/selenium script to run the spinning balls benchmark in the browser and scrape the results. It would be really cool if we could arrange for this to be run overnight so we could track out progress on this. I've attached the script and the results for weekly builds run on MacOS. The results seem to vary quite a lot between builds, and I haven't yet made any attempt to track down why this might be. The graph shows the average of ten runs of the benchmark, with the error bars showing one standard deviation.
Attached image Results for weekly builds on MacOS (obsolete) —
Good idea. Jon, you could file a bug to get this added to Talos. For an example, see bug 767225.
I was thinking something a little more internal to the team - I'm not sure we want to make this a part of the official test runs just yet, at least until we get to the bottom of why the results are changing so much between builds. I'm currently working on getting some more detailed results which I hope will help.
Attachment #657331 - Attachment is obsolete: true
(In reply to Jon Coppeard (:jonco) from comment #4) > Created attachment 659733 [details] > Results for daily builds on MacOS Interesting. Do you know what happened (if anything) around Aug 14?
Whiteboard: [js:t]
Here's what landed in that range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=ea4dc0320767&tochange=b441413e4c2d The only GC-related change is some telemetry stuff, which shouldn't cause any trouble. If I had to guess, I would blame bug 539356, which is a huge graphics-related change. Jon, if you have time, could you directly test before and after that change landed? I get uniformly horrible scores on my machine throughout the entire range.
I removed the drawing of the balls from the benchmark to generated the second graph, to focus on gc rather than graphics performance. This gives much better results, in that the score is increasing over time.
To run the benchmark without graphics, I commented out the line: scene.draw(); from render() in v.js.
Is higher better? Running locally I get 18.
(In reply to Terrence Cole [:terrence] from comment #10) Higher is better. The score calculated by: score = number of frames * 1000 / sum ( pause time ^ 2 ) So it is very sensitive to even a few long pauses, if the machine is doing something else at the same time for example.
I the end I came to the conclusion that this benchmark is just too sensitive to give useful results. Also it's focused on a single compartment use case when a lot of what we're doing is trying to make this better for the situation where we have lots of compartments. So I'm closing this.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: