636096 - really slow on earthmodel testcase/benchmark

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Reporter

Description

•

14 years ago

From a WebGL mailing list post: ---- Using JavaScript Typed Arrays: http://visual-demos.dev.concord.org/seasons/earth/model2d.html Using regular JavaScript Arrays: http://visual-demos.dev.concord.org/seasons/earth/model2d-reg-arrays.html steps per second browser/version Regular Arrays Typed Arrays ------------------------------------------------------------------------ Minefield 4.0b12pre (2011-02-22): 17.2 13.3 WebKit 79303: 15.5 27.5 Chrome 9.0.597.102 67.6 42.2 Chrome 10.0.648.82 beta 109.8 38.4 I ran my tests on a MacBook Pro, Mac OS X 10.6.6, Intel Core i& 2.66 GHz The code is available here: https://github.com/stepheneb/seasons More specifically: https://github.com/stepheneb/seasons/blob/master/earth/model2d-reg-arrays.html https://github.com/stepheneb/seasons/blob/master/earth/model2d.html --- Note that having typed arrays be slower is suspect to begin with...

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Reporter

Comment 1

•

14 years ago

Looking into this. First off, the testcase uses: paintInterval = setInterval("renderModelStep()", 0); for benchmarking. additionally, inside renderModelStep(), it sets innerHTML per step to a string. None of that is helping us. Changing things to use postMessage didn't change the numbers much. I wrote a simple benchmark function in model2d.html: function do_benchmark() { var model = new model2d.Model2D(); var t0 = (new Date()).getTime(); for (var i = 0; i < 100; ++i) { model2d.addHotSpot(model, 50.0); model.nextStep(); } var t1 = (new Date()).getTime(); step_count.innerHTML = '100 steps: ' + (t1 - t0) + 'ms'; } tie it to a button somewhere. This gives me 3000 in Minefield, and 2500 in Chrome 10 -- the 2500 number is ~40fps, which fits well with 38fps above (because it's not doing setInterval or the innerHTML set inside the hot loop). Minefield is at ~33fps, which is over double what I get with the original benchmark (~15/~16), leading me to believe that the innerHTML set is killing us the most. Ok, now on to why we're at 3000. addHotspot() does a very simple loop with some misc math and array setting. The meat is in nextStep(). Note that sunShine() calls shootAtAngle() which has an early return before all the meat -- e.g. no Photon constructors in loops etc. A lot of the meat is in fluidsolver.solve() -- commenting that out, I get 924ms with Minefield, and 1168ms with Chrome10 (that is, we end up being faster, and 2/3 of the time is in fluidsolver). Both those functions seem to be largely math and array sets/gets, and some property gets from |this|. I don't have a debug build, but I wonder if we're just aborting due to too long of a trace?

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Reporter

Comment 2

•

14 years ago

Attached file standalone benchmark — Details

standalone benchmark. can be trivially turned into shell testcase if needed.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Reporter

Comment 3

•

14 years ago

(Also, our first run is generally ridiculously slow -- more indication of too-long tracer abort?)

Jan de Mooij [:jandem]

Comment 4

•

14 years ago

Framerate goes from 14 to 23 after enabling methodjit_always (or disabling the tracer). Maybe a regression from bug 631951. Seems bad.

Jan de Mooij [:jandem]

Comment 5

•

14 years ago

In the shell this is even more dramatic: -m -j : 100 steps: 6187ms (16 fps) -m -j -a: 100 steps: 3262ms (31 fps)

Jan de Mooij [:jandem]

Comment 6

•

14 years ago

Two problems: 1) JM does not compile JSOP_CASE. In function FluidSolver2D.prototype.applyBuoyancy we have cases like this: case model2d.BUOYANCY_AVERAGE_COLUMN. This is bug 628073. Working around this wins 6 fps with -m only (from 31 to 37) 2) This JM abort makes mjp much slower than mjpa. Without the abort there is not much difference.

Jan de Mooij [:jandem]

Comment 7

•

14 years ago

Attached file Modified shell test case (obsolete) — Details

For the attached file: -m -j -p : 38 fps -m -j -p -a: 93 fps After commenting out line 939: -m -j -p : 93 fps -m -j -p -a: 93 fps Does not slow down when working around the JM abort by changing model2d.BUOYANCY_AVERAGE_ALL to 0 and model2d.BUOYANCY_AVERAGE_COLUMN to 1.

David Mandelin [:dmandelin]

Updated

•

14 years ago

Blocks: WebJSPerf

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 8

•

14 years ago

I noticed that, in Jan's tests case, there is also a huge 32-bit/64-bit difference. When running it with -m -j -p -a, I get these numbers: 32-bit: 44fps 64-bit: 98fps These are for the same machine, of course. Normally we're faster on 32-bit. I guess we generate better typed array code on 64-bit? Or maybe it's register allocation? I'll look into the -a difference, which seems unrelated.

David Mandelin [:dmandelin]

Updated

•

14 years ago

Depends on: 636219

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 9

•

14 years ago

I filed bug 636219 for the -a/no -a issue. There's a patch for it in that bug.

No longer depends on: 636219

David Mandelin [:dmandelin]

Updated

•

14 years ago

Depends on: 636219

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Reporter

Comment 10

•

14 years ago

I like 93fps! How do we get 93fps?

Mike Shaver (:shaver emeritus)

Comment 11

•

14 years ago

Can you confirm that bill's fix in bug 636219 gives us the good FPS?

Jan de Mooij [:jandem]

Comment 12

•

14 years ago

(In reply to comment #10) > I like 93fps! How do we get 93fps? I hacked the test case a bit to locate the problem.. But still, the patch should help a lot.

Boris Zbarsky [:bzbarsky]

Updated

•

14 years ago

Depends on: 628073

Boris Zbarsky [:bzbarsky]

Comment 13

•

14 years ago

Do we need a separate bug on the innerHTML issue? It shouldn't generally take 16ms to set innerHTML once!

Jan de Mooij [:jandem]

Comment 14

•

14 years ago

Attached file Shell version of benchmark — Details

Bug 636219 has been fixed, attaching unmodified shell test case. Bug 628073 will win another 23%

Attachment #514444 - Attachment is obsolete: true

Jan de Mooij [:jandem]

Updated

•

14 years ago

Blocks: 467263

Stephen Bannasch

Comment 15

•

14 years ago

On my MacBook Pro the speed I get running Vladimir's stand-alone benchmark (https://bug636096.bugzilla.mozilla.org/attachment.cgi?id=514432) has increased from 22 to 31 model steps per second.

Stephen Bannasch

Comment 16

•

14 years ago

Interesting that after updating to the latest Minefield Vladimir's benchmark slowed by 25% on my MacBook Pro 10.6.7 https://bug636096.bugzilla.mozilla.org/attachment.cgi?id=514432 Minefield 4.0pre23 100 steps: 2452ms (41 fps) Minefield 6.0a1 (2011-04-21) 100 steps: 3114ms (32 fps)

David Mandelin [:dmandelin]

Comment 17

•

14 years ago

(In reply to comment #16) > Interesting that after updating to the latest Minefield Vladimir's benchmark > slowed by 25% on my MacBook Pro 10.6.7 > > https://bug636096.bugzilla.mozilla.org/attachment.cgi?id=514432 > > Minefield 4.0pre23 100 steps: 2452ms (41 fps) > Minefield 6.0a1 (2011-04-21) 100 steps: 3114ms (32 fps) Would you be able to bisect that using Tracemonkey nightly builds?

Boris Zbarsky [:bzbarsky]

Comment 18

•

14 years ago

Or even just m-c builds; I'm seeing no perf change between 2011-04-01 and now, and both are 30% faster than 2011-03-01 builds.

Jan de Mooij [:jandem]

Comment 19

•

14 years ago

TI does really well here. Just added the regular array version to awfy-assorted to make sure it won't regress. I had to reduce the number of steps to make it run faster, does not otherwise affect the results. -m -n : 157 ms -m : 770 ms -m -j -p: 770 ms d8 : 814 ms -j : 3480 ms

Till Schneidereit [:till]

Comment 20

•

11 years ago

We're not in tracer-territory, speed-wise, but we're a lot faster than the competition: Web: Safari 6.0.5: 100 steps: 1922ms (52 fps) Chrome 30.0.1568.2 dev: 100 steps: 521ms (192 fps) Nightly 2013-07-22: 100 steps: 382ms (262 fps) Shell: jsc (from above Safari): 100 steps: 1988ms (50 fps) d8 3.18.5: 100 steps: 369ms (271 fps) SpiderMonkey: 100 steps: 362ms (276 fps) I'm going to declare victory.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → WORKSFORME

Stephen Bannasch

Comment 21

•

11 years ago

We've integrated my initial spike developing a JavaScript version of Energy2D into our Lab framework: http://lab.concord.org Here are two Interactives running different Energy2D models in the Interactive Browser page. Benard Cell (mainly convective solver): http://lab.concord.org/interactives.html#interactives/energy2d/imported/benard-cell.json Material Conduction (only conductive solver) http://lab.concord.org/interactives.html#interactives/energy2d/htb/S3A1.json Near the bottom of the Interactive Browser page is a tool for running benchmarks for the selected Interactive. The benchmark tool produces data for rate in steps/s for graphics (just rendering), model-only, model+graphics, and fps (model+graphics integratinged with the browser anim frame callback). There is one additional column "fps-webgl". We've also implemented the physics solvers and renderers in WebGL. Results: Chrome 30.0.1572.0 is about 4-5 times faster than Firefox Nightly 25.0 (2013-07-23) http://lab.concord.org/interactives.html#interactives/energy2d/imported/benard-cell.json Web: browser model (steps/s) fps fps-webgl ---------------------------------------------------------------------- Chrome 30.0.1572.0 canary: 30.6 27.5 25.8 Nightly 25.0 (2013-07-23): 4.8 5.3 19.3 http://lab.concord.org/interactives.html#interactives/energy2d/htb/S3A1.json Web: browser model (steps/s) fps fps-webgl ---------------------------------------------------------------------- Chrome 30.0.1572.0 canary: 91.3 46.3 44.5 Nightly 25.0 (2013-07-23): 22.3 14.5 24.5 NOTE: these tests were run on a 2010 MacBook Pro. Chrome 30.0.1572.0 no longer supports WebGL on this computer so the fps-webgl data reported for Chrome are actually just a second test for the JavaScript solvers and renderers. Should I make a new performance issue for these results?

Boris Zbarsky [:bzbarsky]

Comment 22

•

11 years ago

Yes, please.

Till Schneidereit [:till]

Comment 23

•

11 years ago

Interesting, thanks for the update. It would be great if you could file two new bugs: one for the webgl part, and one for the js part. These are most likely interesting to different teams within Mozilla. (Please CC me.) As a quick note, my results on a 2012 rMBP@2.7Ghz are quite different from yours: benard-cell: browser graphics model model+graphics fps fps-webgl ---------------------------------------------------------------------------- Chrome 30.0.1568.2 dev: 4173.9 44.9 45.3 20.5 58.3 Nightly 25.0 (2013-07-23): 3125.0 40.3 40.4 34.5 60.0 S3A1: browser graphics model model+graphics fps fps-webgl ---------------------------------------------------------------------------- Chrome 30.0.1568.2 dev: 2779.9 138.3 143.7 45.3 57.0 Nightly 25.0 (2013-07-23): 1398.5 121.9 111.7 34.5 60.0 I find it a bit puzzling that we're slower at graphics, model and model+graphics, but get higher fps. I don't fully understand what's being tested, though, so that might have an easy explanation.

standalone benchmark 14 years ago Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond) 59.60 KB, text/html		Details
Modified shell test case 14 years ago Jan de Mooij [:jandem] 48.31 KB, application/x-javascript		Details
Shell version of benchmark 14 years ago Jan de Mooij [:jandem] 59.55 KB, application/x-javascript		Details