really slow on earthmodel testcase/benchmark

RESOLVED WORKSFORME

Status

()

defect
RESOLVED WORKSFORME
9 years ago
6 years ago

People

(Reporter: vlad, Unassigned)

Tracking

(Blocks 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 1 obsolete attachment)

From a WebGL mailing list post:

----
Using JavaScript Typed Arrays: 
  http://visual-demos.dev.concord.org/seasons/earth/model2d.html

Using regular JavaScript Arrays:
  http://visual-demos.dev.concord.org/seasons/earth/model2d-reg-arrays.html

                                      steps per second
browser/version                       Regular Arrays       Typed Arrays
------------------------------------------------------------------------
Minefield 4.0b12pre (2011-02-22):     17.2                  13.3
WebKit 79303:                         15.5                  27.5
Chrome 9.0.597.102                   67.6                  42.2
Chrome 10.0.648.82 beta              109.8                  38.4

I ran my tests on a MacBook Pro, Mac OS X 10.6.6, Intel Core i& 2.66 GHz

The code is available here: https://github.com/stepheneb/seasons

More specifically:
  https://github.com/stepheneb/seasons/blob/master/earth/model2d-reg-arrays.html
  https://github.com/stepheneb/seasons/blob/master/earth/model2d.html
---

Note that having typed arrays be slower is suspect to begin with...
Looking into this.  First off, the testcase uses:

    paintInterval = setInterval("renderModelStep()", 0);

for benchmarking.  additionally, inside renderModelStep(), it sets innerHTML per step to a string.  None of that is helping us.  Changing things to use postMessage didn't change the numbers much.  I wrote a simple benchmark function in model2d.html:

	  function do_benchmark() {
	      var model = new model2d.Model2D();

	      var t0 = (new Date()).getTime();
	      for (var i = 0; i < 100; ++i) {
		  model2d.addHotSpot(model, 50.0);
		  model.nextStep();
	      }
	      var t1 = (new Date()).getTime();

	      step_count.innerHTML = '100 steps: ' + (t1 - t0) + 'ms';
	  }

tie it to a button somewhere.   This gives me 3000 in Minefield, and 2500 in Chrome 10 -- the 2500 number is ~40fps, which fits well with 38fps above (because it's not doing setInterval or the innerHTML set inside the hot loop).  Minefield is at ~33fps, which is over double what I get with the original benchmark (~15/~16), leading me to believe that the innerHTML set is killing us the most.

Ok, now on to why we're at 3000.  addHotspot() does a very simple loop with some misc math and array setting.  The meat is in nextStep().  Note that sunShine() calls shootAtAngle() which has an early return before all the meat -- e.g. no Photon constructors in loops etc.

A lot of the meat is in fluidsolver.solve() -- commenting that out, I get 924ms with Minefield, and 1168ms with Chrome10 (that is, we end up being faster, and 2/3 of the time is in fluidsolver).

Both those functions seem to be largely math and array sets/gets, and some property gets from |this|.  I don't have a debug build, but I wonder if we're just aborting due to too long of a trace?
standalone benchmark.  can be trivially turned into shell testcase if needed.
(Also, our first run is generally ridiculously slow -- more indication of too-long tracer abort?)
Framerate goes from 14 to 23 after enabling methodjit_always (or disabling the tracer). Maybe a regression from bug 631951. Seems bad.
In the shell this is even more dramatic:

-m -j   : 100 steps: 6187ms (16 fps)
-m -j -a: 100 steps: 3262ms (31 fps)
Two problems:

1) JM does not compile JSOP_CASE. In function FluidSolver2D.prototype.applyBuoyancy we have cases like this: case model2d.BUOYANCY_AVERAGE_COLUMN. This is bug 628073.

Working around this wins 6 fps with -m only (from 31 to 37)

2) This JM abort makes mjp much slower than mjpa. Without the abort there is not much difference.
Posted file Modified shell test case (obsolete) —
For the attached file:

-m -j -p   : 38 fps
-m -j -p -a: 93 fps

After commenting out line 939:

-m -j -p   : 93 fps
-m -j -p -a: 93 fps

Does not slow down when working around the JM abort by changing model2d.BUOYANCY_AVERAGE_ALL to 0 and model2d.BUOYANCY_AVERAGE_COLUMN to 1.
I noticed that, in Jan's tests case, there is also a huge 32-bit/64-bit difference. When running it with -m -j -p -a, I get these numbers:
  32-bit: 44fps
  64-bit: 98fps
These are for the same machine, of course. Normally we're faster on 32-bit. I guess we generate better typed array code on 64-bit? Or maybe it's register allocation?

I'll look into the -a difference, which seems unrelated.
Depends on: 636219
I filed bug 636219 for the -a/no -a issue. There's a patch for it in that bug.
No longer depends on: 636219
Depends on: 636219
I like 93fps! How do we get 93fps?
Can you confirm that bill's fix in bug 636219 gives us the good FPS?
(In reply to comment #10)
> I like 93fps! How do we get 93fps?

I hacked the test case a bit to locate the problem.. But still, the patch should help a lot.
Do we need a separate bug on the innerHTML issue?  It shouldn't generally take 16ms to set innerHTML once!
Bug 636219 has been fixed, attaching unmodified shell test case. Bug 628073 will win another 23%
Attachment #514444 - Attachment is obsolete: true
Blocks: 467263
On my MacBook Pro the speed I get running Vladimir's stand-alone benchmark (https://bug636096.bugzilla.mozilla.org/attachment.cgi?id=514432) has increased from 22 to 31 model steps per second.
Interesting that after updating to the latest Minefield Vladimir's benchmark  slowed by 25% on my MacBook Pro 10.6.7

https://bug636096.bugzilla.mozilla.org/attachment.cgi?id=514432

Minefield 4.0pre23                      100 steps: 2452ms (41 fps)
Minefield 6.0a1 (2011-04-21) 100 steps: 3114ms (32 fps)
(In reply to comment #16)
> Interesting that after updating to the latest Minefield Vladimir's benchmark 
> slowed by 25% on my MacBook Pro 10.6.7
> 
> https://bug636096.bugzilla.mozilla.org/attachment.cgi?id=514432
> 
> Minefield 4.0pre23                      100 steps: 2452ms (41 fps)
> Minefield 6.0a1 (2011-04-21) 100 steps: 3114ms (32 fps)

Would you be able to bisect that using Tracemonkey nightly builds?
Or even just m-c builds; I'm seeing no perf change between 2011-04-01 and now, and both are 30% faster than 2011-03-01 builds.
TI does really well here. Just added the regular array version to awfy-assorted to make sure it won't regress. I had to reduce the number of steps to make it run faster, does not otherwise affect the results.

-m -n   :  157 ms
-m      :  770 ms
-m -j -p:  770 ms
d8      :  814 ms
-j      : 3480 ms
We're not in tracer-territory, speed-wise, but we're a lot faster than the competition:

Web:
Safari 6.0.5:            100 steps: 1922ms (52 fps)
Chrome 30.0.1568.2 dev:  100 steps: 521ms (192 fps)
Nightly 2013-07-22:      100 steps: 382ms (262 fps)

Shell:
jsc (from above Safari): 100 steps: 1988ms (50 fps)
d8 3.18.5:               100 steps: 369ms (271 fps)
SpiderMonkey:            100 steps: 362ms (276 fps)


I'm going to declare victory.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
We've integrated my initial spike developing a JavaScript version of Energy2D into our Lab framework: http://lab.concord.org

Here are two Interactives running different Energy2D models in the Interactive Browser page.

Benard Cell (mainly convective solver):
http://lab.concord.org/interactives.html#interactives/energy2d/imported/benard-cell.json

Material Conduction (only conductive solver)
http://lab.concord.org/interactives.html#interactives/energy2d/htb/S3A1.json

Near the bottom of the Interactive Browser page is a tool for running benchmarks for the selected Interactive.

The benchmark tool produces data for rate in steps/s for graphics (just rendering), model-only, model+graphics, and fps (model+graphics integratinged with the browser anim frame callback). There is one additional column "fps-webgl". We've also implemented the physics solvers and renderers in WebGL.

Results: Chrome 30.0.1572.0 is about 4-5 times faster than Firefox Nightly 25.0 (2013-07-23)

http://lab.concord.org/interactives.html#interactives/energy2d/imported/benard-cell.json

Web:

browser                      model (steps/s)    fps      fps-webgl
----------------------------------------------------------------------
Chrome 30.0.1572.0 canary:   30.6               27.5     25.8
Nightly 25.0 (2013-07-23):    4.8                5.3     19.3

http://lab.concord.org/interactives.html#interactives/energy2d/htb/S3A1.json

Web:

browser                      model (steps/s)    fps      fps-webgl
----------------------------------------------------------------------
Chrome 30.0.1572.0 canary:   91.3               46.3     44.5
Nightly 25.0 (2013-07-23):   22.3               14.5     24.5

NOTE: these tests were run on a 2010 MacBook Pro. Chrome 30.0.1572.0 no longer supports WebGL on this computer so the fps-webgl data reported for Chrome are actually just a second test for the JavaScript solvers and renderers.

Should I make a new performance issue for these results?
Interesting, thanks for the update. It would be great if you could file two new bugs: one for the webgl part, and one for the js part. These are most likely interesting to different teams within Mozilla. (Please CC me.)

As a quick note, my results on a 2012 rMBP@2.7Ghz are quite different from yours:

benard-cell:
browser                      graphics  model  model+graphics  fps  fps-webgl
----------------------------------------------------------------------------
Chrome 30.0.1568.2 dev:      4173.9    44.9   45.3            20.5     58.3
Nightly 25.0 (2013-07-23):   3125.0    40.3   40.4            34.5     60.0


S3A1:
browser                      graphics  model  model+graphics  fps  fps-webgl
----------------------------------------------------------------------------
Chrome 30.0.1568.2 dev:      2779.9    138.3  143.7           45.3     57.0
Nightly 25.0 (2013-07-23):   1398.5    121.9  111.7           34.5     60.0


I find it a bit puzzling that we're slower at graphics, model and model+graphics, but get higher fps. I don't fully understand what's being tested, though, so that might have an easy explanation.
You need to log in before you can comment on or make changes to this bug.