I assume the actual steps to reproduce are to load http://stepheneb.github.com/avalanche2d-js/avalanche2d.html and select the "Go" radio button? If so, and using the default settings (100x100 desk array size, visualization+graph rendering, Uint8Array), I get numbers like so on Mac OS X 10.6: Firefox 7: 2265.5 (steps/s) Current trunk nightly: 5417.1 (steps/s) Chrome dev build (16.x): 7680.5 (steps/s) If I turn off the rendering part, these become: Firefox 7: 3043.2 (steps/s) Current trunk nightly: 9124.1 (steps/s) Chrome dev build (16.x): rate: 8741.3 (steps/s) Though all the numbers are very noisy (about 10% or more noise). If I enable either the visualization or the graph we drop down to about the first set of numbers.... Stephen, how likely is this page to stick around in its current form? In any case, it seems like the main bottleneck at this point for nightly is in fact the canvas bits. Brian, do you want to look into whatever JS issues are left here and file bugs as needed?
Status: UNCONFIRMED → NEW
Component: General → General
Ever confirmed: true
Product: Firefox → Core
QA Contact: general → general
Summary: Slow Computational Performance → Slow canvas and simulation performance on this testcase
Component: General → Canvas: 2D
QA Contact: general → canvas.2d
A profile of the testcase with the rendering bits (both visualization and graph) enabled shows: 7% kernel-level GL stuff 15% painting 31% JIT-generated JS code 7% array_shift 7% Non-quickstubbed gets of window.self 5% under mjit::ic::Call (some of this is canvas stuff, note). 3% under mjit:stibs:SlowCall (likewise) 8% stroking on canvas (about 2/3 the path ops and 1/3 invalidation) and various array ops, stubs::Pos, js_Date, stubs::sub, BeginPath, etc. If I collapse all the libmozjs stuff, then 53% is JS engine code, 8% is canvas stroking, 7% is window.self, 1% is setAttribute on elements, and some minor stuff for getImageData, beginPath, putImageData, etc. Looks like the window.self gets all happen from this sort of pattern in the grapher: y.call(self, pts.y, 0) Is there a reason that's using "self" and not just "window" or better yet "null" as the first argument to call() there?
The other interesting thing is the "Animation loop timing measurements for this browser" number, which is rather different for us and Chrome. What is that measuring?
Depends on: 697244
I normally run two tests: model, visualization, and grapher AND just the model. It makes sense to describe these results as a pair of numbers so I might describe your Nightly results as 5417/9124 Model, Canvas Visualization, Graph: 1) Close other CPU intensive processes 2) open page: http://stepheneb.github.com/avalanche2d-js/avalanche2d.html 3) press "Go" and wait for 5000 model-steps to complete 4) Press "Reset" 5) press "Go" and wait for 5000 model-steps to complete 6) record value displayed in "rate: nnnn (steps/s)" Just Model -- no Canvas Visualization or Graph This should mainly just test the computational speed. 1) Close other CPU intensive processes 2) open page: http://stepheneb.github.com/avalanche2d-js/avalanche2d.html 3) Uncheck "Visualization" and "Graph" 4) press "Go" and wait for 5000 model-steps to complete 5) Press "Reset" 6) press "Go" and wait for 5000 model-steps to complete 7) record value displayed in "rate: nnnn (steps/s)" I run the test twice because there sometimes seems to run slower the first time. One reason the result is noisy is that the computational algorithm has chaotic aspects. A model step consists of randomly "dropping" a folder on one of the 'Bureaucrats' desks. If that Bureaucrat has more than 3 folders they take four folders from their desk and distribute them to their neighbors. In this process any Bureaucrat to whom a folder has been distributed must be checked to see if they have more than 3 folders ... and so on. Only when NO Bureaucrat has more than 3 folders is the model-step over. Bureaucrats on the edges throw their folders over the edge and these are lost. The graph shows the average folders per desk over time. The sharp drops are in effect "Avalanches". Another reason is that I am dynamically adjusting how much work I do before each requestAnimFrame to try and use about 80% of the available time. Normally this would be consider an "unfriendly" behavior by a web page but this page is specifically designed to find out how fast a computational simulation can be run. In a real version of this app the user would select what speed, size, and resolution to run the model -- and if they chose a large complex model to run fast presumably they are OK with the larger power usage that would entail. I just updated my Nightly and I measured 5549/11136 on my Mac OS X 10.6.8, 2.66 GHz Core i7 system. Chrome 14 yesterday was: 9578/12594. I hadn't measure Nightly for a few days and it seems much closer to Chrome than before ... To me this new data indicate that the computational performance is close to Chrome and you are correct that the slowdown appears no to be more with the Canvas performance. While the graphing framework uses D3.js (which uses SVG) the "real-time graphing" is to a Canvas object that is layered over the plotting section of the SVG graph. This is used while the model is running. When the model stops the Canvas object is placed behind the SVH graph and the SVG graph is re-drawn. The graphics performance can be further distinguished by running the model with "Visualization" and "Graph" on and then with just the "Visualization" on and then finally with just the Model running Here's the data I just collected using this pattern: Nightly: 6975/8802/10638 Chrome Canary: 10183/11682/13089 I think I am done for a while making performance changes to the application. I was quite surprised at how much performance I was able to get. This graph shows the performance change over the last two weeks. The Y axis is a log scale! https://img.skitch.com/20111023-q8rkqfkhynifqhtj6yefm86aaw.jpg If it would be helpful I could also put up a static copy of the current site that wouldn't change.
(In reply to Boris Zbarsky (:bz) from comment #2) > Looks like the window.self gets all happen from this sort of pattern in the > grapher: > > y.call(self, pts.y, 0) > > Is there a reason that's using "self" and not just "window" or better yet > "null" as the first argument to call() there? No reason other than that was the same form used in the D3.js example I based the initial work on.
> I am dynamically adjusting how much work I do before each requestAnimFrame to try and > use about 80% of the available time. Hmm. How are you determining the "available time"? Just new Date() and seeing until some threshold is passed? If so, which threshold? > No reason other than that was the same form used in the D3.js example Ah. Passing either "window" or "null" would be somewhat faster, for what it's worth. Did you see comment 3?
(In reply to Boris Zbarsky (:bz) from comment #3) > The other interesting thing is the "Animation loop timing measurements for > this browser" number, which is rather different for us and Chrome. What is > that measuring? I'm still learning about how requestAnimFrame works and how best to do this -- the intent is to use a large percentage of the available cycles but not all of them -- and to dynamically adjust this. I have the impression (w/o direct evidence yet) that running a browser on a phone or tablet might not call back every 16ms. One hypothesis is that this algorithm ends up using a smaller percentage of the cpu time available than when the same code is running on Chrome. My usecase might be a good example where setImmediate:https://github.com/NobleJS/setImmediate would be useful. Here's what I wrote a couple of days ago about this question. My latest implementation is designed to render during every screen re-fresh -- and only render what is necessary for every screen refresh. It's definitely makes sense to only do the work to render what can be seen at the maximum screen refresh rate. To get the line graph to appear I still render it for every model step. However for the canvas visualization of the folder-count on the array of desks I am only rendering that once per screen re-paint. There a bunch of fiddling around to do the maximum amount of work in one screen repaint cycle. I dynamically adjust the maximum time I spend in the runModelLoop to try and always end at about 70-80% of the interval between a screen refresh. Nominally the screen refresh rate is 60 Hz, about 17ms. The browser can choose to make this interval longer. I think this is likely to happen on phones and tablets. It will also happen when the web page is in a tabbed window that is not being displayed (in this situation my algorithm for dynamically adjusting will be more greedy than a good web citizen should be). Here's some data from a run in Chrome: avalanches: 5000 rate: 9041.6 (steps/s) last sample time: 0 (ms) loop compute maximum: 20 anim loop timing: 23 folders: 2.122 (ave) The variables that pertain to this discussion are: loop compute maximum: 20 This is what the last model loop time took. A model loop time always includes one and usually more than one model step. anim loop timing: 23 This is the most recent time between between callbacks from the browser to my code that executes the runModel loop. So at the end of that run I was: * Computing and rendering in my runModelLoop until 20 ms were up. * Returning from my runModel loop to the browser. * Then getting called 3 ms later to start the runModelLoop all over again. And I was rendering at 44 times/s (why this is less than 60 I'm not certain right now). This is tricky because there are model steps which take more than one screen refresh interval -- this happens in big avalanches.
(In reply to Boris Zbarsky (:bz) from comment #6) > > I am dynamically adjusting how much work I do before each requestAnimFrame to try and > > use about 80% of the available time. > > Hmm. How are you determining the "available time"? Just new Date() and > seeing until some threshold is passed? If so, which threshold? Yes. See: measureAnimmationLoop() https://github.com/stepheneb/avalanche2d-js/blob/master/avalanche2d.html#L508 which calls: runModelLoop(); https://github.com/stepheneb/avalanche2d-js/blob/master/avalanche2d.html#L469 Which runs the modelStep once and if the elapsed time is less than model_loop_time runs the model step again (w/o the rendering the Canvas visualization again. > > No reason other than that was the same form used in the D3.js example > > Ah. Passing either "window" or "null" would be somewhat faster, for what > it's worth. > > Did you see comment 3? yes ... typing as fast as I can ;-) Thanks for your help!
(In reply to Stephen Bannasch from comment #8) > (In reply to Boris Zbarsky (:bz) from comment #6) > > > I am dynamically adjusting how much work I do before each requestAnimFrame to try and > > > use about 80% of the available time. > > > > Hmm. How are you determining the "available time"? Just new Date() and > > seeing until some threshold is passed? If so, which threshold? > > Yes. > > See: measureAnimmationLoop() > > https://github.com/stepheneb/avalanche2d-js/blob/master/avalanche2d.html#L508 That a red herring! only used at startup for informational purposes modelGo(); https://github.com/stepheneb/avalanche2d-js/blob/master/avalanche2d.html#L381 Is where the process starts and it sets up a callback to runModelLoop();
this is where the the model_loop_goal is dynamically adjusted: https://github.com/stepheneb/avalanche2d-js/blob/master/avalanche2d.html#L489
Computational performance plays a larger factor than than it might appear, There's a non-linearity in the system because I am only rendering the Canvas visualization once for every time the runModelLoop() is called by the requestAnimFrame() callback. So if 5000 model-steps with just the Canvas visualization on takes one second then avalanche2d.displayFolderCanvas(); will execute 60 times. If avalanche2d.displayFolderCanvas(); takes 5ms then 60 take 300ms. So that means (discounting the time spent not in the callback) that the model itself on average takes 700ms/5000 => 140 us. What if the model instead averages 170us? 170us * 5000 = 850ms Add to that the initial 60 avalanche2d.displayFolderCanvas(); executions for 300ms and you get 1250ms ... but because we're spending another 250ms (at least) we need to add in another 15 avalanche2d.displayFolderCanvas(); executions for another 75ms. So that leaves an intermediate total of 1375. So with a 5ms estimate for the Canvas visualization a 21% increase in the time it takes to run the model results in a 38% increase in the time it takes to run 5000 model steps. I don't actually know how long displayFolderCanvas(); takes -- that would be interesting info.
(In reply to Stephen Bannasch from comment #11) > executions for another 75ms. So that leaves an intermediate total of 1375. basic math error ... actually : 1325 => 33%
> There's a non-linearity in the system Thank you for doing that analysis! That's what I was sort of after with trying to understand the way the throttling works. So yeah, seems like having slightly slower JS performance could just lead us to have lots more canvas ops (which may also be slower, of course)....
I've updated the deployed code so if you are running the model by itself no callbacks are used. This removes a variable from the measurements for just running the model alone. Here are some updated measurements: Avalanche2D-JS Benchmarks Model-steps/s over 5000 model-steps, visualization and graphing ON average comparison -------------------------------------------------------------------------------------------------------------------- Chrome 17.0.920.1 canary 9345.8 9523.8 9276.4 10040.2 10204.1 9678 100% Chrome 14.0.835.186 8210.2 9009 9058 9276.4 9191.2 8949 92% FireFox Nightly 10.0a1 (2011-10-28) 5861.7 6784.3 7042.3 7278 7052.2 6804 70% FireFox 7.0.1 2161.7 2395.8 2356.3 2343 2401.5 2332 24% Model-steps/s over 5000 model-steps, visualization and graphing OFF average comparison --------------------------------------------------------------------------------------------------------------------- Chrome 17.0.920.1 canary 15822.8 15723.3 16129 16077.2 16233.8 15997 100% Chrome 14.0.835.186 15243.9 14619.9 15151.5 14970.1 14925.4 14982 94% FireFox Nightly 10.0a1 (2011-10-28) 12626.3 12787.7 13089 12658.2 12468.8 12726 80% FireFox 7.0.1 3728.6 3776.4 3742.5 3652.3 3663 3713 23% page: http://stepheneb.github.com/avalanche2d-js/avalanche2d.html commit: https://github.com/stepheneb/avalanche2d-js/commit/f8c4f24df date: 10/28/11 computer: Mac OS X 10.6.8, PowerBook Pro, 2.66 GHz Intel Core i7
Actually, I'm going to split this into two bugs given that data. Leaving this as the canvas bug; going to file a separate jseng bug.
Summary: Slow canvas and simulation performance on this testcase → Slow canvas performance on this testcase
I've updated the code a bit. Basically I made the canvas rendering part of the avalanche2d.model object and am keeping a persistent copy of the canvas, context, ImageData, and CanvasPixelArray. I am also pre-rendering the alpha values in the CanvasPixelArray so that these bytes don't need to be written during regular rendering. https://github.com/stepheneb/avalanche2d-js/commit/608030b There is now a google doc with updated timings here: https://docs.google.com/spreadsheet/ccc?key=0AtvlFoSBUC5kdDM4cHJhU0I4NGg5TjlWRkQ0ZExpcGc In general an increase of about 10% when running model, visualization, and grapher.
You need to log in before you can comment on or make changes to this bug.