Closed Bug 599318 Opened 14 years ago Closed 8 months ago

Javascript Voxel Spacing Chrome Experiment is significantly slower in Minefield in comparison to Chromium

Categories

(Core :: JavaScript Engine, defect)

x86
Windows 7
defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: kpangilinan, Unassigned)

References

(Depends on 1 open bug, )

Details

(Whiteboard: [chromeexperiments])

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.10) Gecko/20100914 Firefox/3.6.10 

Minefield runs at 2-4FPS
Chromium runs at 22-24FPS

FPS increases when navigating through the sky or underground (less graphics to render).

Reproducible: Always

Steps to Reproduce:
1. Load the experiment
2. Navigate around in environment
3. Take note of FPS (in Minefield and Chromium)
Actual Results:  
Chromium FPS should be almost 10x more than Minefield

Expected Results:  
Minefield should be running experiment as fast, if not, faster than Chromium browser.
CC'ing Bas. Please CC him in every 2D graphics performance bug on Windows Vista and Windows 7, because he's the Direct2D guy.
For me this runs at 35 fps in minefield with D2D, 30 fps in Minefield without D2D and 55 fps in Chromium. I'm not so sure this is graphics bound.

I can a really quick profile and we spend a fair amount of time(25% of our execution time) in something the profiler marks as ?!?, i.e. it doesn't understand it, this might have to do with WOW64 context switching but I'm not sure.

Other than that although the NVidia UMD does show up in the profile run (with about 7% of execution time, most likely due to bug 598936, note this shows up high mainly because everything that happens in the NVidia UMD gets thrown into a single stack because of there being no symbols).

Generally there's far more time spent in JS processing related stuff. The part of the table I found interesting (nothing can be seen inside the NVidia DLL symbol wise, but work seems to be well distributed and no single stack trace being particularly guilty of the whole 7%), follows, it's basically the stop stack traces below the ?!? and NVidia UMD.

Firefox total weight here is 15337 with 15211 samples. That being roughly 12,5% weight (one core on an eight core machine).

i.e. the functions pasted below account for over a quarter of the total processing time.

Process, Stack, Module, Function, DPC/ISR, Weight, % Weight, Count, TimeStamp
,    |- xul.dll!JSDOUBLE_IS_INT32, , , , 1016.118135, 0.82, 1009, 
,    |- xul.dll!js_math_ceil, , , , 811.897510, 0.66, 806, 
,    |- MOZCRT19.DLL!_ceil_pentium4, , , , 530.702303, 0.43, 527, 
,    |- xul.dll!InitPropOrMethod, , , , 483.735952, 0.39, 481, 
,    |- MOZCRT19.DLL!_isnan, , , , 327.228122, 0.27, 325, 
,    |- xul.dll!FinalizeArenaList<JSObject;&FinalizeObject>, , , , 269.941366, 0.22, 268, 
,    |- xul.dll!js::NewNativeClassInstance, , , , 244.728484, 0.20, 243, 
,    |- xul.dll!js_math_min, , , , 224.608162, 0.18, 223, 
,    |- xul.dll!js::mjit::stubs::SetElem<0>, , , , 211.589560, 0.17, 210, 
,    |- xul.dll!_ftol2_pentium4, , , , 190.325258, 0.15, 189,
Assignee: nobody → general
Component: Canvas: 2D → JavaScript Engine
QA Contact: canvas.2d → general
(In reply to comment #2)
> For me this runs at 35 fps in minefield with D2D, 30 fps in Minefield without
> D2D and 55 fps in Chromium. I'm not so sure this is graphics bound.
> 
> I can a really quick profile and we spend a fair amount of time(25% of our
> execution time) in something the profiler marks as ?!?, i.e. it doesn't
> understand it, this might have to do with WOW64 context switching but I'm not
> sure.

The ?!? could be JITted Javascript, no?

Does it disappear when you disable the jit's in your prefs?
(In reply to comment #3)
> (In reply to comment #2)
> > For me this runs at 35 fps in minefield with D2D, 30 fps in Minefield without
> > D2D and 55 fps in Chromium. I'm not so sure this is graphics bound.
> > 
> > I can a really quick profile and we spend a fair amount of time(25% of our
> > execution time) in something the profiler marks as ?!?, i.e. it doesn't
> > understand it, this might have to do with WOW64 context switching but I'm not
> > sure.
> 
> The ?!? could be JITted Javascript, no?
> 
> Does it disappear when you disable the jit's in your prefs?

It does, and it basically all goes to js::Interpret. Excellent observation. So that 25% is indeed JITted JS.
Whiteboard: [chromeexperiments]
So (on Mac, but I think it doesn't matter that much given the above), I see the following time breakdown:

  12% painting the window (note that for Chrome this may happen in a different
      process, so in parallel with all this other stuff on a multicore system).
  39% mjit-generated code
  14% drawImage on the canvas
  12% stubs::NewInitObject (about half of this is gc)
   5% js_math_ceil (In particular, about 1/6 of this is the JSDOUBLE_IS_INT32
      test in Value::setNumber, especially the ucomisd instruction in there. 
      About 1/3 of the time here is actually calling ceil(); this is a build
      with the math operation cache).
   5% InitPropOrMethod (almost entirely self time).
   4% stubs::NewArray
   3% stubs::Mod (all self time)
   2% stubs::SetElem (on a typed array)
   2% js_math_min
   2% math_abs
  
then various minor stuff (canvas putImageData, stubs::InitProp, canvas getImageData, js_math_max, stubs::SetName, fillRect).  That gets us down to the 0.1% range.  In any case, the only places we can really win big here are in the list above, since those add up to 90%+ of the time (I rounded all the percentages up, so it doesn't quite add up to the 98% it looks like).

What I didn't do is do a profile that would indicate whether we're spending idle time blocked on something....  With the imagedata stuff here we _could_ be getting readbacks that wouldn't show up in this data, maybe.  Not sure how shark accounts those, or the profiler Bas used.

One other note: I hit about 30fps with the build I profiled above (tracemonkey tip).  Chrome is at 40fps on the same hardware.  Safari 5 at 25fps, Opera 10.6 at 45fps.
Status: UNCONFIRMED → NEW
Ever confirmed: true
(In reply to comment #5)
>    5% js_math_ceil (In particular, about 1/6 of this is the JSDOUBLE_IS_INT32
>       test in Value::setNumber, especially the ucomisd instruction in there. 
>       About 1/3 of the time here is actually calling ceil(); this is a build
>       with the math operation cache).

Note: I didn't actually turn on the math cache for ceil/floor because (lazily, without measuring) I thought these should be pretty fast.
I get 22fps with the latest nightly. Can this be closed?
> I get 22fps with the latest nightly. Can this be closed?

What do you get in Chrome on the same hardware?
(In reply to comment #8)
> > I get 22fps with the latest nightly. Can this be closed?
> 
> What do you get in Chrome on the same hardware?

29fps with x86_64 beta channel.
I get around 22fps with 4.0b8pre and around 40fps with chrome, so not fixed yet.
Depends on: 550389
Assignee: general → nobody
Severity: normal → S3
Status: NEW → RESOLVED
Closed: 8 months ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.