Open Bug 732621 Opened 12 years ago Updated 2 years ago

Rendering/processing in onmousemove is very laggy in fractal demo

Categories

(Core :: Widget, defect)

All
Linux
defect

Tracking

()

People

(Reporter: azakai, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [games:p?][tpi:-])

Attachments

(2 files)

This cool demo:

http://www.fragsworth.com/fractals/

is very smooth on Chrome (move the mouse around, the fractal shifts with it), and hardly usable on Firefox. It looks like it only renders one frame when the mouse stops moving.

Attached is a version of that page which modifies the onmousemove event handler to do all its work inside a setTimeout, like this:

  function onCanvasMouseMove(e)
  {
    setTimeout(function() {
      ..
    }, 0);
  }

With that added setTimeout, rendering does happen and it becomes very usable, just slightly less responsive than Chrome.

I have no idea why this has such an effect, however... is it expected?
Blocking gecko-gaming because many games require onmousemove event handling to be responsive (anything that continuously tracks the mouse).
Blocks: gecko-games
OS: Linux → All
Hardware: x86 → All
Please profile this. I doubt this has anything to do with DOM events or event handling.
To elaborate on my point from before, my suspicion that it has to do with event handling stems from the fact that adding a setTimeout 0 inside the event fixes the bug entirely, it makes a huge difference. Without the setTimeout, it looks like the events are not being handled at all - only one frame is rendered after the mouse stops moving, all other frames while the mouse is moving simply have no effect. But it is just a guess on my part that event handling is relevant here.
A bit unreliable profiling shows that 70% of the time is spent in JS when moving mouse in
http://www.fragsworth.com/fractals/

Though, perhaps we're just dispatching mousemove events more often than some other browsers.
Assignee: nobody → general
Component: DOM: Events → JavaScript Engine
QA Contact: events → general
> Though, perhaps we're just dispatching mousemove events more often than some other
> browsers.

I just did an informal test, and on Mac we seem to fire those at about the same rate as Chrome.

Of course on that same Mac I can't reproduce the bug; the rendering updates in realtime as I move the mouse....
(In reply to Boris Zbarsky (:bz) from comment #5)
> > Though, perhaps we're just dispatching mousemove events more often than some other
> > browsers.
> 
> I just did an informal test, and on Mac we seem to fire those at about the
> same rate as Chrome.
> 
> Of course on that same Mac I can't reproduce the bug; the rendering updates
> in realtime as I move the mouse....

Can you share the code you used to test, and I'll run it here on Linux? (I want to run the same code to make the comparison as fair as possible between the different OSes.)
> Can you share the code you used to test

<!DOCTYPE html>
<div style="width: 400px; height: 400px; border: 1px solid black"
     onmouseover="events = 0"
     onmousemove="++events"
     onmouseout="setTimeout(function() { alert(events); }, 1000)">

and then drag the mouse pointer at a given constant rate across the square
Thanks. On my system we also seem to fire the same amount of events in Firefox and Chrome.

If it isn't that, and it isn't JavaScript execution speed, then the next suspect is graphics I think, since it looks like this bug is Linux-only (at least from a sample of my Linux machine and bz's Mac).
Assignee: general → nobody
Component: JavaScript Engine → Graphics
OS: All → Linux
QA Contact: general → thebes
On this machine (64bit linux, intel graphics) it was mostly JS
It should use a lot of JS. The question is why it doesn't render any frames at all (until the mouse stops moving). Did you see that on your machine?
No. It is just a bit slower than Chrome.
If no one can reproduce this, maybe it is something specific on my system. I have no addons enabled, so maybe a graphics driver issue? about:support says

  Graphics

        Adapter Description
        NVIDIA Corporation -- GeForce GT 330M/PCI/SSE2

        Vendor ID
        NVIDIA Corporation

        Device ID
        GeForce GT 330M/PCI/SSE2

        Driver Version
        3.3.0 NVIDIA 280.13

        WebGL Renderer
        NVIDIA Corporation -- GeForce GT 330M/PCI/SSE2 -- 3.3.0 NVIDIA 280.13

        GPU Accelerated Windows
        0

        AzureBackend
        skia
Very strange. I built FF with "-O2 -g" in order to profile, but can't reproduce the problem. With the same profile, I get the bug in Nightly but not in my custom build. Only explanation I can think of is that Nightly is a little behind the custom build, and that the bug was just fixed - but the difference is less than a day, so I am confused.
I still can't figure this out.

When I build m-c myself, I don't see this bug. When I run Nightly, however - even with the same profile - I do see the bug. I've done a few builds over the last week and that stays constant.

I am building m-c with

  mk_add_options MOZ_OBJDIR=../m-c
  ac_add_options --enable-application=browser
  ac_add_options --enable-optimize="-O2 -g"
  ac_add_options --disable-debug
  ac_add_options --disable-tests

- no special settings.

What difference could there be between my build of m-c and Nightly that could cause such different behavior?
Version: unspecified → Trunk
I don't know what the difference could be, but this is probably Bug 602303.

I remember experiencing similar differences in mc vs self compiled builds during the Firefox 4 cycle. IIRC, it even happened with mozilla provided 3.6 builds vs distro provided ones.
Whiteboard: [games:p2]
I can reproduce this with a nightly on win32
I may have an idea about this...
Assignee: nobody → ehsan
Actually, I don't!

This seems to be all JS.  Moving to the JS component.
Assignee: ehsan → general
Component: Graphics → JavaScript Engine
Yeah, this is pretty bad.  I did a quick shark profile.  97% of time is under JaegerShot.  Of this time:
 39% is in jit code
 37% is under nsIDOMImageData_GetData, of this time:
  + 41% is self time of nsIDOMImageData_GetData and dom::ImageData::GetData
  + 38% is just calling JS_WrapValue from dom::ImageData::GetData
  + 17% is calling GCThingIsMarkedGray
 20% is js::mjit::stubs::SlowCall just calling js_math_floor (no wrappers), which spends a lot of time doing TI monitoring and the general Invoke path

So:

 1. Why is Math.floor taking such a slow call path?  Perhaps you could look at this Brian?

 2. Why is nsIDOMImageData doing so much wrapping?  I feared this was cross-global CPG stuff, but, in that case, I would expect to see more CCW junk on the stack which I don't.  In particular, I see calls to WrapperFactory::WrapForSameCompartment under JSCompartment::wrap, so is this some crazy same-compartment wrapper shenannigans?

 3. Unmark grey... anyone know what that's for?
Talking to billm about the unmark-gray overhead, we were wondering if maybe we can statically know that we don't have to unmark grey on the returned object because it was transitively marked black by the JS callee/receiver which we know is black.  (I don't understand enough to know whether this is the case, just hoping it is.)

Also, Bill mentioned Andrew had a patch to inline the fast path of unmark-grey...
(In reply to Luke Wagner [:luke] from comment #20)
> Talking to billm about the unmark-gray overhead, we were wondering if maybe
> we can statically know that we don't have to unmark grey on the returned
> object because it was transitively marked black by the JS callee/receiver
> which we know is black.  (I don't understand enough to know whether this is
> the case, just hoping it is.)
> 
> Also, Bill mentioned Andrew had a patch to inline the fast path of
> unmark-grey...

That's in bug 747066.  The patch itself is too gross to actually land, but the actual operation is just a few loads and bit operations, so maybe we could make it simpler somehow.

My patch eliminated a 6ns overhead bz measured. I'm not sure if these operations you are measuring are fast enough that that would be a problem.

It certainly could be the case that we're unmark graying too much. We've spread them around fairly liberally.
> Why is nsIDOMImageData doing so much wrapping?

Because it always has to call JS_WrapValue on the data, because there's no guarantee that it's called while in the compartment of the data.

The fact that it has to worry about this is mostly bug 751926, though given that we're calling the XPCOM API for now it would have to call JS_WrapValue anyway.  I thought we had an existing bug on switching ImageData to WebIDL once non-Azure canvas dies, but I can't find it anywhere.  Peter?

The fact that JS_WrapValue is slow doesn't seem to have a dedicated bug, though it's a pretty well-known fact.  Maybe there should be such a dedicated bug.
We should rebenchmark this in a post-Jaeger world and see where things are at.
Whiteboard: [games:p2] → [games:p?]
No change, identical to before.
(In reply to Alon Zakai (:azakai) from comment #24)
> No change, identical to before.

The URL in comment 0 no longer works, but I tested the attachment and it seems to work just like in Chrome on OS X and Windows. It's hard to compare them without some kind of fps counter, but it's definitely not "very laggy".

Instruments shows most time is spent in JIT code, with a few % under CanvasRenderingContext2D::PutImageData_explicit, that seems pretty reasonable.

Could this be a Linux graphics issue maybe? What are others seeing?
Super smooth for me even on Linux. I can't see any noticeable to Chromium either.
The previous attachment does *not* show the problem. Looks like the original URL is down, so here is an attachment that does show the issue.

The only difference between the two attachment is that the first one does a setTimeout(.., 0) inside the mouse event handler. Due to that, it renders at a respectable pace. When not doing the setTimeout, though, I get almost no frames rendered until the mouse stops moving - as if it can't render from the mouse event handler.
Is this just about us processing more mousemove events than Chrome. They drop events if they can't
process them fast enough, IIRC.
In the setTimeout testcase, do we end up processing some doFrame calls quite late after the
mousemove actually happened?
Thanks Alon. This version runs fine on OS X and Windows, but I do see the problem on Linux. So it's probably OS / graphics related. Is it possible we have more mouse move events on Linux?

Since this works fine on OS X and Windows, I doubt this is a JS bug.
smaug: I see us emitting more mouse events than chrome, yes. 12/second for chrome and 16/second for us, using this change to the code

var recentMouseEvents = 0;
setInterval(function() {
  console.log('mouse events over last second: ' + recentMouseEvents);
  recentMouseEvents = 0;
}, 1000);
function onCanvasMouseMove(e) {
  recentMouseEvents++;

With that logging once per second in the console, it is clear that the whole browser becomes unresponsive (and no loggings are shown) when the rendering stalls. So perhaps the event loop is just full with events that take significant enough time to cause blockage. Throttling mouse events when the event loop is blocked sounds like it could definitely help.

jandem: Yes, seems more like an event handling issue.
Event handling or event loop issue. Probably the latter.
Sounds like the fact that we have logic to handle processing of starved paints on Windows, but no such thing on Linux if it is a Linux only problem.
Whiteboard: [games:p?] → [games:p2]
On linux bug 930793 makes a huge difference here when testing
https://bugzilla.mozilla.org/attachment.cgi?id=798566 (the non-setTimeout version)
On Windows OMTC might have helped with this already.
Depends on: 930793
Assignee: general → nobody
Component: JavaScript Engine → Widget
Same problem exists on my Macbook pro running yosemite using latest firefox release. The onmousemove event is very slow on firefox compared to chrome and safari. We use a similar technique the original poster using, but still no cigar. Could it be related to GPU compositing or vsync timing?
mousemove event is not slow, but Gecko is dispatching more events and such.
Have you tried to modify your code to use requestAnimationFrame or setTimeout?


(Need to get back to bug 930793)
Yes, with latest firefox, the RequestAnimationFrame gives better results if we put a starvation interval. I should have to check the older versions of Firefox though to confirm that if it would be a viable option, or not. Gecko is indeed dispatching every single mouse move event to the handler starving everything in the process.
dom.event.handling-user-input-time-limit config options is by default 1000. If I set it to zero, the move events are processed much faster. What is this time limit for?
Never mind. It actullay didnt make any difference.
Setting gfx.vsync.compositor to false makes mouse move events more responsive now. The rendering delayed by the vsync but events continue to flow that is why the lag happens. This is on macbook pro Yosemite with Firefox 39. Why is this gfx.vsync.compositor is true by default?
Source of regression is probably this 

https://bugzilla.mozilla.org/show_bug.cgi?id=732621
Ooops sorry about that. This is the bug that is the source of regression

https://bugzilla.mozilla.org/show_bug.cgi?id=1118530
Whiteboard: [games:p2] → [games:p?]
Whiteboard: [games:p?] → [games:p?][tpi:-]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.