Closed Bug 1339758 Opened 7 years ago Closed 7 years ago

Accessing offsetX/offsetY on MouseEvents dispatched from click() is slower than in other engines

Categories

(Core :: DOM: Core & HTML, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Performance Impact low
Tracking Status
firefox55 --- fixed

People

(Reporter: jandem, Assigned: smaug)

References

(Blocks 3 open bugs)

Details

Attachments

(4 files)

Attached file Test
In a Speedometer profile I noticed we were spending a lot of time under the MouseEvent offsetY getter, doing a reflow:

750.00 ms  mozilla::dom::MouseEvent::OffsetY()
750.00 ms  mozilla::dom::Event::GetOffsetCoords
748.00 ms  mozilla::PresShell::FlushPendingNotifications

Is this something we could optimize?

(Also, this was from an Ion IC so our Ion DOM paths didn't work, need to figure out why...)

I'm attaching a simple offsetX micro-benchmark, but it's probably not very representative. Still, we're slower than Chrome and Safari:

Safari:   2 ms
Chrome:  12 ms
Nightly: 48 ms

Maybe Safari is loop-hoisting or DCE-ing this micro-benchmark, but it would still be good to look into our own perf numbers.
I guess the question is do they not flush layout.
Webkit does call targetNode->document().updateLayoutIgnorePendingStylesheets(); once per event.
Blink does something similar.  We should be able to achieve parity with them in terms of the number of reflows if we cached what Event::GetOffsetCoords() computes...
yeah, although that is against the current spec.
So a few things are going on here:

1)  Our flushing code is somewhat dumb in this case: it does a lot of no-op work.  The dumbness can be fixed now, though, due to bug 1334735 being fixed.  I filed bug 1339891 on this.  Fixing that gives us an immediate 30% speedup here, because the flush code becomes a very quick boolean check and then we know we don't need to flush.  Note that it's not clear whether this would help in the original speedometer testcase, because there maybe we _do_ have dirty layout!

2)  The remaining work is due to us doing a bunch of dynamic computation of offsets on every offsetX/Y get.  Specifically, lots of time under GetTransformToAncestor, FindNearestCommonAncestor, Matrix4x4 bits.  This is the part where Blink and WebKit cache the results of this computation.  I'm pretty sure they still have identical code there, based on a brief skim.

It's not clear to me, for #2, what the spec actually requires.  The spec text at https://drafts.csswg.org/cssom-view/#dom-mouseevent-offsetx is:

  If the event’s dispatch flag is set, return the x-coordinate of the position where the event occurred
  relative to the origin of the padding edge of the target node, ignoring the transforms that apply
  to the element and its ancestors, and terminate these steps.

Seems to me like the "correct" per-spec implementation would be neither our "compute it every time" nor Blink/WebKit's "compute it on first get" but a "compute it eagerly at event creation time".  And yes, the three approaches can all give different behavior in various cases...  :(
I don't understand where you get the "compute it eagerly at event creation time" interpretation.
> I don't understand where you get the "compute it eagerly at event creation time" interpretation.

Well, it says "where the event occurred relative to the origin of the padding edge of the target node".  I guess the question is whether it means "the padding edge when the event occurred" or "the padding edge now"?  I read it as the former, but you're right that it's not super-clear.
Jan's test case shouldn't cause any reflows.  It's a bit hard to believe that we're slower *because* of the reflow cost in that test case at least.  I was curious so I profiled the micro benchmark: https://perfht.ml/2lkvQdf

30ms reflow that's all the cost of vsnprintf (profiler overhead).  Filed bug 1339897.
32ms nsLayoutUtils::TransformPoint() and a bunch of other overheads.

I think from this it's clear that the caching does make a difference, even disregarding reflow cost.
Sure.  Just to put numbers to this, on my hardware Safari is at ~10ns per iteration on this testcase if I up the iteration count some.  We're at ~380ns with the patch for bug 1339891.  All of that time, pretty much, is under TransformPoint.  I will claim it's impossible to do the work of computing all that transformation matrix stuff in ~10ns.  ;)

If I add the obvious cache bits, like so:

MouseEvent::OffsetX()
{
  if (!mCachedOffsetPoint) {
    mOffsetCoords = Event::GetOffsetCoords(mPresContext, mEvent, mEvent->mRefPoint,
                                           mClientPoint);
    mCachedOffsetPoint = true;
  }
  return mOffsetCoords.x;

we end up at ~7.5ns per iteration.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #9)
> Sure.  Just to put numbers to this, on my hardware Safari is at ~10ns per
> iteration on this testcase if I up the iteration count some.  We're at
> ~380ns with the patch for bug 1339891.  All of that time, pretty much, is
> under TransformPoint.  I will claim it's impossible to do the work of
> computing all that transformation matrix stuff in ~10ns.  ;)

Fine.  :P

> If I add the obvious cache bits, like so:
> 
> MouseEvent::OffsetX()
> {
>   if (!mCachedOffsetPoint) {
>     mOffsetCoords = Event::GetOffsetCoords(mPresContext, mEvent,
> mEvent->mRefPoint,
>                                            mClientPoint);
>     mCachedOffsetPoint = true;
>   }
>   return mOffsetCoords.x;
> 
> we end up at ~7.5ns per iteration.

Nice.  So I guess one question that we need to answer is what the spec actually wants.  (I wonder what Edge does.)  Also, it's still quite possible that in Speedometer we're getting slow perf because of the reflow and not the caching, since if we actually have a dirty layout, flushing it can be much more expensive than the GetOffsetCoords overhead.
So in terms of it being the actual reflow... Caching would help a _lot_ with that if there are multiple accesses with interspersed mutation, since it makes sure there is at most one reflow, period.

In terms of what Edge does, I wrote some testcases:

1) https://jsfiddle.net/7x5e701m/1/ -- on this one we report a number near 0, then a number near -200.  Blink/WebKit report a number near 0 twice.  Edge 14 matches WebKit/Blink.  The actual testcase is:

  <div style="position: relative" id="mydiv">Click on the "C" of the first "Click" in this line</div>
  <script>
  document.getElementById("mydiv").onclick = function(e) {
    console.log(e.offsetX);
    document.querySelector("div").style.left = "200px";
    console.log(e.offsetX);
  };
  </script>

2) https://jsfiddle.net/7x5e701m/2/ -- on this one both Gecko and Blink/WebKit report a single number near -200.  Edge reports a number near 0.  The actual testcase is:

  <div style="position: relative" id="mydiv">Click on the "C" of the first "Click" in this line</div>
  <script>
  document.getElementById("mydiv").onclick = function(e) {
    document.querySelector("div").style.left = "200px";
    console.log(e.offsetX);
  };
  </script>

3) https://jsfiddle.net/7x5e701m/4/ -- on this one both Gecko and Blink/WebKit report a number near 200, then a number near -200.  Edge reports a number near 200, then a number near 0.  The actual testcase is:

  <div style="position: relative" id="mydiv">Click on the "C" of the first "Click" in this line</div>
  <script>
  document.getElementById("mydiv").onclick = function(e) {
    var d = document.querySelector("div");
    d.style.left = "200px";
    console.log(d.offsetLeft)
    console.log(e.offsetX);
  };
  </script>

So looks to me like Edge is doing something like my interpretation of the spec, where the offsetX is computed based on the layout when the event fires or something?  And yes, cached after that.
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #11)
> So in terms of it being the actual reflow... Caching would help a _lot_ with
> that if there are multiple accesses with interspersed mutation, since it
> makes sure there is at most one reflow, period.

Good point.

> So looks to me like Edge is doing something like my interpretation of the
> spec, where the offsetX is computed based on the layout when the event fires
> or something?  And yes, cached after that.

Technically from these results it's also possible to interpret that Edge just doesn't do a reflow, but that seems far fetched.
> Technically from these results it's also possible to interpret that Edge just doesn't do a reflow

No, that's what the third testcase is about.  Edge is reporting an offsetLeft of 200 for that div, so it did the reflow for the .style.left change, yes?
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #13)
> > Technically from these results it's also possible to interpret that Edge just doesn't do a reflow
> 
> No, that's what the third testcase is about.  Edge is reporting an
> offsetLeft of 200 for that div, so it did the reflow for the .style.left
> change, yes?

Oh yes, you're right.

This is a terrible situation.  :(  We have three competing implementations.
Priority: -- → P1
So ok, we need to make a call here.  Do we follow IE and implement the spec, but with a performance penalty due to having to do this computation at firing time for every single mouse event?  Do we get the spec change to the somewhat incoherent but fast Blink/WebKit behavior?  Something else?

Paging events owner.  ;)
Flags: needinfo?(bugs)
And the events owner reads the spec differently saying that IE/Edge doesn't follow the spec but we do.
But given that webkit/blink model gives the best performance, and at least it isn't easily visible to the user of offsetX/Y that at which point the value is cached, I think we should try that model.
And we of course need a spec change/clarification
https://github.com/w3c/csswg-drafts/issues/1070


Though, how fast can we be with Edge model, assuming we don't flush anything, but just do computation based on the current layout state?
Flags: needinfo?(bugs)
But still, I can live with webkit/blink model, as horrible as it is.
> https://github.com/w3c/csswg-drafts/issues/1070

You probably want to add "[cssom-view]" to the beginning of the issue title, so the CSS WG folks end up with it in the right bucket.

> how fast can we be with Edge model, assuming we don't flush anything

That's a good question.  It sort of depends on the depth of the frame tree and whatnot, right?  In the testcase attached to this bug it takes us about 300-400ns per call to do the computation; the flushing is not really a factor there because there is nothing to flush.

If I add ten levels of nesting, then it's closer to 500ns.  I expect if I added some transform styles and whatnot I could make it a bit slower still, but not sure about that.
Whiteboard: [qf:p3]
Looking closer at Speedometer, I wonder if the problem here is instead that Webkit short-circuits simulated events and we don't:

https://github.com/WebKit/webkit/blob/8ff2aef949f9fb1857fe85990d8c5c831b99d872/Source/WebCore/dom/MouseRelatedEvent.cpp#L222-L223

At least some Speedometer tests trigger events by doing things like element.click().

Am I missing something?
Flags: needinfo?(bugs)
Hmm.  That might be something we can fix much more cheaply/easily than the real event case, yes...
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #21)
> Hmm.  That might be something we can fix much more cheaply/easily than the
> real event case, yes...

Yea, but this seems to be trending toward benchmarketing fixes, right?
I mean, I guess its useful if it moves the bottleneck to something else real we can fix.
There are non-benchmarking use cases for things like element.click() no?

FWIW, what's happening here is that jQuery wraps the event in its own event-thing and uses a for-in loop to copy all properties from the event to its own object. That's where the offsetX/offsetY gets are coming from...

Fixing this will definitely be a win on Speedometer, it affects multiple tests.
> Yea, but this seems to be trending toward benchmarketing fixes, right?

I think click() is common enough that making that case not do stupid things is not just benchmarketing.  In particular, it can skip computing the offsets altogether, unlike real mouse events...
Ok.  I guess I didn't realize .click() was something used on real web sites. *mumbles something about c++*
(In reply to Jan de Mooij [:jandem] from comment #20)
> Looking closer at Speedometer, I wonder if the problem here is instead that
> Webkit short-circuits simulated events and we don't:
> 
> https://github.com/WebKit/webkit/blob/
> 8ff2aef949f9fb1857fe85990d8c5c831b99d872/Source/WebCore/dom/
> MouseRelatedEvent.cpp#L222-L223

What does simulated mean in webkit?
Flags: needinfo?(bugs)
But if other browsers return always 0 for offsetX/Y when using click() then we could try to do the same and get that spec'ed.
Let's try to do at least the click() hack. I wonder in which all cases webkit/blink have it.
Not when creating/dispatching new MouseEvent explicitly in JS.
Assignee: nobody → bugs
Attached file click() perf
Summary: Accessing offsetX/offsetY on MouseEvents slower than in other engines → Accessing offsetX/offsetY on MouseEvents dispatched from click() is slower than in other engines
Can't write wpt test for this since it can't send low level events.
This is trying to mimic blink's behavior. The spec bug is still open so tweaks may come.

Let's see what tryserver thinks about this
https://treeherder.mozilla.org/#/jobs?repo=try&revision=11888382da065122762f7425921929105fbbe9cb
Attachment #8850166 - Flags: review?(masayuki)
Comment on attachment 8850166 [details] [diff] [review]
positionless mouse events

> nsIntPoint
> UIEvent::GetLayerPoint() const
> {
>+  if (mEvent->mFlags.mIsPositionless) {
>+    return nsIntPoint(0, 0);
>+  }
>+
>   if (!mEvent ||
>       (mEvent->mClass != eMouseEventClass &&
>        mEvent->mClass != eMouseScrollEventClass &&
>        mEvent->mClass != eWheelEventClass &&
>        mEvent->mClass != ePointerEventClass &&
>        mEvent->mClass != eTouchEventClass &&
>        mEvent->mClass != eDragEventClass &&
>        mEvent->mClass != eSimpleGestureEventClass) ||

Hmm, like this usage, we cannot add common flags/data for pointing device events to MouseEventClassBase nor MouseEventClass (TouchEventClass doesn't inherit them).

>diff --git a/dom/events/test/test_bug1339758.html b/dom/events/test/test_bug1339758.html
>new file mode 100644
>--- /dev/null
>+++ b/dom/events/test/test_bug1339758.html
>@@ -0,0 +1,78 @@
>+<!DOCTYPE HTML>
>+<html>
>+<!--
>+https://bugzilla.mozilla.org/show_bug.cgi?id=1339758
>+-->
>+<head>
>+  <meta charset="utf-8">
>+  <title>Test for Bug 1339758</title>
>+  <script type="application/javascript" src="/tests/SimpleTest/SimpleTest.js"></script>
>+  <script type="text/javascript" src="/tests/SimpleTest/EventUtils.js"></script>
>+  <link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css"/>
>+  <script type="application/javascript">
>+
>+  /** Test for Bug 1339758 **/
>+  var expectNonZeroCoordinates = false;
>+  SimpleTest.waitForExplicitFinish();
>+
>+  function testCoordinates(e) {
>+   var coordinateProperties =
>+     [ "screenX",
>+       "screenY",
>+       "clientX",
>+       "clientY",
>+       "offsetX",
>+       "offsetY",
>+       "movementX",
>+       "movementY",
>+       "layerX",
>+       "layerY",
>+       "x",
>+       "y" ];
>+    for (var i in coordinateProperties) {
>+      if (e[coordinateProperties[i]] != 0) {
>+        ok(expectNonZeroCoordinates, "Got at least some non-zero coordinate property: " + i);

Hmm, when this fails, it's difficult to distinguish what's tested. Could you add description with global variable before each test in runTests()?

>+        return;
>+      }
>+    }
>+    ok(!expectNonZeroCoordinates);

And I want description here too.

>+  }
>+
>+  function runTests() {
>+    document.getElementById("div_target").click();
>+    document.getElementById("a_target").focus();

focus()? not click()?

>+    sendKey("RETURN");
>+    document.getElementById("input_target").focus();
>+    sendKey("RETURN");
>+
>+    expectNonZeroCoordinates = true;
>+    // Test script created MouseEvents
>+    sendMouseEvent({ type: "click"}, document.getElementById("a_target"));
>+    sendMouseEvent({ type: "click"}, document.getElementById("input_target"));
>+
>+    // Test widget level mouse events
>+    synthesizeMouse(document.getElementById("a_target"), 2, 2, {});
>+    synthesizeMouse(document.getElementById("input_target"), 2, 2, {});
>+    SimpleTest.finish();
>+  }
>+
>+  SimpleTest.waitForFocus(runTests);
>+
>+
>+  </script>
>+</head>
>+<body>
>+<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1339758">Mozilla Bug 1339758</a>
>+<p id="display"></p>
>+
>+<div id="div_target" onclick="testCoordinates(event)">&nbsp;</div>
>+<a href="#" id="a_target" onclick="testCoordinates(event);">test link</a><br>
>+<input type="button" id="input_target" onclick="testCoordinates(event);" value="test button">

Or, event.target.id could be useful for the description to distinguish which test was running.

>diff --git a/widget/BasicEvents.h b/widget/BasicEvents.h
>--- a/widget/BasicEvents.h
>+++ b/widget/BasicEvents.h
>@@ -134,16 +134,19 @@ public:
>   bool mComposedInNativeAnonymousContent : 1;
>   // Set to true for events which are suppressed or delayed so that later a
>   // DelayedEvent of it is dispatched. This is used when parent side process
>   // the key event after content side, and may drop the event if the event
>   // was suppressed or delayed in contents side.
>   // It is also set to true for the events (in a DelayedInputEvent), which will
>   // be dispatched afterwards.
>   bool mIsSuppressedOrDelayed : 1;
>+  // Certain mouse events can be marked as positionless to return 0 from
>+  // coordinate related getters.
>+  bool mIsPositionless : 1;

Although, I don't like to add such event specific flag to EventFlags. But it's okay.
Attachment #8850166 - Flags: review?(masayuki) → review+
> >+
> >+  function runTests() {
> >+    document.getElementById("div_target").click();
> >+    document.getElementById("a_target").focus();
> 
> focus()? not click()?
focus() and not click() to send RETURN to the right element.

> >+  bool mIsPositionless : 1;
> 
> Although, I don't like to add such event specific flag to EventFlags. But
> it's okay.
I wanted to put it to flags since that way we don't end up increase size of any Widget event class
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bec20d74d93a
click() and similar can return 0 from various coordinate properties, r=masayuki
https://hg.mozilla.org/mozilla-central/rev/bec20d74d93a
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
I am seeing a 10% improvement on speedometer due to this landing. Awesome!
http://arewefastyet.com/#machine=17&view=single&suite=speedometer&start=1490146341&end=1490347719
Do we have a followup bug filed on offsetX for non-click() events?  The benchmark uses click(), which is all well and good, but real sites would have actual user clicks....
Blocks: 1350373
Component: DOM → DOM: Core & HTML
Performance Impact: --- → P3
Whiteboard: [qf:p3]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: