Accessing offsetX/offsetY on MouseEvents dispatched from click() is slower than in other engines

RESOLVED FIXED in Firefox 55

Status

()

defect
P1
normal
RESOLVED FIXED
2 years ago
a month ago

People

(Reporter: jandem, Assigned: smaug)

Tracking

(Blocks 3 bugs)

unspecified
mozilla55
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [qf:p3])

Attachments

(4 attachments)

(Reporter)

Description

2 years ago
Posted file Test
In a Speedometer profile I noticed we were spending a lot of time under the MouseEvent offsetY getter, doing a reflow:

750.00 ms  mozilla::dom::MouseEvent::OffsetY()
750.00 ms  mozilla::dom::Event::GetOffsetCoords
748.00 ms  mozilla::PresShell::FlushPendingNotifications

Is this something we could optimize?

(Also, this was from an Ion IC so our Ion DOM paths didn't work, need to figure out why...)

I'm attaching a simple offsetX micro-benchmark, but it's probably not very representative. Still, we're slower than Chrome and Safari:

Safari:   2 ms
Chrome:  12 ms
Nightly: 48 ms

Maybe Safari is loop-hoisting or DCE-ing this micro-benchmark, but it would still be good to look into our own perf numbers.
(Assignee)

Comment 1

2 years ago
I guess the question is do they not flush layout.
(Assignee)

Comment 2

2 years ago
Webkit does call targetNode->document().updateLayoutIgnorePendingStylesheets(); once per event.

Comment 3

2 years ago
Blink does something similar.  We should be able to achieve parity with them in terms of the number of reflows if we cached what Event::GetOffsetCoords() computes...
(Assignee)

Comment 4

2 years ago
yeah, although that is against the current spec.
So a few things are going on here:

1)  Our flushing code is somewhat dumb in this case: it does a lot of no-op work.  The dumbness can be fixed now, though, due to bug 1334735 being fixed.  I filed bug 1339891 on this.  Fixing that gives us an immediate 30% speedup here, because the flush code becomes a very quick boolean check and then we know we don't need to flush.  Note that it's not clear whether this would help in the original speedometer testcase, because there maybe we _do_ have dirty layout!

2)  The remaining work is due to us doing a bunch of dynamic computation of offsets on every offsetX/Y get.  Specifically, lots of time under GetTransformToAncestor, FindNearestCommonAncestor, Matrix4x4 bits.  This is the part where Blink and WebKit cache the results of this computation.  I'm pretty sure they still have identical code there, based on a brief skim.

It's not clear to me, for #2, what the spec actually requires.  The spec text at https://drafts.csswg.org/cssom-view/#dom-mouseevent-offsetx is:

  If the event’s dispatch flag is set, return the x-coordinate of the position where the event occurred
  relative to the origin of the padding edge of the target node, ignoring the transforms that apply
  to the element and its ancestors, and terminate these steps.

Seems to me like the "correct" per-spec implementation would be neither our "compute it every time" nor Blink/WebKit's "compute it on first get" but a "compute it eagerly at event creation time".  And yes, the three approaches can all give different behavior in various cases...  :(
(Assignee)

Comment 6

2 years ago
I don't understand where you get the "compute it eagerly at event creation time" interpretation.
> I don't understand where you get the "compute it eagerly at event creation time" interpretation.

Well, it says "where the event occurred relative to the origin of the padding edge of the target node".  I guess the question is whether it means "the padding edge when the event occurred" or "the padding edge now"?  I read it as the former, but you're right that it's not super-clear.

Comment 8

2 years ago
Jan's test case shouldn't cause any reflows.  It's a bit hard to believe that we're slower *because* of the reflow cost in that test case at least.  I was curious so I profiled the micro benchmark: https://perfht.ml/2lkvQdf

30ms reflow that's all the cost of vsnprintf (profiler overhead).  Filed bug 1339897.
32ms nsLayoutUtils::TransformPoint() and a bunch of other overheads.

I think from this it's clear that the caching does make a difference, even disregarding reflow cost.
Sure.  Just to put numbers to this, on my hardware Safari is at ~10ns per iteration on this testcase if I up the iteration count some.  We're at ~380ns with the patch for bug 1339891.  All of that time, pretty much, is under TransformPoint.  I will claim it's impossible to do the work of computing all that transformation matrix stuff in ~10ns.  ;)

If I add the obvious cache bits, like so:

MouseEvent::OffsetX()
{
  if (!mCachedOffsetPoint) {
    mOffsetCoords = Event::GetOffsetCoords(mPresContext, mEvent, mEvent->mRefPoint,
                                           mClientPoint);
    mCachedOffsetPoint = true;
  }
  return mOffsetCoords.x;

we end up at ~7.5ns per iteration.

Comment 10

2 years ago
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #9)
> Sure.  Just to put numbers to this, on my hardware Safari is at ~10ns per
> iteration on this testcase if I up the iteration count some.  We're at
> ~380ns with the patch for bug 1339891.  All of that time, pretty much, is
> under TransformPoint.  I will claim it's impossible to do the work of
> computing all that transformation matrix stuff in ~10ns.  ;)

Fine.  :P

> If I add the obvious cache bits, like so:
> 
> MouseEvent::OffsetX()
> {
>   if (!mCachedOffsetPoint) {
>     mOffsetCoords = Event::GetOffsetCoords(mPresContext, mEvent,
> mEvent->mRefPoint,
>                                            mClientPoint);
>     mCachedOffsetPoint = true;
>   }
>   return mOffsetCoords.x;
> 
> we end up at ~7.5ns per iteration.

Nice.  So I guess one question that we need to answer is what the spec actually wants.  (I wonder what Edge does.)  Also, it's still quite possible that in Speedometer we're getting slow perf because of the reflow and not the caching, since if we actually have a dirty layout, flushing it can be much more expensive than the GetOffsetCoords overhead.
So in terms of it being the actual reflow... Caching would help a _lot_ with that if there are multiple accesses with interspersed mutation, since it makes sure there is at most one reflow, period.

In terms of what Edge does, I wrote some testcases:

1) https://jsfiddle.net/7x5e701m/1/ -- on this one we report a number near 0, then a number near -200.  Blink/WebKit report a number near 0 twice.  Edge 14 matches WebKit/Blink.  The actual testcase is:

  <div style="position: relative" id="mydiv">Click on the "C" of the first "Click" in this line</div>
  <script>
  document.getElementById("mydiv").onclick = function(e) {
    console.log(e.offsetX);
    document.querySelector("div").style.left = "200px";
    console.log(e.offsetX);
  };
  </script>

2) https://jsfiddle.net/7x5e701m/2/ -- on this one both Gecko and Blink/WebKit report a single number near -200.  Edge reports a number near 0.  The actual testcase is:

  <div style="position: relative" id="mydiv">Click on the "C" of the first "Click" in this line</div>
  <script>
  document.getElementById("mydiv").onclick = function(e) {
    document.querySelector("div").style.left = "200px";
    console.log(e.offsetX);
  };
  </script>

3) https://jsfiddle.net/7x5e701m/4/ -- on this one both Gecko and Blink/WebKit report a number near 200, then a number near -200.  Edge reports a number near 200, then a number near 0.  The actual testcase is:

  <div style="position: relative" id="mydiv">Click on the "C" of the first "Click" in this line</div>
  <script>
  document.getElementById("mydiv").onclick = function(e) {
    var d = document.querySelector("div");
    d.style.left = "200px";
    console.log(d.offsetLeft)
    console.log(e.offsetX);
  };
  </script>

So looks to me like Edge is doing something like my interpretation of the spec, where the offsetX is computed based on the layout when the event fires or something?  And yes, cached after that.

Comment 12

2 years ago
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #11)
> So in terms of it being the actual reflow... Caching would help a _lot_ with
> that if there are multiple accesses with interspersed mutation, since it
> makes sure there is at most one reflow, period.

Good point.

> So looks to me like Edge is doing something like my interpretation of the
> spec, where the offsetX is computed based on the layout when the event fires
> or something?  And yes, cached after that.

Technically from these results it's also possible to interpret that Edge just doesn't do a reflow, but that seems far fetched.
> Technically from these results it's also possible to interpret that Edge just doesn't do a reflow

No, that's what the third testcase is about.  Edge is reporting an offsetLeft of 200 for that div, so it did the reflow for the .style.left change, yes?

Comment 14

2 years ago
(In reply to Boris Zbarsky [:bz] (still a bit busy) from comment #13)
> > Technically from these results it's also possible to interpret that Edge just doesn't do a reflow
> 
> No, that's what the third testcase is about.  Edge is reporting an
> offsetLeft of 200 for that div, so it did the reflow for the .style.left
> change, yes?

Oh yes, you're right.

This is a terrible situation.  :(  We have three competing implementations.
Priority: -- → P1
So ok, we need to make a call here.  Do we follow IE and implement the spec, but with a performance penalty due to having to do this computation at firing time for every single mouse event?  Do we get the spec change to the somewhat incoherent but fast Blink/WebKit behavior?  Something else?

Paging events owner.  ;)
Flags: needinfo?(bugs)
(Assignee)

Comment 16

2 years ago
And the events owner reads the spec differently saying that IE/Edge doesn't follow the spec but we do.
But given that webkit/blink model gives the best performance, and at least it isn't easily visible to the user of offsetX/Y that at which point the value is cached, I think we should try that model.
(Assignee)

Comment 17

2 years ago
And we of course need a spec change/clarification
https://github.com/w3c/csswg-drafts/issues/1070


Though, how fast can we be with Edge model, assuming we don't flush anything, but just do computation based on the current layout state?
Flags: needinfo?(bugs)
(Assignee)

Comment 18

2 years ago
But still, I can live with webkit/blink model, as horrible as it is.
> https://github.com/w3c/csswg-drafts/issues/1070

You probably want to add "[cssom-view]" to the beginning of the issue title, so the CSS WG folks end up with it in the right bucket.

> how fast can we be with Edge model, assuming we don't flush anything

That's a good question.  It sort of depends on the depth of the frame tree and whatnot, right?  In the testcase attached to this bug it takes us about 300-400ns per call to do the computation; the flushing is not really a factor there because there is nothing to flush.

If I add ten levels of nesting, then it's closer to 500ns.  I expect if I added some transform styles and whatnot I could make it a bit slower still, but not sure about that.
Whiteboard: [qf:p3]
(Reporter)

Comment 20

2 years ago
Looking closer at Speedometer, I wonder if the problem here is instead that Webkit short-circuits simulated events and we don't:

https://github.com/WebKit/webkit/blob/8ff2aef949f9fb1857fe85990d8c5c831b99d872/Source/WebCore/dom/MouseRelatedEvent.cpp#L222-L223

At least some Speedometer tests trigger events by doing things like element.click().

Am I missing something?
Flags: needinfo?(bugs)
Hmm.  That might be something we can fix much more cheaply/easily than the real event case, yes...
(In reply to Boris Zbarsky [:bz] (still a bit busy) (if a patch has no decent message, automatic r-) from comment #21)
> Hmm.  That might be something we can fix much more cheaply/easily than the
> real event case, yes...

Yea, but this seems to be trending toward benchmarketing fixes, right?
I mean, I guess its useful if it moves the bottleneck to something else real we can fix.
(Reporter)

Comment 24

2 years ago
There are non-benchmarking use cases for things like element.click() no?

FWIW, what's happening here is that jQuery wraps the event in its own event-thing and uses a for-in loop to copy all properties from the event to its own object. That's where the offsetX/offsetY gets are coming from...

Fixing this will definitely be a win on Speedometer, it affects multiple tests.
> Yea, but this seems to be trending toward benchmarketing fixes, right?

I think click() is common enough that making that case not do stupid things is not just benchmarketing.  In particular, it can skip computing the offsets altogether, unlike real mouse events...
Ok.  I guess I didn't realize .click() was something used on real web sites. *mumbles something about c++*
(Assignee)

Comment 27

2 years ago
(In reply to Jan de Mooij [:jandem] from comment #20)
> Looking closer at Speedometer, I wonder if the problem here is instead that
> Webkit short-circuits simulated events and we don't:
> 
> https://github.com/WebKit/webkit/blob/
> 8ff2aef949f9fb1857fe85990d8c5c831b99d872/Source/WebCore/dom/
> MouseRelatedEvent.cpp#L222-L223

What does simulated mean in webkit?
Flags: needinfo?(bugs)
(Assignee)

Comment 28

2 years ago
But if other browsers return always 0 for offsetX/Y when using click() then we could try to do the same and get that spec'ed.
(Assignee)

Comment 29

2 years ago
Let's try to do at least the click() hack. I wonder in which all cases webkit/blink have it.
Not when creating/dispatching new MouseEvent explicitly in JS.
Assignee: nobody → bugs
(Assignee)

Comment 30

2 years ago
Posted file click() perf
(Assignee)

Updated

2 years ago
Summary: Accessing offsetX/offsetY on MouseEvents slower than in other engines → Accessing offsetX/offsetY on MouseEvents dispatched from click() is slower than in other engines
(Assignee)

Comment 31

2 years ago
Can't write wpt test for this since it can't send low level events.
This is trying to mimic blink's behavior. The spec bug is still open so tweaks may come.

Let's see what tryserver thinks about this
https://treeherder.mozilla.org/#/jobs?repo=try&revision=11888382da065122762f7425921929105fbbe9cb
Attachment #8850166 - Flags: review?(masayuki)
Comment on attachment 8850166 [details] [diff] [review]
positionless mouse events

> nsIntPoint
> UIEvent::GetLayerPoint() const
> {
>+  if (mEvent->mFlags.mIsPositionless) {
>+    return nsIntPoint(0, 0);
>+  }
>+
>   if (!mEvent ||
>       (mEvent->mClass != eMouseEventClass &&
>        mEvent->mClass != eMouseScrollEventClass &&
>        mEvent->mClass != eWheelEventClass &&
>        mEvent->mClass != ePointerEventClass &&
>        mEvent->mClass != eTouchEventClass &&
>        mEvent->mClass != eDragEventClass &&
>        mEvent->mClass != eSimpleGestureEventClass) ||

Hmm, like this usage, we cannot add common flags/data for pointing device events to MouseEventClassBase nor MouseEventClass (TouchEventClass doesn't inherit them).

>diff --git a/dom/events/test/test_bug1339758.html b/dom/events/test/test_bug1339758.html
>new file mode 100644
>--- /dev/null
>+++ b/dom/events/test/test_bug1339758.html
>@@ -0,0 +1,78 @@
>+<!DOCTYPE HTML>
>+<html>
>+<!--
>+https://bugzilla.mozilla.org/show_bug.cgi?id=1339758
>+-->
>+<head>
>+  <meta charset="utf-8">
>+  <title>Test for Bug 1339758</title>
>+  <script type="application/javascript" src="/tests/SimpleTest/SimpleTest.js"></script>
>+  <script type="text/javascript" src="/tests/SimpleTest/EventUtils.js"></script>
>+  <link rel="stylesheet" type="text/css" href="/tests/SimpleTest/test.css"/>
>+  <script type="application/javascript">
>+
>+  /** Test for Bug 1339758 **/
>+  var expectNonZeroCoordinates = false;
>+  SimpleTest.waitForExplicitFinish();
>+
>+  function testCoordinates(e) {
>+   var coordinateProperties =
>+     [ "screenX",
>+       "screenY",
>+       "clientX",
>+       "clientY",
>+       "offsetX",
>+       "offsetY",
>+       "movementX",
>+       "movementY",
>+       "layerX",
>+       "layerY",
>+       "x",
>+       "y" ];
>+    for (var i in coordinateProperties) {
>+      if (e[coordinateProperties[i]] != 0) {
>+        ok(expectNonZeroCoordinates, "Got at least some non-zero coordinate property: " + i);

Hmm, when this fails, it's difficult to distinguish what's tested. Could you add description with global variable before each test in runTests()?

>+        return;
>+      }
>+    }
>+    ok(!expectNonZeroCoordinates);

And I want description here too.

>+  }
>+
>+  function runTests() {
>+    document.getElementById("div_target").click();
>+    document.getElementById("a_target").focus();

focus()? not click()?

>+    sendKey("RETURN");
>+    document.getElementById("input_target").focus();
>+    sendKey("RETURN");
>+
>+    expectNonZeroCoordinates = true;
>+    // Test script created MouseEvents
>+    sendMouseEvent({ type: "click"}, document.getElementById("a_target"));
>+    sendMouseEvent({ type: "click"}, document.getElementById("input_target"));
>+
>+    // Test widget level mouse events
>+    synthesizeMouse(document.getElementById("a_target"), 2, 2, {});
>+    synthesizeMouse(document.getElementById("input_target"), 2, 2, {});
>+    SimpleTest.finish();
>+  }
>+
>+  SimpleTest.waitForFocus(runTests);
>+
>+
>+  </script>
>+</head>
>+<body>
>+<a target="_blank" href="https://bugzilla.mozilla.org/show_bug.cgi?id=1339758">Mozilla Bug 1339758</a>
>+<p id="display"></p>
>+
>+<div id="div_target" onclick="testCoordinates(event)">&nbsp;</div>
>+<a href="#" id="a_target" onclick="testCoordinates(event);">test link</a><br>
>+<input type="button" id="input_target" onclick="testCoordinates(event);" value="test button">

Or, event.target.id could be useful for the description to distinguish which test was running.

>diff --git a/widget/BasicEvents.h b/widget/BasicEvents.h
>--- a/widget/BasicEvents.h
>+++ b/widget/BasicEvents.h
>@@ -134,16 +134,19 @@ public:
>   bool mComposedInNativeAnonymousContent : 1;
>   // Set to true for events which are suppressed or delayed so that later a
>   // DelayedEvent of it is dispatched. This is used when parent side process
>   // the key event after content side, and may drop the event if the event
>   // was suppressed or delayed in contents side.
>   // It is also set to true for the events (in a DelayedInputEvent), which will
>   // be dispatched afterwards.
>   bool mIsSuppressedOrDelayed : 1;
>+  // Certain mouse events can be marked as positionless to return 0 from
>+  // coordinate related getters.
>+  bool mIsPositionless : 1;

Although, I don't like to add such event specific flag to EventFlags. But it's okay.
Attachment #8850166 - Flags: review?(masayuki) → review+
(Assignee)

Comment 33

2 years ago
> >+
> >+  function runTests() {
> >+    document.getElementById("div_target").click();
> >+    document.getElementById("a_target").focus();
> 
> focus()? not click()?
focus() and not click() to send RETURN to the right element.

> >+  bool mIsPositionless : 1;
> 
> Although, I don't like to add such event specific flag to EventFlags. But
> it's okay.
I wanted to put it to flags since that way we don't end up increase size of any Widget event class
(Assignee)

Comment 34

2 years ago

Comment 35

2 years ago
Pushed by opettay@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bec20d74d93a
click() and similar can return 0 from various coordinate properties, r=masayuki

Comment 36

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/bec20d74d93a
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
I am seeing a 10% improvement on speedometer due to this landing. Awesome!
http://arewefastyet.com/#machine=17&view=single&suite=speedometer&start=1490146341&end=1490347719
Do we have a followup bug filed on offsetX for non-click() events?  The benchmark uses click(), which is all well and good, but real sites would have actual user clicks....
(Assignee)

Updated

2 years ago
Blocks: 1350373
Component: DOM → DOM: Core & HTML
Product: Core → Core
You need to log in before you can comment on or make changes to this bug.