speedometer v2 VueJS async subtests take about twice as long in Firefox vs. Chrome (mostly running JS)

NEW
Unassigned

Status

()

enhancement
P3
normal
2 years ago
4 months ago

People

(Reporter: dholbert, Unassigned)

Tracking

(Depends on 1 bug, Blocks 1 bug, {perf})

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox57 wontfix, firefox58 wontfix, firefox59 ?)

Details

(Whiteboard: [qf:p3:responsiveness])

Reporter

Description

2 years ago
Here's AWFY for Speedometer v2 "VueJS-TodoMVC-Adding100Items-async" subtest:
https://arewefastyet.com/#machine=36&view=single&suite=speedometer-misc&subtest=VueJS-TodoMVC-Adding100Items-async

As you can see there, typical values for Firefox are ~75ms, whereas typical values for Chrome are ~39ms.  So we're taking nearly twice as much time as Chrome (~36ms wasted with respect to them).

Here's a profile of latest nightly, with Acer reference hardware, zoomed to this "Adding100Items-async" subtest (with an 84ms score for the subtest):
https://perf-html.io/public/3265485ddea33710e1a98eda2af95a2ee5c77f5e/calltree/?hiddenThreads=&range=1.0308_1.1155&thread=3&threadOrder=0-2-3-4-1-5

This profile shows we're almost entirely in the JS-engine. (And IIRC, the durations reported in the Call Tree view are all undercounting by 50%, due to a profiler/perf.html bug.)
Reporter

Updated

2 years ago
Summary: speedometer v2 "VueJS-TodoMVC-Adding100Items-async" subtest takes about twice as long in Firefox, vs. Chrome (and mostly running JS) → speedometer v2 "VueJS-TodoMVC-Adding100Items-async" subtest takes about twice as long in Firefox vs. Chrome (mostly running JS)
Reporter

Comment 1

2 years ago
Actually, I'll broaden this bug slightly -- we take a good bit longer than Chrome on *all* async subtests for VueJS.

"Profile" links below are already zoomed in to the three "async" tests. (They're shown as large black bars if you view the full profile, with hover-text "UserTiming", since I applied ehsan's speedometer-benchmark-patch from bug 1373723.)

* speedometer-misc-VueJS-TodoMVC-Adding100Items-async (discussed in comment 0):
AWFY: https://arewefastyet.com/#machine=36&view=single&suite=speedometer-misc&subtest=VueJS-TodoMVC-Adding100Items-async
Typical recent score from AWFY:  Firefox: 75ms, Chrome: 39ms
Profile:
https://perf-html.io/public/3265485ddea33710e1a98eda2af95a2ee5c77f5e/calltree/?hiddenThreads=&range=1.0308_1.1155&thread=3&threadOrder=0-2-3-4-1-5

* speedometer-misc-VueJS-TodoMVC-CompletingAllItems-async:
https://arewefastyet.com/#machine=36&view=single&suite=speedometer-misc&subtest=VueJS-TodoMVC-CompletingAllItems-async
Typical recent score from AWFY:  Firefox: 58ms, Chrome: 26ms
Profile:
https://perf-html.io/public/3265485ddea33710e1a98eda2af95a2ee5c77f5e/calltree/?hiddenThreads=&range=1.1347_1.2050&thread=3&threadOrder=0-2-3-4-1-5

* speedometer-misc-VueJS-TodoMVC-DeletingAllItems-async:
https://arewefastyet.com/#machine=36&view=single&suite=speedometer-misc&subtest=VueJS-TodoMVC-DeletingAllItems-async
Typical recent score from AWFY:  Firefox: 20ms, Chrome: 7ms    (though Firefox has 2 recent 60+ spikes on AWFY) 
Profile:
https://perf-html.io/public/3265485ddea33710e1a98eda2af95a2ee5c77f5e/calltree/?hiddenThreads=&range=1.2234_1.2603&thread=3&threadOrder=0-2-3-4-1-5

I believe the intent of the "async" subtests is that they're supposed to measure a refresh-driver-tick, but the profile snapshots shown
Summary: speedometer v2 "VueJS-TodoMVC-Adding100Items-async" subtest takes about twice as long in Firefox vs. Chrome (mostly running JS) → speedometer v2 VueJS async subtests take about twice as long in Firefox vs. Chrome (mostly running JS)

Comment 2

2 years ago
Please note that the Acer machine is fairly useless for Speedometer profiling usually due to its super low sampling frequency of 2ms!  As you can see from these profiles, the frequency is so low that it is extremely hard to say where the time is going since you have a sum of only 16 interesting samples.

I usually profile these on Linux with a sampling frequency of 0.1ms in order to get much better results.  Such low sampling frequencies are easily achievable on OSX as well with Gecko Profiler.  On Windows I have tried to find a native profiler that goes below 1ms to no avail.
Whiteboard: [qf] → [qf:investigate:p1]
Reporter

Updated

2 years ago
Flags: needinfo?(dholbert)
Comment hidden (obsolete)
Comment hidden (obsolete)
Reporter

Comment 5

2 years ago
Here's a profile taken in latest 64-bit Linux Nightly (2017-07-10) on fast hardware (Lenovo ThinkStation), with 0.1ms sample frequency, running the VueJS-TodoMVC subtest 10 times, with ehsan's UserTiming benchmark-patch to label the async regions (as before). I triggered each test by manually clicking "Run()" on InteractiveRunner.html.
PROFILE: http://perfht.ml/2tHafxv

In that profile, zooming into just the first iteration, we can see these values (from UserTiming black bars):
 Adding100Items:     121.55ms
 CompletingAllItems: 127.93ms
 DeletingAllItems:   51.41ms
https://perf-html.io/public/78681b767181898509059170943d9dc569ce50af/calltree/?hiddenThreads=&range=0.9125_1.3821&thread=4&threadOrder=0-2-3-4-1

Focusing on that first Adding100Items for now (the very first UserTiming in the profile, 121.55ms), I see:
https://perf-html.io/public/78681b767181898509059170943d9dc569ce50af/calltree/?hiddenThreads=&range=0.9125_1.3821~0.9273_1.0488&thread=4&threadOrder=0-2-3-4-1
 - A 10.9ms HTMLInputElement::Focus call, which is mostly a style flush (& frame construction in particular)
 - A 16.4ms nsRefreshDriver::Tick call (which is about half reflow and half painting; around 8ms of each)
 - The rest of the time (nearly 100ms), it looks like we're running JavaScript.

And for the first CompletingAllItems (the very second UserTiming in the profile, 127.93ms), I see:
https://perf-html.io/public/78681b767181898509059170943d9dc569ce50af/calltree/?hiddenThreads=&range=0.9125_1.3821~1.0766_1.2045&thread=4&threadOrder=0-2-3-4-1
 - A 25.7ms HTMLInputElement::Focus call, which is mostly a style flush (though not all frame construction)
 - A 13.8ms nsRefreshDriver::Tick call (which is 4.9ms reflow followed by 6.6ms painting)
 - And the rest of the time (nearly 90ms), it looks like we're running JavaScript.
Reporter

Comment 6

2 years ago
...and for the first DeletingAllItems (third UserTiming in the profile, 51.41ms), I see:
https://perf-html.io/public/78681b767181898509059170943d9dc569ce50af/calltree/?hiddenThreads=&range=0.9125_1.3821~1.2373_1.2887&thread=4&threadOrder=0-2-3-4-1
 - A 4.8ms nsRefreshDriver::Tick call, which is mostly (3.3ms) painting.
 - Negligible time spent in restyle/reflow (less than 1ms each).
 - 5.3ms spent in nsINode::RemoveChild
 - And the rest of the time, we seem to be running JS, I think.

The profile has 9 more iterations of the test, for additional information if needed, but I haven't looked at those.
Reporter

Comment 7

2 years ago
jonco, perhaps you (or someone from the JS team) could take a look at these profiles? (focusing on the UserTiming regions for the purposes of this bug)

As shown by arewefastyet.com graphs above, we take about twice as much time as Chrome on the VuewJS async subtests, and my profiles in comment 5 - comment 6 show that we're spending most of the benchmark's time running JS (as far as I can tell).
Flags: needinfo?(jcoppeard)
Jan, do you know who would be a good person to look at this?

From what I can see from the timeline view we're spending most of the time in running in baseline with only a little in Ion.
Flags: needinfo?(jcoppeard) → needinfo?(jdemooij)
I'll look into this after bug 1373672 has been fixed as it might be related.
Depends on: 1373672
Depends on: 1389159
Keywords: perf
Priority: -- → P3
Reporter

Comment 10

2 years ago
FWIW -- I just looked at the graph, and it seems like things are a bit better.  The graph has stayed mostly flat, but it looks like we added a new (better) configuration shortly after I filed this bug ("Firefox (Ion PGO)"), and it's faster than our previous best configuration (not surprisingly).

Our Ion-PGO test-durations here are in the 60-65ms range, and Chrome's test-durations are around ~40ms (though AWFY's most recent Chrome measurement was 62.85ms - but that's just a single point & probably a fluke).  So our best configuration is now taking 50% more time than Chrome here, rather than twice as much time (100% more).
Reporter

Comment 11

2 years ago
(Sorry, I should've said -- Comment 10 is RE "VueJS-TodoMVC-Adding100Items-async", for this graph:
  https://arewefastyet.com/#machine=36&view=single&suite=speedometer-misc&subtest=VueJS-TodoMVC-Adding100Items-async
On the other hand, we still seem to take ~twice as long as Chrome (and PGO doesn't help us much) on the "CompletingAllItems" subtest, generally.)
Bug 1389159 might help a bit. Clearing NI for now as I have some other things on my plate.
Flags: needinfo?(jdemooij)
Reporter

Updated

4 months ago
Whiteboard: [qf:investigate:p1] → [qf:p3:responsiveness]
Reporter

Comment 14

4 months ago

FWIW, the speedometer-specific awfy page is now here: https://arewefastyet.com/win10/speedometer?numDays=60 , and there's a section there for VueJS-TodoMVC/Adding100Items/Async that still shows our time measurements being 1.5-2x Chrome's. (their recent measurements are in the 24-25ms range, and ours are in the 40-45ms range.)

You need to log in before you can comment on or make changes to this bug.