speedof.me reports download speeds for Geckoview_example and Fenix to be ~40% to 50% of Chrome's - Moto G5
Categories
(Core :: Graphics, task, P3)
Tracking
()
People
(Reporter: acreskey, Unassigned)
References
(Depends on 1 open bug, Blocks 2 open bugs)
Details
(Keywords: perf:pageload)
I collected speedof.me download speed results for geckoview_example, Firefox Preview, and Chrome on two android devices, the Pixel 3 and the reference phone, Moto G5.
(Based on Kamyar's observations)
On the Moto G5, download speeds for Geckoview_example and Fenix are ~40% and ~50% of Chrome's, respectively.
On the Pixel 3, download speeds for Geckoview_example and Fenix are ~84% and ~94% of Chrome's, respectively.
Latency also appears to be lower on Chrome.
There is a short writeup on speedof.me regarding how the tests work:
It downloads progressively larger contiguous files until they take longer than 8 seconds to download. The timing for the last one is used.
I also run some tests where I modified network preferences.
Increasing network.http.max-connections
doesn't look to improve gecko download speed.
This makes sense since from looking at what the site does it's fewer resources downloaded, but they are quite large.
I tried increasing others prefs that I thought might possibly impact this:
network.http.max-persistent-connections-per-server
network.http.spdy.default-hpack-buffer
(set to 4k on android)
network.http.spdy.push-allowance
(set to 32k on android)
These didn't appear to impact the reported speed.
The resources I saw in the dev tools did come through via http/2.
If anyone knows of other configurations that could impact this, let me know and I'm happy to try them.
Reporter | ||
Comment 1•5 years ago
|
||
This is a simpleperf capture from the Moto G5 while the test is running:
http://bit.ly/2W44Tt4
Comment 2•5 years ago
|
||
I just did 4 runs on my Moto G5 with default settings.
GVE:
Download: 39M, 51M, 46M, 47M
Upload: 38M, 41M, 39M, 41M
Fenix:
Download: 53M, 47M, 48M, 56M
Upload: 44M, 39M, 43M, 43M
Chrome:
Download: 54M, 62M, 59M, 60M
Upload: 55M, 36M, 56M, 54M
Which looks like we are still slower to Chrome, but not that bad?
Comment 3•5 years ago
|
||
Looks like the G5 is close to where the P3 was - 90%-ish of Chrome on download, though maybe 75% on upload. I think a new profile would also be in order, and a re-check on a P3 (and maybe a few more runs; the noise per run is high, so 4 runs has a wide error bar). Thanks Sean!
Reporter | ||
Comment 4•5 years ago
|
||
Sean, I wonder if the network you are using is leading to these results?
I re-ran this test and my results still match Kamyar's comment 1:
GV_E (mozilla-central.nightly.2019.06.24)
Download (MBps), 40.0, 39.9, 39.6
Upload (MBbps) 20.0, 19.4, 19.4
Chome 74
Download (MBps): 92.9, 97.8, 94.85
Upload (MBbps) 21.7, 20.7, 20.59
Do you have a faster network that you can test on?
Comment 5•5 years ago
|
||
FiOS 75/75 (but usually measures around 90/90), Moto G5, 3 feet from AP:
GVE: (local build): down: 33, Up 40. Odd, down is reliably less than up.
Chrome: down 91, Up 80
Reporter | ||
Comment 6•5 years ago
|
||
Right, I'm on TekSavvy 250Mbps down, 20Mbps up (hitting max upload with both browsers)
Tests were also done about 3 feet from the access point.
Comment 7•5 years ago
|
||
I was on the Toronto office wifi, I don't know where the AP is, but I don't see any AP-like things near me within 3 feet.
I wonder if it's caused by our geo locations and my tests always connected to 63.245.212.198, NewYork 1, and the latency is somewhat between 20ms to 50ms
Couple of More Runs
Chrome 75
Download (MBps): 46.81 (Max 55.03), 47.39 (Max 56.08), 53.78 (Max 63.73)
Upload (MBbps): 73.89 (Max 74.24), 70.09 (Max 85.11), 51.27 (Max 71.12)
GVE (mozilla-central.nightly.2019.06.24)
Download (MBps): 48.92 (Max 51.52), 44.8 (Max 53.39), 45.63 (Max 52.95)
Upload (MBbps): 39.62 (Max 41.29), 40.83 (Max 45.28), 42.92 (Max 44.83)
My tests shows we are like 90% of Chrome for download, and like 50% of Chrome for upload.
Comment 8•5 years ago
|
||
my latency is 17ms in Chrome, 30-110ms in Firefox.; and latency strongly affects TCP speed. And now I'm seeing 30-35Mbps down, 20-25Mbps up. Test Server "unknown" (?)
We're strongly gated in Content on GFX: https://perfht.ml/2FF0hUy - 59% in Paint(). The Blob constructor is using one chunk in the middle(?) for about 5%; another 7 inbetween gfx chunks in OnDataAvailable, mostly memcpy.
SocketThread is using a LOT of time doing AES_Decrypt() - a total of over 55% in code called from WriteSegment/WritePipeSegments, almost all of it in AES_Decrypt - and another 20% in AES_Encrypt. m_kato's patches for AEC on arm32 might help a TON here.
Comment 9•5 years ago
|
||
Markus - what are our options on gfx here? Or is the page just stupid?
m_kato: what sort of speedup do you expect for this set of calls show in the profile for SocketThread on a Moto G5 (Arm32, 8 symmetric cores A53 1.4GHz == Qualcomm MSM8937 Snapdragon 430 (28 nm)). https://www.gsmarena.com/motorola_moto_g5-8454.php
Updated•5 years ago
|
Comment 10•5 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #8)
We're strongly gated in Content on GFX: https://perfht.ml/2FF0hUy - 59% in Paint().
This is the drawing of the glow effect on the speedometer dial that they display during the test. They're using an SVG filter which does a blur. I wouldn't necessarily call it stupid, but it's certainly an expensive effect. Once we have WebRender on Android and complete SVG filter support in WebRender (bug 1409486), this should become better.
You could try checking if the score improves if you have an override CSS style that disables the effect. I'm not sure how to achieve that, though.
Updated•5 years ago
|
Comment 11•5 years ago
|
||
The patch from bug 1152625 doesn't seem to make a huge difference; probably the wrong cipher --or gfx is the blocker
Comment 12•5 years ago
|
||
Leaf nodes (in AES_Decrypt/Encrypt) are rijndael_encryptBlock128(), gcm_HashMult_sftw32 and things called from it. sendto() is ~5%, recvfrom is around 4%.
Comment 13•5 years ago
•
|
||
The site also provided an API to do the tests, although the free version has limited number of requests, it doesn't have that graphic effect.
https://speedof.me/api/doc/sample_advanced.html
So I did a few more runs!
GVE:
Download (MBps): 45.35, 41.2, 47.55, 45.99
Upload (MBbps): 42.33, 38.59, 39.96, 41.43
Jitter (ms): 39, 45, 27, 40
Latency (ms): 26, 38, 46, 50
Profile: https://perfht.ml/2X8KlQz
Chrome 75:
Download (MBps): 39.47, 46.93, 42.54, 49.43
Upload (MBbps): 77.06, 74.83, 73.45, 76.22
Jitter (ms): 24, 27, 17, 38
Latency (ms): 34, 23, 24, 23
Comment 14•5 years ago
|
||
Also, GCM's code is shown. I filed bug for GCM/aarch32 as bug 1562548
Comment 15•5 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #9)
Markus - what are our options on gfx here? Or is the page just stupid?
m_kato: what sort of speedup do you expect for this set of calls show in the profile for SocketThread on a Moto G5 (Arm32, 8 symmetric cores A53 1.4GHz == Qualcomm MSM8937 Snapdragon 430 (28 nm)). https://www.gsmarena.com/motorola_moto_g5-8454.php
I don't have Moto G5 now. but, As chip, it supports AES, but some vendor may disable it on aarch32 mode of Linux kernel. When browsing file:///proc/cpuinfo can show current support. (If you want to know aarch32's support, you have to use 32-bit program)
Comment 16•5 years ago
|
||
I don't have Moto G5 now. but, As chip, it supports AES, but some vendor may disable it on aarch32 mode of Linux kernel. When browsing file:///proc/cpuinfo can show current support. (If you want to know aarch32's support, you have to use 32-bit program)
bogomips: 38.0 half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm aes pmull sha1 sha2 crc32
Reporter | ||
Comment 17•5 years ago
•
|
||
It appears that we are indeed limited by the SVG filter.
Based on comment 8 and comment 10 I made nsSVGIntegrationUtils::PaintFilter
a no-op and the reported download bandwidth almost doubled.
Baseline gv_example (local release build, Moto G5)
Download (Mbps): 37.1, 37.1, 37.8
Upload (Mbps): 19.8, 19.8, 17.7
gv_example (local release build, , Moto G5, but skipping nsSVGIntegrationUtils::PaintFilter
):
https://searchfox.org/mozilla-central/rev/11712bd3ce7454923e5931fa92eaf9c01ef35a0a/layout/svg/nsSVGIntegrationUtils.cpp#1057
Download (Mbps): 62.9, 63.0, 62.4
Upload (Mbps): 20.4, 20.3, 20.2
My home network is 250Mbps down and 20Mbps upload so I'm near the upload limit.
On Chrome I'm seeing ~100Mbps download.
I have a preference for independently-verified results, so if someone else who can repro the issue would like to repeat the test, that would be great.
But this makes me think:
- Is this an isolated problem? An expensive filter running during their test, or could this impact pageload of real sites?
I'll start a tp6m job to test this hypothesis. - It's great that we may be getting decryption optimizations out of this. But is there anything else we can do about the graphics side prior to WebRender + SVG?
Comment 18•5 years ago
•
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #16)
I don't have Moto G5 now. but, As chip, it supports AES, but some vendor may disable it on aarch32 mode of Linux kernel. When browsing file:///proc/cpuinfo can show current support. (If you want to know aarch32's support, you have to use 32-bit program)
bogomips: 38.0 half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm aes pmull sha1 sha2 crc32
Hmm, I guess that Moto G5's kernel doesn't return valid feature with AT_HWCAP2. By bug 1562611, I will add telemetry for it and I will change CPU detection of arm.h to use cpu-features on Android.
Reporter | ||
Comment 19•5 years ago
|
||
This is a performance comparison of baseline (left) vs a build where nsSVGIntegrationUtils::PaintFilter()
does nothing (Moto G5 - arm7, geckoview_example PGO):
(please ignore the Pixel 2 results, it's not reasonable to get enough retries on that device at the moment)
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=84cc921108a44b6481e891b7b1bdfb05198bcfe6&newProject=try&newRevision=fdc0451a2a203c35b53e2ae35d689c7b56227830&framework=10
Thoughts:
• There are still some running jobs, but this one might show a real change because the SVG filter is painted 3 times during the test:
6% loadtime, 5% fnbp improvement on tp6m-amazon-search-geckoview over 20 runs
• This looked like a ~7% improvement on tp6m-bbc-geckoview except... from my logging the filter isn't even used in this page.
• When I test these PGO builds locally the reported download rate increases from ~40Mbps to ~62Mbps with the SVG filter disabled.
Comment 20•5 years ago
|
||
(In reply to Makoto Kato [:m_kato] from comment #18)
Hmm, I guess that Moto G5's kernel doesn't return valid feature with AT_HWCAP2. By bug 1562611, I will add telemetry for it and I will change CPU detection of arm.h to use cpu-features on Android.
We need a workaround like https://bugs.chromium.org/p/boringssl/issues/detail?id=46. BoringSSL reads cpuinfo if AT_HWCAP2 returns 0.
Comment 21•5 years ago
|
||
Should we move this bug to the "Core::Networking: HTTP" Bugzilla component?
Reporter | ||
Comment 22•5 years ago
|
||
Chris, from what I can tell the graphics in the content process is the biggest bottleneck (although great to see improvements to encryption/decryption being made as well).
I'm not sure how the bug should be moved based on that.
Comment 23•5 years ago
|
||
Perhaps make this a bug on SVG and spin off a clone for the networking issue (assigned to m_kato). Or make it a meta (especially if we think there are more than these 2 issues), and spin off 2 clones
Reporter | ||
Comment 24•5 years ago
|
||
Moved this bug to Core::Graphics.
In comment 10 and comment 17 we saw that the SVG blur filter is significantly reducing the reported download bandwidth.
Not sure if anything can be done outside of WebRender.
Updated•5 years ago
|
Reporter | ||
Comment 25•5 years ago
|
||
I had meant to try this earlier:
I enabled webrender on the motoG5 but the performance of this test did not improve.
Here's a profile:
https://perfht.ml/2o7OXuN
A lot of time in the content process in mozilla::dom::XMLHttpRequestMainThread::AppendToResponseText
and nscstring_fallible_append_utf16_to_utf8_impl
.
On the socket thread, very busy in GCM_DecryptUpdate
.
Reporter | ||
Comment 26•5 years ago
|
||
In Bug 1576617 we discussed how speedof.me
is using random text for the large files that are download tested.
Not the most realistic scenario. I've explained and asked them to change the XHR
's to "arraybuffer"
mode but they did not respond.
Nonetheless, geckoview example is reporting ~35Mbps download while Chrome on the same device is at around 85Mbps.
Comment 27•5 years ago
|
||
When Web Render is turned off, filter processing spends a lot of times.
Updated•4 years ago
|
Updated•4 years ago
|
Updated•4 years ago
|
Comment 28•4 years ago
|
||
This seems to work now!
Updated•3 years ago
|
Description
•