Closed Bug 857359 Opened 12 years ago Closed 6 years ago

Startup page load times 3 or 4 time slower than chrome or stock on eideticker nytimes test

Categories

(Firefox for Android Graveyard :: General, defect, P5)

x86_64
Linux
defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: wlach, Unassigned)

References

(Depends on 2 open bugs, )

Details

(Keywords: meta, Whiteboard: [platform-rel-NYTimes])

Attachments

(4 files)

So I finally got around to adding a cross-browser page load test to eideticker (and displaying the results on the dashboard), which simply tries to load a copy of the nytimes site from the local network and display the result. Unfortunately the results make us look really bad: http://eideticker.wrla.ch/#/samsung-gn/nytimes-load/timetostableframe We seem to be between 3-4x slower than the competition (latest chrome and stock on a galaxy nexus running jelly bean). I think most of the difference is in either loading resources over the network and rendering them, but I'm not totally sure. It looks like we have at least the page background color loaded around the 7 second mark (out of an 11 second runtime). If people need assistance reproducing the setup we used for eideticker I can provide it. I suspect simply serving a copy of any website over the local network would produce similar behaviour though.
Is this from a cold start for Fennec? Webkit browsers might have an advantage here if their libraries are already loaded in memory.
Pageload performance comes up in Google Play Store comments quite a bit. AaronMT also started an informal feedback thread in Yammer, and pageload speed was cited as a reason someone didn't use Fennec.
what kats said. Dumb questions: What does 'first stable frame' mean, and why does optimization for this matter?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #1) > Is this from a cold start for Fennec? Webkit browsers might have an > advantage here if their libraries are already loaded in memory. These tests are run from the state of the process not running. Though of course if some other app has a shared library open then I assume that would impact things. It might be interesting to do another test which tests opening a new page when the app is already running. I could file another bug for that if it would be interesting/helpful. (In reply to Doug Turner (:dougt) from comment #3) > what kats said. Dumb questions: What does 'first stable frame' mean, and > why does optimization for this matter? This metric is actually explained on the dashboard. :) "The time to the first frame of the capture where the image is stable (i.e. mostly unchanging). This is a startup metric that indicates roughly when things have stopped loading. Lower values are better."
For those unfamiliar with eideticker, note that you can click on any of the data points on the graph, then click on the play control that will appear on the right to see the page load: the difference is striking. Would tcpdumps be helpful?
tracking-fennec: --- → ?
to confirm, this is loading data over wifi from an internal webserver hosting the data on the same network as the wifi? the chrome and fennec tests are on the same network, just different devices, right?
(In reply to Joel Maher (:jmaher) from comment #6) > to confirm, this is loading data over wifi from an internal webserver > hosting the data on the same network as the wifi? > > the chrome and fennec tests are on the same network, just different devices, > right? Correct on all counts.
(In reply to William Lachance (:wlach) from comment #4) > It might be interesting to do another test which tests opening a new page > when the app is already running. I could file another bug for that if it > would be interesting/helpful. I think that would be interesting and possibly helpful.
(In reply to Geoff Brown [:gbrown] from comment #8) > (In reply to William Lachance (:wlach) from comment #4) > > It might be interesting to do another test which tests opening a new page > > when the app is already running. I could file another bug for that if it > > would be interesting/helpful. > > I think that would be interesting and possibly helpful. Filed bug 860790 for this.
tracking-fennec: ? → +
(In reply to William Lachance (:wlach) from comment #9) > (In reply to Geoff Brown [:gbrown] from comment #8) > > (In reply to William Lachance (:wlach) from comment #4) > > > It might be interesting to do another test which tests opening a new page > > > when the app is already running. I could file another bug for that if it > > > would be interesting/helpful. > > > > I think that would be interesting and possibly helpful. > > Filed bug 860790 for this. Did so. Results seem to indicate that the problems are occurring after the application is launched: http://eideticker.wrla.ch/#/samsung-gn/nytimes-load-poststartup/timetostableframe
See Also: → 867690
I just added request logging to eideticker for these sorts of tests and did one sample on the galaxy nexus for the page-open-after-load test (nytimes-load-poststartup): http://eideticker.wrla.ch/httplogs/http-log-1368045939.61.json It looks like we're requesting the main page after 0.7714719772338867 seconds (there is probably a certain amount of latency between this action being triggered and firefox registering the event... up to a few hundred ms). So no *huge* surprises there. What is surprising is that it's only 3 seconds later that we're requesting the next resource. I wonder what's going on?
Blocks: 807322
Still an issue?
(In reply to Aaron Train [:aaronmt] from comment #12) > Still an issue? Yup, we are still substantially slower. http://eideticker.wrla.ch/#/samsung-gn/nytimes-load-poststartup/timetostableframe
filter on [mass-p5]
Priority: -- → P5
its not clear that this is going to be core networking.. but let's poke at it in some more helpful ways comment 11 seems like something to focus on - I don't think I'm in a good spot to pick this up personally right now (maybe valentin or bagder could), but the first set of hypothesis I would explore would be 1] necko gets the channel open for resource-2 from the dom but is slow to dispatch it for some reason (potentially threading related if it needs to ping pong to the main thread).. perhaps that can be optimized 2] the dom is not quickly consuming resource-1 (or it isn't being delivered to it quickly) and resoruce-2 is dependent on that happening 3] the dom generates resource-2 channel open but that event doesn't get to gecko for a long time due to threading and event queue issues involved in parsing the rest of resource-1 4] some other background resources are coming along and pushing the real NYT work aside. This seems unlikely, but might as well put it on the brainstorming board. an http_log from an effected load that shows the 3 second delay would be key in ruling #1 in or out (and shedding light on #4). https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging -- bonus points for indicating which urls constitute resource-1 and resource-2
That sounds like ni :wlach for an http log.
Flags: needinfo?(wlachance)
I've attached the NSPR log, as well as a SPS profile and Visual Event Tracer logs from loading the cached nytimes page. This was done on a Nexus 10. I am not sure if we see the 3s problem described above, but loading is definitely slower compared to Chrome on the same device.
I forgot, I think you need :froydnj's modified about:timeline addon in order to view the visual event trace. You can build it from here[0] or install an xpi I rolled from here[1] [0] https://github.com/froydnj/about-timeline [1] http://people.mozilla.org/~jwillcox/timeline.xpi
The event trace shows that http://192.168.1.136:8000/nytimes/css.nyt.com/css/0.1/screen/build/homepage/styles.css%3Fv=20120119.css waits for over a second. Similar wait time for http://192.168.1.136:8000/nytimes/css.nyt.com/css/0.1/print/styles.css as well. I am not sure I'm reading this right, but the name resolution looks pretty fishy. I'm also not sure why we only seem to see those two resources on the main thread. Maybe this trace is busted.
Alright, all of the other resources are in the Socket Thread, and there are even longer wait times there. It looks like the wait time for the net::http::transaction items you see there is the time it takes from nsHttpTransaction::Init to nsHttpTransaction::SetConnection.
Flags: needinfo?(wlachance)
According to the logs I was hitting the connection limit (6 per server by default). It was against a HTTP/1.0 server, so that makes sense. Upping the limit to 100 made the docshell::pageload number in the visual event trace go from 7630ms to 3877ms. It still doesn't feel *fast*, but it is clearly an improvement. Patrick, I think an increase here could certainly help. I'm going to rerun against a 1.1 server with pipelining (since we enable that in fennec) and see if we still have a problem. How was the current value settled on? With more capable networks and phones does it make sense to raise it?
Flags: needinfo?(mcmanus)
tracking-fennec: + → ?
The requests for styles.css waits for over a second. It looks like the cache fetches it immeditely, however. I'm investigating why it's taking so long for it to make it back to the main thread.
This is IETF week for me, so I haven't spent the sustained period of time with the log that it needs, trying to use f2f time effectively. I'll get there. hopefully someone else will pitch in too in the bug But from a very high level, a major issue is that the real website actually uses multiple origins and shards but the test in question has everything mapped to a single origin of http://192.168.1.136:8000 - that has a significant impact on how things are scheduled and the parallelism involved. snorp has identified that this is a problem, but in this case its a synthetic problem. the test isn't realistic for the real site you're trying to measure (which doesn't mean it is uninteresting!). The connection limit is 6 per origin, not 6 overall.. and the test uses a different number of origins than nyt. Further, the test server is not doing persistent connections. (it is returning http 1.0 responses without connection headers - those cannot be reused.) The NYT servers do persistent connections, and its pretty unusual to find any site of any scale on the Internet that doesn't... so again, this will impact the relevance of your test (which doesn't mean it is uninteresting in the abstract) The connection limit per origin is ~6. This is a balance between the need for parallelism and the need for priority and congestion semantics. Its more or less an industry standard between ie/chrome/ff (it might differ in a very tight range - like 8 instead of 6). Origins can effectively override this through sharding - whether or not that is appropriate to do depends on your content and tcp server settings, so is something that makes sense to control on the server end. The real fix to that problem is H2 and we're encouraging folks to deploy it - there is no single right answer for that constant in an H1 world. is it plausible to tweak the test so that we're measuring something closer to what you want?
Flags: needinfo?(mcmanus)
all of the resources being returned are poorly cachable - i.e. they have no explicit cache control semantics and have very very recent last modified times on them. I suspect that means they go into the cache but are not cache hits going forward (we try to revalidate them, but the server returns 200 instead of 304). This might actually maximize the IO we do. honza should confirm. Again, that's going to be pretty different than what is really happening on the NYT origin. Again, that doesn't mean its uninteresting :)
2015-03-25 20:11:55.736827 UTC - -1191690528[af959080]: uri=http://192.168.1.136:8000/nytimes/css.nyt.com/css/0.1/screen/build/homepage/styles.css%3Fv=20120119.css 2015-03-25 20:11:55.736925 UTC - -1191690528[af959080]: nsHttpChannel::Init [this=90cc7c00] 2015-03-25 20:11:55.737855 UTC - -1191690528[af959080]: nsHttpChannel::AsyncOpen [this=90cc7c00] 2015-03-25 20:11:55.739247 UTC - -1191690528[af959080]: nsHttpChannel::OnProxyAvailable [this=90cc7c00 pi=0 status=0 mStatus=0] 2015-03-25 20:11:55.745652 UTC - -1191690528[af959080]: nsHttpChannel::BeginConnect [this=90cc7c00] prefetching 2015-03-25 20:11:55.752616 UTC - -1191690528[af959080]: nsHttpChannel::Connect 90cc7c00 AwaitingCacheCallbacks forces async 2015-03-25 20:11:55.753105 UTC - -1191690528[af959080]: nsHttpChannel::Suspend [this=90cc7c00] 2015-03-25 20:11:55.754192 UTC - -1189336704[97864000]: Opened cache input stream without buffering [channel=90cc7c00, mCacheEntry=907fb5e0, stream=9007a1f0] 2015-03-25 20:11:56.364687 UTC - -1191690528[af959080]: nsHttpChannel::Resume [this=90cc7c00]
comment 29 shows a pretty long delay in between looking a css up in the cache, finding out that we have something that needs to be revalidated, and starting the actual http transaction.. in between the main thread is pretty flooded with other new requests that are no doubt also putting new requests into the cache engine. it might be interesting to rerun the test with the cache disabled - just to get data points. Its certainly not helping you here with the way the server is configured (we always do i/o but we never get a hit or a reval) - but we could see just how much overhead it is.
26, 28-30
Flags: needinfo?(honzab.moz)
So far, it looks like the main thread is blocked mostly due to nsHtml5TreeOpExecutor::RunFlushLoop(), with HTTP_on_modify_request() in UserAgentOverride.jsm being roughly 25% of that. Not too sure what the rest is doing yet.
We also spend a pretty significant amount of time messing with the plugin blocklist. It has to actually read the blocklist as part of that since this is the first usage of it.
The second load of this page looks dramatically better. The whole thing is done in about a second.
(In reply to Patrick McManus [:mcmanus] from comment #28) > all of the resources being returned are poorly cachable - i.e. they have no > explicit cache control semantics and have very very recent last modified > times on them. I suspect that means they go into the cache but are not cache > hits going forward (we try to revalidate them, but the server returns 200 > instead of 304). This might actually maximize the IO we do. honza should > confirm. If there is Last-Modified in the response we send If-Modified-Since. And if the test server (or whatever) cannot handle it with 304, we get 200 and a new (same) content again from the server. > > Again, that's going to be pretty different than what is really happening on > the NYT origin. Again, that doesn't mean its uninteresting :) (In reply to Patrick McManus [:mcmanus] from comment #29) > 2015-03-25 20:11:55.736827 UTC - -1191690528[af959080]: > uri=http://192.168.1.136:8000/nytimes/css.nyt.com/css/0.1/screen/build/ > homepage/styles.css%3Fv=20120119.css > 2015-03-25 20:11:55.736925 UTC - -1191690528[af959080]: nsHttpChannel::Init > [this=90cc7c00] > 2015-03-25 20:11:55.737855 UTC - -1191690528[af959080]: > nsHttpChannel::AsyncOpen [this=90cc7c00] > 2015-03-25 20:11:55.739247 UTC - -1191690528[af959080]: > nsHttpChannel::OnProxyAvailable [this=90cc7c00 pi=0 status=0 mStatus=0] > 2015-03-25 20:11:55.745652 UTC - -1191690528[af959080]: > nsHttpChannel::BeginConnect [this=90cc7c00] prefetching > 2015-03-25 20:11:55.752616 UTC - -1191690528[af959080]: > nsHttpChannel::Connect 90cc7c00 AwaitingCacheCallbacks forces async > 2015-03-25 20:11:55.753105 UTC - -1191690528[af959080]: > nsHttpChannel::Suspend [this=90cc7c00] > 2015-03-25 20:11:55.754192 UTC - -1189336704[97864000]: Opened cache input > stream without buffering [channel=90cc7c00, mCacheEntry=907fb5e0, > stream=9007a1f0] > 2015-03-25 20:11:56.364687 UTC - -1191690528[af959080]: > nsHttpChannel::Resume [this=90cc7c00] Sounds like classification suspension to me, but it can well be delayed just by main thread flood.
Flags: needinfo?(honzab.moz)
clearing and adding the meta keyword
tracking-fennec: ? → ---
Keywords: meta
platform-rel: --- → ?
Whiteboard: [platform-rel-NYTimes]
platform-rel: ? → ---
Site moved. But no data on this chart since 2014.
Re-triaging per https://bugzilla.mozilla.org/show_bug.cgi?id=1473195 Needinfo :susheel if you think this bug should be re-triaged.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INCOMPLETE
The test case for this bug is no longer available
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: