Closed Bug 1196592 Opened 9 years ago Closed 9 years ago

Fetch API significantly slower than XMLHttpRequest

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox43 --- fixed

People

(Reporter: zbraniecki, Assigned: khuey)

References

Details

Attachments

(12 files)

While experimenting with moving Gaia library to Fetch API I encountered performance penalty that according to :nms should not be there.

The exact scenario in which I'm operating is loading a resource file during app initialization trying to get the file loaded before firstPaint.
The file is loaded from the app resources.

That means that in the most crucial condition, the I/O request is competing with other page loading code paths like HTML parsing, CSS etc. but I was able to reproduce the same performance penalty in other scenario that are described below.

Original numbers come from Firefox OS Settings app bootstrapping.

During the app initialization the JS code is launched which loads a single JSON file using one of the two methods, XHR or Fetch. The results are like this:

https://pastebin.mozilla.org/8843279

In both scenarios the load function is executed around 1080ms after the app has been launched, which is consistent with other measurements (like raptor).

Next, the two scenarios diverge. While in the XHR case the resource is loaded and its JSON available 200 ms later, in the Fetch API scenario the response object is available over 350ms later and JSON object is available another 150ms later which means that XHR takes ~200ms, and Fetch ~500ms.

In order to to isolate testing I modified gaia's dev_apps/template app to see if the behavior is the same in a much lighter app. It is.

I'll attach the Template diff here as a patch. 

I tested in two conditions. During bootstrap and on button click. This allows me to see the different between how XHR vs. Fetch performance is different when I/O competes with Gecko bootstrapping the webapp and when not.

The results are:

Bootstrap:
XHR: 12-20ms
Fetch: 140-200ms

OnClick:
XHR: 6-7ms
Fetch: 100-125ms
Attached patch template.diffSplinter Review
Patch to Gaia dev_apps/template app
Attached file index.en-US.json
Example JSON file used, although I expect that any file would work.
Nikhil, can you take a look at this?

I have access to Firefox DevTools so I can use them to profile that and I guess that the on button click scenario should be available for profiling this way. But the bootstrap scenario, I'm not sure how to profile.

I also noticed that I can use the same example JS code to test it on desktop, but then I'm in the zone of 1-2ms vs 2-3ms which is hard to use as a statistical proof, so I guess that using Flame device is the best to reproduce it (and closest to the environment in which I'd like to use Fetch API).
Flags: needinfo?(nsm.nikhil)
Zibi, as an experiment, can you try varying the size of the JSON file?  I'm curious if its a constant time penalty or scales with the body.
Flags: needinfo?(gandalf)
Gecko profiler should work for this case.

My guess, without knowing Fetch implementation at all, is that we're busy doing other stuff in the main loop and Fetch implementation does things way more asynchronously (posting more Runnables to event loop than XHR ... XHR in fact just relies on necko to provide the data and passes it to JS asap)

But note, your testcase doesn't actually ever retrieve the json from XHR, but does from Fetch.
> Zibi, as an experiment, can you try varying the size of the JSON file?  I'm curious if its a constant time penalty or scales 
> with the body.

Sure! Will do in the morning.

(In reply to Olli Pettay [:smaug] from comment #5)
> Gecko profiler should work for this case.

Is the Gecko profiler the one available in Dev Tools? If not, how do I plug it into gaia app bootstrap?
 
> My guess, without knowing Fetch implementation at all, is that we're busy
> doing other stuff in the main loop and Fetch implementation does things way
> more asynchronously (posting more Runnables to event loop than XHR ... XHR
> in fact just relies on necko to provide the data and passes it to JS asap)

That's what I thought as well as a possibility, but I'm really surprised that in a heavy concurrent app it can take up to 500ms to load one file. That feels like the priority must be very low and also it means that we would not be able to use it in JS that is concurrent :(

So I wanted to test it against other possibilities.

I'm also not sure if your hypothesis would be a possible explanation of the onclick scenario where we launch XHR/Fetch as the only operation in the thread. Why would it still be 10x slower?

> But note, your testcase doesn't actually ever retrieve the json from XHR,
> but does from Fetch.

Good point! I'm always forgetting how lazy DOM is :) I'll update the testcase in the morning.

But notice that in my first pastebin link, the Response object in the Fetch case is returned much much slower than the response object in XHR case, even ignoring the surprising 150ms that response.json() takes.
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #6)
> Is the Gecko profiler the one available in Dev Tools? If not, how do I plug
> it into gaia app bootstrap?
I meant Gecko profiler as the addon in
https://developer.mozilla.org/en-US/docs/Performance/Profiling_with_the_Built-in_Profiler 
(I would profile all the processes in FFOS)
but perhaps the devtools profiler can show something useful too.


> That's what I thought as well as a possibility, but I'm really surprised
> that in a heavy concurrent app it can take up to 500ms to load one file.
> That feels like the priority must be very low and also it means that we
> would not be able to use it in JS that is concurrent :(
There isn't really any priority in event loop. It is just a FIFO queue.
See also bug 1164539



> But notice that in my first pastebin link, the Response object in the Fetch
> case is returned much much slower than the response object in XHR case, even
> ignoring the surprising 150ms that response.json() takes.
Right, that hints that event loop is busy doing other stuff.
(In reply to Olli Pettay [:smaug] from comment #7)
> I meant Gecko profiler as the addon in
> https://developer.mozilla.org/en-US/docs/Performance/
> Profiling_with_the_Built-in_Profiler 
> (I would profile all the processes in FFOS)
> but perhaps the devtools profiler can show something useful too.

I'll try that.

> There isn't really any priority in event loop. It is just a FIFO queue.
> See also bug 1164539

Wondering if https://w3c.github.io/resource-hints/ would help with that if we allowed Fetch to define priority?

> > But notice that in my first pastebin link, the Response object in the Fetch
> > case is returned much much slower than the response object in XHR case, even
> > ignoring the surprising 150ms that response.json() takes.
> Right, that hints that event loop is busy doing other stuff.

Even if I launch it after Nuwa is loaded, and there's nothing else going on in the phone (power usage is minimal, CPU is at 0)? What can make it busy?

Do you think it's unlikely that there's a bug in the implementation that causes the slowdown?
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #8)
> Even if I launch it after Nuwa is loaded, and there's nothing else going on
> in the phone (power usage is minimal, CPU is at 0)? What can make it busy?
All the cpu cores at 0? that is then surprising.

Btw, console.timeStamp can be used to add markers to the Gecko profiler results
bug 1170314
That can be useful with profiling.


> Do you think it's unlikely that there's a bug in the implementation that
> causes the slowdown?
well, the implementation is possibly too async heavy, but profiler would tell if there is also something else.
Flags: needinfo?(gandalf)
Attached image call-tree-fetch.png
I couldn't even fit the whole call tree here. It's huge.
I plugged DevTools into Flame and tried my test using the onclick scenario.

Mind you, that means that I launched XHR/Fetch at the time when no other process was busy, nuwa was stable and generally nothing was going on the phone. I imagine that the reason the same Fetch request may take up to 500ms during Settings start is that it then compete with other things Gecko does when it bootstraps an app.

The difference in the profile is between 34ms and 147ms as measured by console.time() and console.timeEnd().
You can import the json log for both into your own devtools and inspect.

Olli, is that any helpful or should I learn Gecko Profiler?
Flags: needinfo?(bugs)
Well, for starters we're not always successfully retargeting Fetch ODAs to the STS thread.  There's an nsCORSListenerProxy in the listener chain when loading from an nsJARChannel for example.  That seems wrong ...
The same problem would exist for http,though.  We could make the CORS proxy retargetable.  I'm not convinced that's it, though.  Do we see similar slowdowns for xhr with cors enabled?
Before doing that, can we try fetch() with "no-Cora" mode to see if it's the problem?
Sorry, "no-cors" mode.
Flags: needinfo?(gandalf)
It's not just CORS, FetchDriver itself doesn't support retargeting.
Flags: needinfo?(gandalf)
Yeah, I added no-cors and got 120ms again. No big difference, but the profile looks different.
Attached patch PatchSplinter Review
This makes the off-main-thread code in Fetch actually work.  Want to give it a whirl?
Attachment #8652078 - Flags: feedback?(gandalf)
So, in my isolated test scenario the patch does help.

Fetch onclick case goes down from ~130ms to ~65ms.

Unfortunately, under 'load', so during Settings app bootstrap, the cost is still high:

XHR: 247ms
Fetch: 515ms

Mind you, there are two pieces in play - fetch() itself and response.json(). The numbers for them are respectively:

XHR: 1) resource load: 216ms 2) e.target.response: 31ms
Fetch: 1) fetch(): 329ms 2) resource.json(): 186ms

Of course e.target.response is synchronous while resource.json() is async, but it still seems that both operation are significantly more costly for Fetch API than XHR API.
does resource.json() happen while we're still loading something else? If so, it probably just means
the other stuff is processed while json-runnable is in the event queue waiting for to be processed.

What kind of e.target.response vs. resource.json() results do you get when the app isn't loading (but some time later after the load)?
Flags: needinfo?(bugs)
Attachment #8652078 - Flags: feedback?(gandalf) → feedback+
(In reply to Olli Pettay [:smaug] (high review load) from comment #28)
> does resource.json() happen while we're still loading something else?

Yes, the scenario in which I'm currently using XHR and I'd like to use Fetch is a mock library that is loading l10n resources during app bootstrapping, so at the same time as CSS/JS resources are loaded/initialized and HTML is parsed.

The XHR is currently the major contributor to how long it takes to load the l10n resource and the initialization of XMLHttpRequest is costly memory wise.

It seems that Fetch retains less memory, but in order to be able to use it in place of XHR it has to be at least on par with XHR.

> What kind of e.target.response vs. resource.json() results do you get when
> the app isn't loading (but some time later after the load)?

When I load the same file after the app is loaded (onclick), I get:

Fetch: 1) fetch(): ~65ms, 2) response.json(): 22ms
XHR: 1) resource load: 130ms, 2) e.response.target: 15ms

which means that the difference between heavy load and no load is as follows:

              XHR   |   Fetch
-----------------------------------------
light load |   40ms |   65ms
heavy load |  185ms |  328ms
light json |   15ms |   35ms
heavy json |   15ms |  127ms

total light|   55ms |  100ms  
total heavy|  200ms |  455ms

(that's without khuey's patch because I can't build gecko today on mac due to bug 1198237)

> If so, it probably just means
> the other stuff is processed while json-runnable is in the event queue
> waiting for to be processed.

Is that per-design and does it make the Fetch not applicable here and we have to stick to XHR?
Comment on attachment 8652078 [details] [diff] [review]
Patch

Review of attachment 8652078 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/security/nsCORSListenerProxy.cpp
@@ +664,4 @@
>  
>  NS_IMETHODIMP
>  nsCORSListenerProxy::OnDataAvailable(nsIRequest* aRequest,
>                                       nsISupports* aContext, 

Nit: please remove the trailing whitespace while you are here.

@@ +1267,5 @@
>  nsCORSPreflightListener::GetInterface(const nsIID & aIID, void **aResult)
>  {
>    return QueryInterface(aIID, aResult);
>  }
>  

Nit: Extra newline.
Attachment #8652078 - Flags: review?(nsm.nikhil) → review+
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #29)
> > If so, it probably just means
> > the other stuff is processed while json-runnable is in the event queue
> > waiting for to be processed.
> 
> Is that per-design and does it make the Fetch not applicable here and we
> have to stick to XHR?
Well it means other stuff gets processed earlier when Fetch is used. So the total app load time probably shouldn't change
blocking v2 meta bug. Thanks!
(In reply to Olli Pettay [:smaug] (high review load) from comment #32)
> > Is that per-design and does it make the Fetch not applicable here and we
> > have to stick to XHR?
> Well it means other stuff gets processed earlier when Fetch is used. So the
> total app load time probably shouldn't change

Unfortunately that doesn't work this way. I'm not sure if it's just the sync+async combo that happens at the same time, but I know that the whole app bootstrap (CSS+JS+HTML) happens within ~400ms and we try to squeeze in l10n there which takes around 300ms during that time.
We're racing against firstPaint.

The crucial piece is loading the resource because it has to happen asynchronously and before firstPaint so that we can translate HTML before it.

So since with XHR the resource loading takes ~200ms, the earlier we can start it, the better. And we unfortunately can't prefetch because we need to use URL Template and resolve it in JS. And the earlier we can get the resource, the more cushion we have to avoid flashes of untranslated content.

If being more async puts our codepath on lower priority that gets executed after firstPaint because everything else is more synchronous, then we're at a disadvantage.

What I'm trying to understand is if there's a chance to get Fetch to behave comparably to XHR in terms of execution.
This bug is going to close when that patch merges.  Can you file a new bug for what remains now that we got the low-hanging fruit out of the way?
Flags: needinfo?(gandalf)
Yeah! I'll test my condition with the patch in and report in a new bug cloned from this one.
Flags: needinfo?(gandalf)
https://hg.mozilla.org/mozilla-central/rev/fa3fabfe2b02
Assignee: nobody → khuey
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
I think bug 1206079 could explain some of the other slowness under load.  I didn't see the other follow bug if it was filed.
See Also: → 1206079
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: