Open Bug 580571 Opened 14 years ago Updated 2 years ago

Try to keep preloads from saturating our connection limit

Categories

(Core :: DOM: Core & HTML, defect, P5)

defect

Tracking

()

Tracking Status
blocking2.0 --- -

People

(Reporter: scoobidiver, Unassigned)

References

()

Details

User-Agent:       Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; rv:2.0b2) Gecko/20100720 Firefox/4.0b2
Build Identifier: Mozilla/5.0 (Windows; Windows NT 6.1; WOW64; rv:2.0b2) Gecko/20100720 Firefox/4.0b2

The ref URL page load takes about 3 min to display something and 7 min to be completely done.
Before the test, the history is emptied.

Idem for FF 4.0b1.
No problem in FF 3.6 or ie 8.

Reproducible: Always
Status: UNCONFIRMED → NEW
Component: General → HTML: Parser
Ever confirmed: true
OS: Windows 7 → All
Product: Firefox → Core
QA Contact: general → parser
Hardware: x86_64 → All
Version: unspecified → Trunk
Disabling the html5 parser prevents the hang.
blocking2.0: --- → ?
At least this page causes a lot of resources to be loaded speculatively from the wrong URLs due to bug 534293.
Hmm.  Do the speculative script loads block things or something?  In particular, http://www.airbus.com/en/aircraftfamilies/fileadmin/templates/airbus.com/lib/scripts/layer_top.js (just the first thing we'd try to speculate) takes forever to actually time out instead of returning a 404 or something.

Are we blocking the parser on pending speculations?  Are we blocking the scriptloader on pending speculations?  Either one is wrong, imo.
Er, I lied.  That url does return a 404.. eventually.  It takes about a minute to do that.
Ok, neither of the things in comment 3 happens, in fact.  What does happen is that we get 10 script preload requests.  Then we get a single script load request.  Then the parser blocks until that script is loaded, of course.

Now 10 > 6, and 6 is the maximum number of persistent connections we open to a server.  So all the connections are taken up by the slow scripts and we don't load the real (non-preloaded) script until we get a connection (so until at least 5 of the loads have timed out).  And during that time the parser is blocked.

If I just raise my connection limit to something greater than 10, the issue disappears.

So in addition to fixing bug 534293 we should consider the following changes:

1)  Timing out preloads if they're taking too long and we have pending
    real-script loads.
2)  Limiting the number of preloads in the script loader so we don't saturate
    the connection limit.  But note that other preloads could still saturate it.
3)  Flagging all preloads (via nsISupportsPriority) as preloads and having
    necko never give the last available connection to a preload.
4)  Bumping our connection limits.  ;)

Thoughts?
Blocks: 534293
(In reply to comment #5)
> Ok, neither of the things in comment 3 happens, in fact.  What does happen is
> that we get 10 script preload requests.  Then we get a single script load
> request.  Then the parser blocks until that script is loaded, of course.
>
> Now 10 > 6, and 6 is the maximum number of persistent connections we open to a
> server.  So all the connections are taken up by the slow scripts and we don't
> load the real (non-preloaded) script until we get a connection (so until at
> least 5 of the loads have timed out).  And during that time the parser is
> blocked.

So why didn't this happen with the old parser? Were we just not as good at preloading stuff?


> 1)  Timing out preloads if they're taking too long and we have pending
>     real-script loads.

This seems like a good idea. Though we should be careful about a not timing out preloads that are now real loads. Also, we should only do this if there are other loads pending. Though maybe we don't need to limit ourselves to script loads. Could we check with teh loadgroup how many other things we're loading and which ones aren't script-preloads?

If we pause a request, does that mean it's no longer counted towards the 6 limit?

Would be awesome if necko could handle this. There has been talk about adding priorities to necko. Maybe the lowest priority should allow requests to be paused if higher priority requests get blocked by it.

> 2)  Limiting the number of preloads in the script loader so we don't saturate
>     the connection limit.  But note that other preloads could still saturate
>     it.

Possibly a good idea. Though I like 1 more.

> 3)  Flagging all preloads (via nsISupportsPriority) as preloads and having
>     necko never give the last available connection to a preload.

This is a good idea. Could possibly be done using the priority system too if we add it.

> 4)  Bumping our connection limits.  ;)

I don't think this is helpful. There's always going to be a site with more preloads out there.
Maybe combo of 1 and 3?   For DNS we already do #3--only 3 out of 8 queries outstanding can be speculative.  I'm surprised we don't have anything similar for HTTP connections.

Should we open a separate bug for this?

Is #2 a fast to implement stopgap measure in the mean time?
> So why didn't this happen with the old parser?

The preloader possibly took <base> into account?  Not sure.

> Could we check with teh loadgroup how many other things we're loading

Yes but what matters is loads to the site... we could assume worst-case, I guess.

> and which ones aren't script-preloads?

The loadgroup has no idea about that.

> If we pause a request, does that mean it's no longer counted towards the 6
> limit?

I doubt it.  I'm not even sure that _canceling_ a request immediately makes it stop counting.  :(

> There has been talk about adding priorities to necko.

Necko knows about priorities.  You can set them on a per-channel basis.  We'd need to add some back end to use them in a smart way here, of course.  This was idea #3.

#2 would indeed be a stopgap measure if none of the other solutions are looking likely (and that includes fixing the <base> thing in the parser, mind you).
(In reply to comment #8)
> > So why didn't this happen with the old parser?
> 
> The preloader possibly took <base> into account?  Not sure.

It looked to me (by code inspection) like it didn't. I deliberately made the first iteration of speculative loading in the HTML5 bugwards compatible with the old parser to make it easier to recognize perf effects from the HTML5 move vs. specific improvements.
Ok, what probably changed from the HTML4 parser was that we used to call document->GetBaseURI(). This would take <base> into account as soon as the <base> element was parsed, which would be before any speculation-causing <script>s if the <base> appears above the <script>s in the page. So speculative parsing was likely to take <base> into account.

In the HTML5 parser we don't affect document->GetBaseURI as soon as we see the <base>. All we do is create a treeoperation.

So it seems like we definitely need to make the HTML5 parser speculative code take <base> into account in order to prevent problems like the ones in this page.
blocking2.0: ? → betaN+
No longer blocks: 534293
Depends on: 534293
(In reply to comment #8)
> #2 would indeed be a stopgap measure if none of the other solutions are looking
> likely (and that includes fixing the <base> thing in the parser, mind you).

The patch that's now in bug 534293 fixes the load time of the page reported here.
Priority: -- → P2
Who owns this?  Need an owner ASAP.
I think once bug 534293 is fixed we should either make this not a blocker or spin off a separate non-blocker bug and resolve this fixed....
I pushed the fix for bug 534293, so I think this bug can now be downgraded to non-blocker if we keep this one open.

Do we want to resolve this as a duplicate of bug 534293 or do we want to keep this one open for tracking the additional fixes suggested in comment 5?
The latter, I think.
blocking2.0: betaN+ → -
Component: HTML: Parser → DOM
QA Contact: parser → general
Summary: Page load takes more than 5 min in FF 4.0b2 wrt 1 sec in FF 3.6 → Try to keep preloads from saturating our connection limit
Bug 819734 will probably help to prevent these bursts.
https://bugzilla.mozilla.org/show_bug.cgi?id=1472046

Move all DOM bugs that haven't been updated in more than 3 years and has no one currently assigned to P5.

If you have questions, please contact :mdaly.
Priority: P2 → P5
Component: DOM → DOM: Core & HTML
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.