Note: There are a few cases of duplicates in user autocompletion which are being worked on.

CSS not loading in iframes with pipelining on

RESOLVED FIXED in mozilla16

Status

()

Core
Networking: HTTP
--
major
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: mayhemer, Assigned: mcmanus)

Tracking

Trunk
mozilla16
Points:
---

Firefox Tracking Flags

(firefox15 affected)

Details

Attachments

(2 attachments)

(Reporter)

Description

5 years ago
Created attachment 634590 [details]
logs

I have created a small tool that searches where all a bug has landed at http://www.janbambas.cz/links/ that displays results in iframes.

STR:
0. set network.http.pipelining to true
1. load http://www.janbambas.cz/links/
2. wait a bit (3 - 5 secs) to load the version board (initially a blank space under "Releases" link
3. enter some random bug # to the "Search bug landings" field (needs to be 6 numbers)
4. click the Go button, 5 iframes will be added (initially just blank) that load query to hg.mozilla.org for m-i, m-c up to m-r repos with the search string entered
5. wait some time, may be even a minute - queries take a long time

The bug:
Sometimes (relatively often) it happens that CSS for the first (m-i) iframe doesn't load.  I've also seen failed load also for m-c.

Notes:
- I can reproduce while all the pages has already been loaded in the past (are cached)
- I can reproduce with Nigthly, Aurora
- seems unreproducible with Beta and Release channels, but that could be a red herring

There are two logs attached, one from a debug Nightly and one from a release Aurora.
I didn't check deeper, but try to search for http://hg.mozilla.org/integration/mozilla-inbound/static/style-gitweb.css that probably failed to load in both cases.
(Assignee)

Comment 1

5 years ago
love a good testcase - thanks!
Assignee: nobody → mcmanus
(Assignee)

Comment 2

5 years ago
I can say with some confidence that this is related to reschedule-timeout (the time at which head of line blocked transactions are moved from 1 connection to another). I'm still looking for the bug - but setting that timeout quite low repros the issue and making it high sees it go away.

I've fixed a couple issues related to read-timeout which is the more serious condition where the front-of-the-line transaction is aborted and restarted in a context that does not include pipelines. However, this does not appear to be the issue according to the logs (that code is not running - it has a much larger timeout value that is not reached).

From the logs the rescheduled transaction appears to run ok and receive a 304. I wonder if we cache the wrong thing on a restart somehow. too soon to say.
(Assignee)

Comment 3

5 years ago
Here's what's happening.

Conn 1 has pipelined onto it A, B

We determine that HOL blocking has occurred and B gets rescheduled onto Conn 2.. but we don't give up on Conn 1 - we're still waiting for A. However, we can't take back the request for B that was written to conn 1 - so conn 1 has to be marked "DontReuse" when rescheduling that blocked pipeline.

And we do that :)

However, when the response for A comes back the values in the response header end up overwriting the Keep-Alive member variables that DontReuse was implemented with.. in effect, undoing the DontReuse(). Ouch!

Now that the connection is marked as reusable a new transaction ("C") can be placed on it like any other reusable persistent connection - in this case its the css file request. The response being read first is actually for "B", and we get ourselves a mixup.

The fix is easy - create a non-reversible bool for dontreuse().
(Assignee)

Comment 4

5 years ago
despite the bug, this is actually a vindiction for the pipelining algorithm.. it detected a real head of line blocking problem and routed around it dynamically!
(Assignee)

Comment 5

5 years ago
Created attachment 637504 [details] [diff] [review]
patch 0
Attachment #637504 - Flags: review?(honzab.moz)
(Reporter)

Comment 6

5 years ago
(In reply to Patrick McManus [:mcmanus] from comment #4)
> despite the bug, this is actually a vindiction for the pipelining
> algorithm.. it detected a real head of line blocking problem and routed
> around it dynamically!

I'd more say it's sad we are confirming a functionality's working randomly when a bug occurs rather then by automated tests...
(Assignee)

Comment 7

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #6)
> (In reply to Patrick McManus [:mcmanus] from comment #4)
> > despite the bug, this is actually a vindiction for the pipelining
> > algorithm.. it detected a real head of line blocking problem and routed
> > around it dynamically!
> 
> I'd more say it's sad we are confirming a functionality's working randomly
> when a bug occurs rather then by automated tests...

not mutually exclusive thoughts!
(Reporter)

Comment 8

5 years ago
Comment on attachment 637504 [details] [diff] [review]
patch 0

Review of attachment 637504 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the investigation.  This fixes this bug.  I don't immediately see any regressions, but real usage will show.
Attachment #637504 - Flags: review?(honzab.moz) → review+
(Assignee)

Comment 9

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/d5bc4a1251b6
Target Milestone: --- → mozilla16
https://hg.mozilla.org/mozilla-central/rev/d5bc4a1251b6
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Updated

5 years ago
status-firefox16: affected → ---
You need to log in before you can comment on or make changes to this bug.