Last Comment Bug 766312 - CSS not loading in iframes with pipelining on
: CSS not loading in iframes with pipelining on
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking: HTTP (show other bugs)
: Trunk
: All All
: -- major (vote)
: mozilla16
Assigned To: Patrick McManus [:mcmanus] PTO until Sep 6
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-06-19 14:24 PDT by Honza Bambas (:mayhemer)
Modified: 2012-07-01 02:08 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
affected


Attachments
logs (179.18 KB, application/x-zip)
2012-06-19 14:24 PDT, Honza Bambas (:mayhemer)
no flags Details
patch 0 (3.18 KB, patch)
2012-06-28 06:57 PDT, Patrick McManus [:mcmanus] PTO until Sep 6
honzab.moz: review+
Details | Diff | Splinter Review

Description Honza Bambas (:mayhemer) 2012-06-19 14:24:45 PDT
Created attachment 634590 [details]
logs

I have created a small tool that searches where all a bug has landed at http://www.janbambas.cz/links/ that displays results in iframes.

STR:
0. set network.http.pipelining to true
1. load http://www.janbambas.cz/links/
2. wait a bit (3 - 5 secs) to load the version board (initially a blank space under "Releases" link
3. enter some random bug # to the "Search bug landings" field (needs to be 6 numbers)
4. click the Go button, 5 iframes will be added (initially just blank) that load query to hg.mozilla.org for m-i, m-c up to m-r repos with the search string entered
5. wait some time, may be even a minute - queries take a long time

The bug:
Sometimes (relatively often) it happens that CSS for the first (m-i) iframe doesn't load.  I've also seen failed load also for m-c.

Notes:
- I can reproduce while all the pages has already been loaded in the past (are cached)
- I can reproduce with Nigthly, Aurora
- seems unreproducible with Beta and Release channels, but that could be a red herring

There are two logs attached, one from a debug Nightly and one from a release Aurora.
I didn't check deeper, but try to search for http://hg.mozilla.org/integration/mozilla-inbound/static/style-gitweb.css that probably failed to load in both cases.
Comment 1 Patrick McManus [:mcmanus] PTO until Sep 6 2012-06-19 14:41:37 PDT
love a good testcase - thanks!
Comment 2 Patrick McManus [:mcmanus] PTO until Sep 6 2012-06-27 11:44:44 PDT
I can say with some confidence that this is related to reschedule-timeout (the time at which head of line blocked transactions are moved from 1 connection to another). I'm still looking for the bug - but setting that timeout quite low repros the issue and making it high sees it go away.

I've fixed a couple issues related to read-timeout which is the more serious condition where the front-of-the-line transaction is aborted and restarted in a context that does not include pipelines. However, this does not appear to be the issue according to the logs (that code is not running - it has a much larger timeout value that is not reached).

From the logs the rescheduled transaction appears to run ok and receive a 304. I wonder if we cache the wrong thing on a restart somehow. too soon to say.
Comment 3 Patrick McManus [:mcmanus] PTO until Sep 6 2012-06-27 14:14:22 PDT
Here's what's happening.

Conn 1 has pipelined onto it A, B

We determine that HOL blocking has occurred and B gets rescheduled onto Conn 2.. but we don't give up on Conn 1 - we're still waiting for A. However, we can't take back the request for B that was written to conn 1 - so conn 1 has to be marked "DontReuse" when rescheduling that blocked pipeline.

And we do that :)

However, when the response for A comes back the values in the response header end up overwriting the Keep-Alive member variables that DontReuse was implemented with.. in effect, undoing the DontReuse(). Ouch!

Now that the connection is marked as reusable a new transaction ("C") can be placed on it like any other reusable persistent connection - in this case its the css file request. The response being read first is actually for "B", and we get ourselves a mixup.

The fix is easy - create a non-reversible bool for dontreuse().
Comment 4 Patrick McManus [:mcmanus] PTO until Sep 6 2012-06-28 06:41:07 PDT
despite the bug, this is actually a vindiction for the pipelining algorithm.. it detected a real head of line blocking problem and routed around it dynamically!
Comment 5 Patrick McManus [:mcmanus] PTO until Sep 6 2012-06-28 06:57:41 PDT
Created attachment 637504 [details] [diff] [review]
patch 0
Comment 6 Honza Bambas (:mayhemer) 2012-06-28 10:09:37 PDT
(In reply to Patrick McManus [:mcmanus] from comment #4)
> despite the bug, this is actually a vindiction for the pipelining
> algorithm.. it detected a real head of line blocking problem and routed
> around it dynamically!

I'd more say it's sad we are confirming a functionality's working randomly when a bug occurs rather then by automated tests...
Comment 7 Patrick McManus [:mcmanus] PTO until Sep 6 2012-06-28 10:18:39 PDT
(In reply to Honza Bambas (:mayhemer) from comment #6)
> (In reply to Patrick McManus [:mcmanus] from comment #4)
> > despite the bug, this is actually a vindiction for the pipelining
> > algorithm.. it detected a real head of line blocking problem and routed
> > around it dynamically!
> 
> I'd more say it's sad we are confirming a functionality's working randomly
> when a bug occurs rather then by automated tests...

not mutually exclusive thoughts!
Comment 8 Honza Bambas (:mayhemer) 2012-06-28 11:47:03 PDT
Comment on attachment 637504 [details] [diff] [review]
patch 0

Review of attachment 637504 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the investigation.  This fixes this bug.  I don't immediately see any regressions, but real usage will show.
Comment 9 Patrick McManus [:mcmanus] PTO until Sep 6 2012-06-28 15:23:33 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/d5bc4a1251b6
Comment 10 Ed Morley [:emorley] 2012-06-29 00:46:19 PDT
https://hg.mozilla.org/mozilla-central/rev/d5bc4a1251b6

Note You need to log in before you can comment on or make changes to this bug.