Closed Bug 1228822 Opened 4 years ago Closed 4 years ago

HTTP/2 pushed resource > 128KiB hangs indefinitely


(Core :: Networking: HTTP, defect)

42 Branch
Not set



Tracking Status
firefox46 --- fixed


(Reporter: tatsuhiro.t, Assigned: u408661)



(Whiteboard: [spdy])


(1 file, 1 obsolete file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:42.0) Gecko/20100101 Firefox/42.0
Build ID: 20151029151421

Steps to reproduce:

1. Configure web server to push resource of length more than 128KiB.
2. Browse the page where server pushes resource we configured in step 1.

Actual results:

Page does not load.
Server could not send data because stream window is exhausted.
Firefox's initial stream level window size is 128KiB (131072 bytes).
But it does not send any WINDOW_UPDATE for that stream.
When server pushed resource of length more than 128KiB, it stalls because there is no window left.

I tested changing resource file size.
* 128KiB - 1: OK
* 128KiB: OK
* 128KiB + 1: Fail

Expected results:

Page should be loaded.  Firefox should send WINDOW_UPDATE to the pushed stream.
Component: Untriaged → Networking: HTTP
Product: Firefox → Core
See Also: → 1167851
Whiteboard: [spdy]
firefox will not accept more than 128KB of push until it sees a reason to pull (i.e use) the resource. At that point it will open up the window and continue the transfer.
Closed: 4 years ago
Resolution: --- → WONTFIX
I forget to tell you the important condition to cause this issue.
This issue happens when I add 100ms latency using tc command.
Without latency, firefox sends WINDOW_UPDATE to pushed streams, after received some bytes, which backs up your explanation.
For some reason, if there is some latency, no WINDOW_UPDATE is sent.  Very strange.
does firefox instead pull the stream? (there has to be a use of that uri somewhere, right?)
No, it doesn't.  From HTTP log, firefox recognized pushed resource was really linked from HTML (Pushed Stream Match located id=0x2 key=[http2.2]/bug1228822/assets/css/bootstrap.css), and continued to read, and no download request was issued from firefox.  But somehow firefox did not emit WINDOW_UPDATE to pushed resource, and stream was stalled because there was no window left.  Again, page was not rendered at all.  Spinner kept spinning.

I setup test site to reproduce this issue:

Please add at least 100ms latency to reproduce this bug.
thanks tatsuhiro - nick, want to take a crack?

also, tatsuhiro - was firefox using e10s?
Ever confirmed: true
Resolution: WONTFIX → ---
Assignee: nobody → hurley
"Enable multi-process Nightly" was disabled.
Here we go. We weren't updating pushed stream h2 state properly (it was always RESERVED_BY_REMOTE). The state we were stuck in does not allow WINDOW_UPDATE frames to be sent, so we never sent one. Oops :)
Attachment #8696110 - Flags: review?(mcmanus)
xpcshell tests passed locally, but for the sake of sanity:
the pushed stream should leave RESERVED_BY_REMOTE (changing to OPEN) when the response headers are read (when setallheadersreceived() is called on the push stream).

so it appears that the source/sink matchup that calls adjustinitialwindow() is happening between PP (as PP is necessary to get into the matching table) and HEADERS - in which case it will return early because RESERVED_BY_REMOTE is the right state.

There is a bug here - the window won't get bumped up from the push value to the pull value when this thing eventually goes to OPEN because you only get 1 crack at adjustinitialwindow. we should fix that.

However, there is still a 128KB window (I would think) because we haven't seen any data frames yet (they might have been sent - but not recvd, because we know we haven't seen HEADERS yet because we are in RESERVED_BY_REMOTE).. as those DATA frames do come in we should send ACK^H^H^HWINDOW_UPDATE for some fraction of them in the usual way which would keep the stream flowing albeit perhaps at a suboptimal rate. And the bug reported is a total stall.

What do you think is going on? Is it as simple as a bug in SetAllHeadersRecevied()? maybe start by seeing what the localwindow is when you see reserved_by_remote? I would think we could actually assert that it is 128KB in that state as HEADERS data is (famously) not flow controlled.
Attachment #8696110 - Flags: review?(mcmanus) → review-
OK, so here's where we are (I discussed this with Pat just now in Orlando) - we actually *do* create a window bump... and then never send it over the wire, because we never flush our ourput. Oops! (I was looking too early in the log last time, the effect of the new coming patch is the same, but we send the bump at a more appropriate time.) Tests pass locally, running through try:
Attached patch patchSplinter Review
OK, here's a new version, that properly sends out the window bump as discussed in person. I'm not totally happy with the state fudging here, but with Orlando-brain, I'm finding it difficult to come up with anything better at this moment. At least I can get validation (or not) of the appropriateness of this tactic.
Attachment #8697331 - Flags: review?(mcmanus)
Attachment #8696110 - Attachment is obsolete: true
Comment on attachment 8697331 [details] [diff] [review]

Review of attachment 8697331 [details] [diff] [review]:

r+ with comment addressed

::: netwerk/protocol/http/Http2Stream.cpp
@@ +1394,5 @@
> +      if (NS_SUCCEEDED(rv)) {
> +        mSentPushWindowBump = true;
> +      }
> +    } else {
> +      MOZ_ASSERT(false, "Http2Stream::OnReadSegment unexpected state");

it seems like the point of the mSentPushWindowBump flag is to prevent doing this multiple times.. in which case you probably don't want the assert if it is called multiple times, right?

if you do want the assert.. then I would restructure the block - instead of 
if (foo) {
} else {
assert (false)

do it as
assert (foo)
Attachment #8697331 - Flags: review?(mcmanus) → review+
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla46 is still broken for me.. which matches a bug report I've got in email about
Depends on: 1249379
I think I've got this worked out.. but I'm blocked on uploading it because the test tickles another smaller bug (which I think I can workaround in the test I will attach to this bug while fixing it separately.)
You need to log in before you can comment on or make changes to this bug.