1381016 - Youtube Live video stops after about 5-10 minutes

Reporter

Description

•

7 years ago

Attached file about:support — Details

Build Identifier:
https://hg.mozilla.org/releases/mozilla-beta/rev/91e10e2411762dea81d5df70d9fefe96fe619353
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0 ID:20170713130618


Reproducible: Yes

Steps To Reproduce:
1. Open https://www.youtube.com/live
2. Click a live video. e.g. Sky News - Live https://www.youtube.com/watch?v=y60wDzZt8yg
3. Change to Theater mode

Actual Results:
The video stops after about 5 minutes

Expected Results:
Not stop

Alice0775 White

Reporter

Comment 1

•

7 years ago

Attached image screenshot when problem occuerd — Details

Blake Wu [:bwu][:blakewu]

Updated

•

7 years ago

Keywords: regressionwindow-wanted

Priority: -- → P1

Alice0775 White

Reporter

Updated

•

7 years ago

Summary: Youtube Live video stops after about 5 minutes → Youtube Live video stops after about 5-10 minutes

Jean-Yves Avenard [:jya]

Updated

•

7 years ago

Flags: needinfo?(jyavenard)

Comment hidden (obsolete)

Reproduced on Firefox54.0.1, Beta55.0b9.
But not Nightly56.0a1.

The problem seems occur on beta(release) build only, 
because the problem starts when merge-day to beta from central.
So, it is difficult to find regression-window :(


Last Nightly54.0a1 build seems okay:
https://hg.mozilla.org/mozilla-central/rev/6583496f169cd8a13c531ed16e98e8bf313eda8e
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0

First Beta54.0 build reproduce the problem:
https://hg.mozilla.org/releases/mozilla-beta/rev/d57aa49c39484782cdd91cde6c10c343eb0d557e
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:54.0) Gecko/20100101 Firefox/54.0


Last Nightly55.0a1 build seems okay:
https://hg.mozilla.org/mozilla-central/rev/f9605772a0c9098ed1bcaa98089b2c944ed69e9b
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0

First Beta55.0 build reproduce the problem:
https://hg.mozilla.org/releases/mozilla-beta/rev/2191d7f87e2e69af1db8b3459522abe7c1db4c2e
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0


Can I run m-c as beta(release) to find regression-window?

Alice0775 White

Reporter

Comment 3

•

7 years ago

I can also reproduce on Nightly56.0a1 without e10s.

status-firefox56: ? → affected

Whiteboard: [beta/release build only]

Jean-Yves Avenard [:jya]

Comment 4

•

7 years ago

What about if you disable webm? (So that YouTube send mp4) media.webm.enabled to false

Alice0775 White

Reporter

Comment 5

•

7 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #4)
> What about if you disable webm? (So that YouTube send mp4)
> media.webm.enabled to false

I tested 10 times on Beta55.0b9. 
I can reproduce the problem with 1/10 probability.
It seems to be less than webm.

Alice0775 White

Reporter

Comment 6

•

7 years ago

Attached image screenshot (webm disabled) — Details

Comment hidden (spam)

Although I am not confident 100% I got the following regression window.

Regression window(beta channel):
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=ca14ab5a7f99b1d7caf841868ee46188fb44f5db&tochange=075f767684d1c02759e72b69d0b565fbbc002a4e

Jean-Yves Avenard [:jya]

Comment 8

•

7 years ago

October 2016? Tap hat sounds unlikely for a 55 regression. This was about Firefox 52

Flags: needinfo?(jyavenard)

Alice0775 White

Reporter

Comment 9

•

7 years ago

(In reply to Alice0775 White from comment #7)
> Although I am not confident 100% I got the following regression window.
> 
> Regression window(beta channel):
> https://hg.mozilla.org/releases/mozilla-beta/
> pushloghtml?fromchange=ca14ab5a7f99b1d7caf841868ee46188fb44f5db&tochange=075f
> 767684d1c02759e72b69d0b565fbbc002a4e


aha, the range is wrong, the good buil does also reproduce the problem.

Alice0775 White

Reporter

Comment 10

•

7 years ago

s/buil/build/

Alice0775 White

Reporter

Comment 11

•

7 years ago

The following preference fixes the problem.....

user_pref("network.http.spdy.enabled.http2", false);

Alice0775 White

Reporter

Updated

•

7 years ago

Component: Audio/Video: Playback → Networking: HTTP

Alice0775 White

Reporter

Updated

•

7 years ago

Blocks: 1097320

Keywords: regression

Whiteboard: [http2]

Jean-Yves Avenard [:jya]

Comment 12

•

7 years ago

Great find thank you...

Amy Chung [:Amy]

Comment 13

•

7 years ago

Hi Patrick,
Would you take a look at this bug?
Thanks!

Flags: needinfo?(mcmanus)

Patrick McManus [:mcmanus]

Comment 14

•

7 years ago

I'll ask nick to look into it.

Flags: needinfo?(mcmanus) → needinfo?(hurley)

Jean-Yves Avenard [:jya]

Comment 15

•

7 years ago

I can't reproduce here.
I'm on a pretty poor mobile phone connection.. I've listened to this live stream for over 2 hours without any disconnection. I did have the occasional rebuffering, but it always resumed just fine.

I have:
network.http.spdy.enabled.http2=true

Alice0075, are you using IPv4 or IPv6?

Flags: needinfo?(alice0775)

Alice0775 White

Reporter

Comment 16

•

7 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #15)
> I can't reproduce here.
> I'm on a pretty poor mobile phone connection.. I've listened to this live
> stream for over 2 hours without any disconnection. I did have the occasional
> rebuffering, but it always resumed just fine.
> 
> I have:
> network.http.spdy.enabled.http2=true
> 
> Alice0075, are you using IPv4 or IPv6?

IPv4 approx 8Mbps, Youtube classic style. and not logged in.
If not reproduce the problem at fist time, Restart Browser then try again.
It should occure within 15min.


FYI,
Other people have also reported similar problems in the Japanese forum.

Flags: needinfo?(alice0775)

u408661

Assignee

Comment 17

•

7 years ago

I was able to (I think) reproduce this once with http logging, but nothing stood out in the logs as suspicious. Data just... stopped flowing (WriteSegments returned WOULD_BLOCK). I thought I reproduced again, under rr with logging, but when I was about to close the browser to start looking at the replay, the video started back up. That behavior made me wonder if I just didn't wait long enough on my first "reproduction" to see the video start back up (my connection was under pretty heavy load at the time, so things could've been backed up there rather than a bug in necko).

I'll keep playing around with this to see if I can get any more data. If not, at least I have a nice calming live video of bears fishing for salmon going all the time :)

Honza Bambas (:mayhemer)

Comment 18

•

7 years ago

Nick, are you willing to send me the log for examination?

u408661

Assignee

Comment 19

•

7 years ago

Honza, log from what I think is a reproduction with a clean profile on my mac is at http://files.nwgh.fastmali.fm/youtube-h2-fail.log.gz

u408661

Assignee

Comment 20

•

7 years ago

Argh, I mistyped that (copying from a different computer): http://files.nwgh.fastmail.fm/youtube-h2-fail.log.gz

Flags: needinfo?(hurley)

Jean-Yves Avenard [:jya]

Comment 21

•

7 years ago

I've been trying to reproduce the problem for several hours with no luck. Whenever playback appeared to stall, it typically resumed after about a minute (which isn't surprising in my current setup as I'm on a tethered mobile phone with sporadic coverage)

u408661

Assignee

Comment 22

•

7 years ago

I also continue to be unable to reproduce this, which makes me consider even more strongly the possibility (now likelihood) that my "reproduction" mentioned in comment 17 was likely due to my congested network and not to any bug.

Given that this appears to be mostly (entirely?) reproduced by people in Japan, I wonder if youtube deployed something slightly broken to their Japanese frontends that isn't affecting those of us on the other side of the world.

Alice0775 White

Reporter

Comment 23

•

7 years ago

Attached file log2.7z — Details

>set MOZ_LOG=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,sync,MediaDecoder:5,MediaSource:5,MediaPromise:5,MP4Demuxer:5,nsMediaElement:5,nsMediaElementEvents:5

buffering stops at around 19:04:20
video stops at around 19:05:40

Kershaw Chang [:kershaw]

Comment 24

•

7 years ago

(In reply to Alice0775 White from comment #23)
> Created attachment 8888034 [details]
> log2.7z
> 
> >set MOZ_LOG=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5,sync,MediaDecoder:5,MediaSource:5,MediaPromise:5,MP4Demuxer:5,nsMediaElement:5,nsMediaElementEvents:5
> 
> buffering stops at around 19:04:20
> video stops at around 19:05:40

ni Nick to have a look on the log file. Thanks.

Flags: needinfo?(hurley)

u408661

Assignee

Comment 25

•

7 years ago

So I've taken a look. I'm sad to say, I won't be able to get any farther on this without a log (same as above), plus a packet dump and (since this is all tls encrypted) a key log. (I might be able to get a bit farther without the key log, but not much).

Basically what I see happen is... things generally go along fine. The last normal operation I see is that we get a PING frame from youtube, and immediately send out the PING with the ack flag set. The socket responds with SENDING_TO and then... that's it. All subsequent attempts to read from the socket return WOULD_BLOCK, and eventually we send a GOAWAY with no error, as the browser is shutting down. So I don't know if youtube never sees our PING ack and assume's we're dead, or if it keeps trying to send us data that gets stuck somewhere else. Either way, this does not appear to be an h2 bug as things currently are, and I can't figure out anything further without more low-level information.

Flags: needinfo?(hurley)

u408661

Assignee

Updated

•

7 years ago

Flags: needinfo?(alice0775)

Alice0775 White

Reporter

Comment 26

•

7 years ago

I can still reproduce the problem on Nightly55.a1 as well as Beta55.0b12.
However, we have a solid workaround(disable http2).
No more wasting time to investigation.

status-firefox55: affected → wontfix

status-firefox56: affected → wontfix

Flags: needinfo?(alice0775)

Keywords: regressionwindow-wanted

u408661

Assignee

Comment 27

•

7 years ago

I strongly disagree that disabling http/2 is a "solid workaround". It may be "solid" in that it reliably fixes the problem, but it does so at the expense of disabling an incredibly crucial and useful piece of code - http/2 is quickly becoming more and more of the web, and disabling http/2 will make sites that assume a modern browser supports http/2 slower. Also, if this is, in fact, a bug in Firefox's http/2 implementation (and not Google's frontend as deployed in Japan), then I want to fix it! Either way, this is absolutely worth investigating, unfortunately I am just unable to reproduce it, so I need assistance to get the information I need.

Flags: needinfo?(alice0775)

Alice0775 White

Reporter

Comment 28

•

7 years ago

Attached image bug.png — Details

HR to manifest.googlevideo.com (http/2.0 capable server) suddenly stall when buffering stops.

Flags: needinfo?(alice0775)

u408661

Assignee

Comment 29

•

7 years ago

So thanks to :swu, I was able to get a reproduction of this issue with a pcap so I can see what happens on the wire (no TLS keys, so I couldn't decrypt the packets, but that turned out to be irrelevant for the behavior I'm seeing).

What I see happen is this - h2 sessions truck along fine. We have a few to different google properties, but every single one of them that lasts to the point when the issue happens behave the same way. Then, when the issue occurs (whatever it is), traffic just stops. In every single case, the last thing that happens on the session is:

(1) Google FE sends a PING frame
(2) We send a PING with the ACK flag set
(3) Google FE IP sends a TCP ACK for the TCP packet containing our PING frame (identified by matching timestamps in the log and the pcap).
(4) No other traffic is seen on the TCP stream corresponding to the h2 session in either direction.

Like I said, every single failed session behaves the same way. Sessions that finish before the issue occurs close after our timeout expires in a clean fashion.

One thing I've noticed is that both :swu and the reporter are using IPv4 (I have a native IPv6 connection at home). I will attempt to reproduce after disabling v6 locally to see if that gets me anything, but if the behavior is the same, I won't really be able to get any farther.

:swu - one question for you - what OS were you using? Looking at the logs, it looks like you were using Linux, correct? I just want to rule out something potentially Windows-specific happening here.

Flags: needinfo?(swu)

Shian-Yow Wu [:swu]

Comment 30

•

7 years ago

(In reply to Nicholas Hurley [:nwgh][:hurley] (also hurley@todesschaf.org) from comment #29)
> So thanks to :swu, I was able to get a reproduction of this issue with a
> pcap so I can see what happens on the wire (no TLS keys, so I couldn't
> decrypt the packets, but that turned out to be irrelevant for the behavior
> I'm seeing).

You figured out what happened in a half-blind condition!  Sorry for missing ssl key file, it's now uploaded to the link I sent you, in case it's still helpful.

> One thing I've noticed is that both :swu and the reporter are using IPv4 (I
> have a native IPv6 connection at home). I will attempt to reproduce after
> disabling v6 locally to see if that gets me anything, but if the behavior is
> the same, I won't really be able to get any farther.
> 
> :swu - one question for you - what OS were you using? Looking at the logs,
> it looks like you were using Linux, correct? I just want to rule out
> something potentially Windows-specific happening here.

Yes, I am using Linux with IPv4, and IPv6 is enabled too.  The Firefox version is 2017-07-25 nightly.

Flags: needinfo?(swu)

Shian-Yow Wu [:swu]

Comment 31

•

7 years ago

Attached patch Patch: allow to discard frame with zero padding length (obsolete) — Details — Splinter Review

This patch works for me. Nick, could you take a look whether it makes sense?

Flags: needinfo?(hurley)

u408661

Assignee

Comment 32

•

7 years ago

Copy/pasting from email by :swu, so we have all the info in one place (and I can try to clarify my understanding):

> Hi,
>
> According to bug comments, the recent H2 issue of gmail[1] and youtube[2] seems the same root cause.
>
> I can reproduce the youtube issue on my Linux with rr(reproducible no matter e10s is on or off.).  When the issue happens, the > mInputFrameDataSize is 1 in [3], it resulted in the discardCount to be 0 in [4] and returned NS_BASE_STREAM_WOULD_BLOCK, caused > this issue to happen.
>
> I am not familiar with H2, but if any help I can do, please let me know.
>
> [1] https://bugzilla.mozilla.org/show_bug.cgi?id=1380896
> [2] https://bugzilla.mozilla.org/show_bug.cgi?id=1381016
> [3] https://searchfox.org/mozilla-central/rev/09c065976fd4f18d4ad764d7cb4bbc684bf56714/netwerk/protocol/http/Http2Session.cpp#2695
> [4] https://searchfox.org/mozilla-central/rev/09c065976fd4f18d4ad764d7cb4bbc684bf56714/netwerk/protocol/http/Http2Session.cpp#3034

First question - :swu, do you have the log for this? (And a pointer to the appropriate lines, so I can see what's going on around the steps mentioned above?)

If I'm understanding correctly, it looks like we're receiving a 1-byte DATA frame that does *not* have the PADDED flag set, but we still are somehow ending up with 1 byte of data read, and then we try to discard the rest, returning WOULD_BLOCK?

The reason I say a frame without the padded flag is that your patch is designed to change the unpadded case, so that's the only time it would make a difference.

If my understanding above is correct, then what we should try to figure out is *why* we're starting the discard path with one byte of data already read... there's a bug somewhere else in the code, and this is papering over it, I believe.

I'll continue trying to reproduce locally. Unfortunately, rr is currently broken on my machine, as I have one of the kernel versions that breaks rr, and Fedora hasn't yet updated the kernel to a working version. I suppose if it comes to it, I could try compiling my own, though it's been a loooooooong time since I've done that... :/ So, until I get a working kernel, I'm stuck trying to debug with just logs.

Flags: needinfo?(hurley) → needinfo?(swu)

u408661

Assignee

Comment 33

•

7 years ago

Also, I want to note that this discarding code path is unchanged since our first landing of the h2 code. Not that that means it's not buggy, but since it's worked for a few years just fine with Google, they almost certainly changed something, and if we can get info from them about what changed, that might go a long way to helping us figure out what the issue is (since we'd know what to look for).

Patrick, do you know anyone to reach out to at Google to try to figure out what changed around the time this bug started happening?

Flags: needinfo?(mcmanus)

Jason Duell

Comment 34

•

7 years ago

Note: I gave ekr links to both this bug and bug 1380896 and he's let some google folks know about the issue.

Patrick McManus [:mcmanus]

Comment 35

•

7 years ago

google does mess with padding now and then as a defense against a few BEAST like attacks.. they won't acknowledge more than that they mess with padding strategies now and then :).. so if it looks legal on the wire, we pretty much have to work it from there.

u408661

Assignee

Comment 36

•

7 years ago

So... I have a theory about what could be happening. Whether it's correct or not is still up in the air (I'll need someone who can reproduce to verify my theory). But here goes. I *think* what's going on is that google is sending the FIN frame in a stream as a frame with length == 1 and the PADDED flag set. That one byte is the padding control byte, which is set to 0. That's how we get to the point where mInputFrameDataSize == mInputFrameDataRead == 1 with our internal state of DISCARDING_DATA_FRAME_PADDING. In that case, we do that short-circuit code path, which skips the bits later on where we'll pass knowledge of the FIN to the stream. So, we never close our half of the stream. If google keeps track of the fact that, eventually, we have a lot of these half-open streams hanging around on their end to prevent some kind of DoS against them, I could easily see them disallowing us to open more streams until we close some. So, eventually, even though we send out some requests, they never respond to them.

If I'm correct in my assessment above, the reason :swu's patch works is that it skips the short-circuit in this edge padding case. However, we should be able to lock the short-circuit down even more - only run it if the internal state is DISCARDING_DATA_FRAME - at that point, we know for a fact that the stream is already closed on our end (that's why we enter that state), so we know we'll never be accidentally skipping the CleanupStream call if it's necessary.

My only worry with my assessment is... if google thinks we're DoSing them, why aren't they just closing the socket? Either way, we are most definitely skipping the CleanupStream on this half-closed stream, so we are leaving stuff hanging around that we shouldn't. I can see this in my own logs of stuff against google properties where we go almost 10 seconds between receiving this single-byte-padded frame and doing our close (which is happening because I shut down the browser). Perhaps my inability to reproduce is that I just don't use that many google things, so I'm not using nearly as many streams as others are - I suspect that, given enough time, I would eventually hit whatever limit there is and experience the issue myself.

I'll attach a modified version of :swu's patch with my proposal in the second paragraph above, I'm hoping he can test the patch to verify that it works (as I suspect it will) so we can get this landed and fixed within a day or so.

Flags: needinfo?(swu)

Flags: needinfo?(mcmanus)

Ryan VanderMeulen [:RyanVM]

Comment 37

•

7 years ago

FWIW, I've been running a local build with this patch applied for the last 4 hours or so without a single GMail hang where I'd been getting them pretty regularly before.

Comment hidden (mozreview-request)

Review commit: https://reviewboard.mozilla.org/r/163648/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/163648/

u408661

Assignee

Comment 39

•

7 years ago

linux64 try build to sanity check h2 xpcshell tests with my modified patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0c6a4134d4b13d7819377a892b4709d303e2af58

and builds for the other desktop platforms for anyone willing to take them for a spin to ensure I got it right: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3cfeb2c98f5a03bea8dd4311b70fcd294404ad07

u408661

Assignee

Comment 40

•

7 years ago

:swu - if you could (in addition to others who are already helping with testing) verify that my modified patch fixes the issue for you as well, I'd greatly appreciate that (and your r+ on the patch once it's verified!) Thanks!

Flags: needinfo?(swu)

Shian-Yow Wu [:swu]

Comment 41

•

7 years ago

Nick, I've tested the new patch and it worked well, thank you!

The log shows that when receiving the single-byte-padded frame, the CleanupStream() is be called as expected in the patch.

2017-08-02 01:17:34.950115 UTC - [Socket Thread]: I/nsHttp Http2Session::WriteSegments 0x7fe31c143000 InternalState 1
2017-08-02 01:17:34.950118 UTC - [Socket Thread]: D/nsSocketTransport nsSocketInputStream::Read [this=0x7fe32fc26e58 count=9]
2017-08-02 01:17:34.950131 UTC - [Socket Thread]: D/nsSocketTransport   calling PR_Read [count=9]
2017-08-02 01:17:34.950141 UTC - [Socket Thread]: D/nsSocketTransport   PR_Read returned [n=9]
2017-08-02 01:17:34.950144 UTC - [Socket Thread]: D/nsSocketTransport JIMB: ReleaseFD_Locked: mFDref = 2
2017-08-02 01:17:34.950147 UTC - [Socket Thread]: D/nsSocketTransport nsSocketTransport::SendStatus [this=0x7fe32fc26c00 status=804b0006]
2017-08-02 01:17:34.950150 UTC - [Socket Thread]: V/nsHttp Http2Session::LogIO 0x7fe31c143000 stream=(nil) id=0x0 [Reading Frame Header]
2017-08-02 01:17:34.950153 UTC - [Main Thread]: D/nsStreamPump nsInputStreamPump::OnInputStreamReady [this=0x7fe3134a4500]
2017-08-02 01:17:34.950168 UTC - [Main Thread]: D/nsStreamPump   OnStateStart [this=0x7fe3134a4500]
2017-08-02 01:17:34.950153 UTC - [Socket Thread]: V/nsHttp 00000000: 00 00 01 00 09 00 00 03 33 
2017-08-02 01:17:34.950176 UTC - [Main Thread]: D/nsHttp nsHttpChannel::OnStartRequest [this=0x7fe31c735000 request=0x7fe3134a4500 status=0]
2017-08-02 01:17:34.950183 UTC - [Socket Thread]: I/nsHttp Http2Session::WriteSegments[0x7fe31c143000::56] Frame Header Read type 0 data len 1 flags 9 id 0x333
2017-08-02 01:17:34.950192 UTC - [Socket Thread]: I/nsHttp Http2Session::ChangeDownstreamState() 0x7fe31c143000 from 1 to 3
2017-08-02 01:17:34.950197 UTC - [Socket Thread]: D/nsSocketTransport nsSocketInputStream::Read [this=0x7fe32fc26e58 count=1]
2017-08-02 01:17:34.950202 UTC - [Socket Thread]: D/nsSocketTransport   calling PR_Read [count=1]
2017-08-02 01:17:34.950183 UTC - [Main Thread]: D/nsHttp nsHttpChannel::ProcessResponse [this=0x7fe31c735000 httpStatus=200]
2017-08-02 01:17:34.950216 UTC - [Socket Thread]: D/nsSocketTransport   PR_Read returned [n=1]
2017-08-02 01:17:34.950221 UTC - [Socket Thread]: D/nsSocketTransport JIMB: ReleaseFD_Locked: mFDref = 2
2017-08-02 01:17:34.950224 UTC - [Socket Thread]: D/nsSocketTransport nsSocketTransport::SendStatus [this=0x7fe32fc26c00 status=804b0006]
2017-08-02 01:17:34.950228 UTC - [Socket Thread]: V/nsHttp Http2Session::LogIO 0x7fe31c143000 stream=(nil) id=0x0 [Reading Data Frame Padding Control]
2017-08-02 01:17:34.950232 UTC - [Socket Thread]: V/nsHttp 00000000: 00 
2017-08-02 01:17:34.950235 UTC - [Socket Thread]: I/nsHttp Http2Session::WriteSegments 0x7fe31c143000 stream 0x333 mPaddingLength=0
2017-08-02 01:17:34.950238 UTC - [Socket Thread]: I/nsHttp Http2Session::WriteSegments 0x7fe31c143000 stream 0x333 frame with only padding
2017-08-02 01:17:34.950240 UTC - [Socket Thread]: I/nsHttp Http2Session::ChangeDownstreamState() 0x7fe31c143000 from 3 to 5
2017-08-02 01:17:34.950243 UTC - [Socket Thread]: I/nsHttp Start Processing Data Frame. Session=0x7fe31c143000 Stream ID 0x333 Stream Ptr 0x7fe330096100 Fin=1 Len=1
2017-08-02 01:17:34.950246 UTC - [Socket Thread]: I/nsHttp Http2Session::UpdateLocalSessionWindow this=0x7fe31c143000 newbytes=1 localWindow=11845871
2017-08-02 01:17:34.950249 UTC - [Socket Thread]: I/nsHttp Http2Session::WriteSegments 0x7fe31c143000 trying to discard 0 bytes of data
2017-08-02 01:17:34.950251 UTC - [Socket Thread]: V/nsHttp Http2Session::LogIO 0x7fe31c143000 stream=(nil) id=0x0 [Discarding Frame]
2017-08-02 01:17:34.950254 UTC - [Socket Thread]: I/nsHttp Http2Session::ResetDownstreamState() 0x7fe31c143000
2017-08-02 01:17:34.950256 UTC - [Socket Thread]: I/nsHttp Http2Session::ChangeDownstreamState() 0x7fe31c143000 from 5 to 1
2017-08-02 01:17:34.950258 UTC - [Socket Thread]: I/nsHttp   SetRecvdFin id=0x333
2017-08-02 01:17:34.950261 UTC - [Socket Thread]: I/nsHttp MaybeDecrementConcurrent 0x7fe31c143000 id=0x333 concurrent=1 active=1
2017-08-02 01:17:34.950263 UTC - [Socket Thread]: I/nsHttp Http2Session::CleanupStream 0x7fe31c143000 0x7fe330096100 0x333 0
2017-08-02 01:17:34.950266 UTC - [Socket Thread]: I/nsHttp Http2Session::CloseStream 0x7fe31c143000 0x7fe330096100 0x333 0

Flags: needinfo?(swu)

Shian-Yow Wu [:swu]

Comment 42

•

7 years ago

mozreview-review

Comment on attachment 8892662 [details]
Bug 1381016 - Ensure we process FIN flags on all-padding final frames.

https://reviewboard.mozilla.org/r/163648/#review169096

Attachment #8892662 - Flags: review?(swu) → review+

Patrick McManus [:mcmanus]

Comment 43

•

7 years ago

:swu :hurley - awesome. thanks.

gmail is notoriously stingy on streams, so its not terribly surprising that this would show up there.

Patrick McManus [:mcmanus]

Updated

•

7 years ago

Attachment #8892662 - Flags: feedback+

Pulsebot

Comment 44

•

7 years ago

Pushed by hurley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1a5e6ebcf961
Ensure we process FIN flags on all-padding final frames. r=swu

u408661

Assignee

Comment 45

•

7 years ago

Excellent, thanks :swu and :mcmanus. I'll request uplift to all the branches tomorrow when this reaches m-c. For now, I must go eat dinner :)

Julien Cristau [:jcristau]

Updated

•

7 years ago

status-firefox55: wontfix → affected

status-firefox56: wontfix → affected

status-firefox57: --- → affected

tracking-firefox55: --- → +

tracking-firefox56: --- → +

Damien GUILLEM

Comment 46

•

7 years ago

Thank you for your work !
Please dont forget "firefox-esr52" : not in tacking flags at the moment but also affected (see https://bugzilla.mozilla.org/show_bug.cgi?id=1380896#c61).

Julien Cristau [:jcristau]

Updated

•

7 years ago

status-firefox-esr52: --- → affected

Julien Cristau [:jcristau]

Updated

•

7 years ago

tracking-firefox-esr52: --- → ?

u408661

Assignee

Updated

•

7 years ago

Assignee: nobody → hurley

u408661

Assignee

Updated

•

7 years ago

Status: NEW → ASSIGNED

Whiteboard: [http2] → [necko-active][spdy]

u408661

Assignee

Updated

•

7 years ago

Attachment #8892442 - Attachment is obsolete: true

u408661

Assignee

Comment 47

•

7 years ago

Comment on attachment 8892662 [details]
Bug 1381016 - Ensure we process FIN flags on all-padding final frames.

(Submitting before this hits m-c per jcristau's request via irc)

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: gmail (and other google properties) will hang at random intervals, requiring a browser restart
Fix Landed on Version: 57
Risk to taking this patch (and alternatives if risky): low/none
String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

Approval Request Comment
[Feature/Bug causing the regression]: http/2
[User impact if declined]: gmail (and other google properties) will hang at random intervals, requiring a browser restart
[Is this code covered by automated tests?]: no
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: no
[Why is the change risky/not risky?]: targeted 1-line fix to ensure proper state handling for http/2 streams that end in a particular way
[String changes made/needed]: none

Attachment #8892662 - Flags: approval-mozilla-release?

Attachment #8892662 - Flags: approval-mozilla-esr52?

Attachment #8892662 - Flags: approval-mozilla-beta?

Julien Cristau [:jcristau]

Comment 48

•

7 years ago

Comment on attachment 8892662 [details]
Bug 1381016 - Ensure we process FIN flags on all-padding final frames.

a+ all the things.

should be in build2 for 52.3esr and 55.0, as well as 56.0b1 next week

Attachment #8892662 - Flags: approval-mozilla-release?

Attachment #8892662 - Flags: approval-mozilla-release+

Attachment #8892662 - Flags: approval-mozilla-esr52?

Attachment #8892662 - Flags: approval-mozilla-esr52+

Attachment #8892662 - Flags: approval-mozilla-beta?

Attachment #8892662 - Flags: approval-mozilla-beta+

Ryan VanderMeulen [:RyanVM]

Comment 49

•

7 years ago

uplift

https://hg.mozilla.org/releases/mozilla-esr52/rev/c74486f87dc3bdc3278379bba48c4c042f71fa14

status-firefox-esr52: affected → fixed

Ryan VanderMeulen [:RyanVM]

Updated

•

7 years ago

Blocks: 1380896

Ryan VanderMeulen [:RyanVM]

Comment 50

•

7 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-release/rev/d8c080ddb32c

Will land on Beta56 once all the post-merge bustage is cleaned up.

status-firefox55: affected → fixed

Ritu Kothari (:ritu) (Inactive, please n-i to RyanVM, jcristau, or pascal)

Updated

•

7 years ago

tracking-firefox-esr52: ? → 55+

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 51

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/1a5e6ebcf961

Status: ASSIGNED → RESOLVED

Closed: 7 years ago

status-firefox57: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla57

Andrei Vaida [:avaida]

Updated

•

7 years ago

Flags: qe-verify+

Emil Ghitta, QA [:emilghitta]

Comment 52

•

7 years ago

I have managed to reproduce the issue described in comment 0 using Firefox 56.0a1 (Build Id:201707140320205) on Windows 10 32bit.

It seems that this issue is no longer reproducible on Firefox 57.0a1 (Build Id:20170803134456), Firefox 55.0 (build 2 , Build Id:20170802111421) and Firefox esr 52.3.0 (build 2 , Build Id:20170802111520).

Status: RESOLVED → VERIFIED

status-firefox55: fixed → verified

status-firefox57: fixed → verified

status-firefox-esr52: fixed → verified

Flags: qe-verify+

Ryan VanderMeulen [:RyanVM]

Comment 53

•

7 years ago

bugherder uplift

https://hg.mozilla.org/releases/mozilla-beta/rev/5f16075f7768

status-firefox56: affected → fixed

Ryan VanderMeulen [:RyanVM]

Comment 54

•

7 years ago

(In reply to Patrick McManus [:mcmanus] from comment #35)
> so if it looks legal on the wire, we pretty much have to work it from there.

Is there any kind of fuzzing we could do to look for more of these legal but unexpected situations before we get bitten in the wild by them?

Flags: needinfo?(hurley)

u408661

Assignee

Comment 56

•

7 years ago

(In reply to Ryan VanderMeulen [:RyanVM] from comment #54)
> (In reply to Patrick McManus [:mcmanus] from comment #35)
> > so if it looks legal on the wire, we pretty much have to work it from there.
> 
> Is there any kind of fuzzing we could do to look for more of these legal but
> unexpected situations before we get bitten in the wild by them?

I'm fairly certain we already did/have done fuzzing of the h2 stack. I'm certainly open to having it fuzzed more, but I don't know what (if anything) we're missing, as I don't know exactly how it's been fuzzed in the past. That's probably more a question for someone on the fuzzing team.

Flags: needinfo?(hurley)

Cornel Ionce [:noni] [Hubs QA]

Updated

•

7 years ago

Flags: qe-verify+

Andrei Vaida [:avaida]

Updated

•

7 years ago

Flags: qe-verify+

about:support 7 years ago Alice0775 White 11.54 KB, text/plain		Details
screenshot when problem occuerd 7 years ago Alice0775 White 470.45 KB, image/png		Details
screenshot (webm disabled) 7 years ago Alice0775 White 666.85 KB, image/png		Details
log2.7z 7 years ago Alice0775 White 9.02 MB, application/x-7z		Details
bug.png 7 years ago Alice0775 White 573.29 KB, image/png		Details
Patch: allow to discard frame with zero padding length 7 years ago Shian-Yow Wu [:swu] 1.04 KB, patch		Details \| Diff \| Splinter Review
Bug 1381016 - Ensure we process FIN flags on all-padding final frames. 7 years ago u408661 59 bytes, text/x-review-board-request	swu : review+ mcmanus : feedback+ jcristau : approval-mozilla-beta+ jcristau : approval-mozilla-release+ jcristau : approval-mozilla-esr52+	Details