Closed
Bug 1492484
Opened 6 years ago
Closed 6 years ago
Switching tabs may send an HTTP/2 PRIORITY frame with stream ID == 0
Categories
(Core :: Networking: HTTP, defect, P2)
Tracking
()
VERIFIED
FIXED
mozilla64
People
(Reporter: iippolitov, Assigned: u408661)
References
Details
(Keywords: regression, Whiteboard: [necko-triaged])
Attachments
(1 file)
46 bytes,
text/x-phabricator-request
|
kershaw
:
review+
pascalc
:
approval-mozilla-beta+
RyanVM
:
approval-mozilla-esr60+
|
Details | Review |
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:61.0) Gecko/20100101 Firefox/61.0 Build ID: 20180704192850 Steps to reproduce: 1) create a long living http2 connection to a site (I post images in 4 streams in that connection) 2) open another http2 site (livejournal feed) and wait for it to fully load 3) switch back to a tab with long running connection Actual results: From server logs (nginx) I can see that firefox sends PRIORITY http2 frame with sid:0. Nginx closes connection and firefox silently opens another one Similar workflow in chromium works well. Older firefox version also worked well. Expected results: The PRIORITY frame always identifies a stream. If a PRIORITY frame is received with a stream identifier of 0x0, the recipient MUST respond with a connection error (Section 5.4.1) of type PROTOCOL_ERROR. Firefox should not send PRIORITY frame with sid == 0.
Updated•6 years ago
|
Component: Untriaged → Networking: HTTP
Product: Firefox → Core
Hi reporter, you say older firefox versions work properly. Can you use mozregression (https://mozilla.github.io/mozregression/) to figure out where, exactly, things went wrong for firefox? That can help us track down the change that introduced this behaviour. Thanks.
Flags: needinfo?(iippolitov)
Reporter | ||
Comment 2•6 years ago
|
||
Utility told me it has not enough data to bissect. But the first faulty build is 2017-08-31 and the last good is 2017-08-30. So it should be easy to find faulty change. Also, I noticed, 2017-08-31 sends sid=0 when I simply switch tabs. Looks like no need for another http2 site. Hope this helps.
Flags: needinfo?(iippolitov)
Updated•6 years ago
|
Keywords: regression
OK, this was caused by bug 1377208 - my best guess is there's a pushed stream somewhere in the mix, as that should be the only way we could accidentally do this. Definitely our bug, though. I just have to figure out which makes more sense - reprioritising the pushed stream, or not.
Reporter, can you try the build at https://index.taskcluster.net/v1/task/gecko.v2.try.revision.2c7f502b2dd5df495f8c53d3de5293840baedc30.firefox.linux64-opt/artifacts/public/build/target.tar.bz2 and see if this fixes the issue? You'll want to use a new profile, as nightly can't share profiles with release versions. If you're not using linux (I guessed based on your bug report), let me know what platform you're using, and I can point you to the appropriate build for that OS. Thanks.
Flags: needinfo?(iippolitov)
Reporter | ||
Comment 5•6 years ago
|
||
Couldn't reproduce the issue, so I believe it is fixed. Thank you very much!
Flags: needinfo?(iippolitov)
Great, thanks! I'll prepare this patch to land, and hopefully get it uplifted to beta, as well.
Summary: A long running HTTP2 connection may send PRIORITY frame with stream ID == 0 → Switching tabs may send an HTTP/2 PRIORITY frame with stream ID == 0
Right now, pushed streams don't get put in the priority hierarchy anyway, and with the code as it was, we were sending an illegal PRIORITY frame for stream 0. This prevents that from happening. We should probably think about putting pushed streams in the priority hierarchy at some point.
Updated•6 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 8•6 years ago
|
||
Comment on attachment 9010983 [details]
Do not reprioritize pushed streams on tab switch.
Kershaw Chang [:kershaw] has approved the revision.
Attachment #9010983 -
Flags: review+
Pushed by hurley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/6341eaf530a7 Do not reprioritize pushed streams on tab switch. r=kershaw
Comment 10•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6341eaf530a7
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Comment 11•6 years ago
|
||
What's the impact of this bug? Does it need to be considered for backport?
status-firefox62:
--- → wontfix
status-firefox63:
--- → affected
status-firefox-esr60:
--- → affected
Flags: needinfo?(hurley)
Assignee | ||
Comment 12•6 years ago
|
||
Comment on attachment 9010983 [details] Do not reprioritize pushed streams on tab switch. Approval Request Comment [Feature/Bug causing the regression]: bug 1377208 [User impact if declined]: wasted tcp connections/potentially slower operation on long-running uploads or downloads [Is this code covered by automated tests?]: no [Has the fix been verified in Nightly?]: yes [Needs manual test from QE? If yes, steps to reproduce]: no [List of other uplifts needed for the feature/fix]: none [Is the change risky?]: no [Why is the change risky/not risky?]: does a 0-check for http/2 stream ids, and avoids sending priority frames on them (which breaks spec). [String changes made/needed]: none
Flags: needinfo?(hurley)
Attachment #9010983 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 13•6 years ago
|
||
If this makes it through to beta, then we can try pushing the envelope to see how far up the trees (esr) we can get it :)
Comment 14•6 years ago
|
||
Comment on attachment 9010983 [details]
Do not reprioritize pushed streams on tab switch.
Low risk minimal patch fixing an old P2 regression. Fix verified by the reporter, uplift accepted for 63 beta 10, thanks.
Attachment #9010983 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 15•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/92308e03be83
Assignee | ||
Comment 16•6 years ago
|
||
Comment on attachment 9010983 [details] Do not reprioritize pushed streams on tab switch. [Approval Request Comment] If this is not a sec:{high,crit} bug, please state case for ESR consideration: see user impact if declined User impact if declined: potentially slower http/2 downloads, potentially slower or even stalled/killed http/2 uploads Fix Landed on Version: 64 Risk to taking this patch (and alternatives if risky): no/very little risk String or UUID changes made by this patch: none See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Attachment #9010983 -
Flags: approval-mozilla-esr60?
Assignee | ||
Comment 17•6 years ago
|
||
Comment on attachment 9010983 [details]
Do not reprioritize pushed streams on tab switch.
Should've requested this same time as I requested esr60. See above for approval request info.
Attachment #9010983 -
Flags: approval-mozilla-esr52?
Comment 18•6 years ago
|
||
Comment on attachment 9010983 [details]
Do not reprioritize pushed streams on tab switch.
ESR52 is EOL. I honestly have no idea why that flag is still active in Bugzilla.
Attachment #9010983 -
Flags: approval-mozilla-esr52?
Comment 19•6 years ago
|
||
Comment on attachment 9010983 [details]
Do not reprioritize pushed streams on tab switch.
Fix for intermittent HTTP2 connection issues. Approved for ESR 60.3.
Attachment #9010983 -
Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Updated•6 years ago
|
Flags: qe-verify+
Comment 20•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-esr60/rev/1b3e6759cf3a
Comment 21•6 years ago
|
||
Hi iippolitov, Can you please explain the steps to reproduce from comment 0 in a more easily understandable process for a low technical knowledge tester? I need to be able to reproduce it and then verify the fix myself and I have the impression you'll need to put it is really simple terms for me. Thank you!
Flags: needinfo?(iippolitov)
Reporter | ||
Comment 22•6 years ago
|
||
Daniel, First of all, steps in comment 0 are a bit too complex. The issue can be reproduced a bit easier: just switch tabs waiting for a reply in an HTTP/2.0 connection. You'll need any web site opened in a tab and an HTTP/2 request which is processed for a long time. For example, with a form being sent to a server which replies in 20-30 seconds. You send the form and while the browser is waiting for a reply you switch tabs back and forth. If the bug is in effect the long waiting connection will be closed earlier than it should (and you'll see error logs in the web server logs). I think, I can compose a simple self contained nginx configuration for you to test this. Or a less simple shell script. Feel free to ask additional questions if my explanation looks vague to you.
Flags: needinfo?(iippolitov)
Comment 23•6 years ago
|
||
The explanation is not that vague, but I'll need you to correct me if I'm wrong. These are my steps taken, on Ubuntu 16 OS: 1. Downloaded a build from a date prior to the push of the fix in mozilla-central: http://archive.mozilla.org/pub/firefox/nightly/2018/09/2018-09-20-22-01-02-mozilla-central/firefox-64.0a1.en-US.linux-x86_64.tar.bz2 2. Opened e random http page in a tab and a http2 page in another Tab1: https://ro.yahoo.com/?p=us Tab2: https://http2.akamai.com/ 3. I opened the "about:networking#logging" page in a 3rd tab and clicked on "Start Logging" button. 4. I refresh tab 2. 5. I switch between the tabs for 20-30 seconds. 6. I went to the 3rd tab and clicked on the "Stop Logging" button. 7. Retrieved the log: log.txt-main.9202 from where it was saved. Download it from here: https://drive.google.com/file/d/13-x9V6U-_TRVWdBGQKjl_yYuf1ewlZPP/view?usp=sharing I do not have the knowledge to read the log, unfortunately. @iippolitov: A. Did I perform the steps correctly? B. If yes, did the issue reproduce? C. If not, can you correct my steps or provide a suitable test case that I could use? Thank you!
Flags: needinfo?(iippolitov)
Reporter | ||
Comment 24•6 years ago
|
||
Daniel, I'm not really familiar with mozilla network logs, please give me some time to look through it with issue reproduced and without it. Strangely, I can't reproduce the issue right now, while I'm 100% sure it breaks my uploads every day. It may take me a couple of day, but I'll get back to you.
Reporter | ||
Comment 25•6 years ago
|
||
The message we are looking for looks like this: >LOG3(("Http2Session::SendPriorityFrame %p Frame 0x%X depends on 0x%X weight %d\n", this, streamID, dependsOn, weight)); In logs with a problem reproduce it looks like this: > 2018-10-10 12:31:06.978614 UTC - [Parent 2253: Socket Thread]: I/nsHttp Http2Session::SendPriorityFrame 0x7fa9e13be800 Frame 0x31 depends on 0x7 weight 21 > 2018-10-10 12:31:06.978667 UTC - [Parent 2253: Socket Thread]: I/nsHttp Http2Session::SendPriorityFrame 0x7fa9e13be800 Frame 0x0 depends on 0x7 weight 21 > 2018-10-10 12:31:06.978725 UTC - [Parent 2253: Socket Thread]: I/nsHttp Http2Session::SendPriorityFrame 0x7fa9e13be800 Frame 0x35 depends on 0x7 weight 21 Your log doesn't have any 'SendPriorityFrame' messages. Most likely, because your connection is way to fast. Here is step-by-step guide I succeeded with: Open http2.akamai.com, open dev tools (ctrl-shift-i in my environment), in dev tools go to network tab, on the top right, set throttle mode to gprs (slowest) open about:netowrking#logging, start logging Reload http2.akamai.com, it will take some time. Don't wait till it's loaded! While it is loading, click several tabs back and forth. Grep logs with something like: grep 'SendPriorityFrame [x[:xdigit:]]* Frame 0x0 depends' /tmp/log2.txt-main.2253 If there is no such entry - try a couple more times. The issue is reproduced like 7 out of 10 attempts with http2.akamai.com throttling. It is reproduced better if you have several long running requests, I think. Feel free to ask for additional info.
Flags: needinfo?(iippolitov)
Comment 26•6 years ago
|
||
I have used your steps from comment 25. I think I have reproduced the issue on Windows 10 x64 on Nightly v64.0a1 from 2018-09-15. Also, it seems to me that the issue is also reproducing on Nightly v64.0a1 from 2018-10-10, the latest Nightly, so the issue may not be fixed. Can you confirm this? These are the logs: https://drive.google.com/drive/folders/1F1qts-DgZZQb2N0WJZQJ7Nvlf_hclpZj?usp=sharing
Flags: needinfo?(iippolitov)
Reporter | ||
Comment 27•6 years ago
|
||
Daniel,
According to logs, a frame for a stream with zero ID was sent in the 'before fix' log:
>2018-10-11 12:11:50.664000 UTC - [Parent 11524: Socket Thread]: I/nsHttp Http2Session::SendPriorityFrame 0000018AB0308800 Frame 0x0 depends on 0x7 weight 20
But there is no such frame in 'after fix' log file. As far as I can see everything is OK.
Could you please share with me, how did you make the conclusion that you have reproduced the bug? What seems to be faulty behaviour?
Flags: needinfo?(iippolitov) → needinfo?(daniel.bodea)
Comment 28•6 years ago
|
||
I do not really understand what 'a frame for a stream with zero ID' means so I just compared the examples you gave me in comment 25 with similar ones in the logs. In any case, this is good news! I will set the firefox64 tag as verified and I'll post other logs extracted from fixed Beta and ESR60 builds (done with the exact same repro steps at least 2 times each) so that you can confirm the fix yourself.
Flags: needinfo?(daniel.bodea)
Reporter | ||
Comment 29•6 years ago
|
||
Daniel, I'm not really an HTTP/2 professional, but this looks to be a simple issue. Let me explain: HTTP/2 can send data in several streams. Each stream consists of frames. Streams are identified by an ID. Each frame has that streamID inside. By design, that ID can't be zero. In case a browser sends a frame for a stream with a zero ID the server will drop this connection. A client may set different priority for streams. This can be done in initial frames or later with a 'set priority' frame. Firefox is able to send this 'priority frame'. This kind of packet (frame) is logged with following C statement: >LOG3(("Http2Session::SendPriorityFrame %p Frame 0x%X depends on 0x%X weight %d\n", this, streamID, dependsOn, weight)); I don't really know what LOG3 does, but it looks like it logs a string for the third level of verbosity. From other C function I guess the text in quotes is a template: >"Http2Session::SendPriorityFrame %p Frame 0x%X depends on 0x%X weight %d\n" Then go values to populate those '%'-something places: 'this', 'streamID', 'dependsOn' and 'weight'. Obviously, streamID looks like a stream ID mentioned above =) if 'streamID' is zero, than the template looks like this: >"Http2Session::SendPriorityFrame %p Frame 0x0 depends on 0x%X weight %d\n" %p will be replaces with a hex number, %X and %d will also be replaced with hex and decimal numbers. But 'Frame 0x0' is a problem indicator. Then you just have to find 'Frame 0x0' string in the log file. From user perspective this error looks like a network problem: something can't be loaded (because the server shuts connection mid "flight"). Reload and everything works well. Hope this helps.
Comment 30•6 years ago
|
||
I think it did help and it seems (to me) that the Beta v63.0b13 and Release ESR60 build (from taskcluster, that contains the changeset), are now fixed. For certainty, I would like you to have a look at the logs and confirm that the issue does not occur. The logs are in the same folder like the ones mentioned in comment 26: https://drive.google.com/drive/u/2/folders/1F1qts-DgZZQb2N0WJZQJ7Nvlf_hclpZj Thank you!
Flags: needinfo?(iippolitov)
Reporter | ||
Comment 31•6 years ago
|
||
Daniel, Looks good to me. No traces of the issue
Flags: needinfo?(iippolitov)
Comment 32•6 years ago
|
||
Considering the comment above, I will mark this issue as verified in firefox63 and firefox-esr60. Thank you for your contribution, iippolitov!
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in
before you can comment on or make changes to this bug.
Description
•