Request fail for 10 seconds on network switch (MacOs / Linux)
Categories
(Core :: Networking: HTTP, defect, P2)
Tracking
()
People
(Reporter: ruihildt, Assigned: kershaw)
References
(Blocks 1 open bug)
Details
(Whiteboard: [necko-triaged][necko-priority-queue])
Attachments
(7 files, 1 obsolete file)
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:87.0) Gecko/20100101 Firefox/87.0
Steps to reproduce:
Scenario 1:
0 - Load a page (In videos linked, https://mozilla.org)
1 - Change my network or vpn status (enable/disable)
2 - Reload the page
Scenario 2:
0 - Load a page (In videos linked, https://mozilla.org)
1 - Change my network or vpn status (enable/disable)
2 - Wait for more than 10 seconds
3 - Reload the page
Actual results:
Scenario 2:
- The page doesn't reload, and no errors are returned
- A second refresh load the page correctly
Scenario 2:
- The page loads correctly
Expected results:
The page should always reload instantly.
Please note that this behavior is reproducible in Firefox MacOs/Linux, but NOT in Windows. This behavior is not observable in Chromium at all.
I first encountered this behavior when I was doing request in a webextension popup.
Comment 2•4 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'WebExtensions::Untriaged' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.
I reverted the change from Bugbug, this is not specific to webextension.
Comment 4•4 years ago
|
||
Hi, I was able to reproduce this issue as well in Firefox Release 88, Beta 89.0b2 and our latest Nightly build 90.0a1 (2021-04-21) using Nordvpn to connect and disconnect and then reloading the page in Firefox.
Comment 5•3 years ago
|
||
I was able to reproduce this with ExpressVPN.
Depending on when the request happens, it could be:
- Before the DNS settings have stabilized, and getaddrinfo returns an error (shows error page)
- Before the old connections have stopped working (triggers this bug)
- After the old connection has stopped working - in which case we create a new one, and it works.
I'll take a look to see if we can better improve pruning dead connections.
Comment 8•2 years ago
|
||
Sorry for missing the needinfo.
We recently fixed a similar issue in bug 1647985, but that one was only for DoH connections.
From what I can tell, the main issue is that the H2 connection is broken, but it takes a while to detect that.
// The amount of idle seconds on a http2 connection before initiating a
// server ping. 0 will disable.
if (PREF_CHANGED(HTTP_PREF("http2.ping-threshold"))) {
mSpdyPingThreshold = PR_SecondsToInterval((uint16_t)clamped(
StaticPrefs::network_http_http2_ping_threshold(), 0, 0x7fffffff));
}
// The amount of seconds to wait for a http2 ping response before
// closing the session.
if (PREF_CHANGED(HTTP_PREF("http2.ping-timeout"))) {
mSpdyPingTimeout = PR_SecondsToInterval((uint16_t)clamped(
StaticPrefs::network_http_http2_ping_timeout(), 0, 0x7fffffff));
}
the ping threshold seems a bit large, at 58 seconds, but it does get trimmed down to 5 seconds here.
The problem is that we only seem to call it when VerifyTraffic is called which we only seem to do when a network change occurs. However, with VPNs we might miss network change events, or the network changes might occur later than our event actually fires.
Kershaw, am I missing something here? Do we really have a bug in the sense that we're only sending the pings when triggered by a network change event?
Updated•2 years ago
|
Assignee | ||
Comment 9•2 years ago
|
||
The problem is that we only seem to call it when VerifyTraffic is called which we only seem to do when a network change occurs. However, with VPNs we might miss network change events, or the network changes might occur later than our event actually fires.
I think network change event is necessary to trigger VerifyTraffic
. I think there is nothing we can do if we can't detect network change reliably.
However, I do find a problem when using Firefox with mozilla VPN. The problem is that when mozilla VPN is enabled/disabled, we do detect a network change event, but the NS_NETWORK_LINK_DATA
is up
, not cahnged
. In the end, VerifyTraffic
is not called, so we have a broken h2 connection. To fix this, I think we should perform VerifyTraffic
every time we receive a network change event, regardless of NS_NETWORK_LINK_DATA
.
I am not sure if it's possible that the real network changes after event fires, but maybe we can start another timer to perform VerifyTraffic
after certain seconds.
Another thing we could do is reducing http2.ping-timeout
. It's 8s currently, which seems a bit long. Maybe 3s or 5s would be better.
Kershaw, am I missing something here? Do we really have a bug in the sense that we're only sending the pings when triggered by a network change event?
It seems that we don't have this kind of bug before, but I might be wrong.
Comment 10•2 years ago
|
||
I wonder if http2.ping-threshold might also be a problem. It's currently at 58 seconds, which means absent any other events or traffic, we wait for 58 seconds before sending a ping.
We could reduce it (at least for desktop) to something more reasonable - 15, 20 seconds - at the expense of reduced battery life though I'm not too worried about that.
Unfortunatelly I can't test this at the moment, as mozvpn doesn't work on the latest ubuntu :(
@Kershaw, would you be able to take this?
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 11•2 years ago
|
||
For mozvpn, the data received in network change event is "up", not "changed", so we should call VerifyTraffic for every event for safe.
This patch also reduces http2.ping-timeout and http2.ping-threshold, since the original values are too long.
Comment 12•1 years ago
|
||
Comment 13•1 years ago
|
||
Backed out for wpt failure on 001.html
Backout link: https://hg.mozilla.org/integration/autoland/rev/4cc8e322f8832e379a9c7f48a29878cf0963d6b6
Log link: https://treeherder.mozilla.org/logviewer?job_id=418296116&repo=autoland&lineNumber=2291
Assignee | ||
Comment 14•1 years ago
|
||
(In reply to Narcis Beleuzu [:NarcisB] from comment #13)
Backed out for wpt failure on 001.html
Backout link: https://hg.mozilla.org/integration/autoland/rev/4cc8e322f8832e379a9c7f48a29878cf0963d6b6
Log link: https://treeherder.mozilla.org/logviewer?job_id=418296116&repo=autoland&lineNumber=2291
This is caused by the pref change to network.http.http2.ping-threshold
. It seems that changing this value could break something, so we need to be very careful. I'll revert the pref changes and file another bug to investigate whether we can adjust these values.
Comment 15•1 years ago
|
||
Comment 16•1 years ago
|
||
bugherder |
Updated•1 years ago
|
Updated•1 year ago
|
Comment 17•1 year ago
|
||
Hi @ruihildt can you please try our latest Firefox 116 Build again and see if the issue still occurs on your end ? My installed NordVpn keeps returning .sock not found and I cant reproduce the issue with the Addon version of nordVPN for Firefox.
Here is where you can find the Firefox 116 BETA build: https://www.mozilla.org/en-US/firefox/channel/desktop/
Please let us know if the issue still occurs on your end.
Reporter | ||
Comment 18•1 year ago
|
||
(In reply to Rares Doghi, Desktop QA from comment #17)
Hi @ruihildt can you please try our latest Firefox 116 Build again and see if the issue still occurs on your end ? My installed NordVpn keeps returning .sock not found and I cant reproduce the issue with the Addon version of nordVPN for Firefox.
Here is where you can find the Firefox 116 BETA build: https://www.mozilla.org/en-US/firefox/channel/desktop/
Please let us know if the issue still occurs on your end.
You mention you used the VPN extension, but the error here is for system wide VPN and not the extension.
I tried with Firefox Beta 116.0b6 (using the Developer Edition) and I still have the same issue. Should I reopen the issue?
Comment 19•1 year ago
|
||
I tried using the System version of NordVPN but it keeps returning .sock not found, tried uninstalling it and installing it again but it still wont work its why I tried the Extension version. We need someone with an installed version of VPN to try this issue @Kershaw Chang ? can you please take a look ? it seems that the issue still occurs in our latest Beta Dev Edition.
Assignee | ||
Comment 20•1 year ago
|
||
(In reply to ruihildt from comment #18)
(In reply to Rares Doghi, Desktop QA from comment #17)
Hi @ruihildt can you please try our latest Firefox 116 Build again and see if the issue still occurs on your end ? My installed NordVpn keeps returning .sock not found and I cant reproduce the issue with the Addon version of nordVPN for Firefox.
Here is where you can find the Firefox 116 BETA build: https://www.mozilla.org/en-US/firefox/channel/desktop/
Please let us know if the issue still occurs on your end.You mention you used the VPN extension, but the error here is for system wide VPN and not the extension.
I tried with Firefox Beta 116.0b6 (using the Developer Edition) and I still have the same issue. Should I reopen the issue?
Hi, could you try to record a http log for this?
Please use the steps below:
- Start logging (make sure you select
Logging to a file
). - Load
https://mozilla.org
- Change network status by VPN
- Load the same page again
- Stop logging
Thanks.
Reporter | ||
Comment 21•1 year ago
|
||
Reporter | ||
Comment 22•1 year ago
|
||
Reporter | ||
Comment 23•1 year ago
|
||
Sorry not familiar with the interface, and can't edit my previous comment.
Loading mozilla.org regularly: https://bugzilla.mozilla.org/attachment.cgi?id=9345740
Loading mozilla.org after disconnecting VPN: https://bugzilla.mozilla.org/attachment.cgi?id=9345741
Assignee | ||
Comment 24•1 year ago
|
||
Hi reporter,
Thanks fro the log.
However, it seems that the log is not completed. I only saw the HTTP request to load https://www.mozilla.org/en-US/
once.
Based on the steps in comment #0, there should be another request to load https://www.mozilla.org/en-US/
after VPN change, but I didn't see it in the log.
Could you try to record a log again?
Thanks.
Reporter | ||
Comment 25•1 year ago
|
||
Here it is.
Assignee | ||
Comment 26•1 year ago
|
||
The basic concept here is adding a pending list in ConnectionEntry and put connections in it when VerifyTraffic() is called.
By doing this, we will always create new connections after a network change event. For the old connections, which might be still alive after network change, we put them into the pending list.
The connections in the pending list will keep working until their transactions are done.
Comment 27•1 year ago
|
||
Comment 28•1 year ago
|
||
bugherder |
Comment 29•1 year ago
•
|
||
Hello! I have tried to reproduce the issue on Ubuntu 22.04 LTS with Firefox 114.0, 90.0a1(2021-04-20) unfortunately I wasn't able to do so.
Ruihildt could you please take a look if the issue is fixed in the latest nightly? Here is a link: https://www.mozilla.org/en-US/firefox/channel/desktop/
Reporter | ||
Comment 30•1 year ago
|
||
(In reply to Negritas Sergiu, Desktop QA from comment #29)
Hello! I have tried to reproduce the issue on Ubuntu 22.04 LTS with Firefox 114.0, 90.0a1(2021-04-20) unfortunately I wasn't able to do so.
Ruihildt could you please take a look if the issue is fixed in the latest nightly? Here is a link: https://www.mozilla.org/en-US/firefox/channel/desktop/
It is not fixed in Firefox 121.0a1. Reopening the issue.
I have just tried on Ubuntu 22.04 LTS/Fedora 38 with Mullvad VPN, and on macOS 13.5 with both Mullvad VPN and NordVPN, with the same errors as always.
Assignee | ||
Comment 31•1 year ago
|
||
Hi Reporter,
Could you try to flip this pref network.http.http2.move_to_pending_list_after_network_change
to true and see if you still can reproduce this issue?
If yes, may I ask you to record a http log again?
Thanks.
Reporter | ||
Comment 32•1 year ago
|
||
(In reply to Kershaw Chang [:kershaw] from comment #31)
Hi Reporter,
Could you try to flip this pref
network.http.http2.move_to_pending_list_after_network_change
to true and see if you still can reproduce this issue?
If yes, may I ask you to record a http log again?Thanks.
Happy to report flipping the pref to true
fixes the issue.
Assignee | ||
Comment 33•1 year ago
|
||
Comment 34•1 year ago
|
||
Comment 35•1 year ago
|
||
bugherder |
Comment 36•11 months ago
|
||
Kershaw, at which point can we say that it's safe to let the pref ride the trains to release?
Assignee | ||
Comment 37•10 months ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #36)
Kershaw, at which point can we say that it's safe to let the pref ride the trains to release?
I think it's about time to ship this. I filed bug 1876045 for enabling it.
Updated•10 months ago
|
Comment 38•6 months ago
|
||
Set release status flags based on info from the regressing bug 1884349
Updated•6 months ago
|
Updated•6 months ago
|
Updated•6 months ago
|
Updated•6 months ago
|
Description
•