Closed Bug 1385116 Opened 8 years ago Closed 8 years ago

Firefox doesn't show error page when server doesn't respond.

Categories

(Core :: Networking, defect)

x86_64
Unspecified
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- unaffected
firefox56 --- disabled
firefox57 --- fixed

People

(Reporter: tcampbell, Assigned: dragana)

References

Details

(Keywords: regression, Whiteboard: [necko-active])

Attachments

(5 files)

STR: - Start 64-bit nightly (Win10, maybe linux) - Try to navigate to 0.0.0.0 Expected: - Error page saying server unreachable Actual: - URL restores to original address and no navigation occurs Regression range: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=235ab635d17254e70b629bfe106334442a9a728f&tochange=211800000a636c336b90b1c2712319448259b4e0
Does this look related to Bug 1377004?
Flags: needinfo?(dd.mozilla)
This only occurs on 64-bit. The 32-bit builds are unaffected.
Blocks: 1377004
Keywords: regression
Hardware: Unspecified → x86_64
I think this the reason for Bug 1384957 that I experience.
Blocks: 1384957
I also see 64-bit only network issues in devtools network reporting starting roughly a week later. Could this somehow be related?
See Also: → 1384679
Sorry, I just looked out the Bug 1377004 and see that is was temporary and it has since been reverted. I'll try to update regression ranges.
Flags: needinfo?(dd.mozilla)
Second regression range is https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=a897a38aad5c4b2dbe7d880d13bf39d079b734a8&tochange=bdcd3e78b4c051dd383a26ea4172b0ccdb62feaf which is also more TCP FastOpen experimentation. These TFO experiments both break gmail for me when I switch networks and sleep. Hopefully this is resolved once the backout lands.
Can you make me a http log: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging I would really like to fix this. Thank you
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Whiteboard: [necko-active]
Attached file HTTPLog_BadServer.zip
Here is a full HTTP log using |timestamp,rotate:200,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5| In the trace I open a fresh nightly 2017-07-27 and attempt to navigate to |0.0.0.0| I have forced FastOpen to on (which should already be the default of that nightly)
I also tried to get a regression range when fastopen is forced on and got the following. ./mach mozregression --good 2017-01-01 --bits 64 --pref network.tcp.tcp_fastopen_enable:true Regression range https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=7c902365910c2b0b4e0d47335a78e55bd4102eb3&tochange=352f22e37dc37d0a3f613a5498f90e9825f6ec60
Can you make another log with an additional parameter: cd c:\ set MOZ_LOG=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5 set MOZ_LOG_FILE=%TEMP%\log.txt set NSPR_LOG_MODULES=io:5,sync set NSPR_LOG_FILE=%TEMP%\nsprlog.txt cd "Program Files\Mozilla Firefox" .\firefox.exe I need to see which error windows is returning. Thank you.
Flags: needinfo?(tcampbell)
Attached file Network Logs (64-bit)
Here is the MOZ_LOG + NSPR_LOG you requested. This is run on Nightly 2017-07-29 (64-bit).
Flags: needinfo?(tcampbell)
Additionally, this is the same experiment for 32-bit nightly which works correctly and doesn't show the bug.
nspr and moz_log must have different file names. I see only moz_log_64.txt (and children process logs)
Attached file NSPR Logs (32+64-bit)
Oops. Here they are.
With https://hg.mozilla.org/mozilla-central/rev/3e01e416c117 landed to turn off TFO, the issue no longer affects FF56. Leaving bug open since TFO will eventually be turned back on and this issue will reoccur.
Sorry for the silence. Around the same time you submitted this bug we have turned on some new tests on out test server for Windows10 and they are failing. I figured out that they are failing for the same reason as your case. I am trying o debug the issue on the test server. An I will post it here when I have an explanation and fix.
Thanks for the update :) These bugs aren't blocking anything for me, I just wanted to make sure regressions were tracked for when TFO is eventually shipped by default. (FYI: Issues seem to happen with e10s disabled, if that makes things easier for you to debug)
(In reply to Ted Campbell [:tcampbell] from comment #17) > Thanks for the update :) > > These bugs aren't blocking anything for me, I just wanted to make sure > regressions were tracked for when TFO is eventually shipped by default. > > (FYI: Issues seem to happen with e10s disabled, if that makes things easier > for you to debug) Thank. I want to fix all these TFO bug. I am keeping eye on all of them and the feature will not be shipped before fixing them. And in the last 10 min I figured out what is the problem. It actually my but :( And I did test that exact case on 2 windows computer and it worked. I di thested it because it is an obious error that can happened connection getting refuse. So for my 2 computers getsockopt with SO_ERROR did return correct error ERROR_CONNECTION_REFUSED(number 1225) and everything worked fine. I am 100% sure I hev tested that case 100s of times... In your case and in the case of our test server it returns no error which we translate into common error (this is not really helpful :( ) and only querying Overlapped structure we get correct error ERROR_CONNECTION_REFUSED.
Hi Ted, can you try firefox from this link: https://queue.taskcluster.net/v1/task/D6nQmHBbRMeObNpZo-uh6Q/runs/0/artifacts/public/build/target.zip you do not need to install it just unpack it and run firefox.exe This is build with patch from bug 1386719. I think that patch should resolve you problem too. Thank you!
Flags: needinfo?(tcampbell)
Attached file Logs for test build
Here's output when I run your experiment build. I included console output as well.
Flags: needinfo?(tcampbell)
Just confirming that this problem only occurs on Windows for me. As a result, this seems unrelated to Bug 1384679.
I will need to extend patch from bug 1386719 to deal with outer non WSA error. (background: ConnectEx returns non WSA error codes from the WSAGetLastError function.)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=4e5ba1e3925bf81c9c0d4ea62f460915e7286326 Can you try this build: https://queue.taskcluster.net/v1/task/NeoweOPIT7WQGv5c2j-bPA/runs/0/artifacts/public/build/target.zip Can you try all scenario that are failing (this bug and bug 1384957). If they are failing can you post nspr log (it should not contain any private info) or at lease look for line SocketConnectContinue GetOverlappedResult failed, I am interested in the number at the end of that line. Thank you!
Flags: needinfo?(tcampbell)
please check whether TFO is turned on.
This fixes the problem in this bug for me! :) I'll check bug 1384957 this week. I didn't have simpler steps than switching between my home and office wifi networks.
Flags: needinfo?(tcampbell)
(In reply to Ted Campbell [:tcampbell] from comment #25) > This fixes the problem in this bug for me! :) > > I'll check bug 1384957 this week. I didn't have simpler steps than switching > between my home and office wifi networks. Thank you, I will wait for you to test.
(In reply to Ted Campbell [:tcampbell] from comment #25) > This fixes the problem in this bug for me! :) > > I'll check bug 1384957 this week. I didn't have simpler steps than switching > between my home and office wifi networks. Have you had time to try it?
Flags: needinfo?(tcampbell)
I've not been able to reproduce Bug 1384957 at all, even with older nightlies. So, let's ignore it for now as it may be unrelated to TFO and/or may have been fixed in Bug 1380896.
Flags: needinfo?(tcampbell)
(In reply to Ted Campbell [:tcampbell] from comment #28) > I've not been able to reproduce Bug 1384957 at all, even with older > nightlies. So, let's ignore it for now as it may be unrelated to TFO and/or > may have been fixed in Bug 1380896. It can also be that your network provider has updated, fix its network. TFO is a new TCP feature that is not used a lot till now (currently mostly used on mobile). And the error you have see is probably network refusing tcp packets with TFO. We had a bug in nspr code that was not interpreting this correctly. This is fix by bug 1380896 and the proper fix should land in nspr in bug 1386719.
From the log I sure that this is fixed by bug 1386719. I will close the bug. If the problem reappears please reopen the bug.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Depends on: 1386719
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: