Open Bug 1896216 Opened 1 year ago Updated 6 months ago

ns_error_net_interrupt error when accessing l-tike.com

Categories

(Core :: Networking, defect, P3)

defect

Tracking

()

People

(Reporter: mostlygeek, Unassigned, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged])

Attachments

(5 files)

Attached image successful result

Context:

  • On the Japanese site: https://l-tike.com/st1/ghibli-pk-en4/sitetop, looking to book Ghibli Park tickets
  • The site was under high load since ticket booking just opened. It's working now (low load).
  • Using Firefox v125.0.3 (OSX)
  • Going through the ordering forms, got warnings that the site was not secure
  • On the console, saw lots of ns_error_net_interrupt.
  • Clicking the [Try Again] button on the Security Warning page, which did the POST request again would eventually work. Each additional step (POST) would cause the warnings to come up again. It happened enough that I couldn't complete the whole process with Firefox.
  • It worked on my iPhone. No errors came up with security, network, etc. Whole process was normal latency, no retries, etc.

This is a difficult problem to replicate. It appears l-tike.com has some issues under load and Firefox does not handle the connection issues gracefully. I've attached some screenshots with information from the console. I also tried it with curl and the issue could be reproduced. I attached the CLI to this bug as a text file.

When I tested with copy as cURL I got this error:

> Upgrade-Insecure-Requests: 1
> Sec-Fetch-Dest: document
> Sec-Fetch-Mode: navigate
> Sec-Fetch-Site: same-origin
> Sec-Fetch-User: ?1
>
* HTTP/2 stream 1 was not closed cleanly: INTERNAL_ERROR (err 2)
* Connection #0 to host l-tike.com left intact
curl: (92) HTTP/2 stream 1 was not closed cleanly: INTERNAL_ERROR (err 2)

I have the full dump, which contains some cookies that I can send privately.

Severity: -- → S3
Priority: -- → P3
Whiteboard: [necko-triaged][necko-priority-new]
  • idea:
    • retry the channel once if it is a GET request and it failed with a reset -or other recoverable errors.
    • check what safari is doing?
    • see if any of the headers we're setting might be causing the failure (with copy as curl?)

(In reply to Benson Wong [:mostlygeek] from comment #3)

I have the full dump, which contains some cookies that I can send privately.

What kind of dump is it? The curl output?
Is the request setting any special headers? (apart from cookies?)

Flags: needinfo?(bwong)

Attaching the curl dump. I reviewed it a bit closer and doesn’t appear to contain any private info as the site did not require a log in.

Flags: needinfo?(bwong)

It's not easy to figure out why it's failing.
The curl command does close with internal_error - but for Firefox the website redirects.

I think we still need more investigation to confirm if it is a firefox issue.
Moving it out of priority new and triaged list.

Whiteboard: [necko-triaged][necko-priority-new]

It's hard to tell if there's a real problem here. Most likely the connection got reset during the TLS handshake.
Without a reproducible test case I don't think we can do much about it.

Whiteboard: [necko-triaged]

Is this bug related to HTTP3?
Mentioned here: https://www.reddit.com/r/computerhelp/comments/18cea3i/network_problems_on_specific_sites_that_are_fixed/
and here https://stackoverflow.com/questions/76370134/ns-error-net-interrupt-in-firefox-but-not-in-chrome

I had problem year ago and I don't remember how I fxed it.
Now I have it again, I have disabled http3 and now it works.

On server side it's nginx/1.25.5

forgot to add: ff beta 128.0b4

Same issue has been occurring on my website

https://mithras.vincejv.com -> HTTP/3 website
https://legacy.vincejv.com -> HTTP/2 only domain

There is a hotlink on mithras.vincejv.com going to legacy, sometimes, when clicking on the legacy link from the mithras subdomain it presents with the error

I seemed to have noticed this behavior more frequently when i enabled OCSP Stapling and OCSP Must staple on nginx, I'm using the freenginx fork latest mainline freenginx-1.27.2 with openssl-3.3.1 statically linked, can't reproduce it at all on Chrome

Using stable Firefox 128.0.3

Hi all, I see the same issue - a small percentage of requests ending with NS_ERROR_NET_INTERRUPT. Some details:

I maintain several web sites, and my users started to see these errors after I moved these sites behind haproxy (I was using Linux+IPVS as a load balancer in front of Apache HTTPD back-ends before). So it apparently correlates with clients starting to use newer protocols than HTTP/1.1.

My Haproxy is 2.9.7 compiled with quictls 3.1.5 (https://github.com/quictls/openssl) on AlmaLinux 9.

Nor I neither my users were able to reproduce this in Chrome. It happens only with Firefox (Linux and Windows) for us.

The easiest way how to reproduce this for us is to repeatedly click on a logo of our title page, which reloads the page. The error happens after a while (once in several tens of reloads). The title page has about 15 KB and it is sent with Cache-Control: no-cache header, so it is reloaded every time user accesses its location.

When I disable HTTP/3 (either by unbinding the UDP port 443 and removing the alt-svc header in haproxy or by using haproxy compiled against system openssl from AlmaLinux, which does not support HTTP/3 at all, it works, or at least I was not able to reproduce this.

When the error occurs, there is no line added to the haproxy log, so it dies very early in the request.

So it might be some H3 protocol corner case. How can I help to debug this? H3 is all-encrypted, so packet captures with tcpdump are not very useful.

Reading the whole thread again, I would like to add more clues:

I also use OCSP stapling (with a letsencrypt.org certificate),
I have Firefox 127.0.2 on Fedora Linux 40.

Thanks!

Hi Jan,

Could you try to capture a http log?
In about:logging, please select Logging preset: to HTTP/3 and select logging to file. Then, you could send the file to necko@mozilla.com.

Thanks.

Flags: needinfo?(kas)

Hi Kershaw,

log file sent as requested (from a newly created Firefox session).

It looks like there are no private data there (maybe except the session cookie), so I also put the log here:

https://www.fi.muni.cz/~kas/tmp/h3-log.txt

(It will get deleted in a month or so)

-Yenya

Flags: needinfo?(kas)

When testing this, I noticed that Firefox sometimes switches to HTTP/2 and does not do further requests over HTTP/3, even though there is alt-svc header in all responses of that site pointing to HTTP/3. Can I reset this somehow? It would make testing a bit easier. Thanks!

You can open the browser console with Ctrl-Shitf-J and execute Services.obs.notifyObservers(null, "network:reset-http3-excluded-list"); to clear the HTTP/3 exclusion list. You might need to set devtools.chrome.enabled to true in about:config to be able to execute code in the Browser console.

(In reply to Kershaw Chang [:kershaw] from comment #15)

Could you try to capture a http log?
In about:logging, please select Logging preset: to HTTP/3 and select logging to file. Then, you could send the file to necko@mozilla.com.

Hi Kershaw,

was the log file I sent to necko@m.o sufficient? Do you need more logs?

It is hard to say, but it seems I am able to reproduce this problem only on some sites I have recently moved behind haproxy, but apparently not on all of them. So it might as well be a problem of haproxy or quictls. But other reports here say that they see the problem on freenginx and openssl-3.3, so it might be a problem of Firefox after all.

-Yenya

Valentin Gosu: How can I enter something to the Ctrl-Shift-J pop-up window? There seems to be only one input field for filtering messages. And in the ordinary per-page console (F12 -> Console) the above command does not work, as there is no object named Services in its context. Thanks in advance for clarification.

(In reply to Jan Kasprzak from comment #20)

Valentin Gosu: How can I enter something to the Ctrl-Shift-J pop-up window? There seems to be only one input field for filtering messages. And in the ordinary per-page console (F12 -> Console) the above command does not work, as there is no object named Services in its context. Thanks in advance for clarification.

I think you need to set the devtools.chrome.enabled pref to true in about:config

Valentin Gosu: thanks!

So, did anybody managed to examine the logs I sent? Thanks!

Ping? Is there anything new WRT. this problem?

Thank you for the ping.
I do see the stream being reset in the logs:

2024-08-06 11:12:53.558863 UTC - [Parent 860623: Socket Thread]: V/nsHttp Http3Session::ProcessInput writer=7f31a85282e0 [this=7f319e57c500 state=2]
2024-08-06 11:12:53.558925 UTC - [Parent 860623: Socket Thread]: I/neqo_http3::* [neqo_http3::connection] [Http3 connection] Handle a stream reset stream_id=32 app_err=0
2024-08-06 11:12:53.558938 UTC - [Parent 860623: Socket Thread]: V/nsHttp Http3Session::ProcessInput received=31
2024-08-06 11:12:53.558944 UTC - [Parent 860623: Socket Thread]: V/nsHttp Http3Session::ProcessEvents [this=7f319e57c500]
2024-08-06 11:12:53.558949 UTC - [Parent 860623: Socket Thread]: V/nsHttp Http3Session::ProcessEvents 7f319e57c500 - Reset
2024-08-06 11:12:53.558958 UTC - [Parent 860623: Socket Thread]: I/neqo_http3::* [neqo_http3::connection_client] [Http3 client] reset_stream 32 error=268.
2024-08-06 11:12:53.558964 UTC - [Parent 860623: Socket Thread]: I/neqo_http3::* [neqo_http3::connection] [Http3 connection] cancel_fetch 32 error=268.
2024-08-06 11:12:53.558971 UTC - [Parent 860623: Socket Thread]: D/nsHttp nsHttpTransaction::Close [this=7f31af3b5000 reason=804b0047]
2024-08-06 11:12:53.558984 UTC - [Parent 860623: Socket Thread]: V/nsHttp nsHttpConnectionMgr::RemoveActiveTransaction t=7f31af3b5000 tabid=10(1) thr=0
2024-08-06 11:12:53.558989 UTC - [Parent 860623: Socket Thread]: V/nsHttp Active transactions -[0,0,0,0]
2024-08-06 11:12:53.559019 UTC - [Parent 860623: Socket Thread]: D/nsHttp nsHttpTransaction::ShouldRestartOn0RttError [this=7f31af3b5000, mEarlyDataWasAvailable=0 error=804b0047]
2024-08-06 11:12:53.559026 UTC - [Parent 860623: Socket Thread]: V/nsHttp HttpTrafficAnalyzer::AccumulateHttpTransferredSize [Y1_N1] rb=0 sb=0 [this=7f31cc39752d]
2024-08-06 11:12:53.559034 UTC - [Parent 860623: Socket Thread]: D/nsHttp nsHttpTransaction::RemoveDispatchedAsBlocking this=7f31af3b5000 not blocking
2024-08-06 11:12:53.559041 UTC - [Parent 860623: Socket Thread]: D/nsHttp nsHttpTransaction 7f31af3b5000 request context set to null in ReleaseBlockingTransaction() - was 7f319d3ae430
2024-08-06 11:12:53.559061 UTC - [Parent 860623: Socket Thread]: I/nsHttp Http3Session::CloseStreamInternal 7f319e57c500 7f319dc21dd0 0x804b0047
2024-08-06 11:12:53.559073 UTC - [Parent 860623: Socket Thread]: V/nsHttp Http3Session::SendData [this=7f319e57c500]
2024-08-06 11:12:53.559079 UTC - [Parent 860623: Socket Thread]: V/nsHttp Http3Session::ProcessOutput reader=7f31a85282e0, [this=7f319e57c500]
2024-08-06 11:12:53.559473 UTC - [Parent 860623: Main Thread]: D/nsHttp nsHttpChannel::OnStartRequest [this=7f31a9e3ba00 request=7f319cb56e80 status=804b0047]

error=268 corresponds to Self::HttpRequestCancelled => 0x10c.

Max, when you get a chance could you take a look at whether this is is a recoverable error? I'm looking at this comment here and wondering if it's correct to cancel the stream with NET_ERROR_INTERRUPT


Apart from the HTTP/3 error we're seeing here, what I noticed in the logs is that the network request is actually for a cache revalidation.
I think it would be a much better user experience if cache revalidation channel errors would be treated the same as 304 Not modified. I'll file a different bug to investigate this approach.

Blocks: QUIC
Flags: needinfo?(mail)

So, it seems part of the problem is that the remote end (haproxy in my case) resets the connection after some timeout? I am not very familiar with H3/QUIC, but it probably can happen in a normal traffic, can't it?

I am asking whether there can be a problem also on haproxy side. Thanks!

(In reply to Valentin Gosu [:valentin] (he/him) from comment #24)

Max, when you get a chance could you take a look at whether this is is a recoverable error? I'm looking at this comment here and wondering if it's correct to cancel the stream with NET_ERROR_INTERRUPT

Are you suggesting a change along the lines of:

modified   netwerk/protocol/http/Http3Session.cpp
@@ -1451,7 +1451,7 @@ void Http3Session::ResetOrStopSendingRecvd(uint64_t aStreamId, uint64_t aError,
     httpStream->Transaction()->DisableHttp3(false);
     httpStream->Transaction()->DisableSpdy();
     CloseStream(stream, NS_ERROR_NET_RESET);
-  } else if (aError == HTTP3_APP_ERROR_REQUEST_REJECTED) {
+  } else if (aError == HTTP3_APP_ERROR_REQUEST_REJECTED || aError == HTTP3_APP_ERROR_REQUEST_CANCELLED) {
     // This request was rejected because server is probably busy or going away.
     // We can restart the request using alt-svc. Without calling
     // DoNotRemoveAltSvc the alt-svc route will be removed.

As far as I can tell HttpRequestCancelled is indeed recoverable, see e.g. cancellation of already cancelled push stream or this unit test, handling a HttpRequestCancelled event, but asserting the connection to be alive.

See also RFC9114 H3_REQUEST_CANCELLED.

Flags: needinfo?(mail) → needinfo?(valentin.gosu)

I think that seems right but I would defer to Kershaw regarding this change.

Flags: needinfo?(valentin.gosu)
Attached file log.txt
Hi, I have the same (?) issue with Firefox and invoiceninja behind Caddy with HTTP/3. The app generates and displays a pdf on chromium based browsers but fails with firefox. ``

I could confirm a similar behavior happened on Nightly 140.0a1 (2025-05-05). Got a NS_ERROR_NET_INTERRUPT under HTTP/3 trying to book an airline ticket. Disabling HTTP/3 via about:config (falling back to HTTP/2) didn’t fix the problem.

The same site loads perfectly under Stable 138.0.1, so I suggest this might be a regression.

Do you mind running mozregression to figure out when the regression was introduced?
https://mozilla.github.io/mozregression/quickstart.html

Flags: needinfo?(poren)

@valentin Ah! Should’ve done that.

I couldn’t reproduce the issue on any of the Nightly or mozregression builds today—their servers aren’t producing the crashing RST_STREAM reliably now. I will see if the problem reemerges in the next few days.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: