Closed Bug 1980812 Opened 2 months ago Closed 29 days ago

Websites take way too long to load

Categories

(Core :: Networking, defect, P2)

Firefox 141
Desktop
Windows 11
defect

Tracking

()

RESOLVED FIXED
144 Branch
Performance Impact medium
Tracking Status
firefox-esr115 --- unaffected
firefox-esr140 143+ fixed
firefox143 + affected
firefox144 + fixed

People

(Reporter: anandindraganti, Assigned: kershaw)

References

(Blocks 2 open bugs, )

Details

(Whiteboard: [necko-triaged][necko-priority-queue])

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:141.0) Gecko/20100101 Firefox/141.0

Steps to reproduce:

I opened one of the websites.

Actual results:

It took way too long to open/load,while all other browsers in my pc did it within 5 secs or less.

Expected results:

It should've performed on par with other browsers as well.
https://share.firefox.dev/45iPyuj
this is the firefox profiler networking file upload link.

The Bugbug bot thinks this bug should belong to the 'Core::Networking' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Networking
Product: Firefox → Core
Type: enhancement → defect
OS: Unspecified → Windows 11
Hardware: Unspecified → Desktop
Summary: Webites take way too long to load → Websites take way too long to load

Looking at the profile, it reminds of the bug 1929479, bug 1939941 where we try to request a large number of resources within short period of time.
We need to verify this once we have fixed the other bug.

See Also: → 1929479, 1939941
Severity: -- → S3
Priority: -- → P2
Whiteboard: [necko-triaged]

(In reply to Sunil Mayya from comment #2)

Looking at the profile, it reminds of the bug 1929479, bug 1939941 where we try to request a large number of resources within short period of time.
We need to verify this once we have fixed the other bug.

So, is any solution available?

Performance Impact: --- → ?

One improvement we just landed that will improve performance on pages with a large number of requests is part of bug 1976964, specifically this one, https://phabricator.services.mozilla.com/D257991

However in looking at this particular profile it looks like the requests are all HTTP/1.1 where we limit the number of connections to a given host to 6.

Anand - can you share the name of the site so we can take a closer look?

Flags: needinfo?(anandindraganti)

(In reply to Andrew Creskey [:acreskey] from comment #4)

One improvement we just landed that will improve performance on pages with a large number of requests is part of bug 1976964, specifically this one, https://phabricator.services.mozilla.com/D257991

However in looking at this particular profile it looks like the requests are all HTTP/1.1 where we limit the number of connections to a given host to 6.

Anand - can you share the name of the site so we can take a closer look?

Hey, Thanks for responding back.
I have been facing this issue with multiple websites almost and at random times. I cant pin-point to a single website. But what i can say is that by looking at network tab while website is loading, majority of time taking requests are either 1) in dns resolution 2) connecting 3) blocked
Any new website that I haven't visited wont load until i give random refresh click.
I have made a Reddit help post in r/firefox as well. U can view the time taking process in the video.
https://www.reddit.com/r/firefox/comments/1mfswo8/firefox_feels_unusually_slow_for_me_and_i_cant/

Flags: needinfo?(anandindraganti)

Thanks Anand.

One step that could help pinpoint the problem is a capture with logs obtained via about:logging

As this can capture a lot of personal information, please only do so from a new profile loading a site without credentials, etc.

Go to about:logging
Press "Start Logging"
Navigate until you encounter the problem.
Press "Stop Logging"

A new performance profile with logs will appear in a new tab.
Press the "Upload Local Profile" and check Include additional data that may be identifiable
Then share the URL.

(In reply to Andrew Creskey [:acreskey] from comment #6)

Thanks Anand.

One step that could help pinpoint the problem is a capture with logs obtained via about:logging

As this can capture a lot of personal information, please only do so from a new profile loading a site without credentials, etc.

Go to about:logging
Press "Start Logging"
Navigate until you encounter the problem.
Press "Stop Logging"

A new performance profile with logs will appear in a new tab.
Press the "Upload Local Profile" and check Include additional data that may be identifiable
Then share the URL.

Here you go.
https://share.firefox.dev/45j2YGr

Thank -- we might have enough information here to figure this out.

But if you can -- one more, this time with DNS over HTTPS and UBlock origin disabled or removed would be helpful.
Other than UBlock and DNS over HTTPS are you on a VPN or corporate network or ?

https://www.instagram.com/ is a perfect test.

Some findings
Looking this problem resource
Load 10: https://static.cdninstagram.com/rsrc.php/v4/y5/r/cKfMRAHc07D.js
Interestingly the profiler is saying that this resources loads over HTTP/1.1 which should not happen.

And we've already made a successful connection to static.cdninstagram.com
And yet I'm not seeing any signs of connection re-use.

Flags: needinfo?(anandindraganti)

(In reply to Andrew Creskey [:acreskey] from comment #8)

Thank -- we might have enough information here to figure this out.

But if you can -- one more, this time with DNS over HTTPS and UBlock origin disabled or removed would be helpful.
Other than UBlock and DNS over HTTPS are you on a VPN or corporate network or ?

https://www.instagram.com/ is a perfect test.

Some findings
Looking this problem resource
Load 10: https://static.cdninstagram.com/rsrc.php/v4/y5/r/cKfMRAHc07D.js
Interestingly the profiler is saying that this resources loads over HTTP/1.1 which should not happen.

And we've already made a successful connection to static.cdninstagram.com
And yet I'm not seeing any signs of connection re-use.

Hello again!
Below profiler is with both DNS over HTTPS and UBlock origin disabled and No I'm not on any vpn or a corporate network just my home wifi.
However, seeminly instagram loaded really quickly, but a new site troubled this time. As i said, its almost like random websites at random times.
https://share.firefox.dev/45DVqiT

Flags: needinfo?(anandindraganti)

(In reply to Andrew Creskey [:acreskey] from comment #8)

Thank -- we might have enough information here to figure this out.

But if you can -- one more, this time with DNS over HTTPS and UBlock origin disabled or removed would be helpful.
Other than UBlock and DNS over HTTPS are you on a VPN or corporate network or ?

https://www.instagram.com/ is a perfect test.

Some findings
Looking this problem resource
Load 10: https://static.cdninstagram.com/rsrc.php/v4/y5/r/cKfMRAHc07D.js
Interestingly the profiler is saying that this resources loads over HTTP/1.1 which should not happen.

And we've already made a successful connection to static.cdninstagram.com
And yet I'm not seeing any signs of connection re-use.

hello any updates yet?

Hi,

Yes, I can see at least what's going wrong in this profile:

https://share.firefox.dev/45DVqiT

Looking to connect to this host: Scrimba, 104.18.23.196

LogMessages — (nsSocketTransport)   trying address: 104.18.23.196

LogMessages — (nsHttp) DnsAndConnectSocket::OnOutputStreamReady [this=26b77a9fea0 ent=scrimba.com primary]


4.386s LogMessages — (nsHttp) GetH2orH3ActiveConn() request for ent 26b721b0ca0 .S......F.[tlsflags0x00000000]scrimba.com:443 {NPN-TOKEN h2}^partitionKey=%28https%2Cscrimba.com%29 did not find an active connection

4.425s LogMessages — (nsHttp) nsHttpConnection::Activate [this=26b74e67c00 trans=26b6b3f38c0 caps=201001]

LogMessages — (nsHttp) nsHttpConnection::OnSocketWritable [this=26b74e67c00] host=scrimba.com

4.385 LogMessages — (nsHttp) ConnectionEntry::ConnectionEntry this=26b721b0ca0 key=.S......F.[tlsflags0x00000000]scrimba.com:443 {NPN-TOKEN h2}^partitionKey=%28https%2Cscrimba.com%29

4.425s LogMessages — (nsHttp) TlsHandshaker::SetupSSL 26b74e67c00 caps=0x201001 .S......F.[tlsflags0x00000000]scrimba.com:443 {NPN-TOKEN h2}^partitionKey=%28https%2Cscrimba.com%29

4.481s LogMessages — (nsHttp) nsHttpConnection::HandshakeDoneInternal [this=26b74e67c00]
LogMessages — (nsHttp) negotiatedNPN:  - transactionNPN: h2
LogMessages — (nsHttp) Resetting connection due to mismatched NPN token
Counter::add - network.alpn_mismatch_count : 1

So we've completed the TLS handshake but then we abort due to the mismatched NPN token.
I'll ask the protocol experts why that might be happening here.

Based on the log, here’s what happens:

  1. Firefox first tries to connect to the site using h3, but that fails—likely because UDP is blocked.
  2. It then attempts to fallback to h2.
  3. The transaction’s NPN token is set to h2, but the server doesn’t negotiate h2.
  4. As a result, Firefox can only wait for the h3 connection to time out (30s) before trying the next connection attempt.

For the fallback, it might not be correct to assume that the server will accept h2, even if the HTTPS record indicates it could. Due to load balancers or other configuration issues, the server may only accept HTTP/1.1 at the moment. A possible fix would be to avoid forcing the NPN token to h2 during fallback, and instead let the server decide which protocol to use.

Whiteboard: [necko-triaged] → [necko-triaged][necko-priority-queue]

(In reply to Kershaw Chang [:kershaw] from comment #12)

Based on the log, here’s what happens:

  1. Firefox first tries to connect to the site using h3, but that fails—likely because UDP is blocked.
  2. It then attempts to fallback to h2.
  3. The transaction’s NPN token is set to h2, but the server doesn’t negotiate h2.
  4. As a result, Firefox can only wait for the h3 connection to time out (30s) before trying the next connection attempt.

For the fallback, it might not be correct to assume that the server will accept h2, even if the HTTPS record indicates it could. Due to load balancers or other configuration issues, the server may only accept HTTP/1.1 at the moment. A possible fix would be to avoid forcing the NPN token to h2 during fallback, and instead let the server decide which protocol to use.

Hey, Thanks for the support.
Can you please suggest what I can do to avoid the issue? Is there something that I can/need to do to solve from my end?
Thankyou.

Hey, Thanks for the support.
Can you please suggest what I can do to avoid the issue? Is there something that I can/need to do to solve from my end?
Thankyou.

Unfortunately, no. I'll try to fix this issue soon, so the fix will be in Firefox nightly in days.

The Performance Impact Calculator has determined this bug's performance impact to be medium. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: [x] Windows [x] macOS [x] Linux [x] Android
Impact on site: Renders site effectively unusable
Configuration: Rare
Page load impact: Severe

Performance Impact: ? → medium
Assignee: nobody → kershaw
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

Hi, Regarding the attached file Bug 1980812 - Avoid setting explicit h2 ALPN for connections, r=#necko — Details from https://phabricator.services.mozilla.com/D262000
Is the issue fixed?
Can you please explain what steps (If I need any) to follow to fix?

Probably the easiest fix would be to make sure your server supports HTTP/2.
If you are not in control of the server, temporarily disabling HTTP/2 in Firefox should avoid the delay (set network.http.http2.enabled to false in about:config)

Pushed by smolnar@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/5a1fbd9afa6f https://hg.mozilla.org/integration/autoland/rev/747d59705e12 Revert "Bug 1980812 - Avoid setting explicit h2 ALPN for connections, r=necko-reviewers,valentin" for causing xpc failures @ test_https_rr_ech_prefs.js
Flags: needinfo?(kershaw)
Status: ASSIGNED → RESOLVED
Closed: 29 days ago
Resolution: --- → FIXED
Target Milestone: --- → 144 Branch
Duplicate of this bug: 1985094

(In reply to Cosmin Sabou [:CosminS] from comment #23)

https://hg.mozilla.org/mozilla-central/rev/f591dd77a5c9
In what release can I be expecting this fix to be included?

QA Whiteboard: [qa-triage-done-c145/b144]

(In reply to Anand from comment #25)

In what release can I be expecting this fix to be included?

By the target milestone / status flag Fx 144 to be released on October 14.

we're getting lots of reports of something similar to this on the ESR 140.3.

do we expect it to have happened there?

this is starting to seem urgent enough to uplift

Adding a need info to :kershaw for Comment 27.
This defect is not tracked as a regression. As Mike already asked, is ESR140 affected?

Flags: needinfo?(kershaw)
Attachment #9514396 - Flags: approval-mozilla-esr140?

(In reply to Donal Meehan [:dmeehan] from comment #28)

Adding a need info to :kershaw for Comment 27.
This defect is not tracked as a regression. As Mike already asked, is ESR140 affected?

Yes, I think ESR140 is affected, since this issue had been existing for a while.
I'll create a patch for uplift.

Flags: needinfo?(kershaw)

Hi Donal,

Sorry that I don't know how to submit uplift form when using moz-phab to submit the patch.
Let me know if you need more information from me.

Thanks.

Flags: needinfo?(dmeehan)

firefox-esr140 Uplift Approval Request

  • User impact if declined: Some websites might take long time to load.
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: low
  • Explanation of risk level: This fix is straightforward and is already verified.
  • String changes made/needed: N/A
  • Is Android affected?: yes
Attachment #9514398 - Flags: approval-mozilla-esr140?

Thanks for the request, clearing need-info since this now in the uplift request queue.
It will at least be uplifted for 140.4esr, scheduled to go-live on 2025-10-14

Flags: needinfo?(dmeehan)

Thanks Donal, what are the options to enable a patch release for this bug that seems to affect already a few enterprise users?
My understanding is that the work-around is setting network.http.http3.enable=false but a patch release, would help reduce the risk of more users hitting the bug as they update to ESR 140 between now and Oct 14th

(In reply to Romain Testard [:RT] from comment #35)

Thanks Donal, what are the options to enable a patch release for this bug that seems to affect already a few enterprise users?
My understanding is that the work-around is setting network.http.http3.enable=false but a patch release, would help reduce the risk of more users hitting the bug as they update to ESR 140 between now and Oct 14th

:RT, if it's urgent enough to warrant a 140.3.1esr, then I suggest spinning up an incident (internal Slack channel etc). There are a couple of moving parts here and some discussions around the fix.

Attachment #9514396 - Attachment is obsolete: true
Attachment #9514396 - Flags: approval-mozilla-esr140?

I created the internal incident and confirm we'd like the patch to be released because of the high severity of the issue that is impacting several enterprise deployments already.

firefox-release Uplift Approval Request

  • User impact if declined: Some websites may need long time to load.
  • Code covered by automated testing: yes
  • Fix verified in Nightly: yes
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: N/A
  • Risk associated with taking this patch: low
  • Explanation of risk level: This fix is straightforward and is already verified.
  • String changes made/needed: N/A
  • Is Android affected?: yes
Attachment #9514759 - Flags: approval-mozilla-release?
Attachment #9514398 - Flags: approval-mozilla-esr140? → approval-mozilla-esr140+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: