Waiting for socket thread takes 3 s
Categories
(Core :: Networking, defect, P2)
Tracking
()
People
(Reporter: pmenzel+bugzilla.mozilla.org, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [necko-triaged])
Attachments
(1 file)
49.93 KB,
text/plain
|
Details |
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:126.0) Gecko/20100101 Firefox/126.0
Steps to reproduce:
Debian sid/unstable with Firefox Nightly
Actual results:
Loading simple pages takes several seconds.
Network profile shows several entries with Waiting for socket thread greater than three seconds.
Expected results:
Loading should be instant, that means as fast as possible.
Reporter | ||
Comment 1•10 months ago
|
||
Comment 2•10 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Networking' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 3•10 months ago
|
||
One comment about "Waiting for Socket Thread..." -- that's somewhat misleading, as if we're using Http1.x and are limited on the sharding limit of 6 connections/site, other connections to the server will be waiting for socket thread. There are other cases where things will be "waiting for socket thread" that don't actually involve waiting for the SocketThread.
However, something is very odd about this profile... especially if you look at SocketThread
There are many LongTaskSocketProcessing markers, up to 2s long. Most of the time in these markers is spent in IO_file_write(), called from cycle collection (Collect()), which is bizarrely called from an atomic read in ShouldLog (which shouldn't call anything). Perhaps something is wrong with the symbols?
Is this a Mozilla build, or a Debian build? Anything else unusual in the setup/config/settings?
Can you repro with a fresh profile (via about:profiles)? Can you repro with a mozilla-built Firefox, if you're using a Debian build?
Thanks!
Updated•10 months ago
|
Reporter | ||
Comment 4•10 months ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #3)
One comment about "Waiting for Socket Thread..." -- that's somewhat misleading, as if we're using Http1.x and are limited on the sharding limit of 6 connections/site, other connections to the server will be waiting for socket thread. There are other cases where things will be "waiting for socket thread" that don't actually involve waiting for the SocketThread.
I do not fully understand. Do you see that many parallel requests?
However, something is very odd about this profile... especially if you look at SocketThread
There are many LongTaskSocketProcessing markers, up to 2s long. Most of the time in these markers is spent in IO_file_write(), called from cycle collection (Collect()), which is bizarrely called from an atomic read in ShouldLog (which shouldn't call anything). Perhaps something is wrong with the symbols?
Is this a Mozilla build, or a Debian build? Anything else unusual in the setup/config/settings?
I install the package from https://packages.mozilla.org/apt. Nothing special. Maybe an updated package was already installed in the background? Would that mess up the symbols?
Can you repro with a fresh profile (via about:profiles)?
It only happens sometimes, and I have to try to then create a fresh problem.
Reporter | ||
Comment 5•10 months ago
|
||
At home everything worked fine, coming to work pages in the same Firefox instance take long to load. On the console everything works fine. Profile: https://share.firefox.dev/3xLNBtj
Comment 6•10 months ago
|
||
The profile seems to indicate that a HTTP request to www.mozilla.org
is delayed because it's waiting for the resolution of HTTPS record.
Could you go to about:config
and disable network.dns.force_waiting_https_rr
or network.dns.native_https_query
and see either of it helps?
Thanks.
Comment 7•10 months ago
|
||
These lines from the profile are interesting:
2024-04-22 09:29:02.480 UTC - [Parent Process 4642 DNS Resolver #4] D/nsHostResolver DNS lookup thread - Calling getaddrinfo for host [www.mozilla.org].
2024-04-22 09:29:02.480 UTC - [Parent Process 4642 DNS Resolver #4] D/nsHostResolver DNS lookup thread - starting execution.
2024-04-22 09:29:02.480 UTC - [Parent Process 4642 DNS Resolver #4] D/nsHostResolver byType on DNS thread
2024-04-22 09:29:12.487 UTC - [Parent Process 4642 DNS Resolver #4] D/nsHostResolver FillBuffer response len < 0
2024-04-22 09:29:12.487 UTC - [Parent Process 4642 DNS Resolver #4] D/nsHostResolver nsHostResolver::CompleteLookupByType record 7f34901f0cd0 [www.mozilla.org] status 8000ffff
It shows that we wait 10s for the HTTPS record.
Reporter | ||
Comment 8•10 months ago
|
||
It shows that we wait 10s for the HTTPS record.
On the command line, DNS works instantly.
PS: After restarting Firefox, it currently works. I am going to try the other suggestions next time this happens.
Comment 9•10 months ago
|
||
(In reply to Paul Menzel from comment #8)
It shows that we wait 10s for the HTTPS record.
On the command line, DNS works instantly.
Are you resolving IP or HTTPS records?
Does dig HTTPS cloudflare.com
work for you?
Reporter | ||
Comment 10•10 months ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #9)
(In reply to Paul Menzel from comment #8)
It shows that we wait 10s for the HTTPS record.
On the command line, DNS works instantly.
Are you resolving IP or HTTPS records?
Right, IP records with host www.mozilla.org
.
Does
dig HTTPS cloudflare.com
work for you?
Yes, it does:
$ dig HTTPS cloudflare.com
; <<>> DiG 9.19.21-1+b1-Debian <<>> HTTPS cloudflare.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 30838
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;cloudflare.com. IN HTTPS
;; ANSWER SECTION:
cloudflare.com. 300 IN HTTPS 1 . alpn="h3,h2" ipv4hint=104.16.132.229,104.16.133.229 ipv6hint=2606:4700::6810:84e5,2606:4700::6810:85e5
;; Query time: 4 msec
;; SERVER: 141.14.16.1#53(141.14.16.1) (UDP)
;; WHEN: Mon Apr 22 14:09:54 CEST 2024
;; MSG SIZE rcvd: 116
Comment 12•10 months ago
|
||
So the basic cause of the profiles showing long delays in socket processing is the logging -- logging is turned on, and worse yet it's set to sync which writes data immediate, so it's spending a lot of time in OS write functions from within the socket polls. Also it's logging more than it needs to.
The earlier log which showed calls from logging into CycleCollection (and eventually back to io writes) was clearly a problem symbolicating the stacks. The new profile correctly showed the logging.
Comment 13•9 months ago
|
||
Hi Paul,
Could you try again with the latest nightly?
I think bug 1895226 might have fixed this issue.
Updated•9 months ago
|
Reporter | ||
Comment 14•9 months ago
|
||
Thank you. I haven’t seen this issue in over a week. I am currently using 128.0a1 (20240523205926). No idea how to properly close this issue. It’d be great if you did it.
Comment 15•9 months ago
|
||
That's great. Let us know if you see it again.
Reporter | ||
Comment 16•8 months ago
|
||
I noticed slowness again today, but not knowing if it’s related I created bug 1903975.
Description
•