[network-markers] Too much time attributed to "waiting for socket thread", need to gather more information about what happens before the start of a request
Categories
(Core :: Networking, defect, P3)
Tracking
()
People
(Reporter: mstange, Unassigned)
References
(Depends on 2 open bugs, Blocks 1 open bug)
Details
(Whiteboard: [necko-triaged])
Profile: https://perfht.ml/3cdKr25
In this profile, the first three requests spend 20ms, 12ms, and 92ms in a state which the UI exposes as "Waiting for socket thread".
This label is probably misleading. Matt Woodrow and Randell Jesup have suggested the following pieces that we might be spending this waiting time in:
- SSL setup time
- waiting on a slot (6 simultaneous http1 connections/server)
- waiting for the http2 channel
If that's what these requests are waiting for, then we should collect timing data for those things and put the data into the marker.
Comment 1•5 years ago
•
|
||
Ah yeah, this is something we already discussed with mayhemer. Definitely we need to make it better. At the very least change the label (that would be a patch to the UI part), but we also discussed about using another structure to gather more timings than the ones we already do.
Thanks for filing !
Comment 2•5 years ago
|
||
Julien, do you know who should own this bug? Thanks!
Comment 3•5 years ago
|
||
Hey Valentin,
Clearly we'll need to work cross-team to solve this.
I believe the profiler team will eventually work on this, but this isn't planned very soon. The priority also depends on how badly our users want this. I expect that we'll be able to look at network markers again some time this year, but rather H2.
Because it blocks our meta bug it's on our radar anyway. Would you prefer that we have it in the profiler component?
Comment 4•5 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #3)
Because it blocks our meta bug it's on our radar anyway. Would you prefer that we have it in the profiler component?
I think it's fine to stay in Networking if it's Necko/Necko-related work. I just wasn't aware of the plan for this - whether it was immediate work or something else.
hello !
I would like to work on this .
Valentin could you please elaborate what I need to do in this ?
Thanks :)
Comment 6•5 years ago
|
||
(In reply to Mahak from comment #5)
Valentin could you please elaborate what I need to do in this ?
I think this task might be a little more complex. I'll let someone else take the lead on this.
Reporter | ||
Comment 7•5 years ago
•
|
||
https://share.firefox.dev/3aWNkV0 (from bug 1684703) is another case where 260ms are attributed to "waiting for socket thread" but all threads are idle.
Edit: In that profile it was the extension suspend that caused the delay.
Reporter | ||
Updated•5 years ago
|
Comment 8•3 years ago
|
||
Here's another profile: https://share.firefox.dev/3GG2ttj
Comment 9•3 years ago
|
||
If I understand the code correctly, waiting for socket thread
is reporting the time spent from 'aStart' on first network marker call to the first entry in the network timing structure, TimingStruct
. This could be to domainLookupStart
or requestStart
, or possibly others. Depending on what timing is populated in the TimingStruct.
On the front end: the startTime
to 'waiting for socket thread' mapping.
We could increase the granularity by adding more steps to the network timing structure, but I don't think its intention is to capture internal timings (like time from asyncOpen in the child process to asyncOpen in the parent process).
But from my perspective, those would be the things I would want to see. i.e. time spent between processes or waiting on the event queue.
Also URL classification, which can slow down loads and doesn't appear on this list.
Maybe the addition of an internal timing structure parameter to the profiler_add_network_marker()
function could address this?
Comment 10•3 years ago
|
||
As you've seen, the current support reuses the network timing structure which isn't ideal. The timing structure is used when outputting the "end marker" (there's also a "start marker" that's a bit weird compared to the other start/end markers we have).
The long-term plan is to support marker sets (bug 1661114), that is grouping several markers together using the same index. This would make it easier to add new phases, and each phase would have its marker. However this is quite some work.
The intermediate solution would be adding new markers for the missing information. This way this would already provide useful information to developers (even if not integrated with the current markers), but in the future when we'll implement the marker sets functionality, we would also easily add the index to them.
What do you think?
Comment 11•3 years ago
|
||
Julien, that sounds very reasonable.
With this approach, the work of adding the new measurements is purely marker instrumentation to the platform code, (i.e. no profiler front end edits), which will make it easier for me and other devs working on this.
As we find missing measurements in Bug 1777538, we will add them as profiler markers. Thank you!
Updated•3 years ago
|
Comment 12•2 years ago
|
||
The current strategy is to add more markers (which include the channel id), and then find those markers to segment the "Waiting for Socket Thread" timing.
This can be done without marker set support.
Comment 13•2 years ago
|
||
Adding a new marker for when asyncOpen is received in the parent in Bug 1835709
Comment 14•2 years ago
|
||
In bug 1877525 I'm adding channelId to key events on the socket thread: transaction arrival and dispatch.
With these in place, we may be able to break up "Waiting for socket thread"
Comment 15•1 year ago
|
||
"Waiting for socket thread" is the human-readable name for 'startTime'
https://github.com/firefox-devtools/profiler/blob/de99cf0fce81211574e48ee90240033c18cd6033/src/components/tooltip/NetworkMarker.js#L103
Comment 17•1 year ago
|
||
Andrew has added a marker and added channelId's to other markers. Two questions: first, what other things fall into "startTime" currently (delays due to the sharding limit, classification, etc) can we add channelId's for? (I.e. what else can happen to a request that falls into startTime?) Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?
Comment 18•1 year ago
|
||
Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?
We can look into it, I filed a bug in https://github.com/firefox-devtools/profiler/issues/5092.
One issue we will face, I think, is that the new markers are (I presume) in the main process only or even maybe in different threads, but our marker handling pipeline is very one-thread-oriented, with network markers currently duplicated in both the content process and the main process. We'll have to think quite a bit about this issue and how we want things to look. Would that be good enough to have this breakdown in the main process only?
Comment 19•1 year ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #18)
Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?
We can look into it, I filed a bug in https://github.com/firefox-devtools/profiler/issues/5092.
One issue we will face, I think, is that the new markers are (I presume) in the main process only or even maybe in different threads, but our marker handling pipeline is very one-thread-oriented, with network markers currently duplicated in both the content process and the main process. We'll have to think quite a bit about this issue and how we want things to look. Would that be good enough to have this breakdown in the main process only?
Yes, the new markers are on the socket thread of parent process.
Having just the main process broken down this way would definitely be good enough as the first step as that's where we have a lot of activity.
Thanks for logging this on the profiler side. If there's anything else needed with the markers, don't hesitate to ask.
Comment 20•1 year ago
|
||
Would it make sense to move these markers to the parent process? This would be done by adding an option to the marker function such as in [1]. It generally makes sense if we'd like the marker to show up even though the specific thread isn't captured. (and for this bug this could make our life easier too :-) ).
Comment 21•1 year ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #20)
Would it make sense to move these markers to the parent process? This would be done by adding an option to the marker function such as in [1]. It generally makes sense if we'd like the marker to show up even though the specific thread isn't captured. (and for this bug this could make our life easier too :-) ).
Yes; I didn't realize we had this capability and this seems like a good use as profiling the socket thread explicitly is less common.
Patch incoming.
Comment 22•1 year ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #20)
Would it make sense to move these markers to the parent process? This would be done by adding an option to the marker function such as in [1]. It generally makes sense if we'd like the marker to show up even though the specific thread isn't captured. (and for this bug this could make our life easier too :-) ).
Bug 1915700 adds the MarkerThreadId::MainThread()
marker option to the three markers that include an int property, channelId
.
Comment 23•1 year ago
|
||
Tangentially related, bug 1915717, Expose http protocol version for a request in the profiler network tab
Comment 24•11 months ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #17)
Andrew has added a marker and added channelId's to other markers. Two questions: first, what other things fall into "startTime" currently (delays due to the sharding limit, classification, etc) can we add channelId's for? (I.e. what else can happen to a request that falls into startTime?) Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?
I think there's a bunch of things that we do before we actually go to the socket thread - such as classification or waiting for a HTTPS record, CORS, etc. I don't think we mark when these are done (at least for some of them) but we probably should.
Comment 25•11 months ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #24)
I think there's a bunch of things that we do before we actually go to the socket thread - such as classification or waiting for a HTTPS record, CORS, etc. I don't think we mark when these are done (at least for some of them) but we probably should.
Agreed - additional markers around these steps could be very useful.
One I just logged -- Expose class of service for each request in the profiler network tab.
Reporter | ||
Comment 26•10 months ago
|
||
Here's another profile where I spend 2 seconds waiting for something and I don't know for what: https://share.firefox.dev/4dN4SBs
Description
•