[network-markers] Too much time attributed to "waiting for socket thread", need to gather more information about what happens before the start of a request
Categories
(Core :: Networking, defect, P3)
Tracking
()
People
(Reporter: mstange, Unassigned)
References
(Depends on 2 open bugs, Blocks 1 open bug)
Details
(Whiteboard: [necko-triaged])
Profile: https://perfht.ml/3cdKr25
In this profile, the first three requests spend 20ms, 12ms, and 92ms in a state which the UI exposes as "Waiting for socket thread".
This label is probably misleading. Matt Woodrow and Randell Jesup have suggested the following pieces that we might be spending this waiting time in:
- SSL setup time
- waiting on a slot (6 simultaneous http1 connections/server)
- waiting for the http2 channel
If that's what these requests are waiting for, then we should collect timing data for those things and put the data into the marker.
Comment 1•4 years ago
•
|
||
Ah yeah, this is something we already discussed with mayhemer. Definitely we need to make it better. At the very least change the label (that would be a patch to the UI part), but we also discussed about using another structure to gather more timings than the ones we already do.
Thanks for filing !
Comment 2•4 years ago
|
||
Julien, do you know who should own this bug? Thanks!
Comment 3•4 years ago
|
||
Hey Valentin,
Clearly we'll need to work cross-team to solve this.
I believe the profiler team will eventually work on this, but this isn't planned very soon. The priority also depends on how badly our users want this. I expect that we'll be able to look at network markers again some time this year, but rather H2.
Because it blocks our meta bug it's on our radar anyway. Would you prefer that we have it in the profiler component?
Comment 4•4 years ago
|
||
(In reply to Julien Wajsberg [:julienw] from comment #3)
Because it blocks our meta bug it's on our radar anyway. Would you prefer that we have it in the profiler component?
I think it's fine to stay in Networking if it's Necko/Necko-related work. I just wasn't aware of the plan for this - whether it was immediate work or something else.
hello !
I would like to work on this .
Valentin could you please elaborate what I need to do in this ?
Thanks :)
Comment 6•4 years ago
|
||
(In reply to Mahak from comment #5)
Valentin could you please elaborate what I need to do in this ?
I think this task might be a little more complex. I'll let someone else take the lead on this.
Reporter | ||
Comment 7•3 years ago
•
|
||
https://share.firefox.dev/3aWNkV0 (from bug 1684703) is another case where 260ms are attributed to "waiting for socket thread" but all threads are idle.
Edit: In that profile it was the extension suspend that caused the delay.
Reporter | ||
Updated•3 years ago
|
Comment 8•2 years ago
|
||
Here's another profile: https://share.firefox.dev/3GG2ttj
Comment 9•2 years ago
|
||
If I understand the code correctly, waiting for socket thread
is reporting the time spent from 'aStart' on first network marker call to the first entry in the network timing structure, TimingStruct
. This could be to domainLookupStart
or requestStart
, or possibly others. Depending on what timing is populated in the TimingStruct.
On the front end: the startTime
to 'waiting for socket thread' mapping.
We could increase the granularity by adding more steps to the network timing structure, but I don't think its intention is to capture internal timings (like time from asyncOpen in the child process to asyncOpen in the parent process).
But from my perspective, those would be the things I would want to see. i.e. time spent between processes or waiting on the event queue.
Also URL classification, which can slow down loads and doesn't appear on this list.
Maybe the addition of an internal timing structure parameter to the profiler_add_network_marker()
function could address this?
Comment 10•2 years ago
|
||
As you've seen, the current support reuses the network timing structure which isn't ideal. The timing structure is used when outputting the "end marker" (there's also a "start marker" that's a bit weird compared to the other start/end markers we have).
The long-term plan is to support marker sets (bug 1661114), that is grouping several markers together using the same index. This would make it easier to add new phases, and each phase would have its marker. However this is quite some work.
The intermediate solution would be adding new markers for the missing information. This way this would already provide useful information to developers (even if not integrated with the current markers), but in the future when we'll implement the marker sets functionality, we would also easily add the index to them.
What do you think?
Comment 11•2 years ago
|
||
Julien, that sounds very reasonable.
With this approach, the work of adding the new measurements is purely marker instrumentation to the platform code, (i.e. no profiler front end edits), which will make it easier for me and other devs working on this.
As we find missing measurements in Bug 1777538, we will add them as profiler markers. Thank you!
Updated•1 year ago
|
Comment 12•10 months ago
|
||
The current strategy is to add more markers (which include the channel id), and then find those markers to segment the "Waiting for Socket Thread" timing.
This can be done without marker set support.
Comment 13•10 months ago
|
||
Adding a new marker for when asyncOpen is received in the parent in Bug 1835709
Comment 14•1 month ago
|
||
In bug 1877525 I'm adding channelId to key events on the socket thread: transaction arrival and dispatch.
With these in place, we may be able to break up "Waiting for socket thread"
Description
•