1618687 - [network-markers] Too much time attributed to "waiting for socket thread", need to gather more information about what happens before the start of a request

Reporter

Description

•

5 years ago

In this profile, the first three requests spend 20ms, 12ms, and 92ms in a state which the UI exposes as "Waiting for socket thread".
This label is probably misleading. Matt Woodrow and Randell Jesup have suggested the following pieces that we might be spending this waiting time in:

SSL setup time
waiting on a slot (6 simultaneous http1 connections/server)
waiting for the http2 channel

If that's what these requests are waiting for, then we should collect timing data for those things and put the data into the marker.

Julien Wajsberg [:julienw]

Comment 1

•

5 years ago

•

Edited

Ah yeah, this is something we already discussed with mayhemer. Definitely we need to make it better. At the very least change the label (that would be a patch to the UI part), but we also discussed about using another structure to gather more timings than the ones we already do.
Thanks for filing !

Valentin Gosu [:valentin] (he/him)

Comment 2

•

5 years ago

Julien, do you know who should own this bug? Thanks!

Flags: needinfo?(felash)

Priority: -- → P3

Whiteboard: [necko-triaged]

Julien Wajsberg [:julienw]

Comment 3

•

5 years ago

Hey Valentin,
Clearly we'll need to work cross-team to solve this.
I believe the profiler team will eventually work on this, but this isn't planned very soon. The priority also depends on how badly our users want this. I expect that we'll be able to look at network markers again some time this year, but rather H2.
Because it blocks our meta bug it's on our radar anyway. Would you prefer that we have it in the profiler component?

Flags: needinfo?(felash)

Valentin Gosu [:valentin] (he/him)

Comment 4

•

5 years ago

(In reply to Julien Wajsberg [:julienw] from comment #3)

Because it blocks our meta bug it's on our radar anyway. Would you prefer that we have it in the profiler component?

I think it's fine to stay in Networking if it's Necko/Necko-related work. I just wasn't aware of the plan for this - whether it was immediate work or something else.

Florian Quèze [:florian]

Updated

•

5 years ago

Comment 5

•

5 years ago

hello !
I would like to work on this .
Valentin could you please elaborate what I need to do in this ?
Thanks :)

Flags: needinfo?(valentin.gosu)

Valentin Gosu [:valentin] (he/him)

Comment 6

•

5 years ago

(In reply to Mahak from comment #5)

Valentin could you please elaborate what I need to do in this ?

I think this task might be a little more complex. I'll let someone else take the lead on this.

Flags: needinfo?(valentin.gosu)

Markus Stange [:mstange]

Reporter

Updated

•

5 years ago

Depends on: 1652515

Markus Stange [:mstange]

Reporter

Updated

•

5 years ago

Depends on: 1652517

Markus Stange [:mstange]

Reporter

Comment 7

•

4 years ago

•

Edited

https://share.firefox.dev/3aWNkV0 (from bug 1684703) is another case where 260ms are attributed to "waiting for socket thread" but all threads are idle.
Edit: In that profile it was the extension suspend that caused the delay.

Blocks: 1690235

Markus Stange [:mstange]

Reporter

Updated

•

4 years ago

Blocks: 1684703
No longer blocks: 1690235

Markus Stange [:mstange]

Reporter

Updated

•

4 years ago

No longer blocks: 1684703

Jeff Muizelaar [:jrmuizel]

Comment 8

•

3 years ago

Here's another profile: https://share.firefox.dev/3GG2ttj

Andrew Creskey [:acreskey]

Updated

•

3 years ago

Comment 9

•

3 years ago

If I understand the code correctly, waiting for socket thread is reporting the time spent from 'aStart' on first network marker call to the first entry in the network timing structure, TimingStruct. This could be to domainLookupStart or requestStart, or possibly others. Depending on what timing is populated in the TimingStruct.

On the front end: the startTime to 'waiting for socket thread' mapping.

And an example profiler marker, profiler_add_network_marker() call in HttpChannelChild::AsyncOpenInternal.

We could increase the granularity by adding more steps to the network timing structure, but I don't think its intention is to capture internal timings (like time from asyncOpen in the child process to asyncOpen in the parent process).

But from my perspective, those would be the things I would want to see. i.e. time spent between processes or waiting on the event queue.
Also URL classification, which can slow down loads and doesn't appear on this list.

Maybe the addition of an internal timing structure parameter to the profiler_add_network_marker() function could address this?

Julien Wajsberg [:julienw]

Comment 10

•

3 years ago

As you've seen, the current support reuses the network timing structure which isn't ideal. The timing structure is used when outputting the "end marker" (there's also a "start marker" that's a bit weird compared to the other start/end markers we have).

The long-term plan is to support marker sets (bug 1661114), that is grouping several markers together using the same index. This would make it easier to add new phases, and each phase would have its marker. However this is quite some work.

The intermediate solution would be adding new markers for the missing information. This way this would already provide useful information to developers (even if not integrated with the current markers), but in the future when we'll implement the marker sets functionality, we would also easily add the index to them.

What do you think?

Andrew Creskey [:acreskey]

Comment 11

•

3 years ago

Julien, that sounds very reasonable.
With this approach, the work of adding the new measurements is purely marker instrumentation to the platform code, (i.e. no profiler front end edits), which will make it easier for me and other devs working on this.
As we find missing measurements in Bug 1777538, we will add them as profiler markers. Thank you!

BMO Automation

Updated

•

3 years ago

Severity: normal → S3

Andrew Creskey [:acreskey]

Comment 12

•

2 years ago

The current strategy is to add more markers (which include the channel id), and then find those markers to segment the "Waiting for Socket Thread" timing.
This can be done without marker set support.

Flags: needinfo?(acreskey)

Andrew Creskey [:acreskey]

Comment 13

•

2 years ago

Adding a new marker for when asyncOpen is received in the parent in Bug 1835709

Flags: needinfo?(acreskey)

Andrew Creskey [:acreskey]

Updated

•

2 years ago

Comment 14

•

1 year ago

In bug 1877525 I'm adding channelId to key events on the socket thread: transaction arrival and dispatch.
With these in place, we may be able to break up "Waiting for socket thread"

Andrew Creskey [:acreskey]

Comment 15

•

1 year ago

"Waiting for socket thread" is the human-readable name for 'startTime'
https://github.com/firefox-devtools/profiler/blob/de99cf0fce81211574e48ee90240033c18cd6033/src/components/tooltip/NetworkMarker.js#L103

Andrew Creskey [:acreskey]

Updated

•

1 year ago

Updated

•

1 year ago

Duplicate of this bug: 1913417

Randell Jesup [:jesup] (needinfo me)

Comment 17

•

1 year ago

Andrew has added a marker and added channelId's to other markers. Two questions: first, what other things fall into "startTime" currently (delays due to the sharding limit, classification, etc) can we add channelId's for? (I.e. what else can happen to a request that falls into startTime?) Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?

Flags: needinfo?(valentin.gosu)

Flags: needinfo?(felash)

Flags: needinfo?(acreskey)

Julien Wajsberg [:julienw]

Comment 18

•

1 year ago

Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?

We can look into it, I filed a bug in https://github.com/firefox-devtools/profiler/issues/5092.

One issue we will face, I think, is that the new markers are (I presume) in the main process only or even maybe in different threads, but our marker handling pipeline is very one-thread-oriented, with network markers currently duplicated in both the content process and the main process. We'll have to think quite a bit about this issue and how we want things to look. Would that be good enough to have this breakdown in the main process only?

Flags: needinfo?(felash)

Andrew Creskey [:acreskey]

Comment 19

•

1 year ago

(In reply to Julien Wajsberg [:julienw] from comment #18)

Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?

We can look into it, I filed a bug in https://github.com/firefox-devtools/profiler/issues/5092.

One issue we will face, I think, is that the new markers are (I presume) in the main process only or even maybe in different threads, but our marker handling pipeline is very one-thread-oriented, with network markers currently duplicated in both the content process and the main process. We'll have to think quite a bit about this issue and how we want things to look. Would that be good enough to have this breakdown in the main process only?

Yes, the new markers are on the socket thread of parent process.
Having just the main process broken down this way would definitely be good enough as the first step as that's where we have a lot of activity.

Thanks for logging this on the profiler side. If there's anything else needed with the markers, don't hesitate to ask.

Flags: needinfo?(acreskey)

Julien Wajsberg [:julienw]

Comment 20

•

1 year ago

Would it make sense to move these markers to the parent process? This would be done by adding an option to the marker function such as in [1]. It generally makes sense if we'd like the marker to show up even though the specific thread isn't captured. (and for this bug this could make our life easier too :-) ).

[1] https://searchfox.org/mozilla-central/rev/490a1df802d8872f996f8ef4093d54e3c854c8f9/tools/profiler/core/platform.cpp#5377

Andrew Creskey [:acreskey]

Comment 21

•

1 year ago

(In reply to Julien Wajsberg [:julienw] from comment #20)

Would it make sense to move these markers to the parent process? This would be done by adding an option to the marker function such as in [1]. It generally makes sense if we'd like the marker to show up even though the specific thread isn't captured. (and for this bug this could make our life easier too :-) ).

[1] https://searchfox.org/mozilla-central/rev/490a1df802d8872f996f8ef4093d54e3c854c8f9/tools/profiler/core/platform.cpp#5377

Yes; I didn't realize we had this capability and this seems like a good use as profiling the socket thread explicitly is less common.
Patch incoming.

Andrew Creskey [:acreskey]

Comment 22

•

1 year ago

(In reply to Julien Wajsberg [:julienw] from comment #20)

Would it make sense to move these markers to the parent process? This would be done by adding an option to the marker function such as in [1]. It generally makes sense if we'd like the marker to show up even though the specific thread isn't captured. (and for this bug this could make our life easier too :-) ).

[1] https://searchfox.org/mozilla-central/rev/490a1df802d8872f996f8ef4093d54e3c854c8f9/tools/profiler/core/platform.cpp#5377

Bug 1915700 adds the MarkerThreadId::MainThread() marker option to the three markers that include an int property, channelId.

Depends on: 1915700

Andrew Creskey [:acreskey]

Comment 23

•

1 year ago

Tangentially related, bug 1915717, Expose http protocol version for a request in the profiler network tab

Valentin Gosu [:valentin] (he/him)

Comment 24

•

1 year ago

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #17)

Andrew has added a marker and added channelId's to other markers. Two questions: first, what other things fall into "startTime" currently (delays due to the sharding limit, classification, etc) can we add channelId's for? (I.e. what else can happen to a request that falls into startTime?) Second, can we get the profiler UI to grab these other markers to break up startTime/Waiting for Socket Thread?

I think there's a bunch of things that we do before we actually go to the socket thread - such as classification or waiting for a HTTPS record, CORS, etc. I don't think we mark when these are done (at least for some of them) but we probably should.

Flags: needinfo?(valentin.gosu)

Andrew Creskey [:acreskey]

Comment 25

•

1 year ago

(In reply to Valentin Gosu [:valentin] (he/him) from comment #24)

I think there's a bunch of things that we do before we actually go to the socket thread - such as classification or waiting for a HTTPS record, CORS, etc. I don't think we mark when these are done (at least for some of them) but we probably should.

Agreed - additional markers around these steps could be very useful.

One I just logged -- Expose class of service for each request in the profiler network tab.

Markus Stange [:mstange]

Reporter

Comment 26

•

1 year ago

Here's another profile where I spend 2 seconds waiting for something and I don't know for what: https://share.firefox.dev/4dN4SBs

Markus Stange [:mstange]

Reporter

Updated

•

1 year ago

Blocks: 1907982