NS_NewNamedThread blocks the main thread
Categories
(Core :: XPCOM, enhancement, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox87 | --- | fixed |
People
(Reporter: florian, Assigned: alexical)
References
Details
(Keywords: perf:pageload, Whiteboard: [bhr:NS_NewNamedThread])
Attachments
(1 file, 1 obsolete file)
| Reporter | ||
Updated•6 years ago
|
Comment 1•6 years ago
|
||
Comment 2•6 years ago
|
||
| Reporter | ||
Comment 3•6 years ago
|
||
| Reporter | ||
Comment 4•6 years ago
|
||
Comment 5•6 years ago
|
||
| Reporter | ||
Comment 6•6 years ago
|
||
Comment 7•6 years ago
|
||
Updated•6 years ago
|
| Reporter | ||
Comment 8•6 years ago
|
||
Comment 9•6 years ago
|
||
| Reporter | ||
Comment 10•6 years ago
|
||
Comment 11•6 years ago
|
||
| Reporter | ||
Comment 12•6 years ago
|
||
Updated•6 years ago
|
| Reporter | ||
Comment 13•6 years ago
|
||
Here is another profile where this wastes a lot of main-thread time in both the parent and content processes: https://perfht.ml/2DixnYF
In the "Privileged Content" track, it visibly happens for:
mozilla::net::nsStreamTransportService::Dispatch,
mozilla::net::nsSocketTransportService::Init,
mozilla::image::DecodePoolImpl::CreateThread,
nsHtml5Module::GetStreamParserThread and
mozilla::VideoDecoderManagerChild::InitializeThread.
I think these are things that happen during page load (most of these are before the end of the TFFI marker).
Comment 14•6 years ago
|
||
Putting this back into the [qf] triage queue, since it looks like this can also affect page load.
Updated•6 years ago
|
| Reporter | ||
Comment 15•6 years ago
|
||
https://perfht.ml/2In0Cz1 shows that we encounter this several times when attempting to load about:home after startup.
| Assignee | ||
Comment 16•6 years ago
|
||
(Taking a look at this per Florian's nudge)
(In reply to Nathan Froyd [:froydnj] from comment #1)
I'm open to ideas here, but I'm pretty sure we rely on this behavior and
there's not really a better way to implement it. (In general, it's not safe
to just fire off the thread without some sort of synchronization when you're
passing data into the newly-started thread.) Inclined to WONTFIX this.OTOH, the profile says that the thread pool underlying the stream transport
service is starting up a new thread. It's possible we could tune the
heuristics for the stream transport service's thread pool so it didn't shut
down threads so eagerly.
Nathan, maybe you can short-circuit my digging - can you explain exactly how we rely on this behavior? It's not really clear to me from a theory perspective why we can't queue something up to be run whether the thread has finished starting up or not.
Comment 17•6 years ago
|
||
(In reply to Doug Thayer [:dthayer] from comment #16)
(Taking a look at this per Florian's nudge)
(In reply to Nathan Froyd [:froydnj] from comment #1)
I'm open to ideas here, but I'm pretty sure we rely on this behavior and
there's not really a better way to implement it. (In general, it's not safe
to just fire off the thread without some sort of synchronization when you're
passing data into the newly-started thread.) Inclined to WONTFIX this.Nathan, maybe you can short-circuit my digging - can you explain exactly how we rely on this behavior? It's not really clear to me from a theory perspective why we can't queue something up to be run whether the thread has finished starting up or not.
We already queue things to run before we're sure whether the thread has started, e.g.:
https://searchfox.org/mozilla-central/source/xpcom/threads/nsThread.cpp#669-675
The safety concern comes from the initialization data we pass in: when we initialize threads, we put ThreadInitData on the stack:
https://searchfox.org/mozilla-central/source/xpcom/threads/nsThread.cpp#659
and pass a pointer to said ThreadInitData into PR_CreateThread:
https://searchfox.org/mozilla-central/source/xpcom/threads/nsThread.cpp#662
and our ThreadFunc accesses that data:
https://searchfox.org/mozilla-central/source/xpcom/threads/nsThread.cpp#411-444
So that ThreadInitData needs to live until we receive some notification that the thread we wanted to start has actually started. If we don't wait, ThreadInitData might be destroyed before the new thread has a chance to start running.
I guess we could allocate ThreadInitData on the heap instead of on the stack, which would eliminate that concern. I would be a little concerned about code that assumes that NS_NewNamedThread is, in effect, a blocking call, though (e.g. the code in nsThreadPool.
Other alternatives are to have some other thread for thread pools that starts threads, or to make the case in PutEvent where we have to start a new thread a runnable that runs in the thread pool itself to start a new thread and then services the original runnable.
| Assignee | ||
Comment 18•6 years ago
|
||
Apologies, I just noticed the conversation above regarding the BHR data primarily featuring PR_CreateThread rather than the wait. This is interesting. In the profiles Florian links, the time is spend primarily in the wait. In the BHR data, however, PR_CreateThread does indeed take more of the share, but the share of NS_NewNamedThread in general is incredibly small compared to overal BHR share.
Florian, can you ensure that your data lines up with the data as reported by arewesmoothyet? It seems like quite an insignificant portion spend creating threads in that view.
| Reporter | ||
Comment 19•6 years ago
|
||
(In reply to Doug Thayer [:dthayer] from comment #18)
Apologies, I just noticed the conversation above regarding the BHR data primarily featuring PR_CreateThread rather than the wait.
Sorry for not replying sooner. I just poked at awsy a bit. If I focus on the nsThread::Init function, then 88.8% is PR_Wait and 7.7% is PR_CreateThread. Does this answer your question?
In any case, the profiles I got even on slow machines had wait times that were typically below the BHR 128ms threshold, so I expect this issue to be underreported by BHR.
| Assignee | ||
Comment 20•6 years ago
|
||
Ah, okay - I was focusing on NS_NewNamedThread, in which >25% is in beginthreadex/_beginthreadex (https://arewesmoothyet.com/?category=all&durationSpec=128_512&invertCallstack&mode=explore&payloadID=8b3d90fab13f434983feb5f4520ab134&search=ns_newnamedth&thread=1&transforms=mf-3626~mf-3627~mf-3628).
I'm trying to assess whether we want to do something goofy like a thread-creating thread, which wouldn't reduce any of the time in beginthreadex, or just allocate the ThreadInitData on the heap, which is substantially simpler and I suspect less likely to have unintended consequences, but only covers 75% of the observed cost of NS_NewNamedThread. In either case NS_NewNamedThread is a fairly tiny portion of the overall BHR data as shown in the awsy graph, though for what could be quite a small change it's likely worth it anyway.
What I'm more curious about is why you're seeing thread creation in profiles at an apparently reliable rate given that it's such a small part of BHR data overall. I'm wondering if there's something in the 2018(?) reference hardware that makes thread creation comparatively more expensive.
| Reporter | ||
Comment 21•6 years ago
|
||
(In reply to Doug Thayer [:dthayer] from comment #20)
What I'm more curious about is why you're seeing thread creation in profiles at an apparently reliable rate given that it's such a small part of BHR data overall.
But in my profiles I'm never seeing it be long enough to be reported by BHR.
I'm wondering if there's something in the 2018(?) reference hardware that makes thread creation comparatively more expensive.
Maybe McAfee?
| Reporter | ||
Comment 22•6 years ago
|
||
Another recent profile showing how much this impacts startup on the 2018 reference hardware: https://perfht.ml/2RqmbQj
| Assignee | ||
Comment 23•6 years ago
|
||
GetCurrentPhysicalThread and GetCurrentVirtualThread are, in practice,
identical, as the TLS override that GetCurrentVirtualThread depends on
is never actually set. This simply removes that and renames some things/
deletes some comments.
| Assignee | ||
Comment 24•6 years ago
|
||
To remove the blocking inside nsThread::Init, two things needed
to happen:
- Switch the ThreadInitData value passed as the argument for
ThreadFunc to a heap allocation, so that it can outlive the call
to nsThread::Init. - Initialize mThread and mEventTarget->mThread to the return
value of PR_CreateThread, so that to the callers, checks which
depend on these values being set can continue to function.
Depends on D41247
| Assignee | ||
Comment 25•6 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #17)
I guess we could allocate
ThreadInitDataon the heap instead of on the stack, which would eliminate that concern. I would be a little concerned about code that assumes thatNS_NewNamedThreadis, in effect, a blocking call, though (e.g. the code innsThreadPool.
Can you clarify in what way the code in nsThreadPool depends on this? I could only anywhere dependencies on mThread / mVirtualThread being set, which we can do from nsThread::Init.
Is there anything in particular you would like to see to be convinced that we can safely do this?
Comment 26•6 years ago
|
||
This bug seems to have stalled out. Did you have more feedback for dthayer given comment 25, froydnj?
Comment 27•6 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #26)
This bug seems to have stalled out. Did you have more feedback for dthayer given comment 25, froydnj?
Doug and I have been chatting in the phab issue (yay separated comments...). Is there anything there that I need to respond to?
Comment 28•6 years ago
|
||
Bah, how embarrassing. I hadn't checked Phabricator for the active conversation, and just assumed that we'd stalled out because of the lack of activity in the bug. Sorry about the noise - carry on!
Updated•6 years ago
|
| Assignee | ||
Comment 29•6 years ago
|
||
Sorry, I should have been a bit more clear in the Phabricator comments. I was hoping for you to just quickly look over my analysis of the possible side effects of this work and let me know if it all sounds reasonable. To be clear: from what I am seeing the only thing that remains to be addressed is the technical UB around nsThread::InitCommons static variable. The AddRef / Release issue turns out to be covered by an extra AddRef / Release pair in Init / ThreadFunc.
Comment 30•6 years ago
|
||
(In reply to Doug Thayer [:dthayer] from comment #29)
Sorry, I should have been a bit more clear in the Phabricator comments. I was hoping for you to just quickly look over my analysis of the possible side effects of this work and let me know if it all sounds reasonable. To be clear: from what I am seeing the only thing that remains to be addressed is the technical UB around nsThread::InitCommons static variable. The AddRef / Release issue turns out to be covered by an extra AddRef / Release pair in Init / ThreadFunc.
Commented in phab.
Comment 31•6 years ago
|
||
Comment 32•6 years ago
•
|
||
Backed out 2 changesets (Bug 1510226) for causing xpcshell crashes and xpcshell failures in test_TelemetrySession.js
Backout link: https://hg.mozilla.org/integration/autoland/rev/09e240c6177f504797dd75be356bc537d66afb05
Failure logs:
Comment 33•6 years ago
|
||
There are some r+ patches which didn't land and no activity in this bug for 2 weeks.
:dthayer, could you have a look please?
For more information, please visit auto_nag documentation.
| Assignee | ||
Comment 34•6 years ago
|
||
I'm paused on this for the time being. The issue that manifested the failures in comment 32 seems, as far as I can tell, to be an existing problem, which is just made significantly more likely due to timing differences with this patch applied. However, I wanted to chat with Nathan about it, and he's out until the 28th.
Comment 35•5 years ago
|
||
Comment on attachment 9084080 [details]
Bug 1510226 - Remove vestigial references to cooperative scheduling r?froydnj
Revision D41247 was moved to bug 1602646. Setting attachment 9084080 [details] to obsolete.
| Assignee | ||
Updated•5 years ago
|
| Reporter | ||
Updated•5 years ago
|
Updated•4 years ago
|
| Assignee | ||
Comment 36•4 years ago
|
||
Hey Kris, did you get a chance to take a look at the patch at all? Just pinging you here in case it fell off your radar.
Comment 37•4 years ago
|
||
(In reply to Doug Thayer [:dthayer] (he/him) from comment #36)
Hey Kris, did you get a chance to take a look at the patch at all? Just pinging you here in case it fell off your radar.
Sorry about that! I was looking into this then all the holidays happened and it got buried in my queue. About halfway through right now.
Comment 38•4 years ago
|
||
Comment 39•4 years ago
|
||
| bugherder | ||
Updated•3 years ago
|
Description
•