Make TaskController select pool thread deterministically
Categories
(Core :: XPCOM, task)
Tracking
()
Tracking | Status | |
---|---|---|
firefox130 | --- | fixed |
People
(Reporter: jonco, Assigned: jonco)
References
(Blocks 1 open bug, Regressed 1 open bug)
Details
(Keywords: perf-alert)
Attachments
(2 files)
TaskController can execute tasks off main thread using a pool of worker threads. Currently the thread used to run a given task is non-deterministic and ends up being selected by the host OS (TaskController notifies a single waiter on a condition variable and whichever thread is scheduled picks up a task to run).
It may be better to allow TaskController to select the thread itself. On heterogeneous systems this would allow for selecting a thread tied to a specific class of CPU that is appropriate for the specific task.
Also it may result in better cache behaviour to select threads in order, e.g. so the first pool thread is always used first if it is free, then the second and so on. In general the OS will try to keep a thread running on the same core, so this should result in a better chance of the thread having relevant data in its caches.
Comment 1•8 months ago
|
||
(In reply to Jon Coppeard (:jonco) from comment #0)
Also it may result in better cache behaviour to select threads in order, e.g. so the first pool thread is always used first if it is free, then the second and so on. In general the OS will try to keep a thread running on the same core, so this should result in a better chance of the thread having relevant data in its caches.
We recently did something similar for thread pools using a MRU approach, see bug 1891664. Bug 1897072 tracks some possible further enhancements we decided to not look into soon, but might be food for thought also here.
Assignee | ||
Comment 2•8 months ago
|
||
This passes a pointer to the current pool thread to RunPoolThread, removing the
need for thread local storage. It also does some refactoring, moving the
PoolThread definition out of the header file.
This also changes the thread vector to use UniquePtr which will be needed by
the next patch (it adds a CondVar to PoolThread which .
Assignee | ||
Comment 3•8 months ago
|
||
This replaces TaskController::mThreadPoolCV with an individual condition
variable for each thread, and uses these to wake individual threads. The task
for each thread is assigned beforehand by setting mCurrentTask.
The counter mIdleThreadCount is used to quickly tell if there are any threads
free.
This splits TaskController::RunPoolThread to move the task selection part to
SelectTaskToRun. MaybeDispatchOneTask is called whenever there may be a new
task to dispatch (this runs from both the main thread and pool threads).
I tried to keep the same behaviour with mEffectiveTaskPriority but I don't
fully understand what this is being used for.
Assignee | ||
Comment 4•8 months ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #1)
Interesting, this does have similar aims to bug 1897072.
Assignee | ||
Comment 5•8 months ago
|
||
Benchmark results with the attached patches show some sp3 subtest improvements on Windows at high confidence:
- 19% improvement on React-Stockcharts-SVG/ZoomTheChart/Async
- 2-5% improvement on a bunch of TodoMVC, Charts, React-Stockcharts subtests
Updated•8 months ago
|
Updated•8 months ago
|
Comment 7•7 months ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/6506dfdce9d8
https://hg.mozilla.org/mozilla-central/rev/5b138611e81c
Comment 8•7 months ago
|
||
(In reply to Pulsebot from comment #6)
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6506dfdce9d8
Part 1: Pass current thread pointer to TaskController::RunPoolThread and
refactor r=xpcom-reviewers,nika
https://hg.mozilla.org/integration/autoland/rev/5b138611e81c
Part 2: Make TaskController select threads for tasks deterministically
r=xpcom-reviewers,nika
Perfherder has detected a browsertime performance change from push 5b138611e81c1e98528f36cc4a52d81240f5b84d.
Improvements:
Ratio | Test | Platform | Options | Absolute values (old vs new) | Performance Profiles |
---|---|---|---|---|---|
6% | bing-search-restaurants ContentfulSpeedIndex | android-hw-a51-11-0-aarch64-shippable-qr | warm webrender | 608.30 -> 571.69 | Before/After |
4% | speedometer Flight-TodoMVC/CompletingAllItems/Sync | windows11-64-nightlyasrelease-qr | fission webrender | 15.46 -> 14.86 | Before/After |
4% | bing-search-restaurants PerceptualSpeedIndex | android-hw-a51-11-0-aarch64-shippable-qr | warm webrender | 686.95 -> 661.44 | Before/After |
3% | bing-search-restaurants LastVisualChange | android-hw-a51-11-0-aarch64-shippable-qr | warm webrender | 1,525.46 -> 1,480.03 | Before/After |
3% | bing-search-restaurants SpeedIndex | android-hw-a51-11-0-aarch64-shippable-qr | warm webrender | 673.00 -> 653.45 | Before/After |
Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.
If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.
You can run these tests on try with ./mach try perf --alert 1444
For more information on performance sheriffing please see our FAQ.
Updated•7 months ago
|
Description
•