vsync stops being notified while counting MIDI devices for telemetry
Categories
(Core :: DOM: Core & HTML, defect, P2)
Tracking
()
People
(Reporter: florian, Assigned: bholley)
References
(Regression)
Details
(Keywords: perf:responsiveness, perf:startup, regression)
Attachments
(7 files)
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review |
While profiling bug 1789268 to verify the fix, I noticed a pause during the animation. Here is the profile: https://share.firefox.dev/3uZMm4S
There's a 738ms gap during which no refresh driver tick happened on the content process.
When looking at what's happening in the parent process at the same time, there's a long (793ms) "startupLateIdleTask" marker. The main thread seems idle during that time.
At the beginning of that idle startup task, there's a ChromeUtils.importESModule - resource://gre/modules/PrivateBrowsingUtils.sys.mjs marker with this stack:
ChromeUtils::IdleDispatch handler []
_scheduleBestEffortUserIdleTasks/< [resource:///modules/BrowserGlue.sys.mjs:2936:37]
_scheduleBestEffortUserIdleTasks/idleTasks< [resource:///modules/BrowserGlue.sys.mjs:2919:12]
getTopWindow [resource:///modules/BrowserWindowTracker.jsm:164:14]
This points to this code added in bug 1797019.
The part of the profile at the end of the startupLateIdleTask marker looks related to audio devices: https://share.firefox.dev/3FH0qVL
It looks like even though the main thread is not blocked, the operating system stops notifying our process of vsync until we are done counting audio devices. Markus, any idea of why this could happen?
Comment 1•1 year ago
|
||
Can you capture a profile that includes the WindowsVsyncThread?
Assignee | ||
Comment 2•1 year ago
|
||
So this telemetry isn't super important and we could of course remove it. Unfortunately, various sites out there silently request MIDI access to fingerprint users, and so to avoid annoying users with prompts, we internally enumerate the devices to determine whether the user has any connected. So even if we remove this telemetry, some users will still hit this jank in the wild during ordinary unrelated browsing.
Assignee | ||
Comment 3•1 year ago
|
||
(unless it's the case that this jank is specifically related to requesting MIDI access from the parent process, in which case we can indeed solve it by removing the telemetry).
Reporter | ||
Comment 4•1 year ago
|
||
(In reply to Markus Stange [:mstange] from comment #1)
Can you capture a profile that includes the WindowsVsyncThread?
I don't know yet if I can reproduce.
Reporter | ||
Comment 5•1 year ago
|
||
(In reply to Markus Stange [:mstange] from comment #1)
Can you capture a profile that includes the WindowsVsyncThread?
This thread seems to only exist on Windows. Here is a profile with all registered threads: https://share.firefox.dev/3YzTLWd
I improved the startup idle task markers to make the profiles easier to read (apply the patch from bug 1806519 if you would like to have these markers locally).
(In reply to Florian Quèze [:florian] from comment #4)
I don't know yet if I can reproduce.
I can actually reproduce most of the time. The "startupLateIdleTask - reportHasMIDIDevices" marker typically takes 700 to 800ms on my machine. I once so it take only 25ms, and that wasn't long enough to make vsync skip a frame.
Comment 6•1 year ago
|
||
Oh, oops, this is on macOS, for some reason I was thinking Windows.
The all-threads profile shows the reason: MIDIPlatformService::Get()
is running on the IPDL background thread, blocking it for the duration of the call to MidiClientCreate
: https://share.firefox.dev/3WxSA7I
This is not good; long-running or blocking tasks should not run on the PBackground thread, see this now-deleted documentation of PBackground:
The background thread is designed to be responsive (nobody is allowed to do long running computation or file I/O on it) to guarantee better latency
Vsync is delivered via the IPDL background thread.
Comment 7•1 year ago
|
||
Set release status flags based on info from the regressing bug 1797019
:nchevobbe, since you are the author of the regressor, bug 1797019, could you take a look? Also, could you set the severity field?
For more information, please visit auto_nag documentation.
Comment 8•1 year ago
|
||
(In reply to Release mgmt bot [:suhaib / :marco/ :calixte] from comment #7)
:nchevobbe, since you are the author of the regressor, bug 1797019, could you take a look? Also, could you set the severity field?
Looks like folks more knowledgeable that me already started looking into this, but let me know if there's anything I can do to help
Assignee | ||
Updated•1 year ago
|
Assignee | ||
Comment 9•1 year ago
|
||
My plan here is the following:
(1) At startup, explicitly initialize a StaticRefPtr<TaskQueue> gMIDITaskQueue on top of the background thread pool.
(2) Bind PMIDIManager to that task queue using the same pattern as CreateFileSystemManagerParent.
(3) Update RecvHasMIDIDevice to explicitly proxy over to the task queue.
(4) Update all the assertions to ensure that all the parent MIDI stuff runs on the task queue.
Updated•1 year ago
|
Assignee | ||
Comment 10•1 year ago
|
||
Assignee | ||
Comment 11•1 year ago
|
||
Assignee | ||
Comment 12•1 year ago
|
||
Assignee | ||
Comment 13•1 year ago
|
||
Assignee | ||
Comment 14•1 year ago
|
||
Assignee | ||
Comment 15•1 year ago
|
||
Assignee | ||
Comment 16•1 year ago
|
||
Assignee | ||
Comment 17•1 year ago
|
||
Assignee | ||
Comment 18•1 year ago
|
||
Assignee | ||
Comment 19•1 year ago
|
||
The patches here should fix this bug properly, but they're far too involved to backport. As a mitigation, I propose we back bug 1797019 out on beta (or even a dot release if we do one), which will substantially reduce the frequency with which users encounter this stutter. Ryan, WDYT?
Comment 20•1 year ago
|
||
Fine by me! CCing Dianna as well for the 108 side of things.
Comment 21•1 year ago
|
||
backout |
Fixed by backout for 109.0b7.
https://hg.mozilla.org/releases/mozilla-beta/rev/3a2cd0b0fdf0
Comment 22•1 year ago
|
||
The bug is marked as tracked for firefox108 (release) and tracked for firefox110 (nightly). However, the bug still has low severity.
:hsinyi, could you please increase the severity for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit auto_nag documentation.
Comment 23•1 year ago
|
||
Fixed by backout for 108.0.2
https://hg.mozilla.org/releases/mozilla-release/rev/db4544e2e40a
Updated•1 year ago
|
Comment 24•1 year ago
|
||
Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/84ef3b5a18a6 Make PMIDIManager refcounted. r=nika https://hg.mozilla.org/integration/autoland/rev/ba226492e8fd Use a helper class to null out the weak reference in MIDIPortChild. r=nika https://hg.mozilla.org/integration/autoland/rev/fe0bd3672dbd Make PMIDIPort refcounted. r=nika https://hg.mozilla.org/integration/autoland/rev/4ba470d9baca Make PMIDIManager a top-level protocol. r=nika https://hg.mozilla.org/integration/autoland/rev/132feb5df539 Make PMIDIPort a top-level protocol. r=nika https://hg.mozilla.org/integration/autoland/rev/4cbb0ea04af1 Stand up a background task queue for the MIDI service. r=nika https://hg.mozilla.org/integration/autoland/rev/06a0a304a757 Hoist the MIDI service onto a dedicated background task queue. r=nika,gsvelto
Comment 25•1 year ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/84ef3b5a18a6
https://hg.mozilla.org/mozilla-central/rev/ba226492e8fd
https://hg.mozilla.org/mozilla-central/rev/fe0bd3672dbd
https://hg.mozilla.org/mozilla-central/rev/4ba470d9baca
https://hg.mozilla.org/mozilla-central/rev/132feb5df539
https://hg.mozilla.org/mozilla-central/rev/4cbb0ea04af1
https://hg.mozilla.org/mozilla-central/rev/06a0a304a757
Assignee | ||
Comment 26•1 year ago
|
||
Florian, is it easy to confirm the performance issue here is resolved? It should be (we moved it to a different thread), so if it's more than 10 minutes don't bother.
Reporter | ||
Comment 27•1 year ago
|
||
(In reply to Bobby Holley (:bholley) from comment #26)
Florian, is it easy to confirm the performance issue here is resolved? It should be (we moved it to a different thread), so if it's more than 10 minutes don't bother.
"startupLateIdleTask - reportHasMIDIDevices" still takes about 800ms, but animations continue during that time: https://share.firefox.dev/3QJLUBV
So yes, FIXED, thanks!
Description
•