Mark background tabs that are playing media as "active" for the Process Priority Manager
Categories
(Core :: DOM: Content Processes, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox67 | --- | fixed |
People
(Reporter: mconley, Assigned: mconley)
References
(Blocks 2 open bugs)
Details
Attachments
(2 files, 1 obsolete file)
In the event that background tabs are playing audio, assuming that audio is being decoded in the same process as that background tab (which I think still happens these days), I think lowering the priority of those processes might result in greater chances for stuttering and lag.
We should probably ensure that any processes that are playing media stay "active".
Comment 1•6 years ago
|
||
This is a complex issue, and requires breaking down the current and future setups that Firefox can use to playback audio, and also to categorize what type of things is being played back and why. I think there are three categories we care about:
Media playback can mean to simply listen to a media stream that has an audible content.
This works like this in Gecko:
- A thread fills a buffer of compressed data, this can be:
- xhr or fetch calls getting compressed data from an endpoint (this touches the main thread)
- HTTP streaming, via a URL (this is off-main-thread)
- Another thread takes this compressed data and decodes it into decoded buffers. This can be:
- In the same process, in software
- In another (dedicated, called Remote Data Decoder process) process
- This data ends up in a queue. The size of this queue is configurable, it's a trade-off between latency, resilience against load and memory usage, roughly
- Every now and then (in an isochronous fashion), a super high-priority thread (the highest on the system, even using different scheduler altogether sometimes) comes and get some data off this queue, sometimes processes things a bit, and hands the data to the system for play back.
This is the simple case of watching a movie or listening to a music track on youtube or spotify. We can buffer a lot ahead, there is no real real-time constraints: the content is there and we can grab a lot of it, and not glitch even if the threads are not being scheduled often enough and for long enough. We can lower the priority of most things, and it will work as long as the queues are big enough. There is however quite a lot of thread hops throughout this process. Lowering thread priorities of some threads will widen the standard deviation of a typical response time of those threads, so we must take care of bumping the buffer length, or we risk data under-runs.
However, sometimes, the buffer lengths are not in our control, sometimes, the app itself tries to have very low latency, while using high-latency constructs (say, a Facebook or Twitch live stream), and Firefox does not know (for now), that a particular site is trying to do this. This is hard, and had us not do any type of setTimeout
backtround-tab throttling in the past for any tab that is doing anything with audio: setTimeout
were only running once a second, and glitch were very frequent.
Playback can mean real time media playback:
This uses an AudioContext to process the audio in various ways, and requires tight scheduling on the main thread and worker processes. It's more or less the same setup as above, although the output latency is extremely small so when things starts going bad, it's extremely bad much faster. This is SoundCloud for example.
Media playback can be a WebRTC call. In this case, it's we have hard-real-time constraints:
- WebCam and microphone must be recorded, encoded and sent out as fast as possible
- Audible content must be received, decoded and played out as fast as possible
This involves a variety of processes and threads: some for networking, some for decoding, some for capturing the webcam, some for playing back audio, etc. This type of application has a very low tolerance for scheduling issues, and consumes lots of resources (encoding X times a video, decoding X times multiple videos, mixing a few audio streams with effects, recording a microphone while processing it for noise cancellation and echo cancellation, encoding and sending it out, etc.). It's however rather common to switch to other tabs during a call.
A safe policy would be to not touch the priority of a process that has an output stream opened. If it's not good enough, maybe we want to figure out something smarter.
Assignee | ||
Comment 2•6 years ago
|
||
cc'ing gsvelto, who I believe has an active interest in this area as well.
I noted this on the distribution list, but my understanding is that we we stop decoding video frames in background videos after X seconds, while the audio, which requires fewer CPU resources, continues.
Bug 1224973 is the one that enabled this functionality.
Comment 5•6 years ago
|
||
I've double-checked the code in the process priority manager and it still assigns a higher-than-background priority to tabs that are holding "cpu" or "high-priority" wake-locks [1]. Unfortunately this doesn't apply anymore to tabs playing audio or video, those use "audio-playing" [2] and "video-playing" [3] wake-locks. That code should be changed to reflect that and the old wake-lock topics should be removed since they're unused (at least from what I can tell). Once that's done we'll then need to change the mapping for the PROCESS_BACKGROUND_PERCEIVABLE_PRIORITY in the HAL [4]. We might use Windows' BELOW_NORMAL_PRIORITY_CLASS for background tabs that are playing music and IDLE_PRIORITY_CLASS for those that aren't.
Once this is done we might try using the PROCESS_MODE_BACKGROUND_BEGIN/PROCESS_MODE_BACKGROUND_END class [5] for background tabs. Compared to IDLE_PRIORITY_CLASS it lowers both CPU and I/O resource usage of the process. IDLE_PRIORITY_CLASS only lowers CPU priority.
[1] https://searchfox.org/mozilla-central/rev/78cd247b5d7a08832f87d786541d3e2204842e8e/dom/ipc/ProcessPriorityManager.cpp#732
[2] https://searchfox.org/mozilla-central/rev/78cd247b5d7a08832f87d786541d3e2204842e8e/dom/html/HTMLMediaElement.cpp#3808
[3] https://searchfox.org/mozilla-central/rev/78cd247b5d7a08832f87d786541d3e2204842e8e/dom/html/HTMLVideoElement.cpp#296
[4] https://searchfox.org/mozilla-central/rev/78cd247b5d7a08832f87d786541d3e2204842e8e/hal/windows/WindowsProcessPriority.cpp#27
[5] https://docs.microsoft.com/en-us/windows/desktop/api/processthreadsapi/nf-processthreadsapi-setpriorityclass
Comment 6•6 years ago
|
||
It looks like this is being seen in the wild (bug 1524193).
Comment 7•6 years ago
|
||
(In reply to Aaron Klotz [:aklotz] from comment #6)
It looks like this is being seen in the wild (bug 1524193).
It's not surprising... With DOM timers throttling (as Paul mentioned in comment 1) over the years several times we have made changes to throttle more aggressively that didn't take audio playback into consideration assuming things will hopefully work out well and every time users ran into issues on sites/machines we didn't test. :-)
Assignee | ||
Comment 8•6 years ago
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #5)
That all sounds generally reasonable... so is it fair to say that, since the wake-lock topics aren't being used, there's currently nothing directly informing the ProcessPriorityManager that a tab has media playing, and that we'd have to build that in?
Comment 9•6 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #8)
(In reply to Gabriele Svelto [:gsvelto] from comment #5)
That all sounds generally reasonable... so is it fair to say that, since the wake-lock topics aren't being used, there's currently nothing directly informing the ProcessPriorityManager that a tab has media playing, and that we'd have to build that in?
Yes, since it has been unused for almost three years its tabs-playing-audio detection has bit-rotted. Fortunately fixing it should be trivial.
Assignee | ||
Comment 10•6 years ago
•
|
||
(In reply to Gabriele Svelto [:gsvelto] from comment #9)
Fortunately fixing it should be trivial.
Well, that's good to hear. :)
From a preliminary scan of the wake lock and process priority manager code, it looks like we set the wake locks for audio and video in the content process, and map that to the window which requested the wake lock.
I see us going pretty deep into Hal once we register that wake lock, but I don't see anything sending a message up to the parent to actually put the lock in up there (presumably, the parent needs to know about the wake locks for them to have any effect).
Is that plumbing just missing? Or am I not seeing it?
Assignee | ||
Comment 11•6 years ago
|
||
I think I'm wrong in comment 10 - I think there is glue proxying wakelock information up to the parent:
So we might have the pieces we need here already, and we'll need to update the Process Priority Manager to handle those media locks.
Comment 12•6 years ago
|
||
I've just double-checked and the wakelocks are being properly proxy'd to the parent process. We just need to wire it up. I can write a patch for this on Monday if it's not super urgent (it's getting late over here).
Assignee | ||
Comment 13•6 years ago
|
||
I've got a patch that seems to do the job. Coming up.
Assignee | ||
Comment 14•6 years ago
|
||
Assignee | ||
Comment 15•6 years ago
|
||
Depends on D18393
Assignee | ||
Comment 16•6 years ago
|
||
I didn't tear out the high-priority WakeLock support from ProcessPriorityManager, since it looks like there's more in ProcessPriorityManager that cares about high-priority processes (like mHighPriorityChildIDs). For "cpu", I noticed that Android seems to reference it: https://searchfox.org/mozilla-central/rev/e62b311b2e0e02633afda2004ba4056b09fbcaa4/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoAppShell.java#578
So I went for the simple thing, and just added video and audio WakeLock support. I imagine we can think about removing the others in a follow-up.
I went with BELOW_NORMAL_PRIORITY_CLASS for PROCESS_PRIORITY_BACKGROUND_PERCEIVABLE as per your recommendation.
Comment 17•6 years ago
|
||
Do we know the interaction between the Multimedia Class Scheduler Service-per thread priority settings (that is essential for any type of audio work) and SetPriorityClass
process-wide priority setting?
In particular, Firefox always uses the class "Audio" for the audio processing thread. Not doing this results in unbearable artifacts even when the machine is very fast and idle.
How do we plan to validate what we're doing here? In general, saturating all CPU/disks/memory bandwidth with a special program (stress on osx and Linux for example, I don't know on Windows), playing back something in Firefox and listening for glitches is good enough to detect issues, but I can provide more advanced tools that we built to do regression testing and real-time code validation if need be.
Comment 18•6 years ago
|
||
The patches are looking good, I believe that we can tear out the "high-priority" wakelock support because it's only use was to lower the priority of all processes which had not requested it in Firefox OS (ironically it was introduced as a workaround for an Android bug that was later solved, but we never removed it). The cpu wakelock seems to be still supported on Fennec but I'm not sure if it's used. I'll open separate bugs to investigate.
Assignee | ||
Comment 19•6 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #17)
Do we know the interaction between the [Multimedia Class Scheduler Service]-per thread priority settings (that is essential for any type of audio work) and
SetPriorityClass
process-wide priority setting?
I don't, and the documentation doesn't make it super clear, but I can make an educated guess. It seems as if the Multimedia Class Scheduler Service is ultimately a way of setting thread priority though, and the Windows scheduler seems to compute which thread gets the next task by looking at both thread and process priority.
This would suggest that a background tab with BELOW_NORMAL_PRIORITY_CLASS using the MCSS priority of "Audio" would have less priority than a tab with NORMAL_PRIORITY_CLASS and an MCSS priority of "Audio".
(In reply to Paul Adenot (:padenot) from comment #17)
How do we plan to validate what we're doing here? In general, saturating all CPU/disks/memory bandwidth with a special program (stress on osx and Linux for example, I don't know on Windows), playing back something in Firefox and listening for glitches is good enough to detect issues, but I can provide more advanced tools that we built to do regression testing and real-time code validation if need be.
I think testing out some of these tools on the reference hardware and ensuring similar playback characteristics sounds like a fine idea. How can I get my hands on them?
Assignee | ||
Comment 21•6 years ago
|
||
Hey gsvelto - I may have spotted a bug in my patches here... correct me if I'm wrong, but suppose we have an <audio> element and a <video> element playing in a tab, and that tab goes into the background.
The patches will put that background tab into BELOW_NORMAL_PRIORITY_CLASS. So far so good.
Now suppose the video ends, and it relinquishes its audio and video locks.
From what I can tell, an update will be broadcast up to the parent to update the wakelock states... the parent will hear that there's no video wakelock, but still an audio wakelock.
If the notification of the video wakelock going away comes second, won't my patch for ParticularProcessPriorityManager::Notify set mHoldsPlayingMediaWakeLock to false, and we'll set the background tab to PROCESS_PRIORITY_BACKGROUND?
If so, maybe I shouldn't have mHoldsPlayingMediaWakeLock bool that's the || of audio / video locks, but individual bools for audio and video.
Or is this not a problem for reasons I don't see?
Comment 22•6 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #21)
Hey gsvelto - I may have spotted a bug in my patches here... correct me if I'm wrong, but suppose we have an <audio> element and a <video> element playing in a tab, and that tab goes into the background.
[...]
If the notification of the video wakelock going away comes second, won't my patch for ParticularProcessPriorityManager::Notify set mHoldsPlayingMediaWakeLock to false, and we'll set the background tab to PROCESS_PRIORITY_BACKGROUND?
Now that you make me think about it I guess it's possible... I'm not 100% but better be safe than sorry and split the logic to use two booleans.
BTW we had tests that covered the PPM logic but I can't find them anymore so I guess they're gone. I'll try to dig them up and see if I can restore them; they would be useful to cover this kind of potential corner-cases.
Assignee | ||
Comment 23•6 years ago
|
||
Okay, thanks - I've switched to a pair of bools then. I've also addressed your review comment. Does this still look okay to land?
Comment 24•6 years ago
|
||
(In reply to Mike Conley (:mconley) (:⚙️) from comment #23)
Okay, thanks - I've switched to a pair of bools then. I've also addressed your review comment. Does this still look okay to land?
Yep! Keeping my fingers crossed about BELOW_NORMAL_PRIORITY_CLASS
being enough to fix the audio glitches.
Comment 25•6 years ago
|
||
I'm using a build without these patches, and it's rather easy to reproduce these glitches when the process priority manager is enabled, and also programmatically detect them, on an XPS15 with a Core i7 7700HQ, by loading the CPU and memory a bit, and without stressing the IOs.
I'm dusting off a machine that's more like something one of our users would use to have another look, and I'm preparing a little patch and instructions on how measure this kind of things.
Comment 26•6 years ago
|
||
This is pretty rough but gives good data.
Comment 27•6 years ago
|
||
To use this, apply the patch, and run Firefox (in opt mode, etc.), like so:
MOZ_DISABLE_CONTENT_SANDBOX=1 MOZ_LOG=MSGTracing:5,raw,sync MOZ_LOG_FILE=tracing.log ./mach run
and then do something that does audio but only once (the code is using globals for ergonomics). For example, this is one of the scenarios I've tested:
- Run Firefox with the command above
- Open a YouTube video that is a music video (kind of important, we don't want a video that has silence in it, because we would miss some glitches, a music video is often good enough)
- Load the machine (I open tabs, run load testing tools, etc.)
- After some time, close Firefox
- Find the files named "tracing.log.child*", there is one that contains lots of data (it's json)
- Load it into https://padenot.github.io/msg-load-analyzer/, this will compute a bunch of data, show graphs. If the curve crosses 1, the audio thread blew its budget.
The "Load" is a metric that is the ratio of the time it took to do the rendering of a buffer of audio, to the length of this buffer of audio. For example, if it takes more than 10ms to render 10ms of audio, there's going to be trouble.
When doing playback, we have lots of headroom (there is a buffer queue in the system, which means that if we miss even a couple deadline, it's not going to glitch), however some sites use the Web Audio API to do their output. When this is the case, this safety net does not exist, and it's easier to glitch under load.
A good example of a website that uses the Web Audio API for output is SoundCloud. Youtube can be use as a test website for HTMLMediaElement.
Comment 28•6 years ago
|
||
Comment 29•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/4a12df5a18f0
https://hg.mozilla.org/mozilla-central/rev/f77433e6354c
Comment 30•6 years ago
|
||
Reopening this to not scatter the information and measurements everywhere.
Mike, I can run some measurements here if you want, did you have time to gather some data yourself ?
Assignee | ||
Comment 31•6 years ago
|
||
(In reply to Paul Adenot (:padenot) from comment #30)
Mike, I can run some measurements here if you want, did you have time to gather some data yourself ?
I haven't gotten to it yet - if you have a few moments, that'd be very helpful! Thank you!
Updated•6 years ago
|
Comment 32•6 years ago
|
||
FYI I'm going to finish cleaning up the other wakelocks in bug 1358005.
Comment 33•6 years ago
|
||
We can close this, a better behaviour has been landed on autoland in https://bugzilla.mozilla.org/show_bug.cgi?id=1524193.
Updated•6 years ago
|
Updated•6 years ago
|
Description
•