This is a complex issue, and requires breaking down the current and future setups that Firefox can use to playback audio, and also to categorize what type of things is being played back and why. I think there are three categories we care about:
Media playback can mean to simply listen to a media stream that has an audible content.
This works like this in Gecko:
- A thread fills a buffer of compressed data, this can be:
- xhr or fetch calls getting compressed data from an endpoint (this touches the main thread)
- HTTP streaming, via a URL (this is off-main-thread)
- Another thread takes this compressed data and decodes it into decoded buffers. This can be:
- In the same process, in software
- In another (dedicated, called Remote Data Decoder process) process
- This data ends up in a queue. The size of this queue is configurable, it's a trade-off between latency, resilience against load and memory usage, roughly
- Every now and then (in an isochronous fashion), a super high-priority thread (the highest on the system, even using different scheduler altogether sometimes) comes and get some data off this queue, sometimes processes things a bit, and hands the data to the system for play back.
This is the simple case of watching a movie or listening to a music track on youtube or spotify. We can buffer a lot ahead, there is no real real-time constraints: the content is there and we can grab a lot of it, and not glitch even if the threads are not being scheduled often enough and for long enough. We can lower the priority of most things, and it will work as long as the queues are big enough. There is however quite a lot of thread hops throughout this process. Lowering thread priorities of some threads will widen the standard deviation of a typical response time of those threads, so we must take care of bumping the buffer length, or we risk data under-runs.
However, sometimes, the buffer lengths are not in our control, sometimes, the app itself tries to have very low latency, while using high-latency constructs (say, a Facebook or Twitch live stream), and Firefox does not know (for now), that a particular site is trying to do this. This is hard, and had us not do any type of
setTimeout backtround-tab throttling in the past for any tab that is doing anything with audio:
setTimeout were only running once a second, and glitch were very frequent.
Playback can mean real time media playback:
This uses an AudioContext to process the audio in various ways, and requires tight scheduling on the main thread and worker processes. It's more or less the same setup as above, although the output latency is extremely small so when things starts going bad, it's extremely bad much faster. This is SoundCloud for example.
Media playback can be a WebRTC call. In this case, it's we have hard-real-time constraints:
- WebCam and microphone must be recorded, encoded and sent out as fast as possible
- Audible content must be received, decoded and played out as fast as possible
This involves a variety of processes and threads: some for networking, some for decoding, some for capturing the webcam, some for playing back audio, etc. This type of application has a very low tolerance for scheduling issues, and consumes lots of resources (encoding X times a video, decoding X times multiple videos, mixing a few audio streams with effects, recording a microphone while processing it for noise cancellation and echo cancellation, encoding and sending it out, etc.). It's however rather common to switch to other tabs during a call.
A safe policy would be to not touch the priority of a process that has an output stream opened. If it's not good enough, maybe we want to figure out something smarter.