Closed Bug 1383580 Opened 7 years ago Closed 7 years ago

EME video playback fails due to hitting CDM IPC shmem limit

Categories

(Core :: Audio/Video: GMP, defect, P1)

55 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox56 --- fixed

People

(Reporter: cpearce, Assigned: cpearce)

Details

Attachments

(2 files)

In some situations playback of EME video fails due to us hitting the limit on the number of shmems we pre-allocate for transferring video frames from the CDM to Firefox.

Telemetry shows that we do hit the 50-shmem limit in Beta sometimes (the 50-infinity line in the table):
https://mzl.la/2vO60Ab

This is affecting Beta and Nightly.

This failure appears to happen basically every time in Nightly while watching Amazon Prime Video. I believe this started happening more in Nightly after we pushed the 1.4.8.970 CDM update, as that has a new video decoder which will have new behaviour.

While playing First Contact (1997) on Amazon Prime Video, I see up to 75 (!) shmems being allocated. That means we're putting 75 video frames into the decoder before we're getting anything out. It would seem the new decoder pre-buffers a lot more frames than the old CDM's decoder.

We need to use a new strategy for managing the shmems we use for transferring video frames between the CDM and Firefox.
We should fix this in Beta 55 as this failure happens there, though thankfully not a huge amount.
The problem is that we're underestimating the size of buffer required. We increase the number of shmems in our pool if we receive a video frame from the CDM which is not in a shmem. The CDM process allocates a frame which is not a shmem if it doesn't have a shmem big enough. When we switch bitrates, we purge the shmems, and estimate the size off shmem we'll need. The parent then sends the CDM process enough shmems of that size to fill the pool. When the CDM tries to allocate a frame it has a full pool of shmems of too small size, so it returns the frame in a non-shmem buffer. The parent sees this, and increases the pool size, purges all shmems, and resends the CDM process a (now larger) pool of shmems of the correct size

If we switch bitrates enough, we'll increase the pool size like this until we hit the pool size limit, whereupon we error.
(In reply to Chris Pearce (:cpearce) from comment #0)
> While playing First Contact (1997) on Amazon Prime Video, I see up to 75 (!)
> shmems being allocated. That means we're putting 75 video frames into the
> decoder before we're getting anything out. It would seem the new decoder
> pre-buffers a lot more frames than the old CDM's decoder.

For the record, my claim here that we're putting 75 frames into the decoder before getting output is incorrect; we are pre-allocating too many frames as I describe in comment 2.
Comment on attachment 8889265 [details]
Bug 1383580 - Add an explicit message to increase CDM-Firefox shmem pool.

https://reviewboard.mozilla.org/r/160316/#review165590

::: commit-message-d544d:7
(Diff revision 1)
> +
> +The strategy we were using to expand the pool of shmems we use to shuffle video
> +frames between the CDM and content processses is to increase the size of the
> +pool if the content process receives a video frame in a non-shmem. However the
> +CDM process will send a frame in a non-shmem if none of the shmems in the pool
> +are bug enough to fit the frame the CDM produces. This happens if we

'bug' -> 'big'
Attachment #8889265 - Flags: review?(gsquelart) → review+
Comment on attachment 8889266 [details]
Bug 1383580 - Pad estimated CDM frame sizes.

https://reviewboard.mozilla.org/r/160318/#review165594
Attachment #8889266 - Flags: review?(gsquelart) → review+
Pushed by cpearce@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/25b1e61ea394
Add an explicit message to increase CDM-Firefox shmem pool. r=gerald
https://hg.mozilla.org/integration/autoland/rev/8ac6475952a1
Pad estimated CDM frame sizes. r=gerald
Must remember to uplift.
Flags: needinfo?(cpearce)
Actually, I think we can get away without this in 55. We only hit this case regularly in Nightly because we the new 970 Widevine CDM doesn't request buffers of the size we expect, but the 903 CDM in 33 does. So we won't hit this degenerate case.

The telemetry shows that while we do somehow hit the limit in beta 55, we do so in 0% of cases. So we're probably ok without this change.
Flags: needinfo?(cpearce)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: