Closed Bug 1670817 Opened 5 years ago Closed 3 years ago

Have crash recovery policy with RDD process

Categories

(Core :: Audio/Video: Playback, task)

task

Tracking

()

RESOLVED FIXED
99 Branch
Tracking Status
firefox99 --- fixed

People

(Reporter: jya, Assigned: stransky)

References

(Blocks 1 open bug)

Details

Attachments

(5 files)

The GPU process was designed to recover smoothly should it crashed. If multiple crashes occur the content's MediaFormatReader will disable HW decoding and request a software decoder instead.

The RDD process doesn't have this policy. If it crashes the decoding will fail and be fatal (the video will stop playing).

It was very unlikely for the RDD process to crash in the past, but following bug 1595994 it is going to be almost as likely to crash as the GPU process.

For now, the issue is downplayed that we will always prefer the GPU process to run the remote decoder for HW accelerated decoders.

I'm going to take this one.

Assignee: nobody → jyavenard
Assignee: jya-moz → nobody
See Also: → 1752493

Recently we set NS_ERROR_DOM_MEDIA_FATAL_ERR when decoder fails to init. That leads to broken media playback even when we're requested to create a new decoder (with possibility of disabled HW acceleration.

In this patch we pass proper error code from decoder to media control code to allow decoder recovery.

Assignee: nobody → stransky
Status: NEW → ASSIGNED

With the patch here we'll restart the RDD process after every crash and try again. I wonder if we should count the crashes and throw the "unable to play" error after say three crashes for the same clip?

The issue with MediaFormatReader check is that it doesn't work with RDD:
https://searchfox.org/mozilla-central/rev/4615b544a0f7166233c409c619b426c4025467a7/dom/media/MediaFormatReader.cpp#2369-2385

  1. RDD overrides decoder error to NS_ERROR_DOM_MEDIA_FATAL_ERR which is not checked so we don't do any recovery from RDD crashes
  2. decoder.mDecoder->IsHardwareAccelerated(error) does not work as expected for RDD. As we initialize mIsHardwareAccelerated after mChild->Init() here:

https://searchfox.org/mozilla-central/rev/b697834e78a3ef7613e2fa57c07624b1d9d1c909/dom/media/ipc/RemoteMediaDataDecoder.cpp#49

mIsHardwareAccelerated has undefined value when decoder crash in mChild->Init() (decoder init). That crash happens now due RDD sandbox but can also happen for broken drivers/vaapi library or so.

OTOH: decoder.mHardwareDecodingDisabled flag is properly delivered to decoders in RDD so we can use it.

So we need to:

  1. return different error code from RDD or update the 'first decoded frame failure' check.
  2. fix decoder.mDecoder->IsHardwareAccelerated(error)

According to https://searchfox.org/mozilla-central/rev/b697834e78a3ef7613e2fa57c07624b1d9d1c909/dom/media/platforms/PlatformDecoderModule.h#482 we don't have to call Init() to call IsHardwareAccelerated() but actual implementation is different. That comment says the IsHardwareAccelerated() actually means 'HW is enabled' but it's implemented as 'HW is used'.

Attachment #9262616 - Attachment description: Bug 1670817 Don't override error code from decoder init r?alwu → Bug 1670817 Don't override error code from decoder init in MediaChangeMonitor::CreateDecoderAndInit r?alwu
Attachment #9263032 - Attachment description: Bug 1670817 Disable HW decoding on Linux in case of RDD/GPU HW decoder crash / decoding error r?alwu → Bug 1670817 [Linux] Disable HW decoding in case of RDD/GPU HW decoder crash & decoding error r?alwu

With the patches here we have control over RDD crashes in MediaFormatReader so we can add some check how many RDD process restart we allow.
(the patch here always asks for new decoder in case of NS_ERROR_DOM_MEDIA_REMOTE_DECODER_CRASHED_ERR).

Pushed by stransky@redhat.com: https://hg.mozilla.org/integration/autoland/rev/8bb5a205517f Don't override error code from decoder init in MediaChangeMonitor::CreateDecoderAndInit r=alwu https://hg.mozilla.org/integration/autoland/rev/745394cb6014 Add a new NS_ERROR_DOM_MEDIA_REMOTE_DECODER_CRASHED_ERR error type r=alwu https://hg.mozilla.org/integration/autoland/rev/308808c4817c Throw NS_ERROR_DOM_MEDIA_REMOTE_DECODER_CRASHED_ERR error when RDD/GPU decoder crashes r=alwu https://hg.mozilla.org/integration/autoland/rev/2f72afc50db7 Make NS_ERROR_DOM_MEDIA_REMOTE_DECODER_CRASHED_ERR non-fatal r=alwu https://hg.mozilla.org/integration/autoland/rev/5f321c9a0add [Linux] Disable HW decoding in case of RDD/GPU HW decoder crash & decoding error r=alwu
Regressions: 1757470
No longer regressions: 1757470
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: