Encrypted subsamples with clear ranges near UINT16_MAX can overflow clear ranges during AnnexB conversion
Categories
(Core :: Audio/Video: Playback, defect, P2)
Tracking
()
People
(Reporter: janliska, Assigned: bryce)
Details
Attachments
(2 files)
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
jcristau
:
approval-mozilla-esr78+
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
jcristau
:
approval-mozilla-esr78+
|
Details | Review |
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36
Steps to reproduce:
- Go to https://drm-test.livesporttv.cloud/issue/index.php (can I send you credentials via email? The stream can't be public...)
- Play video stream and try to move the seek bar
Actual results:
Most of the times video stops playing
Expected results:
Video should start playing at given point.
Comment 1•5 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
•
|
||
We've updated handling around certain streams and it should improve compat with protected streams. I've retested the link in the first comment and believe it's now working. Could you retest (with Firefox nightly) and let me know if you see the same?
Thanks for update. I've tested nightly build with given stream and I am sorry to say that but the issue persists. Sometimes it looks good for a long time, but if you seek frequently eventually it stops. And sometimes it breaks right at the beginning.
Can I help you somehow with debugging? I can send you screen capture with issue, would it be helpful to you?
Assignee | ||
Comment 4•5 years ago
|
||
(In reply to janliska from comment #3)
Thanks for update. I've tested nightly build with given stream and I am sorry to say that but the issue persists. Sometimes it looks good for a long time, but if you seek frequently eventually it stops. And sometimes it breaks right at the beginning.
Can I help you somehow with debugging? I can send you screen capture with issue, would it be helpful to you?
If you have any logging that we could use to trace the specific series of actions that leads to the stall, that would be helpful. E.g. if there a specific point that fails when seeking because Fx fails to parse the data at that point, it would be useful to know about that and it could help us produce a more minimal and deterministic test case.
First of all, I would like to apologize for the delay in communication, nevertheless finding the exact error trigger took us longer than we expected.
We were able to find out that error happens in the moment of change of m4v init file. This occurs when changing ABR profile or during seeking over DASH <Period>.
For testing you can use the previously sent https://drm-test.livesporttv.cloud/issue/index.php (login and password was sent by 'janliska' 28 February 2020, in case of need I can send again).
Latest tests on 77.0.1 (64-bit) and Nightly 79.0a1 (2020-06-08) (64-bit) both for Windows (tested also on earlier versions since inception of this issue).
Would like to receive clarification of anything else? Please let me know.
Hello Bryce, is there any news on your side regarding this? As M pointed out some time ago, we trace down the issue to the change of m4v init files. Do you need any additional information from us to investigate?
Assignee | ||
Comment 7•5 years ago
|
||
If you're able to identify a the files that can be appended to trigger the issue it would be useful if you could either send them through so we can reproduce the issue or to set up a site that just serves those files to repro the error. In either case this would allow us to better isolate the specific issue.
(In reply to Bryce Seager van Dyk (:bryce) from comment #7)
If you're able to identify a the files that can be appended to trigger the issue it would be useful if you could either send them through so we can reproduce the issue or to set up a site that just serves those files to repro the error. In either case this would allow us to better isolate the specific issue.
Can I send you a demo via email? I cannot post this publicly. Thanks
(In reply to Bryce Seager van Dyk (:bryce) from comment #7)
If you're able to identify a the files that can be appended to trigger the issue it would be useful if you could either send them through so we can reproduce the issue or to set up a site that just serves those files to repro the error. In either case this would allow us to better isolate the specific issue.
I've just sent you details to bvandyk@mozilla.com
Assignee | ||
Comment 10•5 years ago
|
||
Thanks, confirming I received the details and will investigate.
Assignee | ||
Comment 11•5 years ago
|
||
I'm reproing by switching resolution of the streams. It would be useful to have a deterministic testcase if possible. E.g. one that served an init segment for a certain resolution, gave some media at that resolution, then switched and reliably fails.
In the mean time, the issue is that the Widevine CDM is failing to decrypt and decode the data provided after a resolution switch. The CDM itself is largely a black box, but it's giving the error code indicating a general purpose decryption failure. This suggests the CDM thinks it has the key to decrypt the data it receives, but is failing to decrypt when it actually tries.
Comment 12•5 years ago
|
||
(In reply to Bryce Seager van Dyk (:bryce) from comment #11)
I'm reproing by switching resolution of the streams. It would be useful to have a deterministic testcase if possible. E.g. one that served an init segment for a certain resolution, gave some media at that resolution, then switched and reliably fails.
In the mean time, the issue is that the Widevine CDM is failing to decrypt and decode the data provided after a resolution switch. The CDM itself is largely a black box, but it's giving the error code indicating a general purpose decryption failure. This suggests the CDM thinks it has the key to decrypt the data it receives, but is failing to decrypt when it actually tries.
I've just sent you an updated link via email.
Comment 13•5 years ago
|
||
(In reply to Bryce Seager van Dyk (:bryce) from comment #11)
I'm reproing by switching resolution of the streams. It would be useful to have a deterministic testcase if possible. E.g. one that served an init segment for a certain resolution, gave some media at that resolution, then switched and reliably fails.
In the mean time, the issue is that the Widevine CDM is failing to decrypt and decode the data provided after a resolution switch. The CDM itself is largely a black box, but it's giving the error code indicating a general purpose decryption failure. This suggests the CDM thinks it has the key to decrypt the data it receives, but is failing to decrypt when it actually tries.
Hello Bryce, was the last video nay helpful? It keeps failing quite reliably. Let me know if I can provide any more information.
Assignee | ||
Comment 14•5 years ago
|
||
Think I've located the source of the issue.
ISO/IEC 23001-7 specifies that the # clear bytes for subsamples are stored in a uint16 and we read that into a uint16. However, we end up expanding that range while performing AnnexB conversion -- which typically takes place on the first sample after we (re) init. If the number of clear bytes is near UINT16_MAX then this expansion can lead to an overflow, which is what is happening here.
The Widevine interface accepts a uint32_t for the # clear bytes values, possible to mitigate this issue. We should be able to mitigate this by storing the value as a uint32_t internally.
Comment 15•5 years ago
|
||
Related issue on Shaka Player's issue tracker: https://github.com/google/shaka-player/issues/2427
Thanks, Mozilla, for your work on this!
Assignee | ||
Comment 16•5 years ago
|
||
Add test code to ensure AnnexB conversions behave as we expect. This adds some
coverage for non-encrypted conversions that we only tested with broader tests
until now. It also adds a test to ensure we don't overflow our subsample sizes
when dealing with encrypted media with very large subsamples. This latter test
covers the issue seen in bug 1618529.
Assignee | ||
Comment 17•5 years ago
|
||
This avoids us risking an overflow when we convert encrypted media with
subsamples to AnnexB (since that conversion can grow the clear sizes of the
sample). See the test in the preceding patch for an example of how and why this
happens.
Depends on D92299
Comment 18•5 years ago
|
||
Comment 19•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a9ede779b827
https://hg.mozilla.org/mozilla-central/rev/c1a849c9d7bf
Updated•5 years ago
|
Comment 20•5 years ago
|
||
What's the impact of this bug? Do we need to uplift to branches? (I'm going to guess no at least for 82 given the S3 severity and where we are in the cycle.)
Assignee | ||
Comment 21•5 years ago
|
||
(In reply to Julien Cristau [:jcristau] from comment #20)
What's the impact of this bug? Do we need to uplift to branches? (I'm going to guess no at least for 82 given the S3 severity and where we are in the cycle.)
We'll fail to play certain encrypted media streams. I'm not sure about the prevalence of these streams. The packaging/encrypting software needs to behave in a certain way (using very large clear ranges on samples). I hadn't seen this before, and the packagers I'm aware of do the opposite and keep the clear ranges small[0]. However, there's clearly some packagers out there doing this.
We should uplift to ESR. I'd been thinking about requesting uplift for beta. The bug and fix itself are well understood and the fix should be a low regression risk (we widen some data from 16 bits -> 32 bits that would be widened later in our media pipeline anyway). I'll request uplift.
[0] Which makes creating test media for end to end tests of this tough, I'd need to do a custom build of shaka packager or bento4's encryptor.
Assignee | ||
Comment 22•5 years ago
|
||
Comment on attachment 9179374 [details]
Bug 1618529 - Store clear crypto subsample info in uint32 rather than uint16. r?jolin
Beta/Release Uplift Approval Request
- User impact if declined: Certain encrypted media streams will fail to playback. Specifically encrypted media that uses large clear (unencrypted) ranges. It's difficult to know the prevalence of such media, but given this bug report as well as reports on Google's shaka packager github (one is linked in the bug), people are encountering this issue.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: 1668099 - I think this will be needed to get the tests in this chain working.
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The changes made are fairly small in scope and well understood -- we widen and integer (16 -> 32 bits) that would be widened later anyway. We do it earlier to avoid an overflow. The changes have been verified both manually, and by adding a test which which allowed for a red green refactor.
- String changes made/needed: None.
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: Fixes us failing to play back encrypted media.
- User impact if declined: Certain encrypted media streams will fail to playback. Specifically encrypted media that uses large clear (unencrypted) ranges. It's difficult to know the prevalence of such media, but given this bug report as well as reports on Google's shaka packager github (one is linked in the bug), people are encountering this issue.
- Fix Landed on Version: 83
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The changes made are fairly small in scope and well understood -- we widen and integer (16 -> 32 bits) that would be widened later anyway. We do it earlier to avoid an overflow. The changes have been verified both manually, and by adding a test which which allowed for a red green refactor.
Note: I'm fairly sure bug 1668099 will be needed to uplift the gtest on this bug.
- String or UUID changes made by this patch: None
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 23•5 years ago
|
||
Comment on attachment 9179374 [details]
Bug 1618529 - Store clear crypto subsample info in uint32 rather than uint16. r?jolin
approved for 82.0b9 and 78.4.0esr
Updated•5 years ago
|
Comment 24•5 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/6b03f7b5d4da
https://hg.mozilla.org/releases/mozilla-beta/rev/5d7abb683531
Comment 25•5 years ago
|
||
bugherder uplift |
Comment 26•5 years ago
|
||
@Janliska Could you please confirm the fix on beta and esr? We don't have any credentials for the provided website, so if you could help us it would be greatly appreciated.
You can get the builds here:
Beta - https://archive.mozilla.org/pub/firefox/candidates/82.0b9-candidates/build1/
ESR - https://treeherder.mozilla.org/#/jobs?repo=mozilla-esr78&revision=e2fade7a3527980ce662d28adb015c943017284d&selectedTaskRun=eVLWRSGfR3i3gEfFxNfQ8g.0
This https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/eVLWRSGfR3i3gEfFxNfQ8g/runs/0/artifacts/public/build/target.dmg would be for macOS specifically.
Assignee | ||
Comment 27•5 years ago
|
||
I've reached out via email to confirm the fix with the organization that reported the issue. They've confirmed that this is fixed in nightly + beta.
Comment 28•5 years ago
|
||
I confirm here as well. We've tested intensively on both nightly and beta. It's running perfectly fine. The issue has been solved.
Comment 29•5 years ago
|
||
Would you be so kind to confirm that the bug is also fixed on 78.4.0ESR (https://archive.mozilla.org/pub/firefox/candidates/78.4.0esr-candidates/build2/)?
Thanks!
Comment 30•5 years ago
|
||
(In reply to Camelia Badau [:cbadau], Release Desktop QA from comment #29)
Would you be so kind to confirm that the bug is also fixed on 78.4.0ESR (https://archive.mozilla.org/pub/firefox/candidates/78.4.0esr-candidates/build2/)?
Thanks!
Yes. I confirm the bug is also fixed here.
Assignee | ||
Updated•5 years ago
|
Description
•