Closed Bug 1773551 Opened 2 years ago Closed 2 years ago

Implement the new autoplay detection spec `getAutoplayPolicy`

Categories

(Core :: Audio/Video: Playback, task, P2)

task

Tracking

()

RESOLVED FIXED
110 Branch
Tracking Status
firefox110 --- fixed

People

(Reporter: alwu, Assigned: alwu)

References

(Blocks 2 open bugs, )

Details

(Keywords: dev-doc-complete)

Attachments

(7 files)

We will remove this old experiment feature and implement a new one based on the standard.

Keywords: dev-doc-needed

Filed an issue on Mozilla Standards Position for this API.

Severity: -- → S3
Priority: -- → P2

Depends on D164750

Depends on D164751

Attachment #9308404 - Attachment description: WIP: Bug 1773551 - part1 : move AutoplayPolicy to media namespace in order to prevent naming collision. → Bug 1773551 - part1 : move AutoplayPolicy to media namespace in order to prevent naming collision.
Attachment #9308406 - Attachment description: WIP: Bug 1773551 - part2 : implement the navigator autoplay policy API. → Bug 1773551 - part2 : implement the navigator autoplay policy API.
Attachment #9308407 - Attachment description: WIP: Bug 1773551 - part3 : add wpt idl test. → Bug 1773551 - part3 : add wpt idl test.
Attachment #9308408 - Attachment description: WIP: Bug 1773551 - part4 : add autoplay policy value wpt test. → Bug 1773551 - part4 : add autoplay policy value wpt test.
Attachment #9308409 - Attachment description: WIP: Bug 1773551 - part5 : add basic media element behavior wpt test. → Bug 1773551 - part5 : add basic media element behavior wpt test.
Attachment #9308582 - Attachment description: WIP: Bug 1773551 - part6 : fix incorrect result for checking autoplay policy for document. → Bug 1773551 - part6 : fix incorrect result for checking autoplay policy for document.
Attachment #9308583 - Attachment description: WIP: Bug 1773551 - part7 : add mochitests for checking autoplay policy. → Bug 1773551 - part7 : add mochitests for checking autoplay policy.
Pushed by alwu@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/94f645c223b9
part1 : move AutoplayPolicy to media namespace in order to prevent naming collision. r=media-playback-reviewers,padenot
https://hg.mozilla.org/integration/autoland/rev/3926274bd6d2
part2 : implement the navigator autoplay policy API. r=media-playback-reviewers,webidl,smaug,padenot
https://hg.mozilla.org/integration/autoland/rev/e75d6d87dc95
part3 : add wpt idl test. r=media-playback-reviewers,padenot
https://hg.mozilla.org/integration/autoland/rev/8d0e9396fbfe
part4 : add autoplay policy value wpt test. r=media-playback-reviewers,padenot
https://hg.mozilla.org/integration/autoland/rev/1ee2814ca94f
part5 : add basic media element behavior wpt test. r=media-playback-reviewers,padenot
https://hg.mozilla.org/integration/autoland/rev/464822ed96c0
part6 : fix incorrect result for checking autoplay policy for document. r=media-playback-reviewers,padenot
https://hg.mozilla.org/integration/autoland/rev/0d36cae8c5c6
part7 : add mochitests for checking autoplay policy. r=media-playback-reviewers,padenot
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/37584 for changes under testing/web-platform/tests
Upstream PR merged by moz-wptsync-bot
Regressions: 1806637

MDN work can be tracked here https://github.com/mdn/content/issues/23676

Blocks: 1812189
Blocks: 1812190
No longer blocks: 1812190
No longer blocks: 1812189

@Alistor, As per https://github.com/w3c/autoplay/issues/36#issue-1559134902 , parts 2/3, I'm trying to work out how well you can trust a getAutoplayPolicy(type) check, and based on that when/if you would use such a check (if ever) or instead just always check every element.

The spec indicates in the note here https://www.w3.org/TR/autoplay-detection/#query-by-a-media-type that if a check on type returns disallowed it may be that some elements/contexts might still allow autoplay. But then the disallowed result is "None of media, corresponding with the given type, are allowed to autoplay."

My "guess" is that

  • if the method returns allowed (or allowed-muted) for a type you're saying that you always expect it to be allowed - it is not going to the be the case where you'll change from allowed to disallowed due to some user interaction. So basically if you get those values back you can just assume them true in the session.
  • if the value returned is disallowed then you're saying that this is the default or "the policy" that applies generally to the page. However it might change or might not apply to all elements in the type -e.g. one of several elements might have been whitelisted through user interaction, so even though generally the page is disallowed, some elements might be allowed.
  • IN other words you don't need to recheck each element / context to be sure of its autoplay value except for the disallowed case.
  1. Is that about right - i.e. you can trust allowed/allowed-muted but not disallowed for above reasons.
  2. So it is possible for different elements/contexts in a document to have a different return value?
  3. If my assumptions above are all wrong, what is the point of having the type - presumably you'd always have to check all individual elements/contexts to be sure that you have set their behaviour correctly if you can't trust the single value returned.
  4. In example 2 spec talks about how you might have an element that is disallowed, but then the user clicks on it, so it is blessed, and could autoplay.
    • But at this point presumably the page has loaded so it would already have been blocked from autoplay.
    • So is this saying that if you click the element anywhere (not on some play control) that might be enough to say it is blessed, so it will autoplay automatically at that point. However the policy is still disallowed, even though the particular element is allowed?
    • But is there any similar "exception" for allowed or allowed-muted cases? I.e. are they "rough" like disallowed, or reliable?
Flags: needinfo?(alwu)

Thanks, Hamish, I will reply your comment on the issue later as well. Back to your questions here, first we need to know that the actual mechanism to block autoplay varies by different user agent, and it's hard to make all user agents to agree with using the same mechanism. Eg. Chrome uses their own media engagement index (chrome://media-engagement/) which is not available to other browsers. Therefore, Autoplay Policy Detection API wants to act as an interface to maximally generalize the difference between different mechanisms by returning simple answer.

if the method returns allowed (or allowed-muted) for a type you're saying that you always expect it to be allowed - it is not going to the be the case where you'll change from allowed to disallowed due to some user interaction. So basically if you get those values back you can just assume them true in the session.

This assumption is true for most of mechanism so far I know. The only exception is Firefox's transient user gesture policy (optional, not default), which would only allow autoplay happening within a certain time window. After that, the autoplay would be blocked again, which means the result will become disallowed later.

As that is the only case resulting in this situation and it's not common, we could discuss whether Firefox needs to keep this policy, or remove it in order to keep consistent with other user agents' policy.

if the value returned is disallowed then you're saying that this is the default or "the policy" that applies generally to the page. However it might change or might not apply to all elements in the type -e.g. one of several elements might have been whitelisted through user interaction, so even though generally the page is disallowed, some elements might be allowed.

Yes, the policy can be changed later. And also some elements in the page could be exception. Like what example 2 describes. This behavior currently happens Firefox's click-to-play policy (optional, not default).

IN other words you don't need to recheck each element / context to be sure of its autoplay value except for the disallowed case.

In general, yes, if the getAutoplayPolicy(type) returns allowed/allowed-muted, web developers should expect that the only possible change for the result would be allowed-muted -> allowed. The only exceptional case is what I mentioned above, Firefox's transient user gesture policy.

So it is possible for different elements/contexts in a document to have a different return value?

Yes, eg. example 2.

If my assumptions above are all wrong, what is the point of having the type - presumably you'd always have to check all individual elements/contexts to be sure that you have set their behaviour correctly if you can't trust the single value returned.

Your assumptions are basically correct, except those exceptions I mentioned above.

So is this saying that if you click the element anywhere (not on some play control) that might be enough to say it is blessed, so it will autoplay automatically at that point. However the policy is still disallowed, even though the particular element is allowed?

No, in example 2 what I'd like to express is if the play is triggered by user's activation. Eg. click a button or a interface which will trigger element's play method.

Here is one example you can check on Firefox.

  1. go to about:config and turn media.autoplay.blocking_policy to 2
  2. go to https://alastor0325.github.io/htmltests/autoplay_tests/iframe_a1_a2_a3.html (there will be three videos on the page)
  3. click play on ONLY one of those three videos
  4. you can see only the video you clicked is allowed to play, but other videos still remain blocked

But is there any similar "exception" for allowed or allowed-muted cases? I.e. are they "rough" like disallowed, or reliable?

Yes, Above example can also be applied to the situation where getAutoplayPolicy(type) returns allowed-muted.

However, during testing, I found out the Firefox current implementation for getAutoplayPolicy(type) is not correct. After step3, the result of getAutoplayPolicy(type) shouldn't be changed. But now it would always be changed to allowed and I will file a bug to fix it.

Feel free to ping/NI me if you have more questions. Thank you so much.

Flags: needinfo?(alwu) → needinfo?(hamishwillee)
See Also: → 1813822

Thanks very much Alastor. I think I understand this. But please confirm my paraphrasing:

  1. getAutoplayPolicy(type) returns the "default policy" for all items of a type assuming there is no user interaction.
    • This should not change across the life of a session even if individual elements become allowed due to specific user interaction
    • (I guess we don't need to consider that it might change if the policy for the whole site changed mid-session - say due to a permission-policy change)
  2. But on firefox (only, currently) you might change the status for a particular element/context by transient user interaction. This might cause the particular element associated with the interaction from autoplay disallowed to allowed-muted to allowed - right?, at which point it would play.
  3. So on most browsers you could just rely on getAutoplayPolicy(type) and apply the result for all the items of the type. But on FF you might have the case where later on a particular element could still autoplay because of transient user interaction, so it would return allowed.
  4. This last case doesn't feel like "autoplay", which is why it is so confusing. The page has already loaded so it hasn't autoplayed - you've basically touched the control to cause it to play.

The use cases for autoplay blocking are generally to avoid surprising audio on page load, and the point of this API is to allow users to modify their content to play muted if they know it is going to be blocked (or some other strategy). So getAutoplayPolicy(type) allows you to catch that particular content is disallowed or only will play if muted on page load, and format your content appropriately to allow autoplay.

Later on someone touches a particular control in FF changing autoplay policy for that element from allowed-muted to allowed. At this point the content could autoplay with sound? I guess the "bad case" here is that you might have made the content not have an audio track so the user will no longer be able to play the audio. They'd be fine if your action had just been to mute the content.

  1. Ultimately the question is do I recommend that developers?
    • just do the type check
    • always check every single element/item
    • check for allowed using type, but if not allowed, then check every single element
    • something else?

My gut feeling is that on firefox at least, you type check for a policy of allowed, and for any other value you would always need to check every element. At least this would be the case if your content should have audio but your removed it. Was this your thinking?

Flags: needinfo?(hamishwillee) → needinfo?(alwu)

(In reply to Hamish Willee from comment #17)

Thanks very much Alastor. I think I understand this. But please confirm my paraphrasing:

  1. getAutoplayPolicy(type) returns the "default policy" for all items of a type assuming there is no user interaction.
    • This should not change across the life of a session even if individual elements become allowed due to specific user interaction
    • (I guess we don't need to consider that it might change if the policy for the whole site changed mid-session - say due to a permission-policy change)

Let me explain more about the part of the user interaction defined to allow a page to autoplay, which is different from user agents.

Eg. Chrome uses Media Engagement Index (how often users watch media on a specific website) + the sticky user activation (whenever users click or type something on the website, that regards as activating the whole page and grant it a permission to start autoplay) Edge uses the sticky user activation but no Media Engagement Index.

Firefox have different policies but the default one is also the sticky user activation, but without Media Engagement Index.

Safari uses its own hybrid mechanism, which is similar to the sticky user activation + click to play policy on Firefox. Not like a normal sticky user activation which would activate the page wherever users click. Safari would only allow pages to autoplay when users explicitly trigger media's play method via event handlers. Eg. click on the play button/the video control interface to trigger play. Then, it would allow all media elements in the same page to play.

Let use the previous example I mentioned in the comment 16 to demonstrate the difference. If you follow the same steps on Safari, those videos won't be playing when you click on any arbitrary places on the page, video can only be started when you click on them. In addition, when one video starts playing, all other videos would also start playing. (not like Firefox, only the one you clicked would start playing)

Then back to your question,

This should not change across the life of a session even if individual elements become allowed due to specific user interaction

No, it could be changed across the life of a session. Eg. it could start returning disallowed first, then return allowed after the condition is met for the user agent.

  1. But on firefox (only, currently) you might change the status for a particular element/context by transient user interaction. This might cause the particular element associated with the interaction from autoplay disallowed to allowed-muted to allowed - right?, at which point it would play.

Yes, the result of the particular element associated with the interaction could be different from the overall window currently only happens on Firefox.

Btw, when you said context, do you refer to AudioContext or the window context? For AudioContext, the spec explicitly defines that the AudioContext should only be affected by the sticky user activation. That means Firefox would only block web audio when users are using sticky user activation policy. If users use other policies, web audio won't be blocked at all and all its related policy would be allowed.

  1. So on most browsers you could just rely on getAutoplayPolicy(type) and apply the result for all the items of the type. But on FF you might have the case where later on a particular element could still autoplay because of transient user interaction, so it would return allowed.

Yes, you are right. I already filed bug 1813822 to discuss if we should remove the transient user activation to reduce the confusion and the complexity of all possible scenario.

  1. This last case doesn't feel like "autoplay", which is why it is so confusing. The page has already loaded so it hasn't autoplayed - you've basically touched the control to cause it to play.

Do you refer to my last example in comment 17 for https://alastor0325.github.io/htmltests/autoplay_tests/iframe_a1_a2_a3.html?

That page is still an autoplay test page, because the script keeps calling video.play(). If you use the default policy on Firefox, or on Chrome and Edge. Clicking anywhere on the page would result in three videos starting playing immediately. (because the page has been activated by the sticky user activation and it's also the biggest drawback of the sticky user activation because it couldn't reflect the fact of whether users are really going to play media, or just simply interact with the page.)

The use cases for autoplay blocking are generally to avoid surprising audio on page load, and the point of this API is to allow users to modify their content to play muted if they know it is going to be blocked (or some other strategy). So getAutoplayPolicy(type) allows you to catch that particular content is disallowed or only will play if muted on page load, and format your content appropriately to allow autoplay.

Yes.

Later on someone touches a particular control in FF changing autoplay policy for that element from allowed-muted to allowed. At this point the content could autoplay with sound?

If the element becomes allowed then it can autoplay with sound.

I guess the "bad case" here is that you might have made the content not have an audio track so the user will no longer be able to play the audio. They'd be fine if your action had just been to mute the content.

Sorry could you elaborate this part more? Not sure if I get your point. Why user won't be able to play audio?

  1. Ultimately the question is do I recommend that developers?
    • just do the type check
    • always check every single element/item
    • check for allowed using type, but if not allowed, then check every single element
    • something else?

My gut feeling is that on firefox at least, you type check for a policy of allowed, and for any other value you would always need to check every element. At least this would be the case if your content should have audio but your removed it. Was this your thinking?

From the past discussion in MediaWG, the case web developers mentioned to use the type check is because they want to know if they're allowed to autoplay before creating any media element instance.

Based on that, I'd suggest web developers to use getAutoplayPolicy(type) if they haven't created any media element/audio context instance. Otherwise, use getAutoplayPolicy(element/context) to get the most accurate result. And always remember to address rejected play promise for the media element.

Thank you!

Flags: needinfo?(alwu) → needinfo?(hamishwillee)

Hi @Alistor

Thanks again - that's very comprehensive. In particular you have now answered the bit that was stumping me most - why do you need the method with type if the value can differ for elements in the page, and if it can differ over time. All the rest is great background too.

Below clarify the bits where I was confusing in my questions.

I guess the "bad case" here is that you might have made the content not have an audio track so the user will no longer be able to play the audio. They'd be fine if your action had just been to mute the content.

Sorry could you elaborate this part more? Not sure if I get your point. Why user won't be able to play audio?

So imagine I want to autoplay a video, but first I use getAutoplayPolicy(type) and get allowed-muted. I could create a media element that has no audio track at all or I could create one that has the track and mute it - either case will autoplay the video with the allowed-muted policy.

Now imagine the individual element changes to allowed (for whatever reason). If I created the original element using a video with no audio source it still can't play audio - even though it is now allowed. That's the bad case!

The only way I can catch this is by looking at the individual elements/media. Now that you have clarified that this is pretty much something I have to do anyway after creating media elements, all is good.

This last case doesn't feel like "autoplay", which is why it is so confusing. The page has already loaded so it hasn't autoplayed - you've basically touched the control to cause it to play.

Do you refer to my last example in comment 17 for https://alastor0325.github.io/htmltests/autoplay_tests/iframe_a1_a2_a3.html?

This is not something you need to worry about.

No. What I meant is that "as an English word" I think of autoplay as meaning "something that automatically plays without user interaction". I was confused by the fact that the interaction does not have to be with the element or any element (except maybe on safari) or FF in transient user interaction. Essentially you're doing something that changes the policy so that content that might previously not autoplay now can. That might be interaction with the media element, but it could just be site interaction as you indicated.

Flags: needinfo?(hamishwillee)

(In reply to Hamish Willee from comment #19)

So imagine I want to autoplay a video, but first I use getAutoplayPolicy(type) and get allowed-muted. I could create a media element that has no audio track at all or I could create one that has the track and mute it - either case will autoplay the video with the allowed-muted policy.

For this part, yes, it's ok to play an video without audio track/a muted video/a video with zero volume.

Now imagine the individual element changes to allowed (for whatever reason). If I created the original element using a video with no audio source it still can't play audio - even though it is now allowed. That's the bad case!

Sorry, I still don't follow this part. For those media elements which have been playing, they can keep playing muted if the policy is allowed-muted. If the policy changes to allowed, then all media should be able to autoplay.

Flags: needinfo?(hamishwillee)

Sorry, I still don't follow this part. For those media elements which have been playing, they can keep playing muted if the policy is allowed-muted. If the policy changes to allowed, then all media should be able to autoplay.

Yes it can autoplay but it will remain silent because as the result of the previous policy "allowed-muted" you created a video media with no audio track. Make sense?

So now because you handled the initial policy in a particular way, you can no longer autoplay with sound using that media. You have no choice but to reload new media. Which is why you need to know that the policy on this individual element changed, rather than the page-wide policy.

Flags: needinfo?(hamishwillee)

Actually I have thought of another question.

Based on that, I'd suggest web developers to use getAutoplayPolicy(type) if they haven't created any media element/audio context instance. Otherwise, use getAutoplayPolicy(element/context) to get the most accurate result.

You call getAutoplayPolicy(type) when you first render the page to decide the media format you need to get it to autoplay. You call getAutoplayPolicy(element/context) to get a more accurate result.

But how do you know that you need to call getAutoplayPolicy(element/context) again - i.e. how do you know it has changed?

I mean there doesn't appear to be a signal to say "autoplay policy changed" when someone does a media engagement or whatever. So the element is likely to autoplay if it can, be blocked if it can't following this change - but you won't know.

And always remember to address rejected play promise for the media element.

So is this just general advice because playing of any media might fail, or are you suggesting it is part of what might let me know that autoplay failed?
Or to put it another way, my code would be like:

  1. getAutoplayPolicy(type) on first page load
  2. watch the play promise. If I get a failure, as part of that check the reason and autoplay for the element and from that infer if I need to change the element so that it can autoplay?
Flags: needinfo?(alwu)

(In reply to Hamish Willee from comment #21)

Yes it can autoplay but it will remain silent because as the result of the previous policy "allowed-muted" you created a video media with no audio track. Make sense?

So now because you handled the initial policy in a particular way, you can no longer autoplay with sound using that media. You have no choice but to reload new media. Which is why you need to know that the policy on this individual element changed, rather than the page-wide policy.

Interesting, I never thought that. If so, I'd suggest instead of loading a source without audio track/or a silent audio track, they should just load the normal audible resource, and set the element muted or volume=0. Because if you expect users to interact with an audible media, it makes more sense to load that media directly.

I mean there doesn't appear to be a signal to say "autoplay policy changed" when someone does a media engagement or whatever. So the element is likely to autoplay if it can, be blocked if it can't following this change - but you won't know.

That is something intended. During the initial discussion way for this API, I remembered seeing people proposing to have some kind of event or attribute to notify website when autoplay is allowed. But that was rejected because it might cause website to abuse autoplay which is something we don't want to see.

So is this just general advice because playing of any media might fail, or are you suggesting it is part of what might let me know that autoplay failed?

It's just a general advice even if not having this API. I think the example will be similar with the one in the spec.

Flags: needinfo?(alwu)

@Alastor, thank you for the clarifications.

I mean there doesn't appear to be a signal to say "autoplay policy changed" when someone does a media engagement or whatever. So the element is likely to autoplay if it can, be blocked if it can't following this change - but you won't know.

That is something intended. During the initial discussion way for this API,

Sadly though, what this very much limits the use of the API, and means that the previously given advice actually *cannot be followed.

  • The part of the advice that can be followed is that you can check the autplay policy on page load and format your use appropriately.
  • The part of the advice that can't be followed is that you can't later check whether the policy for individual elements has changed. Or more specifically you can check, but I can't see that you would ever do so except by polling, and that would be bad for the general operation of the page.

It doesn't make any sense to me to provide the ability to check the autoplay policy for individual elements unless you can realistically use that feature (I don't count "polling" as realistic).

Am I missing something?

Interesting, I never thought that. If so, I'd suggest instead of loading a source without audio track/or a silent audio track, they should just load the normal audible resource, and set the element muted or volume=0. Because if you expect users to interact with an audible media, it makes more sense to load that media directly.

I completely agree. But the spec suggests that loading a source without the audio track is a "reasonable thing to do to conserve resources", but since there is no way to recover from that decision (i.e. to know that the autoplay policy has changed), I don't see a place where this can be anything other than bad advice (unless there is a caveat about the likely problem).

Not entirely sure how to present this all to end users, but really appreciate you helping me understand this!

Flags: needinfo?(alwu)

(In reply to Hamish Willee from comment #24)

Sadly though, what this very much limits the use of the API, and means that the previously given advice actually *cannot be followed.

  • The part of the advice that can be followed is that you can check the autplay policy on page load and format your use appropriately.
  • The part of the advice that can't be followed is that you can't later check whether the policy for individual elements has changed. Or more specifically you can check, but I can't see that you would ever do so except by polling, and that would be bad for the general operation of the page.

It doesn't make any sense to me to provide the ability to check the autoplay policy for individual elements unless you can realistically use that feature (I don't count "polling" as realistic).

Am I missing something?

I don't think polling is something the website should do as well (unless website really wants to annoying users by some unexpected autoplay audibie, which will be a bad UX)

The purpose of this API is not letting websites to autoplay whenever the page becomes allowed, which apparently might cause more unexpected audible autoplay. Websites should use this API before they want to play something, so that they won't need to rely on waiting for the result from an async promise via calling play(). For web audio, currently there even doesn't have anything they can expect to know. Unless endlessly waiting AudioContext state changes to running. In addition, there are already many workarounds in the wild for developers to detect autoplay which also shows providing them an easier API would be a good thing.

I completely agree. But the spec suggests that loading a source without the audio track is a "reasonable thing to do to conserve resources", but since there is no way to recover from that decision (i.e. to know that the autoplay policy has changed), I don't see a place where this can be anything other than bad advice (unless there is a caveat about the likely problem).

Not entirely sure how to present this all to end users, but really appreciate you helping me understand this!

I think the only paragraph mentioned inaudible media for loading in the spec is this part.

then web developers can replace audible media with inaudible media to keep media playing,

This didn't suggest web developers to load a media source which can only be used for inaudible media playback, it just said they can make them inaudible. If this causes any confusion then I can make it more clearly.

Flags: needinfo?(alwu)
See Also: → 1814985

@Alastor Absolutely agree that what you have done is a good thing, and that it makes it easier for developers to detect the autoplay policy on page load (or at some later point that they want to play their content that might otherwise be blocked).

My only concern was that autoplay policy can change after that point, and since you don't know about that, you can't reevaluate whether any configuration you did to make the content autoplay is still appropriate. I guess that is very much an edge case.

And yes, you're right about the last point; I'm not sure where I drew the inference that you might have video with no audio track.

The docs are pretty simple - you can see them here: https://github.com/mdn/content/pull/23978

Thanks again.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: