Last Comment Bug 629350 - (webvtt) Implement the track element
(webvtt)
: Implement the track element
Status: REOPENED
[parity-ie] [parity-chrome] [lang=c++]
: access, dev-doc-needed, feature, meta, verifyme
Product: Core
Classification: Components
Component: Audio/Video: Playback (show other bugs)
: Trunk
: All All
: -- enhancement with 52 votes (vote)
: ---
Assigned To: Rick Eyre (:reyre)
: Alexandra Lucinet, QA Mentor [:adalucinet]
Mentors: Ralph Giles (:rillian) needinfo me
http://www.w3.org/wiki/HTML/Elements/...
Depends on
Blocks: html5 690737 html5test 1275492 663647 880711
  Show dependency treegraph
 
Reported: 2011-01-27 06:38 PST by antistress
Modified: 2016-05-26 19:45 PDT (History)
93 users (show)
cdiehl: sec‑review+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
disabled
disabled
fixed
31+


Attachments
work in progress dump (80.57 KB, patch)
2012-03-16 12:22 PDT, Ralph Giles (:rillian) needinfo me
no flags Details | Diff | Review
Ralph's "work in progress dump" updated to Jan 21, 2012 (71.61 KB, patch)
2013-01-21 16:25 PST, David Humphrey (:humph)
no flags Details | Diff | Review

Description antistress 2011-01-27 06:38:49 PST
User-Agent:       Mozilla/5.0 (Windows NT 5.1; rv:2.0b10) Gecko/20100101 Firefox/4.0b10
Build Identifier: 

The track element (within HTML5) allows in particular to subtitle videos which is quite important especialy for non-English speakers since a lot of video on the Web are in English.
As a french speaker I'm particularly interested in subtitling web videos to spraed their messages.
At present time the best thing to do is to use JavaScript (Universal subtitles widget, JQuery-srt...) which is not optimal since Planet Websites and RSS feed reader don't allow JavaScript.
Being able to use an HTML element would be a great improvement for i18n and a11y.
The track HTML5 element seems perfect for that.
Thanks

http://blog.gingertech.net/2010/10/02/state-of-media-accessibility-in-html5/
http://www.w3.org/TR/html5/video.html#the-track-element


Reproducible: Always

Steps to Reproduce:
Use the track HTML5 element in a web page
Actual Results:  
track element is not taken into accound

Expected Results:  
it should allow to display (in particular) subtitles without the need og JavaScript
Comment 1 Kyle Huey [:khuey] (khuey@mozilla.com) 2011-01-27 12:05:25 PST
Couldn't find an existing bug on this, so confirming.
Comment 2 antistress 2011-06-08 11:06:56 PDT
seems to be a dup of Bug 620664
Comment 3 Matthew Gregan [:kinetik] 2011-06-08 21:47:37 PDT
Bug 620664 was about adding support for the element to the HTML5 parser.  This bug is about implementing the element's functionality.
Comment 4 antistress 2011-06-09 06:21:45 PDT
Thanks.
Could we expect having preliminary support for WEBVTT files in Firefox this year (Firefox 6 or 7) since that Bug 620664 has already done his part ?

That would help open video on the web to succeed. As a french user, and considering that a lot of videos are in english, i really would like the feature to be implemented ASAP. This is about about adding a new possibility to video on the web which seems more important to me than only making existant things faster (which is also great :-)
Comment 5 antistress 2011-09-03 06:48:38 PDT
I'm glad to see that somone (tnahk you Ralph Giles !) picked up this bug
I would love to help but i have no skill, sorry :-/

i don't know if it may help, but WebKit has done some progresses there :
https://bugs.webkit.org/show_bug.cgi?id=62882
https://bugs.webkit.org/show_bug.cgi?id=62881
Comment 6 alexander :surkov 2011-11-15 21:11:51 PST
Ralph, are you going to take care about accessibility in this bug or should I file new one for this?
Comment 7 Scoobidiver (away) 2012-01-12 08:29:41 PST
IE 10 supports the track element (see http://msdn.microsoft.com/en-us/library/hh673566%28v=vs.85%29.aspx).
For an example, see http://ie.microsoft.com/testdrive/Graphics/VideoCaptions/
Comment 8 antistress 2012-02-26 06:07:16 PST
Hi, is there a roadmap concerning this feature implementation ?
Comment 9 Ralph Giles (:rillian) needinfo me 2012-02-27 22:07:47 PST
Right. Roadmap.

I'm slowly implementing this feature. There's a branch on github if you want to contribute patches, but I don't quite have something useful yet. The plan:

* Write a toy parser, just enough for the first demo
  (done, https://github.com/rillian/webrtc)
* Implement the HTML5 track element in the parser
  (already done by hsivonen in bug 620664)
* Implement the track element as an XPCOM interface
  (partially done. Patches on https://github.com/rillian/firefox/commits/webvtt)
* Add an anonymous content div to nsVideoFrame for caption display
  (done, patches on the same branch)

- Hook the parser up to a texttrack decoder and display captions
  (not done; this will be the first demo)
- Rewrite the parser library
  (will require security review)
- Probably rewrite the patch set in response to review comments
- Attempt to land basic webvtt caption support
- Support text track enable/disable and language preference matching in the default controls
- Support for rendering instructions, javascript interfaces, accessibility hooks
- Support text tracks encapsulated in Ogg and WebM media files
Comment 10 Silvia Pfeiffer 2012-03-01 14:25:53 PST
Surely you mean: https://github.com/rillian/webvtt . Also, have you seen https://bitbucket.org/annevk/webvtt/src/4f0cac5f64eb/parser.html ?
Comment 11 Ralph Giles (:rillian) needinfo me 2012-03-01 20:01:22 PST
Indeed, I did. Thanks for the correction.

I had seen annevk's validator. Very cool!
Comment 12 Ralph Giles (:rillian) needinfo me 2012-03-16 12:22:27 PDT
Created attachment 606673 [details] [diff] [review]
work in progress dump

Demo patch. This isn't in any shape for use, it's just me figuring out how things work.

It does show (short!) webvtt overlays. I've been testing with https://people.xiph.org/~giles/2012/sample.html
Comment 13 Ralph Giles (:rillian) needinfo me 2012-03-16 12:25:05 PDT
(In reply to alexander :surkov from comment #6)
> Ralph, are you going to take care about accessibility in this bug or should
> I file new one for this?

I'd like to support accessibility as I go along, but I need help with what to do. I tried the demo patch with a screen reader, but it didn't seem to see the captions. Is there an aria-role I can supply to make them visible? Some nsFrame attribute?
Comment 14 Silvia Pfeiffer 2012-03-16 14:40:19 PDT
With WebVTT cues, if they were rendered into the normal dom, I'd suggest adding a aria-live attribute. That would get the screen reader to read it out as the text appears on screen (thus solving accessibility for type=description at the same time as for other types). Since I assume you are rendering the text into the shadow dom, you will have to figure out if you can make the elements in the shadow dom as accessible.

And, btw, the video controls are not accessible either - they would need a @tabindex to be reachable by keyboard and then roles on them such as "button" and @label or @aria-label to provide a short announcement text.
Comment 15 alexander :surkov 2012-03-20 01:21:49 PDT
(In reply to Silvia Pfeiffer from comment #14)
> Since I assume you
> are rendering the text into the shadow dom, you will have to figure out if
> you can make the elements in the shadow dom as accessible.

accessibility should pick it up since nsVideoFrame::AppendAnonymousContentTo is fitted. You could check accessible by DOM Inspector (Accessible Tree view). So all you need is you should put aria-live attribute on that anonymous div.

Silvia, does aria-live="assertive" sound reasonable?

also it'd be great if you can add a11y mochitest:
1) fix tree/test_media.html - http://mxr.mozilla.org/mozilla-central/source/accessible/tests/mochitest/tree/test_media.html?force=1
2) add elm/test_media_track.html to see if we show/hide events are fired for changed captions and container-live object attribute is exposed on event targets.

please let me know if you need more details

(In reply to Silvia Pfeiffer from comment #14)
> And, btw, the video controls are not accessible either - they would need a
> @tabindex to be reachable by keyboard and then roles on them such as
> "button" and @label or @aria-label to provide a short announcement text.

well, they aren't reachable by tabbing and we have a bug for that but it sounds as different issue, no?
Comment 16 Silvia Pfeiffer 2012-03-20 17:32:14 PDT
(In reply to alexander :surkov from comment #15)
> 
> Silvia, does aria-live="assertive" sound reasonable?

Absolutely. You don't want "polite" because then you might miss some text.


> (In reply to Silvia Pfeiffer from comment #14)
> > And, btw, the video controls are not accessible either - they would need a
> > @tabindex to be reachable by keyboard and then roles on them such as
> > "button" and @label or @aria-label to provide a short announcement text.
> 
> well, they aren't reachable by tabbing and we have a bug for that but it
> sounds as different issue, no?

Fair enough. :-)
Comment 17 Alex Vincent [:WeirdAl] 2012-08-04 11:04:23 PDT
I notice there hasn't been much work on this lately.  I'm rather interested in the metadata "kind" of track.  Might it be simpler to implement that first (before subtitles, etc.), and work on the rest later?
Comment 18 Ralph Giles (:rillian) needinfo me 2012-08-04 22:54:05 PDT
I don't think so. I mean, the actual rendering is a separate piece, but most of the work to be done between here and there is writing a better parser.

I'm not working on this at the moment though, so if you're interested in continuing the work in the current patch, feel free.
Comment 19 Ralph Giles (:rillian) needinfo me 2012-08-08 14:54:42 PDT
I'm happy to offer guidance is someone else wants to work on this in the meantime.
Comment 20 David Bolter [:davidb] 2012-08-13 09:48:31 PDT
Ralph do you have any bite sized pieces that could get reviewed and landed? (Prefed or #ifdef'd off I guess?)
Comment 21 Ralph Giles (:rillian) needinfo me 2012-08-13 13:19:28 PDT
The plumbing to the video element and the overlay div for displaying the captions should be ready for review and can land without needing a pref since there's no way to feed them from web content.

The next pieces that need work are a non-toy webvtt parser, and the TextTrack dom interface, which IIRC needs some cleanup before it's ready for review.
Comment 22 Ralph Giles (:rillian) needinfo me 2012-10-17 16:58:06 PDT
BTW, a class as Seneca college is working on this bug during the current semester.
Comment 23 Ralph Giles (:rillian) needinfo me 2013-01-09 13:36:41 PST
Ok, We want to get this going again, and this :humphd's Seneca college class is going to help.

First step is to get this patch up to date. Several class and file names have changed, so it isn't going to apply cleanly.

Then, we need to hide it behind a pref, split it into logical pieces, get them reviewed by the appropriate folks, and landed.

The current patch assumes you've checked out the webvtt parse into media/webvtt, but that's not going to work for in-tree code. One of the questions for review-time is how we should resolve that. Probably import the current release in a separate bug, then rely on the runtime pref to block access until we're more confident in the implementation.

We can land the display part separately, though, since it doesn't depend on the parser.
Comment 24 David Humphrey (:humph) 2013-01-09 13:39:53 PST
> Then, we need to hide it behind a pref, split it into logical pieces, get
> them reviewed by the appropriate folks, and landed.

I'd be interested to hear more about how you want to split it up.
Comment 25 Ralph Giles (:rillian) needinfo me 2013-01-09 16:27:38 PST
The three obvious pieces are: the nsVideoFrame changes to add the display div, the import and build support for the parser library in media/webvtt, and the WebVTTDecoder (would be better as TextDecoder?) implementation in content/media/webvtt.

The TextTrack, TextTrackCue, etc. stubs should be rewritten to use the new webidl compiler.
Comment 26 David Humphrey (:humph) 2013-01-11 11:53:44 PST
I started converting the IDL in Ralph's patch to use our new webidl bindings, and there is an issue with TextTrackCue as currently defined, which uses a union of primitive types for the line attribute.  I talked to bz and this is not allowed per the WebIDL spec.  I filed https://www.w3.org/Bugs/Public/show_bug.cgi?id=20651.

Given:

enum AutoKeyword { "auto" };

[Constructor(double startTime, double endTime, DOMString text)]
interface TextTrackCue : EventTarget {
...
           attribute (long or AutoKeyword) line;



Traceback (most recent call last):
  File "/Users/dave/Sites/repos/mozilla-central/config/pythonpath.py", line 56, in <module>
    main(sys.argv[1:])
  File "/Users/dave/Sites/repos/mozilla-central/config/pythonpath.py", line 48, in main
    execfile(script, frozenglobals)
  File "/Users/dave/Sites/repos/mozilla-central/dom/bindings/GlobalGen.py", line 78, in <module>
    main()
  File "/Users/dave/Sites/repos/mozilla-central/dom/bindings/GlobalGen.py", line 56, in main
    parserResults = parser.finish()
  File "/Users/dave/Sites/repos/mozilla-central/dom/bindings/parser/WebIDL.py", line 4148, in finish
    production.finish(self.globalScope())
  File "/Users/dave/Sites/repos/mozilla-central/dom/bindings/parser/WebIDL.py", line 552, in finish
    member.finish(scope)
  File "/Users/dave/Sites/repos/mozilla-central/dom/bindings/parser/WebIDL.py", line 2104, in finish
    t = self.type.complete(scope)
  File "/Users/dave/Sites/repos/mozilla-central/dom/bindings/parser/WebIDL.py", line 1376, in complete
    [self.location, t.location, u.location])
WebIDL.WebIDLError: error: Flat member types of a union should be distinguishable, Long is not distinguishable from AutoKeyword (Wrapper), TextTrackCue.webidl line 22:21
           attribute (long or AutoKeyword) line;
                     ^
<builtin type>

TextTrackCue.webidl line 22:30
           attribute (long or AutoKeyword) line;
Comment 27 Silvia Pfeiffer 2013-01-12 01:14:20 PST
Canvas has the same union approach: http://www.whatwg.org/specs/web-apps/current-work/#2dcontext . Might be worth checking how that got resolved.
Comment 28 David Humphrey (:humph) 2013-01-12 05:40:58 PST
We have webidl for CanvasRenderingContext2D using the new bindings now, but it does all of its unions without using primitive types.  That's the issue here.  We just need that fixed.
Comment 29 Chris Pearce (:cpearce) 2013-01-14 19:37:15 PST
(In reply to Ralph Giles (:rillian) from comment #25)
> The three obvious pieces are: the nsVideoFrame changes to add the display
> div, the import and build support for the parser library in media/webvtt,
> and the WebVTTDecoder (would be better as TextDecoder?) implementation in
> content/media/webvtt.

I don't think we need WebVTTDecoder to inherit from BuiltinDecoder (which has been renamed to MediaDecoder since this patch was first written).

MediaDecoder is designed to be the only decoder owned by the nsHTMLMediaElement, and includes a state machine and so on that assumes one contained audio and video track. We'd be better off having some kind of custom TextTrackDecoder object that manages the libwebvtt decoder and co-operates with the MediaDecoderStateMachine (or the nsHTMLMediaElement if we can do it at that level). Then we don't pull in all the unnecessary cruft that comes in with being a MediaDecoder subclass.
Comment 30 Mardeg 2013-01-14 19:58:29 PST
(In reply to Chris Pearce (:cpearce) from comment #29)
> We'd be better off having some kind of
> custom TextTrackDecoder object that manages the libwebvtt decoder and
> co-operates with the MediaDecoderStateMachine (or the nsHTMLMediaElement if
> we can do it at that level). Then we don't pull in all the unnecessary cruft
> that comes in with being a MediaDecoder subclass.
Is that for better combining of <audio> elements with webvtt also?
Comment 31 Chris Pearce (:cpearce) 2013-01-14 20:16:56 PST
I think not inheriting from Builtin/MediaDecoder would make it easier and simpler to implement.

And if the TextTrackDecoder (or whatever we call it) co-operates with MediaDecoderStateMachine or nsHTMLMediaElement we'll be have support for both <video> and <audio> elements, since both nsHTMLVideoElement and nsHTMLAudioElement inherit from nsHTMLMediaElement.

Does that answer your question?
Comment 32 Mardeg 2013-01-14 20:33:47 PST
(In reply to Chris Pearce (:cpearce) from comment #31)
> Does that answer your question?
Yes, thank you. My interest is in the placement of subtitles in relation to the <audio> element with and without the "controls" attribute, but I think that's more for bug 515898
Comment 33 David Humphrey (:humph) 2013-01-21 16:25:15 PST
Created attachment 704716 [details] [diff] [review]
Ralph's "work in progress dump" updated to Jan 21, 2012

I've rewritten Ralph's original patch (thanks for doing so much ground work) to use the new WebIDL bindings--I'm positive they aren't 100% correct yet, but this is hopefully not far off--among other things.  We'll fix it in post, as they say.  I just want to put this here for reference.  In order to build this patch, you have to also apply https://bug830879.bugzilla.mozilla.org/attachment.cgi?id=702428, see bug 830879.

At this point I want to hand the patch off to my students, so we'll be using this bug as a tracking bug, and filing smaller tickets in order to parallelize development.  We'll break bits of this patch out into those separate bugs.

NOTE: there is also work happening in https://github.com/mozilla/webvtt/pull/1 to get the libwebvtt C/C++ parser reviewed.
Comment 34 Curtis Koenig [:curtisk-use curtis.koenig+bzATgmail.com]] 2013-02-13 14:10:23 PST
:cdiehl - would peach be good for this? If so then :rforbes can work on it
[moved from 833403]
Comment 35 Christoph Diehl [:posidron] 2013-02-13 14:18:04 PST
Yes, I will mentor rforbes in fuzzing WebVTT.
Comment 36 David Humphrey (:humph) 2013-03-22 12:13:04 PDT
A note for us to follow-up on...

While rebasing today I hit an issue that required me to change the WebIDL for HTMLMediaElement for addTextTrack(), which is going to need a spec bug filed:

11:32 < humph> bz: the HTMLMediaElement has AddTextTrack( string, [optional] 
               string, [optional] string )
11:32 < humph> bz: which needs to call TextTrack() with strings
11:33 < bz> humph: that IDL is bogus
11:33 < bz> If the label argument was omitted, let label be the empty string.
11:33 < bz> If the language argument was omitted, let language be the empty string.
11:33 < bz> That's in the prose
11:33 < bz> should just be in the IDL and be done with it
11:33 < bz> So: TextTrack addTextTrack(DOMString kind, optional DOMString label = 
            "", optional DOMString language = "");
11:33 < bz> File spec bugs?
Comment 37 Silvia Pfeiffer 2013-03-23 01:14:15 PDT
Yes, sure, file a bug: https://www.w3.org/Bugs/Public/enter_bug.cgi?product=HTML%20WG or https://www.w3.org/Bugs/Public/enter_bug.cgi?product=WHATWG or both. :-)

"optional" is still relatively new to IDL, so we're slowly bringing it into the HTML spec.
Comment 38 dc.loco 2013-03-26 15:33:52 PDT
I'm a devotee of FOSS and working at Gallaudet University, the world's only accredited liberal arts university for deaf students (and we have an elementary school and secondary school for deaf students on the same campus), I'm planning to follow this thread pretty closely. I just started using WEBVTT with the misunderstanding that support was broader.  I'm not sure what I can do to help, other than test periodically...
Comment 39 Ralph Giles (:rillian) needinfo me 2013-03-26 15:42:09 PDT
(In reply to dc.loco from comment #38)
> I'm a devotee of FOSS and working at Gallaudet University

Hi there, thanks for your interest. Right now we most need coding and testing, so please do follow along if you're able to do either of those things!
Comment 40 wind 2013-04-10 04:37:13 PDT Comment hidden (off-topic)
Comment 41 vulcain 2013-04-10 07:53:16 PDT Comment hidden (off-topic)
Comment 42 :Ms2ger 2013-04-11 23:50:20 PDT
Duping forward to the bug with patches...

*** This bug has been marked as a duplicate of bug 833385 ***
Comment 43 Robert Kaiser (not working on stability any more) 2013-04-12 05:23:23 PDT
(In reply to :Ms2ger from comment #42)
> Duping forward to the bug with patches...
> 
> *** This bug has been marked as a duplicate of bug 833385 ***

Umm, shouldn't all the dependencies of this one be added to the other one as well, then? I did that for the html5 and html5test bugs, but it probably needs to be done for the rest as well.
Comment 44 Ralph Giles (:rillian) needinfo me 2013-04-12 08:21:44 PDT
We were using this as a tracking bug. Please leave it open.
Comment 45 Alexander Farkas 2013-05-17 00:29:03 PDT
I have developed a very simple track/textrack polyfill (http://jsfiddle.net/trixta/QZJTM/) and also a script for styleable controls (https://github.com/aFarkas/jMediaelement).

My problem with the current Track spec and implementations as a webdeveloper are the following:

1. We need a way to shrink the rectangel in which the cues are displayed using CSS. 
The webvtt features for positioning are not suitable for all usecases. As soon as we develope custom styleable controls, which are placed over the video element, we need a way to "reserve" this space for those controls. Due to the fact, that this is depending on the style of our webpage and our mediaplayer and not related to the contetnt of the vtt file, it has to be defined using CSS. For example:

::cuedisplay {
    top: 0;
    left: 0;
    right: 0;
    bottom: 40px; /* space is needed for overlaying custom styleable controls at the bottom of the video */
}

2. There is no 'trackmode' change event.
While we have a lot of special events for adding/removing tracks and cuechange/cueexit/cueenter. We do not have an event for the case, where a user changes the mode of a track. Again, this is for example needed for custom styleable controls. If the user changes the mode using the context menue from showing/disabled to disabled/showing we need an event for this to update for example the visual state of our controls. For example:

track.addEventListener('modechange', function(e){
    if(this.mode == 'showing'){
        //do something
    } else {
        //do something else    
    }
});

Currently all custom styleable mediaplayers with track support, removedo not relay on the native implementations and handle the "texttrack" display with script. This is a shame. :-(

I know this is mainly spec related, but you guys are bringing the web forward :-D
Comment 46 Caitlin Potter (:caitp) 2013-05-17 04:08:59 PDT
That's a great point Alexander Farkas, it would be a good for you to file a bug on http://dev.w3.org/html5/webvtt/ WRT this (and if possible, CC :rillian, :caitp, :reyre, and whoever else)
Comment 47 Caitlin Potter (:caitp) 2013-05-17 04:47:29 PDT
It might actually be better suited for http://www.w3.org/html/wg/drafts/html/master/embedded-content-0.html#the-track-element actually, since that requirement would probably be desirable for any timed text format
Comment 48 Alexander Farkas 2013-05-17 15:18:13 PDT
@Caitlin

Opened two bugs:
- https://www.w3.org/Bugs/Public/show_bug.cgi?id=22076
- https://www.w3.org/Bugs/Public/show_bug.cgi?id=22075

Hope I did everything right :-)
Comment 49 Silvia Pfeiffer 2013-05-17 19:43:52 PDT
Cool, I've cc-ed them to the WHATWG, so Ian can work on it, too (i.e. whoever gets to it faster).
Comment 50 Ralph Giles (:rillian) needinfo me 2014-01-15 10:14:53 PST
This is clearly your bug now, Rick. :-)
Comment 51 Rick Eyre (:reyre) 2014-01-15 10:41:19 PST
Thanks Ralph :-).
Comment 52 Alexandra Lucinet, QA Mentor [:adalucinet] 2014-03-21 01:42:23 PDT
WebVTT is missing from Beta release notes (please see http://www.mozilla.org/en-US/firefox/29.0beta/releasenotes/), although it's enabled by default on Firefox 29 beta 1. Any thoughts?
Comment 53 Kohei Yoshino [:kohei] 2014-04-07 13:05:13 PDT
Adding status flags as per Bug 981280.
Comment 54 Sylvestre Ledru [:sylvestre] 2014-04-09 07:35:05 PDT
WebVTT is going to be disabled for 29 and 30 (cf bug 981280).
Until we don't know in which release it is going to ship in, I cannot update the tracking flags accordingly...
Comment 55 Sylvestre Ledru [:sylvestre] 2014-05-07 02:15:36 PDT
Rick, can you confirm, as suggested in bug 981280, that we are going to ship it for 31?
Thanks
Comment 56 Rick Eyre (:reyre) 2014-05-07 07:11:52 PDT
Yep, we will be shipping in FF31. Finally! :)
Comment 57 Sylvestre Ledru [:sylvestre] 2014-05-07 07:19:33 PDT
Excellent! Added back to the release notes.

Note You need to log in before you can comment on or make changes to this bug.