Closed Bug 485288 Opened 11 years ago Closed 9 years ago

Update load algorithm again

Categories

(Core :: Audio/Video, defect)

x86
All
defect
Not set

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: cpearce, Assigned: cpearce)

References

Details

Attachments

(5 files, 4 obsolete files)

The load algorithm has changed slightly again, due in part to our feedback. We should update our implementation.

http://lists.whatwg.org/pipermail/whatwg-whatwg.org/2009-March/019012.html
Flags: blocking1.9.1?
I'm only making this wanted, but we should still definitely do it.
Flags: blocking1.9.1? → wanted1.9.1+
Can we still do something here? I guess Chris P knows what needs to be done?
Reminder: when we fix this, someone needs to update the document at https://developer.mozilla.org/En/Using_audio_and_video_in_Firefox to reflect the new error event behaviour.
Blocks: 505158
Blocks: 478805
Duplicate of this bug: 581355
I'm going to work on this. We should get this into Firefox 4, otherwise we'll be non-compliant until Firefox 5 comes out. Requesting blocking.
Assignee: nobody → chris
blocking2.0: --- → ?
roc & bz (or anyone else!): I would very much appreciate some feedback on my approach to solving this problem here please...

In the resource selection algorithm [1], there's the concept of "asynchronously awaiting a stable state" [2], before continuing executing a "synchronous" section of the algorithm. A synchronous section never mutates the DOM, runs any script, or has any other side-effects, it just jumps to the front of its task queue. When we "asynchronously await a stable state", we're supposed to allow the current task to finish executing (e.g. completely execute the JS stack/context which called HTMLMediaElement.load() and return to the event loop) and then instead of executing the next task in the task queue, we're supposed to execute the synchronous sections of any methods "asynchronously awaiting a stable state" before other tasks in the task queue.

This is basically equivalent to having the algorithm push an event to continue its execution onto the *front* of the main thread event queue, except that we need to ensure that JS that was running in the context which triggered the load() call is finished before we run the sychronous section. This includes waiting until a modal alert() dialog has been dismissed, and so on.

I think I can implement this as follows:
* Create a function which uses an nsIJSContextStackIterator to get the lowest JSContext* in the JS stack. We consider this the "JS entry point".
* When we must asynchronously await a stable state, add an nsIThreadObserver on the main thread. This observer will also proxy all events to the previous observer. In the observer's nsIThreadObserver::AfterProcessNextEvent() implementation (which is called immediately after an event finishes execution on the observer's thread), if the JS entry point is null, then restore the old observer, and begin executing the synchronous section of the resource selection algorithm. If the JS entry point is null, we know that we've (possibly *just*) finished the task in which some JS called load(), and so we can begin executing the synchronous section of the resource selection algorithm. We're also not running any JS, so we can be sure that we can run other synchronous sections from other parts of the load algorithm if need be.
* The observer would probably have to be shared amongst all the media elements which were loading, so that if multiple elements load()ed in the same JS context, we'd not overwrite the observer and forget to resume some of the synchronous load algorithm sections, and so that we could resume the different synchronous sections from multiple load algorithm instances which are at different stages.

Does that sound reasonable? Thanks very much!

[1] http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#concept-media-load-algorithm
[2] http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#await-a-stable-state
> except that we need to ensure that JS that was running in the context which
> triggered the load() call is finished

Do we?  That's what I tried to clarify with my followup mails to the list...

> This includes waiting until a modal alert() dialog has been dismissed

Does that not spin the event queue per spec?  It _definitely_ spins the Gecko event queue.

> * Create a function which uses an nsIJSContextStackIterator to get the lowest
> JSContext* in the JS stack. We consider this the "JS entry point".

fwiw, I think we have functions like that around already.  Though which do you mean by "lowest", the oldest place where JS happened, or the most recent?  That is, which way is your stack oriented?
But the short story, I suspect, is that as long as our event queue setup doesn't _exactly_ match the spec's (which it doesn't), complete with per-page event queues, etc, we're not likely to be able to get this right in all cases....
Thanks for your comments bz!

(In reply to comment #8)
> 
> > This includes waiting until a modal alert() dialog has been dismissed
> 
> Does that not spin the event queue per spec?  It _definitely_ spins the Gecko
> event queue.
> 

In the case of an alert(), we're supposed to have "paused" the task and the event loop, but our user agent UI is still supposed to be responsive. See:
http://www.whatwg.org/specs/web-apps/current-work/multipage/timers.html#dom-alert
http://www.whatwg.org/specs/web-apps/current-work/multipage/webappapis.html#pause

This is different from spinning the event loop, which you asked about on the WHATWG mailing list. It sounds like we're supposed to run synchronous tasks when a task spins the event loop, but not when a task pauses the event loop. My implementation currently (incorrectly) treats these two cases as the same.


> > except that we need to ensure that JS that was running in the context which
> > triggered the load() call is finished
> 
> Do we?  That's what I tried to clarify with my followup mails to the list...

I think in the "paused event loop" case we're supposed wait for the JS to complete. In the "spin the event loop" case we don't have to. I asked on WHATWG, we'll find out soon.

> > * Create a function which uses an nsIJSContextStackIterator to get the lowest
> > JSContext* in the JS stack. We consider this the "JS entry point".
> 
> fwiw, I think we have functions like that around already.  Though which do you
> mean by "lowest", the oldest place where JS happened, or the most recent?  That
> is, which way is your stack oriented?

Sorry, by lowest I meant oldest, my stack conceptually grows upwards. I assume the oldest is the last non-null JSContext* in nsIJSContextStackIterator's sequence.


(In reply to comment #9)
> But the short story, I suspect, is that as long as our event queue setup
> doesn't _exactly_ match the spec's (which it doesn't), complete with per-page
> event queues, etc, we're not likely to be able to get this right in all
> cases....

Ok, I'm not sure how to distinguish between the "spin the event loop" and the "pause the event loop" case...
> My implementation currently (incorrectly) treats these two cases as the same.

The thing is... we have no differentiation between them in Gecko.

> I think in the "paused event loop" case we're supposed wait for the JS to
> complete. In the "spin the event loop" case we don't have to.

I believe this is correct.

> Ok, I'm not sure how to distinguish between the "spin the event loop" and the
> "pause the event loop" case...

We have no such distinction in Gecko.  We do have some instances of spinning the event loop that suppress certain kinds of event delivery to the web page... but I think we use those even in cases in which the spec calls for actually spinning the event loop.  Not sure, though.
I think we implement our spinning by having the spinning thread pump NS_ProcessNextEvent() until its goal is met.

Since we can't distinguish between the spin and pause cases, we can't implement this in a spec compliant way, so perhaps we should not try to wait until all JS is finished before running synchronous sections? Then at least we'll be compliant in the spin case, and we'll be non-compliant in the pause case in a beneficial way; the load will start quicker.
> I think we implement our spinning by having the spinning thread pump
> NS_ProcessNextEvent() until its goal is met.

That's correct (usually; sync xpcom proxies are a separate issue).

I think that just running the sync sections as soon as we're in the event loop makes total sense for now, with a bug filed to implement the spec's stuff at some point maybe.
(In reply to comment #7)
[...]
> * When we must asynchronously await a stable state, add an nsIThreadObserver on
> the main thread. This observer will also proxy all events to the previous
> observer. In the observer's nsIThreadObserver::AfterProcessNextEvent()
> implementation (which is called immediately after an event finishes execution
> on the observer's thread), if the JS entry point is null, then restore the old
> observer, and begin executing the synchronous section of the resource selection
> algorithm. If the JS entry point is null, we know that we've (possibly *just*)
> finished the task in which some JS called load(), and so we can begin executing
> the synchronous section of the resource selection algorithm. We're also not
> running any JS, so we can be sure that we can run other synchronous sections
> from other parts of the load algorithm if need be.

I've realised that I actually need to run the synchronous sections in the nsIThreadObserver::OnProcessNextEvent() implementation, not in nsIThreadObserver::AfterProcesNextEvent(). nsThread::ProcessNextEvent() creates a reference to its nsIThreadObserver when it starts, and it dispatches the notifications using that reference only; if the thread's nsIThreadObserver changes during execution an event, nsThread::ProcessNextEvent() won't notice, and it won't call AfterProcessNextEvent() on the new nsIThreadObserver, and so we won't call our synchronous sections until *after* the execution of the *next* event, which is not what we want. We need to run the synchronous sections in nsIThreadObserver::OnProcessNextEvent(), so that they're run before the next event is run.

Is it reasonable to assume that there will always be another event? Do I need to dispatch a dummy event just to ensure we always run the OnProcessNextEvent()?
> Is it reasonable to assume that there will always be another event?

Probably not reasonable to assume that it'll be any particular time... (though in practice I would think it would always happen quickly).
Attached patch Patch v1: Update load algorithm (obsolete) — Splinter Review
Updates load algorithm to match current spec (WHATWG r5290). We're not quite spec compliant WRT "awaiting a stable state" as per discussion above with bz, but we're otherwise compliant.

This is based on top of the preload attribute patches from bug 548523.

I also changed some Windows line endings to be UNIX line endings.
Attachment #466214 - Flags: review?(roc)
+  // URI of the resource we're attempting to load. When the decoder is
+  // successfully initialized, we rely on it to record the URI we're playing,
+  // and clear mCurrentSrc. Use GetCurrentSrc() to access the currentSrc
+  // attribute.
+  nsCOMPtr<nsIURI> mCurrentSrc;

Call this something else so it doesn't get confused with currentSrc? Maybe mLoadingSrc?

Seems slightly simpler to merge SyncSection with AsyncStableStateWaiter and use multiple daisy-chained AsyncStableStateWaiters.

I'm a bit worried about the way we manage mOldObserver here though. Whatever that observer is, if it decided to remove itself while an AsyncStableStateWaiter is attached, it can't and will either hang around or wipe out our AsyncStableStateWaiter.

I think it would be a lot less fragile to add a generic API to nsIAppShell, say RunInStableState(nsIRunnable* aRunnable), to run things in a stable state. Then we don't need to daisy-chain observers, the appshell can maintain an nsCOMArray of callbacks. That could also take care of shutdown; no need to post a dummy event, because we can run the stable-state callbacks at the end of nsBaseAppShell::Run. (RunInStableState can run the callback immediately if !mRunning, instead of adding it to the array.)

The rest of the actual load algorithm looks great.
Attached patch Patch v2 (obsolete) — Splinter Review
Patch using nsIAppShell::runInStableState instead of an nsIThreadObserver for the synchronous sections.
Attachment #466214 - Attachment is obsolete: true
Attachment #466505 - Flags: review?(roc)
Attachment #466214 - Flags: review?(roc)
Comment on attachment 466505 [details] [diff] [review]
Patch v2

+  if (mSyncSections.Count() > 0 && !mExiting) {

I don't think we should check mExiting here. We can and should still run these sections if we are processing events during shutdown.

Shouldn't we be running the sections in AfterProcessNextEvent? As-is, if a section is dispatched in the last event of the event loop, it won't be run.

+      nsCOMPtr<nsIRunnable> r = do_QueryInterface(mSyncSections[i]);

I think you can just do mSyncSections[i]->Run();
Actually I guess for the case where someone changes the video DOM and then calls alert(), we need to ensure that the video DOM stuff happens before the first event of the nested event loop.

So I think you need to process the mSyncSections both before and after each event. Probably should have a little helper method for that.
Attached patch Patch v3Splinter Review
Attachment #466505 - Attachment is obsolete: true
Attachment #466530 - Flags: review?(roc)
Attachment #466505 - Flags: review?(roc)
(In reply to comment #17)
> RunInStableState can run the callback immediately if
> !mRunning, instead of adding it to the array.

It seems mRunning is never true (at least on MacOSX) when we call RunInStableState(). This condition causes us to fail mochitests on MacOSX, so I'm going to have to remove it.
I see. We still need to make sure that at some point any stable-state callbacks are flushed and no more added.
(In reply to comment #23)
> I see. We still need to make sure that at some point any stable-state callbacks
> are flushed

How about posting a dummy event to the thread in RunInStableState() if !NS_HasPendingEvents(theThread)? nsBaseAppShell is doing something similar at the end of OnProcessNextEvent(), though I don't think we need to store a reference to the event in the mDummyEvent field like OnProcessNextEvent() does.

> and no more added.

Could we refuse to add a callback in RunInStableState() if mExiting is true? That gets set to true when the nsBaseAppShell gets a shutdown observation/notification. Or is there a better way we should decide to refuse any more callbacks?
(In reply to comment #24)
> How about posting a dummy event to the thread in RunInStableState() if
> !NS_HasPendingEvents(theThread)?

That's a great plan. If posting the event fails, the thread is probably done processing events so just run the callback synchronously.

> nsBaseAppShell is doing something similar at
> the end of OnProcessNextEvent(), though I don't think we need to store a
> reference to the event in the mDummyEvent field like OnProcessNextEvent() does.

Right.

> Could we refuse to add a callback in RunInStableState() if mExiting is true?
> That gets set to true when the nsBaseAppShell gets a shutdown
> observation/notification. Or is there a better way we should decide to refuse
> any more callbacks?

I think we'll be fine with the above. Add an assertion that the callback array is empty when the appshell is finally destroyed.
Changes the synchronous sections code as discussed above. Based on top of previous patch.
Attachment #467282 - Flags: review?(roc)
The appshell always runs on the main thread so we don't need mThread, and you can use IsMainThread etc. In fact I'd just remove the mThread assertion and mThread completely.
Revised sync section fix. I use/dispatch on the current thread, as is the convention in that file in general, but I'm asserting that we're on the main thread.
Attachment #467282 - Attachment is obsolete: true
Attachment #467303 - Flags: review?(roc)
Attachment #467282 - Flags: review?(roc)
I had test failures on TryServer in browser/base/content/test/test_contextMenu.html. Some video context menu items were not being disabled in case 11. This is because the new media load algorithm doesn't set the HTMLMediaElement.error field when none of the child source elements have valid resources, it moves networkState to NETWORK_NO_SOURCE and waits until the next source child is added whereupon it restarts the load.

This patch changes the context menu so that it also considers an element with no valid source children (i.e. in networkState NETWORK_NO_SOURCE) as not having a resource, so the context menu items for "Play" etc won't be enabled.
Attachment #470356 - Flags: review?(dolske)
Equivalent patch for SeaMonkey context menu.
I admit I don't fully understand this one. When I run 3 or more instances of test_media_selection at once, I get multiple warnings "Not same origin error!" (nsJSEnvironment.cpp:470), and error events passed to the test harness and thus test failures. It looks like the videocontrols loadedmetadata handler is running, calling startFadeIn, which calls startFade, which tries to use scrubber.valueChanged function, but that reference is null (videcontrols.xml:647). 

I think we're running the event handler after we've started to destroy the controls? If I alter the videoControls handleEvent() function to abort if the bindings have been detached, the error goes away. I don't understand what the bindings being detached means, but this change fixes my error.

I think we this must be tickled by the changes to the events and their dispatch made in my earlier patches.
Attachment #470645 - Flags: review?(dolske)
Attachment #470356 - Flags: review?(dolske) → review+
Comment on attachment 470645 [details] [diff] [review]
Patch 4: videocontrols - Don't process events when bindings detached

Huh, I don't know why I didn't have this |return| when this code was added in bug 462114. It seems pretty obvious in hindsight -- dumb mistake?

FWIW, there's a trivial little manual test there to exercise this case, would be good to see if controls still work with that.
Attachment #470645 - Flags: review?(dolske) → review+
Depends on: 593305
As this has now landed on m-c, need to fix for SM and TB
Attachment #470401 - Attachment is obsolete: true
Attachment #472137 - Flags: review?(neil)
Comment on attachment 472137 [details] [diff] [review]
Patch 3 - context menu fix for SM and TB [Checked In: Comment 36]

r=me for SeaMonkey.
Attachment #472137 - Flags: review?(neil) → review+
Attachment #472137 - Flags: review?(bugzilla)
Attachment #472137 - Flags: review?(bugzilla) → review+
Comment on attachment 472137 [details] [diff] [review]
Patch 3 - context menu fix for SM and TB [Checked In: Comment 36]

http://hg.mozilla.org/comm-central/rev/c7759a95da8f
Attachment #472137 - Attachment description: Patch 3 - context menu fix for SM and TB → Patch 3 - context menu fix for SM and TB [Checked In: Comment 36]
Depends on: 600020
You need to log in before you can comment on or make changes to this bug.