Closed Bug 1145052 Opened 10 years ago Closed 10 years ago

[RTSP] Seeking takes too much time to complete.

Categories

(Firefox OS Graveyard :: RTSP, defect)

x86_64
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(firefox40 fixed)

RESOLVED FIXED
Tracking Status
firefox40 --- fixed

People

(Reporter: jhao, Assigned: jhao)

References

Details

Attachments

(1 file, 4 obsolete files)

If you do seeking by dragging the cursor to desired position. It takes far too long for the seeking to complete, a few tens of seconds or even minutes.
Now even seeking by clicking takes very long time. It seems that to seek from time x to y, it would take time |y-x|. Benjamin said this happened before. Decoder state machine mistakenly thinks the time is still x, so it waited until the timer goes to y. Replay from the end of stream is also affected, since it is also a seeking action.
Summary: [RTSP] It takes too long to seek if dragging the cursor. → [RTSP] Seeking takes too much time to complete.
This bug is absolutely a regression. And its symptom heavily impacts on user experience. We should pay attention to this bug.
It seems that after seeking, AudioThread is stopped and never started again. It should be restarted in MaybeStartPlayback(), but MaybeStartPlayback() always returns early because mAudioPrerolling is true. |mAudioPrerolling| is true because DonePrerolling() is false. DonePrerolling() is false because GetDecodedDuration() is very small. GetDecodedDuration() is very small because mAudioEndTime != -1 mAudioEndTime was set to -1 in Reset() called by InitiateSeek(), but it was updated to its original value because AudioSink dispatched a task OnAudioEndTimeUpdate before it ended. Before bug 1148571, OnAudioEndTimeUpdate was synchronous, so we didn't have this problem.
Hi Bobby, May I ask for your advice about how to deal with the timing issue of asynchronous OnAudioEndTimeUpdate?
Flags: needinfo?(bobbyholley)
How about call DecodeTaskQueue().flush in Reset() function?
Is this the same issue as bug 1153344?
Flags: needinfo?(bobbyholley)
(In reply to Benjamin Chen [:bechen] from comment #5) > How about call DecodeTaskQueue().flush in Reset() function? It's a bad idea to flush task queues. You may end up flushing a task you needed. We've had bugs caused by this. I want to remove task queue flushing ASAP, so please don't add any more.
Also, DecodeTaskQueue().Flush() would not compile, because I made it non-flushable. Please do not change that. :-)
Also note - a better way to accomplish a lot of what you can do with flushing is use MediaPromises and Disconnect() them when you want to "flush". The tasks still execute, but the notification is dropped.
Hi Bobby, This bug severely affects RTSP function, so we want to solve it as soon as possible. I've discussed with JW, and he thought that using MediaPromise is a little bit weird. Promises usually resolves or rejects only once, but OnAudioEndTimeUpdate is more like an event handler, which will constantly be called. JW said he would replace the dispatched Runnable with a customized one, which holds an attribute e.g. mCancel and is remembered by AudioSink. Before AudioSink is about to shut down, it will change that Runnable's mCancel to true. Then, when the Runnable is executed, it can check mCancel to decide whether or not to actually call OnAudioEndTimeUpdate(). Do you think we should use MediaPromise? If so, what do you think it should look like?
Flags: needinfo?(bobbyholley)
Could you do a similar fix to what we did in bug 1153344?
Flags: needinfo?(jwwang)
Flags: needinfo?(jhao)
I don't think it will work for the runnable (wrapping MediaDecoderStateMachine::OnAudioSinkComplete) is already in the task queue after AudioSink::Shutdown returns.
Flags: needinfo?(jwwang)
Ah I see. The proper machinery to use here is the state-mirroring and state-watching machinery (bug 1144481 and bug 1144486), but that probably won't land for another week or so. JW's solution seems fine for the interim.
Flags: needinfo?(bobbyholley)
Thanks to Chris and Bobby for your advice. We will try JW's solution before the state mirroring machinery is done.
Assignee: nobody → jhao
Flags: needinfo?(jhao)
Bug 1144486 already has a patch to be reviewed. I'm not sure whether to wait for that patch, or try to get this workaround reviewed and landed.
Well, you'd need bug 1144481 really. And even then, we'd need to do some amount of additional work to make state mirroring work properly on the audio thread (which is kind of a special snowflake), at which point we may want to consider just removing the audio thread (bug 750596). So I think it's probably smart to do the workaround for now.
Comment on attachment 8594593 [details] [diff] [review] A cancelable runnable for OnAudioEndTimeUpdate Thanks for the update, Bobby :) JW said he'll give me feedback first before I ask Chris for review.
Attachment #8594593 - Flags: feedback?(jwwang)
Comment on attachment 8594593 [details] [diff] [review] A cancelable runnable for OnAudioEndTimeUpdate Review of attachment 8594593 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/AudioSink.cpp @@ +34,5 @@ > +} > + > +NS_IMETHODIMP > +AudioSink::OnAudioEndTimeUpdateTask::Run() { > + if (!mCanceled) { Need a lock to synchronize read/write of |mCanceled|. Also you need to clear reference to |mStateMachine| when canceled in order not to extend the life span of the state machine unexpectedly. @@ +219,5 @@ > mWritten += PlayFromAudioQueue(); > } > int64_t endTime = GetEndTime(); > if (endTime != -1) { > + mOnAudioEndTimeUpdateTask->mCanceled = false; mCanceled is initially false and will only set to true in AudioSink::Shutdown(). You don't need to reset it each time updating audio end time. @@ +220,5 @@ > } > int64_t endTime = GetEndTime(); > if (endTime != -1) { > + mOnAudioEndTimeUpdateTask->mCanceled = false; > + mOnAudioEndTimeUpdateTask->mEndTime = endTime; Too much detail exposed to AudioSink. You can have something like OnAudioEndTimeUpdateTask::Dispatch(int64_t aEndTime) to encapsulate the detail.
Attachment #8594593 - Flags: feedback?(jwwang) → feedback-
I discussed with JW personally. He reconsidered and thought that locks weren't needed.
Attachment #8594593 - Attachment is obsolete: true
Attachment #8595122 - Flags: feedback?(jwwang)
Comment on attachment 8595122 [details] [diff] [review] A cancelable runnable for OnAudioEndTimeUpdate Review of attachment 8595122 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/AudioSink.cpp @@ +35,5 @@ > +NS_IMETHODIMP > +AudioSink::OnAudioEndTimeUpdateTask::Run() { > + // Although this will be dispatched to state machine thread, > + // Cancel() is only called by Shutdown() which ensures the order for us > + if (mStateMachine) { There is no race in reading |mStateMachine| here because writes to |mStateMachine| always happen on the state machine thread as OnAudioEndTimeUpdateTask::Run() does. @@ +43,5 @@ > +} > + > +void > +AudioSink::OnAudioEndTimeUpdateTask::Dispatch(int64_t aEndTime) { > + mEndTime = aEndTime; It would be a data race since |mEndTime| is read/written on different threads. Make it an atomic should fix it. @@ +51,5 @@ > +void > +AudioSink::OnAudioEndTimeUpdateTask::Cancel() { > + // Although mStateMachine will be read in state machine thread, > + // Cancel() is only called by Shutdown() which ensures the order for us > + mStateMachine = nullptr; Since |mStateMachine| is reset after audio thread shutdown, there is no race here. ::: dom/media/AudioSink.h @@ +143,5 @@ > + > + class OnAudioEndTimeUpdateTask : public nsRunnable { > + public: > + int64_t mEndTime; > + nsRefPtr<MediaDecoderStateMachine> mStateMachine; These members should be private. @@ +145,5 @@ > + public: > + int64_t mEndTime; > + nsRefPtr<MediaDecoderStateMachine> mStateMachine; > + > + OnAudioEndTimeUpdateTask(MediaDecoderStateMachine *aStateMachine); Coding style: T* aFoo;
Attachment #8595122 - Flags: feedback?(jwwang) → feedback+
Hi Chris, Could you review this patch? Thanks very much.
Attachment #8595122 - Attachment is obsolete: true
Attachment #8595271 - Flags: review?(cpearce)
Note that I'm adding a generalized mechanism for cancelable runnables in bug 1155433. It might be a little while because I'm busy with other stuff, so let's get this patch landed. When I land it, I'll convert this stuff to use that infrastructure as prototype.
Comment on attachment 8595271 [details] [diff] [review] A cancelable runnable for OnAudioEndTimeUpdate Review of attachment 8595271 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/media/AudioSink.cpp @@ +142,5 @@ > { > mThread->Shutdown(); > mThread = nullptr; > MOZ_ASSERT(!mAudioStream); > + mOnAudioEndTimeUpdateTask->Cancel(); I think you should do the Cancel() before the mThread->Shutdown() call above, as nsThread::Shutdown() spins the thread's event loop while waiting for the thread to shutdown. The event loop spinning means that the OnAudioEndTimeUpdateTask could run before you have a chance to cancel it. In general, whenever the event loop spins you need to be careful, as unexpected things like this can and do happen. However if you Cancel() before the thread Shutdown(), you'll add a race when reading mStateMachine. I think your code is trying to be too clever, you're relying on the ordering of things. You should just make the code simpler, and use a mutex to prevent the race from hurting you. So please add a mutex to OnAudioEndTimeUpdateTask and hold it when reading/writing OnAudioEndTimeUpdateTask::mStateMachine and mEndTime.
Attachment #8595271 - Flags: review?(cpearce) → review-
I agree with Chris. Note that the mechanism in bug 1155433 will be mutex-based, and will allow "opportunistic" (i.e. non-deterministic) cancelling from threads other than the target thread. Cancellation can only be guaranteed from the target thread, but there are still use-cases for "cancel this if it hasn't run already" - specifically, it's a safer way to implement task queue flushing.
Thanks Chris, you're right. I shouldn't rely on the order of things. Please review this updated patch. Thanks in advance.
Attachment #8595271 - Attachment is obsolete: true
Attachment #8595726 - Flags: review?(cpearce)
Attachment #8595726 - Flags: feedback?(jwwang)
Comment on attachment 8595726 [details] [diff] [review] A cancelable runnable for OnAudioEndTimeUpdate Review of attachment 8595726 [details] [diff] [review]: ----------------------------------------------------------------- Thanks.
Attachment #8595726 - Flags: review?(cpearce) → review+
Attachment #8595726 - Flags: feedback?(jwwang) → feedback+
(In reply to Chris Pearce (:cpearce) from comment #23) > I think you should do the Cancel() before the mThread->Shutdown() call > above, as nsThread::Shutdown() spins the thread's event loop while waiting > for the thread to shutdown. The event loop spinning means that the > OnAudioEndTimeUpdateTask could run before you have a chance to cancel it. Interesting. I thought nsIThread::Shutdown is only about joining the target thread. So if our state machine thread is an nsIThread, OnAudioEndTimeUpdateTask could run before AudioSink::Shutdown() returns even though it is queued after the current task which calls AudioSink::Shutdown(), right? However, it is also possible for the audio thread to finish shutdown without running OnAudioEndTimeUpdateTask which is already in the event queue. On the other hand, our MediaTaskQueue contains a thread pool under the hood. It appears to me that event loop spinning doesn't change the execution order of tasks pushed to MediaTaskQueue and deadlock could happen if the audio thread is waiting for some task pending in MediaTaskQueue to be finished.
Flags: needinfo?(cpearce)
(In reply to JW Wang [:jwwang] from comment #27) > So if our state machine thread is an nsIThread State machine is now MediaTaskQueue running on top of the shared thread pool. > , OnAudioEndTimeUpdateTask > could run before AudioSink::Shutdown() returns even though it is queued > after the current task which calls AudioSink::Shutdown(), right? However, it > is also possible for the audio thread to finish shutdown without running > OnAudioEndTimeUpdateTask which is already in the event queue. > > On the other hand, our MediaTaskQueue contains a thread pool under the hood. > It appears to me that event loop spinning doesn't change the execution order > of tasks pushed to MediaTaskQueue and deadlock could happen if the audio > thread is waiting for some task pending in MediaTaskQueue to be finished.
Static check build failed on try server and reminded me that I should add keyword "explicit" before constructor.
Attachment #8595726 - Attachment is obsolete: true
Attachment #8595752 - Flags: review+
(In reply to JW Wang [:jwwang] from comment #27) > On the other hand, our MediaTaskQueue contains a thread pool under the hood. > It appears to me that event loop spinning doesn't change the execution order > of tasks pushed to MediaTaskQueue and deadlock could happen if the audio > thread is waiting for some task pending in MediaTaskQueue to be finished. Never mind. Such code pattern is just a footgun.
(In reply to JW Wang [:jwwang] from comment #27) > (In reply to Chris Pearce (:cpearce) from comment #23) Basically what bholley said; the statemachine thread is now a task queue, which has machinery in place to ensure that won't run stuff for the same queue at once. So spinning here should not be a problem now. I forgot that bholley made the state machine a task queue. This should largely make us immune to nsIThread::Shutdown() spinning its event loop. It also means my comments about switching to a lock aren't strictly needed, though I do still think it makese this patch/code clearer.
Flags: needinfo?(cpearce)
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [backout-asap]
This is causing a smoke test blocking crash so we need this backed out asap. Please see bug 1158692.
Sotaro is fixing the problem; https://bugzilla.mozilla.org/show_bug.cgi?id=1158692#c9 no need for backout.
Whiteboard: [backout-asap]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: