Closed Bug 592833 Opened 14 years ago Closed 14 years ago

Reduce thread usage of media elements

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla8

People

(Reporter: bjacob, Assigned: cpearce)

References

Details

(Whiteboard: [MemShrink:P1])

Attachments

(13 files, 12 obsolete files)

Backtraces for all threads when crash occurs 14 years ago Benoit Jacob [:bjacob] (mostly away) 481.88 KB, text/plain		Details
tracing thread creation 14 years ago Benoit Jacob [:bjacob] (mostly away) 47.74 KB, text/plain		Details
Patch 1 v1: Split thread lifetime control 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 17.48 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 2 v1: Move metadata decoding to decode thread 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 23.36 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 3 v1: Move seeking over to the decode thread 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 36.00 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 4 v1: Remove reader monitor 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 27.32 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 5 v1: Move mAudioStream creation to audio thread 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 25.48 KB, patch		Details \| Diff \| Splinter Review
Patch 6 v1: Run state machine in events 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 40.74 KB, patch		Details \| Diff \| Splinter Review
Patch 7 v1: Unify state machine threads 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 53.18 KB, patch		Details \| Diff \| Splinter Review
Patch 8 v1: Shutdown idle decode threads 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 18.54 KB, patch		Details \| Diff \| Splinter Review
Patch 9 v1: Update comments/documentation 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 22.81 KB, patch		Details \| Diff \| Splinter Review
Patch 1 v2: Split thread lifetime control 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 17.45 KB, patch	cpearce : review+	Details \| Diff \| Splinter Review
Patch 2 v2: Move metadata decoding to decode thread 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 22.97 KB, patch	cpearce : review+	Details \| Diff \| Splinter Review
Patch 3 v2: Move seeking over to the decode thread 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 35.91 KB, patch	cpearce : review+	Details \| Diff \| Splinter Review
Patch 4 v2: Remove reader monitor 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 27.35 KB, patch	cpearce : review+	Details \| Diff \| Splinter Review
Patch 5 v2: Move mAudioStream creation to audio thread 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 25.48 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 6 v2: Run state machine in events 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 41.33 KB, patch		Details \| Diff \| Splinter Review
Patch 6 v2 -w: Run state machine in events 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 28.77 KB, patch		Details \| Diff \| Splinter Review
Patch 7 v2: Unify state machine threads 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 51.36 KB, patch		Details \| Diff \| Splinter Review
Patch 8 v2: Shutdown idle decode threads 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 18.56 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 9 v2: Update comments/documentation 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 22.73 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 6 v3: Run state machine in events 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 41.37 KB, patch		Details \| Diff \| Splinter Review
Patch 6 v3 -w: Run state machine in events 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 28.81 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 7 v3: Unify state machine threads 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 51.76 KB, patch	roc : review+	Details \| Diff \| Splinter Review
Patch 10: Don't hold decoder monitor while audio stream does sync dispatch to main thread (on Android). 14 years ago Chris Pearce [:cpearce (Not reading bugmail)] 8.02 KB, patch	roc : review+	Details \| Diff \| Splinter Review

Benoit Jacob [:bjacob] (mostly away)

Reporter

Description

•

14 years ago

Attached file Backtraces for all threads when crash occurs — Details

Hi, This is coming from bug 557423. We are trying to understand why the Javascript port of Quake2 is running slowly and sometimes crashing. It's a page with a lot of <audio> elements. It turns out to have hundreds of threads running concurrently. Attaching 2 files: 1. backtraces for all threads when a crash occurs (does not occur everytime). Thread 1 is the active thread. The absence of debug info for it suggests that it is JITted code. 2. tracing thread creation. I just let firefox run until it started creating an unusually high number of threads, interrupted, set a breakpoint on nsThread::Init, got some backtraces.

Benoit Jacob [:bjacob] (mostly away)

Reporter

Comment 1

•

14 years ago

Attached file tracing thread creation — Details

Benoit Jacob [:bjacob] (mostly away)

Reporter

Updated

•

14 years ago

Attachment #471265 - Attachment description: tr → tracing thread creation

Benoit Jacob [:bjacob] (mostly away)

Reporter

Updated

•

14 years ago

Component: Web Services → Video/Audio

QA Contact: web-services → video.audio

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 2

•

14 years ago

We currently use 3 threads per media element: 1. The decode thread, which decodes ahead of the current playback position. 3. State machine thread, which manages what state we're in, and pushes decoded frames off to the layers system. 3. Audio 'push' thread, which pushes audio samples to the audio hardware. Linux's pulseaudio also creates a thread for each audio element. We need the decode on a separate thread for performance reasons. We're planing of rewriting the audio subsystem so that it's callback based, this will remove the audio thread (but not pulseaudio's thread on Linux I think). We're doing YUV conversion upon paint in the layers system right? So the state machine thread is really only setting the imagedata on the video element and managing time updates? If so it may be possible to refactor this thread away, but it would be a major refactoring though. Or perhaps we can pool the state machine thread across multiple media elements, as Brendan suggested.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

14 years ago

Summary: Audio-heavy page causes us to spawn hundreds of threads → Reduce thread usage of media elements

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

14 years ago

Depends on: 580531

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Updated

•

14 years ago

Blocks: 598870

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 3

•

14 years ago

We chatted about the relationship between this bug and bug 598870. IPC will require that a thread which has access to decoded frames+audio run an event loop, and it sounds like the new single state-machine thread will. Some other things that should make remoting easier - have logically distinct decode and playback thread pools. For single-process FF, they would be the same pool. For multi-process, the decode pool would live in content and the playback thread in the compositor. - so far, we've set things up so that all large allocations are done in content so that it bears most of the OOM risk. So when we have to do YUV->RGB conversion for basic layers, it would be nice to farm out the conversion task to the decode pool that would live in content, then forward converted frames to the compositor. It sounded like this was pretty close to how things work currently. Does that sound reasonable?

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 4

•

14 years ago

Yup, sounds reasonable.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

14 years ago

Assignee: nobody → chris

Status: NEW → ASSIGNED

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 5

•

14 years ago

The plan of attack is to remove all operations that can block off the state machine thread, and fold all state machine threads into a single thread whose main purpose is to (basically) do the AdvanceFrame() logic for every media element at the appropriate time. Specifically: 1. Move the metadata decoding to the decode thread. 2. Move the seeking implementation to the decode thread. We may then be able to remove the reader monitor. 3. We may need to move the nsAudioStream creation and destruction off the state machine thread. I think we an leave it there, but I need to verify that. When bug 623444 lands, we'll be able to remove the audio push thread per element, and then can probably change to creating the audio stream when playback starts and destroying it when either audio playback ends, or when we destroy the media element. 4. Factor out the DECODING, BUFFERING, COMPLETED, and SHUTDOWN states from nsBuiltinDecoderStateMachine::Run() into their own events, which we can schedule using timers on the state machine thread instead of having the thread run as a loop. 5. Change to using a single state machine thread which services all active media elements. This will reduce our thread usage by one third. This thread will then be responsible for advancing the current time, setting the current image on the image container at appropriate times, and triggering invalidation when it does so. Note YCbCr->RGB conversion (for the non HWACCEL case) will still happen on the decode thread. 6. Change to shutdown the decode thread when playback is paused and decode queues are full. We'll remain with one thread for decoding per media element, and consider changing that in future if necessary.

Timothy B. Terriberry (:derf)

Comment 6

•

14 years ago

(In reply to comment #5) > when it does so. Note YCbCr->RGB conversion (for the non HWACCEL case) will > still happen on the decode thread. It may be worth thinking about how to do this in a separate thread from decode (which I realize is the opposite of the stated intent of this bug). It's probably the easiest parallelization win you can get, and it'll become increasingly important on mobile as more dual-core devices become available if we don't get our HWACCEL story together there.

John O'Duinn [:joduinn] (please use "needinfo?" flag)

Updated

•

14 years ago

Blocks: 643171

Benoit Jacob [:bjacob] (mostly away)

Reporter

Updated

•

14 years ago

Blocks: 657115

Nicholas Nethercote [inactive]

Updated

•

14 years ago

Whiteboard: [MemShrink:P1]

Katelyn Gadd (:kael)

Comment 8

•

14 years ago

In the short term, it might be worthwhile to adjust the stack size ( using an API like http://msdn.microsoft.com/en-us/library/ms682453%28v=vs.85%29.aspx on windows, not sure if this can be done on other OSes ) for these threads so that they consume less address space. Doing that might provide improvements on bug 657115 for minimal risk.

Whiteboard: [MemShrink:P1]

(no longer active)

Updated

•

14 years ago

Whiteboard: [MemShrink:P1]

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 9

•

14 years ago

I'm almost ready to post a patch to merge all the state machine threads into one, which will cut our thread usage per media element by 1/3 on OSX and Windows, and 1/4 on Linux. It should be ready for review this or next week depending on how shaking out the last few bugs goes... Reducing the thread stack size is a good idea, as we reserve 10MB per thread stack on Linux, which kills us on 32bit Linux. IIRC, we use 1MB thread stacks on Windows and OSX, so the problem isn't so pronounced there, but we can probably still make some easy gains by reducing stack size there anyway.

Nicholas Nethercote [inactive]

Comment 10

•

14 years ago

(In reply to comment #9) > I'm almost ready to post a patch to merge all the state machine threads into > one, which will cut our thread usage per media element by 1/3 on OSX and > Windows, and 1/4 on Linux. Great! > Reducing the thread stack size is a good idea, as we reserve 10MB per thread > stack on Linux, which kills us on 32bit Linux. IIRC, we use 1MB thread > stacks on Windows and OSX, so the problem isn't so pronounced there, but we > can probably still make some easy gains by reducing stack size there anyway. I filed bug 664341 for this.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 11

•

14 years ago

Attached patch Patch 1 v1: Split thread lifetime control (obsolete) — Details — Splinter Review

* Splits control of lifetime of audio and decode threads up, so they can be controlled separately.

Attachment #540383 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 12

•

14 years ago

Attached patch Patch 2 v1: Move metadata decoding to decode thread (obsolete) — Details — Splinter Review

* Moves metadata decoding over to the decode thread.

Attachment #540384 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 13

•

14 years ago

Attached patch Patch 3 v1: Move seeking over to the decode thread (obsolete) — Details — Splinter Review

* Moves seeking over to the decode thread.

Attachment #540385 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 14

•

14 years ago

Attached patch Patch 4 v1: Remove reader monitor (obsolete) — Details — Splinter Review

* Now that all the data which the reader monitor protects is only used on the decode thread, remove the reader monitor.

Attachment #540386 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 15

•

14 years ago

Attached patch Patch 5 v1: Move mAudioStream creation to audio thread (obsolete) — Details — Splinter Review

* Destroying the audio stream on the state machine thread can block waiting on acquiring the audio monitor. With a shared state machine, we can't afford this, so move the creation and usage of audio stream over to the audio thread. This enables us to remove the audio monitor as well, provided we always acquire the decoder monitor whenever we create or destroy the audio stream on the audio thread, or whenever we use it off the audio thread. As long as we hold the decoder monitor on the state machine thread, we know the audio stream won't be destroyed while we're using it.

Attachment #540387 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 16

•

14 years ago

Attached patch Patch 6 v1: Run state machine in events (obsolete) — Details — Splinter Review

* Refactor nsBuiltinDecoderStateMachine::Run() to run as a series of events rather than as a single synchronous loop. * Instead of doing a NotifyAll() on the decoder monitor to wake up the state machine, we schedule an event to run instead.

Attachment #540388 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 17

•

14 years ago

Attached patch Patch 7 v1: Unify state machine threads (obsolete) — Details — Splinter Review

* Unify the state machine threads, so that all state machine events runners run on a single thread. * Each nsBuiltinDecoderStateMachine holds a refcount to the state machine thread, so the thread should be destroyed when the last state machine is. * We have to be very careful to ensure that we don't run more than one event for any given state machine at once, else we'll get inconsistent state. * The state machine holds a reference to the decoder, so that we can guarantee the decoder stays alive long enough for the state machine to shut down. This creates a cycle which we break when we shutdown.

Attachment #540389 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 18

•

14 years ago

Attached patch Patch 8 v1: Shutdown idle decode threads (obsolete) — Details — Splinter Review

* Destroy decode thread when it's decode buffers are full, and playback is paused. * Hold off creating the audio thread until playback begins. * Note we can't shutdown the audio thread when paused yet, as if we do so we'll end up discarding the audio which has been written to the nsAudioStream but not played yet. We should be able to refactor again to achieve this when libcubeb lands.

Attachment #540391 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 19

•

14 years ago

In general mochitests should pass at every patch in the queue, though I've not tried each patch individually on tryserver, only with them all applied. Tests are green on Try with all patches applied: http://tbpl.mozilla.org/?tree=Try&rev=01f17e830017

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 20

•

14 years ago

Comment on attachment 540383 [details] [diff] [review] Patch 1 v1: Split thread lifetime control Review of attachment 540383 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #540383 - Flags: review?(roc) → review+

cajbir (:cajbir)

Comment 21

•

14 years ago

Which patch updates the documentation in nsBuiltinDecoder.h so I can get an overview of how the new threading system works?

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 22

•

14 years ago

(In reply to comment #21) > Which patch updates the documentation in nsBuiltinDecoder.h so I can get an > overview of how the new threading system works? The patch that I'll write and post this afternoon updates the documentation. ;) In general I followed the plan I outlined in comment 5.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 23

•

14 years ago

Attached patch Patch 9 v1: Update comments/documentation (obsolete) — Details — Splinter Review

Updates comments in nsBuiltinDecoder*.h to match new architecture.

Attachment #540921 - Flags: review?(roc)

antistress

Comment 24

•

14 years ago

isn't that bug related ? Bug 617852 - Excessively large mmapped regions created by PulseAudio for each active stream

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 25

•

14 years ago

Comment on attachment 540384 [details] [diff] [review] Patch 2 v1: Move metadata decoding to decode thread Review of attachment 540384 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #540384 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 26

•

14 years ago

Comment on attachment 540385 [details] [diff] [review] Patch 3 v1: Move seeking over to the decode thread Review of attachment 540385 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #540385 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 27

•

14 years ago

Comment on attachment 540386 [details] [diff] [review] Patch 4 v1: Remove reader monitor Review of attachment 540386 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #540386 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 28

•

14 years ago

Comment on attachment 540387 [details] [diff] [review] Patch 5 v1: Move mAudioStream creation to audio thread Review of attachment 540387 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/nsBuiltinDecoderStateMachine.cpp @@ +692,5 @@ > } > return mReader->Init(cloneReader); > } > > +void nsBuiltinDecoderStateMachine::StopPlayback() StopPlayback doesn't cause the audio stream to be recreated anymore. If we start playing, then seek and resume playing, what causes the stream's pending samples to be flushed, now that we're not recreating the audio stream?

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 29

•

14 years ago

We call StopAudioThread() in DecodeSeek(), which destroys the old audio thread/stream. We'll start playing when the state machine thread runs in DECODING state again.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 30

•

14 years ago

Comment on attachment 540388 [details] [diff] [review] Patch 6 v1: Run state machine in events Review of attachment 540388 [details] [diff] [review]: ----------------------------------------------------------------- I think this patch would be much easier to review if it was diffed with -w. ::: content/media/nsBuiltinDecoderStateMachine.h @@ +411,5 @@ > // to call. > void DecodeThreadRun(); > > + nsresult ScheduleStateMachine(); > + nsresult ScheduleStateMachine(PRInt64 aUsecs); Need to document what this does, and especially what aUsecs does.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 31

•

14 years ago

Comment on attachment 540389 [details] [diff] [review] Patch 7 v1: Unify state machine threads Review of attachment 540389 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/nsBuiltinDecoderStateMachine.cpp @@ +183,5 @@ > + NS_ASSERTION(NS_IsMainThread(), "Should be on main thread."); > + // Equivalent to MOZ_COUNT_CTOR > + NS_LogCtor((void*)gStateMachineThread, > + "MediaStateMachineThread", > + sizeof(*gStateMachineThread)); I think this is unnecessary. Threads are already tracked, and this is confusing since the thread object has its own refcount internally. Instead of this refcount, I would just have a global counter of the number of nsBuiltinDecoderStateMachine instances, tracked in the nsBuiltinDecoderStateMachine constructor/destructor, and have CreateStateMachineThread/DestroyStateMachineThread methods that get called when the counter rises above zero and reaches zero. @@ +1038,5 @@ > } > + // Equivalent to MOZ_COUNT_CTOR > + NS_LogCtor((void*)mDecodeThread, > + "MediaDecodeThread", > + sizeof(mDecodeThread)); These naked NS_LogCtors are scary! Are they really needed? They're confusing since MediaDecodeThread is not a class. @@ +1856,5 @@ > + mIsRunningStateMachine = PR_FALSE; > + > + if (mRunStateMachineAgain && !mDispatchedStateMachineRun) { > + mDispatchedStateMachineRun = PR_TRUE; > + return NS_DispatchToCurrentThread(this); Is mRunStateMachineAgain really needed? Why not have ScheduleStateMachine just dispatch this event and set mDispatchedStateMachineRun to true? ::: content/media/nsBuiltinDecoderStateMachine.h @@ +219,5 @@ > + > + // The decoder object that created this state machine. We hold a strong > + // reference to the decoder to ensure it stays alive after it's been shutdown > + // by the media element so that the state machine to keep using the decoder's > + // monitor until the state machine has finished shutdown. This is accessed Grammar error somewhere in this sentence @@ +579,5 @@ > PRPackedBool mQuickBuffering; > > + // PR_TRUE if the shared state machine thread is currently running this > + // state machine. > + PRPackedBool mIsRunningStateMachine; Just mIsRunning. @@ +583,5 @@ > + PRPackedBool mIsRunningStateMachine; > + > + // PR_TRUE if we should run the state machine again once the current > + // state machine run has finished. > + PRPackedBool mRunStateMachineAgain; mRunAgain. Document that this doesn't cause us to run again immediately, just post another event to run this state machine later. @@ +590,5 @@ > + // imperative that we don't dispatch multiple events to run the state > + // machine at the same time, as our code assume all events are synchronous. > + // If we dispatch multiple events, the second event can run while the > + // first is shutting down a thread, causing inconsistent state. > + PRPackedBool mDispatchedStateMachineRun; mDispatchedRunEvent?

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 32

•

14 years ago

(In reply to comment #30) > Comment on attachment 540388 [details] [diff] [review] [review] > Patch 6 v1: Run state machine in events > > Review of attachment 540388 [details] [diff] [review] [review]: > ----------------------------------------------------------------- > > I think this patch would be much easier to review if it was diffed with -w. > > ::: content/media/nsBuiltinDecoderStateMachine.h > @@ +411,5 @@ > > // to call. > > void DecodeThreadRun(); > > > > + nsresult ScheduleStateMachine(); > > + nsresult ScheduleStateMachine(PRInt64 aUsecs); > > Need to document what this does, and especially what aUsecs does. I add a comment for ScheduleStateMachine() in the next patch.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 33

•

14 years ago

(In reply to comment #31) > Comment on attachment 540389 [details] [diff] [review] [review] > Patch 7 v1: Unify state machine threads > > Review of attachment 540389 [details] [diff] [review] [review]: > ----------------------------------------------------------------- > > ::: content/media/nsBuiltinDecoderStateMachine.cpp > @@ +183,5 @@ > Instead > of this refcount, I would just have a global counter of the number of > nsBuiltinDecoderStateMachine instances, tracked in the > nsBuiltinDecoderStateMachine constructor/destructor, and have > CreateStateMachineThread/DestroyStateMachineThread methods that get called > when the counter rises above zero and reaches zero. Ah yes, I can do this now. In a previous iteration I needed to factor the release code out, but I don't need to now. > > @@ +1038,5 @@ > > } > > + // Equivalent to MOZ_COUNT_CTOR > > + NS_LogCtor((void*)mDecodeThread, > > + "MediaDecodeThread", > > + sizeof(mDecodeThread)); > > These naked NS_LogCtors are scary! Are they really needed? They're confusing > since MediaDecodeThread is not a class. They make it easy to distinguish *which* thread is leaking. I found this useful. I added this because on my system the video mochitests leak an nsIThread (and a bunch of other stuff) about 40% of the time anyway (with or without my patches) so there's no way to distinguish that nsIThread from the one of the decode threads (or the state machine thread for that matter) unless I do this. Is there a better way to distinguish leaking objects? > > @@ +1856,5 @@ > > + mIsRunningStateMachine = PR_FALSE; > > + > > + if (mRunStateMachineAgain && !mDispatchedStateMachineRun) { > > + mDispatchedStateMachineRun = PR_TRUE; > > + return NS_DispatchToCurrentThread(this); > > Is mRunStateMachineAgain really needed? Why not have ScheduleStateMachine > just dispatch this event and set mDispatchedStateMachineRun to true? If we have more than one event to run a state machine dispatched at the same time, and the first event shuts down one of the helper threads, that event's execution will block, but the other event can still run. The other event will put the state machine into an inconsistent state. Ensuring that only one event is active for a state machine at one time ensures that our logic can assume everything for that state machine is synchronous, which is simpler.

Nicholas Nethercote [inactive]

Comment 34

•

14 years ago

> I added this because on my system the video mochitests leak an nsIThread > (and a bunch of other stuff) about 40% of the time anyway (with or without > my patches) Is there a bug filed about that?

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 35

•

14 years ago

Attached patch Patch 1 v2: Split thread lifetime control — Details — Splinter Review

Patch 1 unbitrotten.

Attachment #540383 - Attachment is obsolete: true

Attachment #542992 - Flags: review+

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 36

•

14 years ago

Attached patch Patch 2 v2: Move metadata decoding to decode thread — Details — Splinter Review

Unbitrotten.

Attachment #540384 - Attachment is obsolete: true

Attachment #542993 - Flags: review+

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 37

•

14 years ago

Attached patch Patch 3 v2: Move seeking over to the decode thread — Details — Splinter Review

Unbitrotten...

Attachment #540385 - Attachment is obsolete: true

Attachment #542994 - Flags: review+

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 38

•

14 years ago

Attached patch Patch 4 v2: Remove reader monitor — Details — Splinter Review

Unbitrotten

Attachment #540386 - Attachment is obsolete: true

Attachment #542995 - Flags: review+

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 39

•

14 years ago

Attached patch Patch 5 v2: Move mAudioStream creation to audio thread — Details — Splinter Review

Patch 5 unbitrotten. Roc, see my comment 29 for a response to the question you asked about this patch. Why doesn't bugzilla add those who post review comments to the CC list?!?

Attachment #540387 - Attachment is obsolete: true

Attachment #542996 - Flags: review?(roc)

Attachment #540387 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 40

•

14 years ago

Attached patch Patch 6 v2: Run state machine in events (obsolete) — Details — Splinter Review

Patch 6 unbitrotten. I'll post an `diff -w` patch for easier review.

Attachment #540388 - Attachment is obsolete: true

Attachment #540388 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 41

•

14 years ago

Attached patch Patch 6 v2 -w: Run state machine in events (obsolete) — Details — Splinter Review

Patch 6 v2, `diff -w` for easier review. I removed the COUNT_CTOR equivalence, it's not precisely what I want, it doesn't track the object's lifetime, just our reference to it, which isn't actually that useful.

Attachment #542998 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 42

•

14 years ago

Attached patch Patch 7 v2: Unify state machine threads (obsolete) — Details — Splinter Review

Patch 7 unbitrotten, and review comments addressed.

Attachment #540389 - Attachment is obsolete: true

Attachment #542999 - Flags: review?(roc)

Attachment #540389 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 43

•

14 years ago

Attached patch Patch 8 v2: Shutdown idle decode threads — Details — Splinter Review

Patch 8 unbitrotten.

Attachment #540391 - Attachment is obsolete: true

Attachment #543000 - Flags: review?(roc)

Attachment #540391 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 44

•

14 years ago

Attached patch Patch 9 v2: Update comments/documentation — Details — Splinter Review

Patch 9 unbitrotten. Yes, it was bitrotted. That's a real word. Look it up.

Attachment #540921 - Attachment is obsolete: true

Attachment #543002 - Flags: review?(roc)

Attachment #540921 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

14 years ago

Attachment #543002 - Attachment is patch: true

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 45

•

14 years ago

Comment on attachment 542996 [details] [diff] [review] Patch 5 v2: Move mAudioStream creation to audio thread Review of attachment 542996 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #542996 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 46

•

14 years ago

Comment on attachment 543002 [details] [diff] [review] Patch 9 v2: Update comments/documentation Review of attachment 543002 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #543002 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 47

•

14 years ago

Comment on attachment 542998 [details] [diff] [review] Patch 6 v2 -w: Run state machine in events Review of attachment 542998 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/nsBuiltinDecoder.cpp @@ +252,5 @@ > "Must have state machine to start state machine thread"); > if (mStateMachineThread) { > return NS_OK; > } > + nsresult res = NS_NewThread(getter_AddRefs(mStateMachineThread)); We're probably going to want a thread pool here, but this'll do for now. ::: content/media/nsBuiltinDecoder.h @@ +570,5 @@ > // Ensures the state machine thread is running, starting a new one > // if necessary. > nsresult StartStateMachineThread(); > > + nsresult CreateStateMachineThread(); Need comments to explain the difference between this and StartStateMachineThread, and why both are needed. ::: content/media/nsBuiltinDecoderStateMachine.cpp @@ +1678,5 @@ > stream->Unpin(); > return res; > } > + > +void RunStateMachine(nsITimer *aTimer, void *aClosure) { static? @@ +1714,5 @@ > + // Ensure the state machine thread is alive; we'll be running on it! > + nsresult res = mDecoder->CreateStateMachineThread(); > + if (NS_FAILED(res)) return res; > + > + nsCOMPtr<nsIEventTarget> target = do_QueryInterface(mDecoder->mStateMachineThread); Why do you need a do_QI here? An implicit static cast should work fine.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 48

•

14 years ago

Comment on attachment 542999 [details] [diff] [review] Patch 7 v2: Unify state machine threads Review of attachment 542999 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/nsBuiltinDecoderStateMachine.cpp @@ +223,4 @@ > { > MOZ_COUNT_CTOR(nsBuiltinDecoderStateMachine); > + NS_ASSERTION(NS_IsMainThread(), "Should be on main thread."); > + if (gStateMachineRefCount == 0) { Call it gStateMachineCount. @@ +1826,5 @@ > + if (mIsRunning) { > + mRunAgain = PR_TRUE; > + } else if (!mDispatchedRunEvent) { > + mDispatchedRunEvent = PR_TRUE; > + gStateMachineThread->Dispatch(this, NS_DISPATCH_NORMAL); Why not just run an iteration of the state machine right here now?

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 49

•

14 years ago

Comment on attachment 543000 [details] [diff] [review] Patch 8 v2: Shutdown idle decode threads Review of attachment 543000 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/media/nsBuiltinDecoderStateMachine.h @@ +600,5 @@ > // If we dispatch multiple events, the second event can run while the > // first is shutting down a thread, causing inconsistent state. > PRPackedBool mDispatchedRunEvent; > > + // PR_TRUE if the decode thread has gone filled its buffers and is now "has gone filled its buffers"?

Attachment #543000 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Updated

•

14 years ago

Blocks: 668449

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 50

•

14 years ago

Attached patch Patch 6 v3: Run state machine in events — Details — Splinter Review

Patch 6 with review comments addressed.

Attachment #542997 - Attachment is obsolete: true

Attachment #542998 - Attachment is obsolete: true

Attachment #542998 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 51

•

14 years ago

Attached patch Patch 6 v3 -w: Run state machine in events — Details — Splinter Review

Patch 6 `diff -w` for easier review.

Attachment #543279 - Flags: review?(roc)

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 52

•

14 years ago

Attached patch Patch 7 v3: Unify state machine threads — Details — Splinter Review

Patch 7 with review comments addressed.

Attachment #542999 - Attachment is obsolete: true

Attachment #543280 - Flags: review?(roc)

Attachment #542999 - Flags: review?(roc)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 53

•

14 years ago

Comment on attachment 543279 [details] [diff] [review] Patch 6 v3 -w: Run state machine in events Review of attachment 543279 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #543279 - Flags: review?(roc) → review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 54

•

14 years ago

Comment on attachment 543280 [details] [diff] [review] Patch 7 v3: Unify state machine threads Review of attachment 543280 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #543280 - Flags: review?(roc) → review+

cajbir (:cajbir)

Comment 55

•

14 years ago

The patch 6 and patch 6 -w versions seem to to result in different nesting of some closing '}' amongst other things. Which is the correct patch that should be applied? For example, see the 'case DECODER_STATE_DECODING_METADATA' in nsBuiltinDecoderStateMachine::Run. In the '-w' version the closing '}' for the case is nested incorrectly.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 56

•

14 years ago

I will be landing "patch 6". Roc asked me to upload a diff -w patch to make it easier for him to review. diff -w patches don't include whitespace changes, which is why you're not seeing correct indentation in some instances with the -w patch applied.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 58

•

14 years ago

Whiteboard: [MemShrink:P1] → [MemShrink:P1][inbound]

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

14 years ago

Target Milestone: --- → mozilla8

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 59

•

14 years ago

Backed out of inbound due to suspected android crashtests permaorange. http://hg.mozilla.org/integration/mozilla-inbound/rev/e95d8bef125f http://hg.mozilla.org/integration/mozilla-inbound/rev/fcf2cf152e13 This will be fun to solve. http://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=04c5bfe94282

Mounir Lamouri (:mounir)

Updated

•

14 years ago

Whiteboard: [MemShrink:P1][inbound] → [MemShrink:P1]

Target Milestone: mozilla8 → ---

:Ms2ger (he/him; ⌚ UTC+1/+2)

Comment 60

•

14 years ago

Btw, the style guide demands "rv" rather than "res" for nsresults.

Benoit Jacob [:bjacob] (mostly away)

Reporter

Comment 61

•

14 years ago

There's something I still don't understand about the general approach here. If I understand correctly, these patches reduce the number of threads per media element to one, but that's still one thread per media element so if a page has 300 media elements, that's 300 threads. What's the point of using 300 threads on a 4-core machine? Wouldn't it be better to limit the number of concurrent media threads to the number of logical CPUs and use a queue?

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 62

•

14 years ago

Yes, an excellent question, ideally we'd use a thread pool for the decoders. But one step at a time eh? With these patches we now have one less thread per media element, and we shutdown threads while we're not decoding (basically when not playing). But we still have 2 threads per *playing* media element (3 on Linux). We should be able to get down to one thread per element once bug 623444 has landed. The hard part is our decoders assume they run synchronously. But decoders can block while reading data when they run out of downloaded data to decode. Their reads will wait for some data to download before returning and continuing. If a decoder's download stalls, such a blocked read could take a long time to return, so in that case we need to switch decode to another decoder. Then the question becomes, how can we switch to another decoder safely? Do we just call the other decoder (conceptually) on top of the existing call stack of the previous decode? What if that decode also blocks, how do we get back to the first decode? What if a malicious website creates enough videos with blocking connections that we end up running out of stack space? These are hard questions, which I don't yet have a good enough answer for. One option is to change our underlying decode to being non-blocking, but that would require us to mutilate libnestegg (our WebM demuxer), so we'd need to consider it carefully. Regardless, we should consider that in another bug.

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 63

•

14 years ago

I think once we get down to one thread per playing element, we'll be in a reasonable good place. We can make further progress relatively easily by giving up the decoder thread when our buffers are full. As Chris says, getting beyond that --- to handle cases where a lot of elements are playing, but stalled --- is hard, but hopefully that isn't the common case, because it would mean the user is getting a terrible experience already.

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 64

•

14 years ago

(In reply to comment #59) > Backed out of inbound due to suspected android crashtests permaorange. The crashtest hang was introduced in "Patch 5 v2: Move mAudioStream creation to audio thread".

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

14 years ago

Blocks: 668973

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 65

•

14 years ago

The crashtest which is timing out on Android is content/media/tests/crashtests/481136-1.html http://tbpl.mozilla.org/?tree=Try&rev=942ebc7636c3

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 66

•

14 years ago

content/media/test/crashtests/459439-1.html is timing out too. http://tbpl.mozilla.org/?tree=Try&rev=080d2c0c9eec

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 67

•

14 years ago

Attached patch Patch 10: Don't hold decoder monitor while audio stream does sync dispatch to main thread (on Android). — Details — Splinter Review

The problem is that I was holding the decoder monitor while calling certain audio stream methods. On Android some nsAudioStreamRemote methods do synchronous event dispatch to the main thread, and the main thread sometimes needs to acquire the decoder monitor, which can cause a deadlock. This patch fixes that... http://tbpl.mozilla.org/?tree=Try&rev=f689e1455cfc

Attachment #545313 - Flags: review?(roc)

Robert O'Callahan (:roc) (email my personal email if necessary)

Comment 68

•

14 years ago

Comment on attachment 545313 [details] [diff] [review] Patch 10: Don't hold decoder monitor while audio stream does sync dispatch to main thread (on Android). Review of attachment 545313 [details] [diff] [review]: -----------------------------------------------------------------

Attachment #545313 - Flags: review?(roc) → review+

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Comment 69

•

14 years ago

Whiteboard: [MemShrink:P1] → [MemShrink:P1][inbound]

Target Milestone: --- → mozilla8

Mounir Lamouri (:mounir)

Comment 70

•

14 years ago

Status: ASSIGNED → RESOLVED

Closed: 14 years ago

Flags: in-testsuite?

Resolution: --- → FIXED

Whiteboard: [MemShrink:P1][inbound] → [MemShrink:P1]

Chris Pearce [:cpearce (Not reading bugmail)]

Assignee

Updated

•

14 years ago

Depends on: 671978

Matthew Gregan [:kinetik]

Updated

•

13 years ago

Depends on: 682593

Alice0775 White

Updated

•

12 years ago

Depends on: 816041

You need to log in before you can comment on or make changes to this bug.