Closed Bug 599089 Opened 14 years ago Closed 14 years ago

Remote audio

Categories

(Core :: IPC, defect)

All
Android
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
fennec 2.0b3+ ---

People

(Reporter: mwu, Assigned: dougt)

References

Details

Attachments

(1 file, 4 obsolete files)

The child process can't play audio on Android, so we need to remote it.
tracking-fennec: --- → ?
tracking-fennec: ? → 2.0+
Assignee: nobody → mwu
tracking-fennec: 2.0+ → 2.0b3+
Just wondering, what is the technical reason why child process can't play
audio?
(In reply to comment #1)
> Just wondering, what is the technical reason why child process can't play
> audio?

The audio api is implemented in a java api. The child process has no access to java.
Are there no other audio APIs we can use directly from the child process?
(In reply to comment #3)
> Are there no other audio APIs we can use directly from the child process?

None that we're allowed to use. That doesn't necessarily stop us, but we do avoid the use of undocumented/unstable APIs unless there is no other choice.
Assignee: mwu → doug.turner
The approach is kinda tough.  Hopefully, we can get access to JNI and call directly to avoid remoting.

The general approach is to provide a stub implementation of nsAudioStream for the child process.  This new child implementation, we will forward to the parent the following.  Something like:


protocol PAudio
{
  PAudio(PRInt32 aNumChannels, PRInt32 aRate, SampleFormat aFormat)
  __delete__() 
  Write(nsCString data, PRUint32 count)
  sync WriteSync(nsCString data, PRUint32 count)
  SetVolume(float aVolume);
  Drain()
  Pause()
  Resume()
};


The constructor and destructor correspond to the init and shutdown of the nsAudioStream.  SampleFormat can be a int, i suppose.

We need two versions of write.  One to suppose the blocking call, and the other to support the normal non blocking call.

Then some control functions that directly map to the methods in nsAudioStream.

The things that worry me most are the GetSampleOffset, and Available methods.  These things are called often and need to return very accurate data or you will see synchronization problems.  For example, GetSampleOffset needs to hold 10ms precision.

A naive approach would be to make these methods sync, but in doing so you will lock up the child very often.

We could make it so that on each Write the parent does, we make these calls on the parent and set some shared memory (avoiding IPC messages) so that the child can simple look at these values when required (no locking, ideally).

cjones, any better ideas?

Matthew, would updating the sample offset and available on write() be good enough?
After some discussion, I think we tentatively agreed that audio playback should done in the compositor process, for the sake of a/v sync.  We likely won't get far enough along in the 4.0 timeframe that it's going to matter too much, but that's the direction.

So, with that in mind we agreed to this remoting solution for android (instead of trying to access the underlying audio libs directly or setting up JNI access from content).  AIUI things will be set up as
 - video/audio decoder thread unchanged
 - audio playback thread will forward data to the content-process main thread, which will send it through PAudio or whatever to compositor
 - the code that previously ran on the content-process audio thread will run in a new thread in the compositor process
 - the compositor-process main thread will forward PAudio data to the compositor-playback thread, which will play it using the Java APIs

We'll be schlepping over the IPC socket somewhere in the neighborhood of 172KB per second of decoded audio.

It's not quite so clear how the "director" thread will access the audio clock.  Some options are
 - synchronously proxy request from director-->content-main thread, then through a |sync| PAudio or whatever message
 - just guess in content and push correcting updates from compositor-->content at XHz
 - (something exotic with shmem?)

We also tentatively agreed to make this implementation ifdef ANDROID so that we don't regress n900 perf.  We'll want this ifdef MOZ_IPC once perf is comparable.

(It's kind of funny how much ping-ponging through various threads and processes the audio data will take going from network --> audio device.  This would be a good design point for architecture optimization in the future.)
Attached patch WIP v.1 (obsolete) — Splinter Review
The basic implementation is as follows:

1) factor the nsAudioStream into two classes.

* The first class is the existing implementation and will run in the parent process (or the default process when using Firefox on the desktop).

* The second class is a new implementation and will be created in the child process.

* Instead of new'ing nsAudioStream(), we can just add a static method to return the correct implementation based on who is calling and from where.

2) The remoting of nsAudioStream

* IPC can only be access from the main thread.  Because of this restriction, the nsAudioStream running in the child will post messages to the main thread.  These messages will be for Init, Shutdown, Write, SetVolume, Pause, Resume, Drain.

* These events currently hold references to the nsAudioStream and that does go away during shutdown, page transisition.  I think I will just add reference counting on these objects.

* Sync functions

** Sample Offset and Avaiable are called from both the main thread and the decoding thread.  The callers expect an immediate response.

** Posting a SYNC message from the decoding thread to the main thread causes deadlocking.

** Proposal was to run a timer in the parent and share that value with the child.

*** Currently using filesystem to share this BAH!

* Parent Process

** Currently runs a 30Hz timer to update the sample offset.

** Implementation of the PAudio basically just calls directly into the nsAudioStream.  It might be possible to just leave it this way.  Otherwise, we can throw this on another thread.

Lots of work and polish needed!
Attachment #489232 - Flags: feedback?(jones.chris.g)
Comment on attachment 489232 [details] [diff] [review]
WIP v.1

I only scanned the audio-clock bits, and we discussed that on IRC.  No specific feedback on overall design, seems fine to me.

(In reply to comment #7)
> ** Implementation of the PAudio basically just calls directly into the
> nsAudioStream.  It might be possible to just leave it this way.  Otherwise, we
> can throw this on another thread.

This needs to go on a separate thread or we'll have another places-like incident on our hands.

> Lots of work and polish needed!

Dziękuję!
Attachment #489232 - Flags: feedback?(jones.chris.g)
Attached patch WIP v.2 (obsolete) — Splinter Review
Attachment #489232 - Attachment is obsolete: true
The formula I gave you for estimating the sample offset was wrong:

+  PRInt64 result = offset + (mRate * (PR_IntervalNow() - time) / MILLISECONDS_PER_SECOND);

This should be:

+  PRInt64 result = offset + (mRate * mChannels * (PR_IntervalNow() - time) / MILLISECONDS_PER_SECOND);

With this change, it works surprisingly well on my desktop machine!
Just a couple of quick comments:

1) dom/ipc/AudioChild.cpp~ should probably not be part of this patch.

2) I know it's trivial compared to the amount of memory allocation, copying, and context switching going on, but recommend making bytesPerSample a member of nsAudioStreamRemote and setting it in the constructor when mFormat is set, to remove the switch in Write().
My comments so far:

1. The child's Drain() call needs to block until the drain operation completes in the parent.

2. Maintain the pause state (mPaused) in the child because we use this to avoid calling Write() when the stream has been paused, so it's not safe to updated mPaused asynchronously from the event processing.

3. As discussed on IRC, you can probably hook the gAudioPlaybackThread stuff into nsAudioStream::InitLibrary/ShutdownLibrary, which is called by nsLayoutStatics... however, #1 (and the blocking Write() calls) will make using a single thread painful/impossible I think.
+    result = (nsAudioStream*) new nsAudioStreamRemote();
+  }
+  else
+#endif
+    result = (nsAudioStream*) new nsAudioStreamLocal();

These casts shouldn't be necessary as both subclasses inherit from nsAudioStream.

+class AudioWriteEvent : public nsRunnable
+{
+ public:
+  AudioWriteEvent(nsAudioStreamRemote* owner,
+                  const void* aBuf,
+                  PRUint32 aBufSize,
+                  PRUint32 aCount,
+                  PRBool blocking)
+  {
+    mOwner = owner;
+    mBlocking = blocking;
+    mCount = aCount;

Can you please rename aCount to aBytes here (and anywhere else that is using bytes rather than samples)?
> Can you please rename aCount to aBytes here (and anywhere else that is
> using bytes rather than samples)?

I'll see if I can get rid of aBufSize and just use mBytesPerSample * aCount.
Attached patch patch v.3 (obsolete) — Splinter Review
this seems to work fine on the n900 and desktop.  i ran into unrelated android streaming problems, so I can't say that audio works great there with this patch, but it works no worse.

kinetik, can you take a look?
Attachment #489659 - Attachment is obsolete: true
Attachment #490162 - Flags: review?
Attachment #490162 - Flags: review? → review?(kinetik)
+static nsIThread *gAudioPlaybackThread = nsnull;

I'm confused about what this is for.  It seems to only be used for the timing 
update stuff, but why can't that run on the main thread (i.e. directly from the timer callback)?  Fetching the timing information is pretty fast.

The other events are running on the main thread, but this 
will cause problems because some of them can block for a long time
(specifically, writes and drain may block for over a second).  They need to be moved to their own thread, and because a blocking write or drain will hold up events for other streams.

In this case, it might make sense to shunt the timing update calls over to the per-stream thread, but I'm not sure... either way seems fine to me.

+    mOwner->mAudioChild = (AudioChild*) cpc->SendPAudioConstructor(mOwner->mChannels,

Is this cast necessary?  If it is, prefer static_cast.

+nsAudioStream* nsAudioStream::AllocateStream()
+{
+#ifdef MOZ_IPC
+  if (XRE_GetProcessType() == GeckoProcessType_Content) {
+    return new nsAudioStreamRemote();
  }
#endif
  return new nsAudioStreamLocal();
}

Simplify AllocateStream to this?

+void
+nsAudioStreamRemote::Pause()
+{
+  nsCOMPtr<nsIRunnable> event = new AudioPauseEvent(this, PR_TRUE);
+  NS_DispatchToMainThread(event);
+}
+
+void
+nsAudioStreamRemote::Resume()
+{
+  nsCOMPtr<nsIRunnable> event = new AudioPauseEvent(this, PR_FALSE);
+  NS_DispatchToMainThread(event);
+}

Set and clear mPaused here and do away with setting it in the AudioPauseEvent::Run method per comment 12.
> Set and clear mPaused here and do away with setting it
> in the AudioPauseEvent::Run method per comment 12.

is it okay to set mPaused before pending write events complete?
Attached patch patch v.4 (obsolete) — Splinter Review
address above concerns.
Attachment #490162 - Attachment is obsolete: true
Attachment #490779 - Flags: review?(kinetik)
Attachment #490162 - Flags: review?(kinetik)
Attached patch patch v.5Splinter Review
Attachment #490779 - Attachment is obsolete: true
Attachment #490802 - Flags: review?(kinetik)
Attachment #490779 - Flags: review?(kinetik)
Comment on attachment 490802 [details] [diff] [review]
patch v.5

+PRInt64 nsAudioStreamRemote::GetPosition()
+{
+  PRInt64 sampleOffset = GetSampleOffset();
+  if (sampleOffset >= 0) {
+    return ((MILLISECONDS_PER_SECOND * sampleOffset) / mRate / mChannels);
+  }
+  return -1;
+}
+
+PRInt64
+nsAudioStreamRemote::GetSampleOffset()
+{
+  if(!mAudioChild)
+    return -1;
+
+  PRInt64 offset = mAudioChild->GetLastKnownSampleOffset();
+  if (offset == -1)
+    return offset;
+
+  PRInt64 time   = mAudioChild->GetLastKnownSampleOffsetTime();
+  PRInt64 result = offset + (mRate * mChannels * (PR_IntervalNow() - time) / MILLISECONDS_PER_SECOND);
+
+  return result;
+}

These should only return -1 on error, so once the audio stream remoting is up and running they should return 0 if the child hasn't received a timing update yet.

Two other issues that can be fixed in followup bugs since you want to get this landed soon:

1. Per comment 16, we need a thread per stream to avoid problems with blocking drain/write calls affecting other AudioStream's operations.

2. We need to work out how to signal errors in the child AudioStream.  If the parent stream sets mInError, it should send an event to the child to mark it as in-error as well... otherwise the child has no way to know the AudioStream is failing.  Events sent between the parent entering error state and the child receiving the event will be ignored by the mInError early return tests in the parent, so this should all work without too much pain.
Attachment #490802 - Flags: review?(kinetik) → review+
Blocks: 612798
Blocks: 612799
followup bugs:

Filed bug 612798 to track issue 1
Filed bug 612799 to track issue 2
http://hg.mozilla.org/mozilla-central/rev/908443327ad3
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Just got timeouts and eventually a crash in nsAudioStreamLocal on Rev3 Fedora 12x64 mozilla-central debug test mochitests-1/5, thread stacks are in here:
http://tinderbox.mozilla.org/1290032771.1290035995.24230.gz

Starred as bug 589595, bug 610931, bug 610930, and bug 591790.

There are thread stacks blocked in nsAudioStreamLocal::Available and in nsAudioStreamLocal::Write.

We just managed to quell a lot of the orange on tinderbox...
Chris,

Bad *.gz url.


The significant changes in non IPC are:

1) We changed the nsAudioStream to be refcount.

2) We call shutdown from nsBuiltinDecoderStateMachine destructor because msAudioStream (in the IPC cause) has cycles that need to be broken. 


     6.1 --- a/content/media/nsBuiltinDecoderStateMachine.cpp
     6.2 +++ b/content/media/nsBuiltinDecoderStateMachine.cpp
     6.3 @@ -161,6 +161,12 @@ nsBuiltinDecoderStateMachine::nsBuiltinD
     6.4  nsBuiltinDecoderStateMachine::~nsBuiltinDecoderStateMachine()
     6.5  {
     6.6    MOZ_COUNT_DTOR(nsBuiltinDecoderStateMachine);
     6.7 +
     6.8 +  if (mAudioStream) {
     6.9 +    MonitorAutoEnter mon(mDecoder->GetMonitor());
    6.10 +    mAudioStream->Shutdown();
    6.11 +    mAudioStream = nsnull;
    6.12 +  }
    6.13  }
chris,
it looked like some of those crashes happened on the 9th (the wave test, with drain() on each of the decoding threads)
(In reply to comment #24)
> 2) We call shutdown from nsBuiltinDecoderStateMachine destructor because
> msAudioStream (in the IPC cause) has cycles that need to be broken.

I don't think ~nsBuiltinDecoderStateMachine() could be called without nsBuiltinDecoderStateMachine::StopPlayback(AUDIO_SHUTDOWN) being called first. This calls shutdown() on the audio stream, and happens in the SHUTDOWN and COMPLETED cases of nsBuiltinDecoderStateMachine::Run().

Does your new call to shutdown() in ~nsBuiltinDecoderStateMachine() ever get run? I don't have an android device or environment setup to check for myself. If not we can remove this code.

If the code is being run, it should be holding the audio monitor rather than the decoder monitor.
yeah, definitely gets called.  Basically if you go to an ogg file, then go to a new page, nsBuiltinDecoderStateMachine::StopPlayback(AUDIO_SHUTDOWN) is never called.
The old code used to call nsAudioStream::Shutdown in the nsAudioStream dtor.  This would be run automatically when nsBuiltinDecoderStateMachine was destroyed.  I've verified in a debugger that this happens when browsing to a new page (StopPlayback(AUDIO_SHUTDOWN) is not called, however).  So the existing code was fine.

The new code removed this Shutdown call, and added an explicit call in the nsBuiltinDecoderStateMachine dtor.  This change shouldn't have been necessary.  I missed it in the last patch I reviewed, after we discussed it I thought it was going to be reverted, but perhaps I misunderstood.

If there's a cycle here, it couldn't be resolved by this change anyway, because nsBuiltinDecoderStateMachine's dtor would not run.  But I can't see a cycle here anyway.
matthew, do you want to back out that part?  I can verify any problems in fennec/ipc tomorrow.
Depends on: 613289
these problems were addressed in bug 613982.
Depends on: 614136
Depends on: 614160
Depends on: 640117
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: