Closed Bug 1051118 Opened 5 years ago Closed 5 years ago

SpeechRecognitionService needs to receive the audio rate from SpeechStreamListener

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: anatal, Assigned: anatal)

References

Details

Attachments

(1 file, 5 obsolete files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:31.0) Gecko/20100101 Firefox/31.0 (Beta/Release)
Build ID: 20140716183446

Steps to reproduce:

SpeechRecognitionService need to know the audiorate to resample the audio before decode the speech. I changed the interfaces properly.
OS: Mac OS X → All
Hardware: x86 → All
Blocks: 1049931
Flags: needinfo?(ggoncalves)
Comment on attachment 8470368 [details]
0001-Passing-the-sample-rate-from-SpeechStreamListener-to.patch

To give people some context: some speech recognition services will need to resample the audio
coming from SpeechRecognition; in order to do so, they'll need the sample rate at which the
audio was recorded.

Looks good! There are, however, some style issues. I'll point out some of them, but you'll
probably want to double-check that you got them all before requesting a review please :)

>-  mRecognitionService->ProcessAudioSegment(aSegment);
>+  mRecognitionService->ProcessAudioSegment(aTrackRate,aSegment);

Please add a space after the comma: aTrackRate, aSegment

>-                                 MediaStreamListener* aProvider)
>+                                 MediaStreamListener* aProvider , TrackRate aTrackRate )

Extra whitespace:
MediaStreamListener* aProvider, TrackRate aTrackRate)

>   event->mProvider = aProvider;
>+  event->aTrackRate = aTrackRate;

This should be mTrackRate.

>-  void FeedAudioData(already_AddRefed<SharedBuffer> aSamples, uint32_t aDuration, MediaStreamListener* aProvider);
>+  void FeedAudioData(already_AddRefed<SharedBuffer> aSamples, uint32_t aDuration, MediaStreamListener* aProvider , TrackRate aTrackRate ) ;
>+

Extra whitespace:
(...) MediaStreamListener* aProvider, TrackRate aTrackRate);

>index d8f77c3..bcb78cd
>--- a/content/media/webspeech/recognition/SpeechStreamListener.cpp
>+++ b/content/media/webspeech/recognition/SpeechStreamListener.cpp
>@@ -50,7 +50,8 @@ SpeechStreamListener::NotifyQueuedTrackChanges(MediaStreamGraph* aGraph,
>     if (iterator->IsNull()) {
>       nsTArray<int16_t> nullData;
>       PodZero(nullData.AppendElements(duration), duration);
>-      ConvertAndDispatchAudioChunk(duration, iterator->mVolume, nullData.Elements());
>+       ConvertAndDispatchAudioChunk(duration, iterator->mVolume, nullData.Elements(),aTrackRate);
>+

Please fix the indentation, and add a space after the comma before aTrackRate.

>-                                     static_cast<const int16_t*>(iterator->mChannelData[0]));
>+              static_cast<const int16_t*>(iterator->mChannelData[0]),aTrackRate);

Indentation.

>-                                     static_cast<const float*>(iterator->mChannelData[0]));
>+                                     static_cast<const float*>(iterator->mChannelData[0]),aTrackRate);

Space after comma.

>-    void processAudioSegment(in AudioSegmentPtr aAudioSegment);
>+	void processAudioSegment(in long aSampleRate , in AudioSegmentPtr aAudioSegment);

Can you make the arguments consistent with SpeechRecognition::ProcessAudioSegment?

void processAudioSegment(in AudioSegmentPtr aAudioSegment, in long aSampleRate);
Attachment #8470368 - Flags: feedback+
Flags: needinfo?(ggoncalves)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: SpeechRecognitionService need to receive the audio rate from SpeechStreamListener → SpeechRecognitionService needs to receive the audio rate from SpeechStreamListener
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1051148
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
This patch replaces the last one, as our previous discussion.
Attachment #8470368 - Attachment is obsolete: true
Attachment #8481630 - Flags: review+
Assignee: nobody → anatal
Flags: needinfo?(ggoncalves)
Attachment #8481630 - Flags: review+ → review?(ggoncalves)
Attachment #8481630 - Attachment is obsolete: true
Attachment #8481630 - Flags: review?(ggoncalves)
Attachment #8481704 - Flags: review?(ggoncalves)
Comment on attachment 8481704 [details] [diff] [review]
0002-Speech-recognition-services-will-need-to-resample-th.patch

Review of attachment 8481704 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but I found a few more style issues. Once those are fixed, this should be ready for :smaug's review.

::: content/media/webspeech/recognition/SpeechRecognition.cpp
@@ +828,4 @@
>  void
>  SpeechRecognition::FeedAudioData(already_AddRefed<SharedBuffer> aSamples,
>                                   uint32_t aDuration,
> +                                 MediaStreamListener* aProvider, TrackRate aTrackRate )

Nit: please remove the extra space after |aTrackRate|.

::: content/media/webspeech/recognition/SpeechStreamListener.cpp
@@ +50,5 @@
>      if (iterator->IsNull()) {
>        nsTArray<int16_t> nullData;
>        PodZero(nullData.AppendElements(duration), duration);
> +      ConvertAndDispatchAudioChunk(duration, iterator->mVolume, nullData.Elements(), aTrackRate);
> +

Please remove this extra empty line.

@@ +57,5 @@
>  
>        MOZ_ASSERT(format == AUDIO_FORMAT_S16 || format == AUDIO_FORMAT_FLOAT32);
>  
>        if (format == AUDIO_FORMAT_S16) {
> +        ConvertAndDispatchAudioChunk(duration, iterator->mVolume, static_cast<const int16_t*>(iterator->mChannelData[0]), aTrackRate);

Please move some of the arguments to a separate line to keep the length of this line under 80 characters (have a look at what this line looked like before your change).

@@ +62,2 @@
>        } else if (format == AUDIO_FORMAT_FLOAT32) {
> +        ConvertAndDispatchAudioChunk(duration, iterator->mVolume, static_cast<const float*>(iterator->mChannelData[0]), aTrackRate);

Same here.

@@ +70,4 @@
>  
>  template<typename SampleFormatType> void
>  SpeechStreamListener::ConvertAndDispatchAudioChunk(int aDuration, float aVolume,
> +                                                   SampleFormatType* aData , TrackRate aTrackRate )

Please remove the extra space after |aData| and |aTrackRate|, and move |aTrackRate| to a separate line to keep this under 80 characters.

@@ +79,4 @@
>    int16_t* to = static_cast<int16_t*>(samples->Data());
>    ConvertAudioSamplesWithScale(aData, to, aDuration, aVolume);
>  
> +  mRecognition->FeedAudioData(samples.forget(), aDuration, this,aTrackRate);

Please add a space after |this,|

::: content/media/webspeech/recognition/nsISpeechRecognitionService.idl
@@ +16,3 @@
>  interface nsISpeechRecognitionService : nsISupports {
>      void initialize(in SpeechRecognitionWeakPtr aSpeechRecognition);
> +	void processAudioSegment(in AudioSegmentPtr aAudioSegment, in long aSampleRate);

Replace this tab with spaces.
Attachment #8481704 - Flags: review?(ggoncalves)
Flags: needinfo?(ggoncalves)
Also, please update the commit message so that it refers to this bug's number, and try to make it shorter and descriptive of the change you've made. Something like:

Bug 1051118 - Pass the sample rate of captured audio to speech recognition services r=smaug.
Attachment #8481704 - Attachment is obsolete: true
Attachment #8487760 - Flags: review?(ggoncalves)
(In reply to Guilherme Gonçalves [:ggp] from comment #6)
> Also, please update the commit message so that it refers to this bug's
> number, and try to make it shorter and descriptive of the change you've
> made. Something like:
> 
> Bug 1051118 - Pass the sample rate of captured audio to speech recognition
> services r=smaug.

Thanks. I fixed the mentioned issues
Comment on attachment 8487760 [details] [diff] [review]
Bug 1051118 - Pass the sample rate of captured audio to speech recognition services

Review of attachment 8487760 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good aside from the style nits, let's get :smaug's word.

::: content/media/webspeech/recognition/SpeechRecognition.h
@@ +130,5 @@
>    uint32_t FillSamplesBuffer(const int16_t* aSamples, uint32_t aSampleCount);
>    uint32_t SplitSamplesBuffer(const int16_t* aSamplesBuffer, uint32_t aSampleCount, nsTArray<nsRefPtr<SharedBuffer>>& aResult);
>    AudioSegment* CreateAudioSegment(nsTArray<nsRefPtr<SharedBuffer>>& aChunks);
> +  void FeedAudioData(already_AddRefed<SharedBuffer> aSamples, uint32_t aDuration, MediaStreamListener* aProvider, TrackRate aTrackRate ) ;
> +

Please remove the extra space after aTrackRate.

::: content/media/webspeech/recognition/SpeechStreamListener.cpp
@@ +58,4 @@
>  
>        if (format == AUDIO_FORMAT_S16) {
>          ConvertAndDispatchAudioChunk(duration, iterator->mVolume,
> +			static_cast<const int16_t*>(iterator->mChannelData[0]), aTrackRate);

Please use spaces instead of tabs for indentation, and align the parameters: you can put |static_cast ...| starting right below |duration|, and |aTrackRate| in the following line.

@@ +62,3 @@
>        } else if (format == AUDIO_FORMAT_FLOAT32) {
>          ConvertAndDispatchAudioChunk(duration, iterator->mVolume,
> +			static_cast<const float*>(iterator->mChannelData[0]), aTrackRate);

Replace tabs with spaces.

@@ +71,5 @@
>  
>  template<typename SampleFormatType> void
>  SpeechStreamListener::ConvertAndDispatchAudioChunk(int aDuration, float aVolume,
> +                                                   SampleFormatType* aData,
> +						    TrackRate aTrackRate)

Replace tabs with spaces.
Attachment #8487760 - Flags: review?(ggoncalves) → review?(bugs)
Ok, fixed the last style nits
Attachment #8487760 - Attachment is obsolete: true
Attachment #8487760 - Flags: review?(bugs)
Attachment #8487901 - Flags: review?(ggoncalves)
Attachment #8487901 - Flags: review?(ggoncalves) → review?(bugs)
Comment on attachment 8487901 [details] [diff] [review]
Bug 1051118 - Pass the sample rate of captured audio to speech recognition services

> .../webspeech/recognition/SpeechRecognition.cpp    |   15 ++++++++-------
> .../webspeech/recognition/SpeechRecognition.h      |    6 ++++--
> .../webspeech/recognition/SpeechStreamListener.cpp |   17 ++++++++++-------
> .../webspeech/recognition/SpeechStreamListener.h   |    2 +-
> .../recognition/nsISpeechRecognitionService.idl    |    4 ++--
> .../test/FakeSpeechRecognitionService.cpp          |    2 +-
> 6 files changed, 26 insertions(+), 20 deletions(-)
> mode change 100644 => 100755 content/media/webspeech/recognition/SpeechRecognition.cpp
> mode change 100644 => 100755 content/media/webspeech/recognition/SpeechRecognition.h
> mode change 100644 => 100755 content/media/webspeech/recognition/SpeechStreamListener.cpp
> mode change 100644 => 100755 content/media/webspeech/recognition/SpeechStreamListener.h
> mode change 100644 => 100755 content/media/webspeech/recognition/nsISpeechRecognitionService.idl
> mode change 100644 => 100755 content/media/webspeech/recognition/test/FakeSpeechRecognitionService.cpp
Why these mode changes? please undo. (-x)
Attachment #8487901 - Flags: review?(bugs) → review+
Ok. Mode change undone
Attachment #8487901 - Attachment is obsolete: true
Attachment #8488069 - Flags: review?(bugs)
Comment on attachment 8488069 [details] [diff] [review]
Bug 1051118 - Pass the sample rate of captured audio to speech recognition services

No need to re-ask review if I've given a conditional r+ and you fix the issue I've pointed.
Attachment #8488069 - Flags: review?(bugs) → review+
Depends on: 1051146
Depends on: 1051148
No longer depends on: 1051146
Depends on: 1060659
No longer depends on: 1051148
https://hg.mozilla.org/mozilla-central/rev/98f4486d16e4
Status: REOPENED → RESOLVED
Closed: 5 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
No longer depends on: 1060659
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.