Closed Bug 1597204 Opened 5 years ago Closed 5 years ago

WebSpeech API GetUserMedia request should use some optimized audio settings

Tracking

()

Status:

RESOLVED WONTFIX

People

(Reporter: gerard-majax, Assigned: gerard-majax)

References

Details

Attachments

(1 obsolete file)

Bug 1597204 - Leverage GetUserMedia audio quality improvements r=pehrsons,andrenatal 5 years ago :gerard-majax 47 bytes, text/x-phabricator-request		Details \| Review

:gerard-majax

Assignee

Description

•

5 years ago

We should enable noise cancellation at the very least, and maybe others.

:gerard-majax

Assignee

Comment 1

•

5 years ago

Attached file Bug 1597204 - Leverage GetUserMedia audio quality improvements r=pehrsons,andrenatal (obsolete) — Details

:gerard-majax

Assignee

Updated

•

5 years ago

Assignee: nobody → lissyx+mozillians

:gerard-majax

Assignee

Updated

•

5 years ago

Blocks: 1597220

Phabricator Automation

Updated

•

5 years ago

Attachment #9109385 - Attachment description: Bug 1597204 - Leverage GetUserMedia audio quality improvements r=pehrsons → Bug 1597204 - Leverage GetUserMedia audio quality improvements r=pehrsons,andrenatal

Daniel Bodea [:danibodea]

Comment 2

•

5 years ago

I have attempted to verify this implementation with the following procedure:

I had 2 builds opened at the same time, to compare how Google Translate recognizes verbal sounds on each of them.
build a: Nightly v72.0a1 from 2019-11-25
build b: Nightly v72.0a1 from https://treeherder.mozilla.org/#/jobs?repo=try&revision=b5bb8a18c9e522895110785c741684eaa9478bc9
I took a generic webcam to use as audio input because its microphone audio quality is most raw.
I used a mobile phone to record my voice in samples that I feed as audio input.
On another mobile phone, I would play different Youtube noise videos.
I would try to find the best working set-up (so that the samples are barely understood) by modifying sample or noise volumes or moving them closer or further to the webcam's microphone. These set-ups would change from a sample to the other or from a noise video to another.

These are the test results: https://docs.google.com/spreadsheets/d/145pulOOjma7gfjLlxZXXC_txzRWcQesyVmCCybE16VM/edit?usp=sharing

In conclusion, I can definitely say that there is no noticeable improvement on the build with the audio quality improvements compared to the latest Nightly build. Please have your own opinion based on the test results. Thank you.

P.S. If this is a feature that is only intended to be activated during testing than it is not needed. I wanted to use noise reduction/cancellation hardware/software only to diversify testing procedures for the Web Speech API feature, not because it would not understand (translate voice into string) well enough.

:gerard-majax

Assignee

Comment 3

•

5 years ago

(In reply to Bodea Daniel [:danibodea] from comment #2)

[...]

In conclusion, I can definitely say that there is no noticeable improvement on the build with the audio quality improvements compared to the latest Nightly build. Please have your own opinion based on the test results. Thank you.

So you tested that only against Google STT, right. Which is very likely to be much more robust to noise for now that the DeepSpeech implem. If it's not regressing and it can help QA'ing DeepSpeech, that's good for me.

:gerard-majax

Assignee

Updated

•

5 years ago

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → WONTFIX

kdavis

Comment 4

•

5 years ago

So are we ignoring any benefit this could have for non-Google engines?

Andreas Pehrson [:pehrsons]

Comment 5

•

5 years ago

A better solution is https://github.com/WICG/speech-api/issues/66 to give the application full control (and where these settings are on by default).

The API won't ship before that's in the spec and implemented anyway, AIUI.

Phabricator Automation

Updated

•

3 years ago

Attachment #9109385 - Attachment is obsolete: true

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

WebSpeech API GetUserMedia request should use some optimized audio settings

Categories

(Core :: Web Speech, enhancement)

Tracking

()

People

(Reporter: gerard-majax, Assigned: gerard-majax)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 obsolete file)

Description

Comment 1

Updated

Updated

Updated

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Updated

Attachment

General

Description

File Name

Content Type