As I user, I would like to have a support for voice input on my FxOS device for executing various actions on device, as well as fetching info from the cloud.
As I user, I would like to have a support for voice input on my FxOS device for executing various actions on device, as well as fetching info from the cloud. Current thought is to break it down into 2 phases phase 1: Grammar based, local actions phase 2: Natural lanaguage, server assisted actions This will be used as a user story meta bug to track dependent work items.
Currently we have it running on Firefox. Steven, Can we do same approach linking against the pre-compiled lib as we did on Desktop at FxOS only to start testing?
(In reply to anatal from comment #1) > Currently we have it running on Firefox. Steven, Can we do same approach > linking against the pre-compiled lib as we did on Desktop at FxOS only to > start testing? Sure, we can do that. :)
Let's synch a more comfortable hour for us to work on it together.
This is the current status of Web Speech API support on browsers : http://caniuse.com/web-speech
Update: A lot of progress was made since we met last and here is the summary. First, WebSpeechAPI integration is completed in a test build and we prepared a couple of demos. This implementation uses Acoustics/language models & decoder from PocketSphinx (all open source): Desktop demo: (Firefox nightly on Mac): http://youtu.be/UcBvsU0fCPs B2g (Flame) demo: https://www.youtube.com/watch?v=0zqBbDmQlQ4 Completed Items 1. Coding the integration of pocketsphinx API with Web Speech API layer at Gecko 2. Modify gUM C++ layer to return pcm as 8khz 3. Test the api with the speech decoder 3.1 Adjust pocketsphinx parameters to enhance accuracy 3.2 Define which languages we'll support initially -> Focused on English at this time 4. Include pocketsphinx sources on gecko and write the moz.build's for each library to be multi-platform and compiled with ./mach 5. Integrate the gecko-dev with b2g and compile them together to support FxOS (OK) 6. Test build Images ready for Mac and Flame (b2g) -- (Please send a note to ANatal@gmail.com) Next Steps 1. Fix minor adjusts on API implementation, code reviews 2. Write mochitests (discussing with QA/Jonathan) 3. Write the prototype (grammar based) app integrated with Gaia. (codenamed "Vaani") 4. Create remaining desktop images for Windows (and Android) 5. Plan integration into the baselines (Gecko, b2g) Thanks, Sandip Kamat & Andre Natal From: "Sandip Kamat" <firstname.lastname@example.org> To: email@example.com, firstname.lastname@example.org, "email@example.com" <firstname.lastname@example.org> Cc: "André Natal" <email@example.com>, "Dietrich Ayala" <firstname.lastname@example.org>, "Josh Carpenter" <email@example.com>, "Larissa Shapiro" <firstname.lastname@example.org> Sent: Tuesday, July 1, 2014 12:46:26 PM Subject: Enabling Voice Input in Open Web / Firefox OS "Many Voices, One Mozilla" Hi All, Here is the summary of high level draft plans we are beginning with for enabling Voice Input in Open Web / Firefox OS. One of our Firefox OS contributors Andre Natal (Brazil community) has done lots of preparatory work and is currently continuing on a GSOC (Google Summer of Code) project around this. The proposed 2 phases of the plan are in email below. Please note the releases and estimates marked below are *all tentative* (will change) and will be refined over next several months. We will continue adding updates here: Wiki https://wiki.mozilla.org/SpeechRTC_-_Speech_enabling_the_open_web Bugzilla Bug 1032964 - [B2G][SpeechRTC][User Story]: Enabling Voice input in Firefox OS Trello board to track status: https://trello.com/b/UWXblmKb/webspeech-api Github: https://github.com/andrenatal/gecko-dev There is lots to do here and we are just starting, so if you are interested, pls watch this wiki and help with the dependent bug# being added to the meta-bug above. This kind of project could use great community participation with contributing code, collecting / testing voice samples with various accents (Remember "many voices, One Mozilla") to improve the acoustic / language models, creating fun gamifications to achieve that and yes, we would need tons & tons of testing!
Phase 1 is tracked with a separate bug now in 1049931.
Phase 1 is tracked with a separate bug now in 1049937.
Phase 2 is tracked with a separate bug now in 1049937.
This is the part 1 of 8 for this bug. This patch that introduces the B2G specific build flags (all enabled initially): * MOZ_WEBSPEECH - Enables/Disables the STT API * MOZ_WEBSPEECH_MODELS - Enables/Disables the model installations * MOZ_WEBSPEECH_POCKETSPHINX - Compiles/Doesn't Compile Pocketsphinx, Sphinxbase, and relevant XPCOM models The try for this patch is running here https://treeherder.mozilla.org/#/jobs?repo=try&revision=1f8a8598fb48
(In reply to kdavis from comment #9) > Created attachment 8604626 [details] [diff] [review] > Part 1 of 8: Introduces the B2G specific build flags, initially enabled. > > This is the part 1 of 8 for this bug. > > This patch that introduces the B2G specific build flags (all enabled > initially): > > * MOZ_WEBSPEECH - Enables/Disables the STT API > * MOZ_WEBSPEECH_MODELS - Enables/Disables the model installations > * MOZ_WEBSPEECH_POCKETSPHINX - Compiles/Doesn't Compile Pocketsphinx, > Sphinxbase, and relevant XPCOM models > > The try for this patch is running here > https://treeherder.mozilla.org/#/jobs?repo=try&revision=1f8a8598fb48 This is the metabug. This patch should be included in its appropriate bug inside the tree.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.