Open Bug 1549096 Opened 5 years ago Updated 2 years ago

Make Firefox's narration in Reader Mode better in Linux

Categories

(Toolkit :: Reader Mode, enhancement)

66 Branch
x86_64
Linux
enhancement

Tracking

()

UNCONFIRMED

People

(Reporter: daft.goon, Unassigned)

Details

Attachments

(1 file)

Steps to Reproduce:

  1. In a fresh Firefox profile, browse to any webpage that is a blog post. We can use [0] as a reference.
  2. Click on the Reader Mode button that can be found on the right side in the Awesome Bar (the url bar).
  3. Click on the Narrate button (on the left side, third one from top) and press Play.

Expected Results:

If you are in Linux and you have used this feature in Windows before, you expect Linux's version to be as good as Windows'.

Actual Results:

Linux's version sounds very robotic compared to Windows' one. In fact, it sounds exactly like if you passed the text to espeak.

Suggested Fix:

We can use/integrate MycroftAI's Mimic2 engine for using TTS for narration in Reader Mode. Mimic2 [1] is an open source implementation of Google's Tacotron: End-to-End Speech Synthesis research paper [2]. Or anything else that improves the TTS quality of the narration.

[0] https://blog.mozilla.org/firefox/introducing-firefox-multi-account-containers/
[1] https://github.com/MycroftAI/mimic2
[2] https://arxiv.org/pdf/1703.10135.pdf

eeejay, the reporter of this bug reached out on #introduction and is interested in contributing the above change if they can. What do you think?

Flags: needinfo?(eitan)

Very interesting! We have our own speech ML stack, I'll copy Kelly Davis who leads that.

As for these specific proposals, I think we will quickly run in to a myriad of issues if we choose to bundle this in Firefox like size and licensing (software, model and talent).

Daft, if you are interested in helping here I think something worthwhile would be to make TTS extensible with WebExtensions. This will allow both experimentation and availability of high quality voices. The big idea is that a user can install other speech engines that would then be available to content via WebSpeech. Narrate uses WebSpeech and would benefit.

Something like this: https://developer.chrome.com/extensions/ttsEngine

Flags: needinfo?(eitan)

Surprised to see Mimic(as in the original proposal) is also what's used in Mozilla's speech synthesis ML stack!

Okay, please clear me if I'm wrong... As I understand, what we should try to do now is to implement a way for using TTS in Mozilla's WebExtensions and that implementation should use the WebSpeech API?

Should we open a new bug post for that?

Can I be guided for trying to implement that feature?

I am really interested in trying to do this... Since, I have no prior experience with using WebExtensions or WebSpeech, I will try to make an-easy-first-addon today in a few moments and will also try to use WebSpeech so I know how they get used.

Hi! I have seen a few tutorials on how to make a Web Extension for Firefox as I felt that knowing that is necessary. I have also got familiar with the common terminologies used. Is there any updates to the project that have happened? I would still like to work on it but I am unsure how to... as we have yet to confirm on how to implement it.

Thanks!

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: