Closed Bug 1316794 Opened 6 years ago Closed 1 year ago

Reader Mode narrate: inserts pauses at line breaks in HTML content

Categories

(Toolkit :: Reader Mode, defect, P3)

Firefox 92
Unspecified
Windows
defect

Tracking

()

VERIFIED FIXED
95 Branch
Tracking Status
firefox95 --- verified

People

(Reporter: me, Assigned: Gijs, NeedInfo)

References

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0
Build ID: 20161019084923

Steps to reproduce:

1. Go to a book on the Gutenberg website (e.g. http://www.gutenberg.ca/ebooks/lewiscs-screwtapeletters/lewiscs-screwtapeletters-00-h.html)
2. Turn Reader Mode on
3. Turn Narrator on


Actual results:

The narrator pauses at what appears to be arbitrary points. If you view the source code you will see that there are line breaks at these points.


Expected results:

All white space in HTML should be treated equally, it should only pause if it actually finds at <br> tag.
Component: Untriaged → Reader Mode
Product: Firefox → Toolkit
> The narrator pauses at what appears to be arbitrary points. If you view the source code you will see that there are line breaks at these points.

Apologies for taking considerable time in getting back to this report. I'm trying to reproduce, but it's not clear to me where you mean, specifically. It's possible the bug has gone away, or simply doesn't reproduce on OS X (we use platform-specific speech engines provided by the OS). Can you pick a specific paragraph where you're seeing, err, hearing this and point to specific points at which you're hearing pauses? Thank you!
Flags: needinfo?(me)
Priority: -- → P3
Yeah, it's probably platform specific, since I can still reproduce it on MS Windows. I can reproduce it on the following paragraph on the page:

“The best way to drive out the devil, if he will not yield to texts of Scripture, is to jeer and flout him, for he cannot bear scorn.”—Luther

There is a pause after the word "not".

One solution would be to scrub the line breaks before passing it to the speech engine.
Flags: needinfo?(me)
Eitan, should we just always normalize newlines in the textContent? Kind of wondering if we should have an exception for things in <pre> (and/or <code> or other things?) or not.
Flags: needinfo?(eitan)
(In reply to Alan Trick from comment #2)
> Yeah, it's probably platform specific, since I can still reproduce it on MS
> Windows. I can reproduce it on the following paragraph on the page:
> 
> “The best way to drive out the devil, if he will not yield to texts of
> Scripture, is to jeer and flout him, for he cannot bear scorn.”—Luther
> 
> There is a pause after the word "not".
> 
> One solution would be to scrub the line breaks before passing it to the
> speech engine.

Do you see a linebreak after "not"? I don't. Are you sure this is a linebreak issue?

(In reply to :Gijs from comment #3)
> Eitan, should we just always normalize newlines in the textContent? Kind of
> wondering if we should have an exception for things in <pre> (and/or <code>
> or other things?) or not.

Yeah, might be worth normalizing and adding exceptions for those cases, and for explicit <br>s.
Flags: needinfo?(eitan)
(In reply to Eitan Isaacson [:eeejay] from comment #4)
> Do you see a linebreak after "not"? I don't. Are you sure this is a
> linebreak issue?

Pretty sure. When you look at the source, it looks like this:

<p class="tb">&ldquo;The best way to drive out the devil, if he will not
yield to texts of Scripture, is to jeer and flout him, for
he cannot bear scorn.&rdquo;&mdash;<i>Luther</i></p>

The text-to-speech also pauses after "for". I haven't tested if this is \r or \n or both.
I want to take back what I said earlier. I think we should remove *all* newlines from textContent. This way it will be consistent with other platforms that doen't insert pauses at newlines.
See Also: → 1294761
Attached audio recording-file.m4a

This is an example of the narrator in reader aberantly pausing at line feeds with no syntactic meaning in html, that are invisible to a reader of the text.

As shown by
http://www.mauve.plus.com/pauses.html
Which is repeated text containing:

This is how this sentance sounds normally with no abberant carriage returns or line feeds.<p>
This
is
how
it
sounds
with
line
feeds
instead
of
spaces.
<p>

Added audio file, confirming the presence of this bug as of 92.0.1, on Win 10 Pro, OS build 19043.1237.
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:92.0) Gecko/20100101 Firefox/92.0

This is a significant issue when reading files that may have originated somewhere not supporting infinite reflowable lines.

So, any flat text files converted to html have this issue typically and are almost unlistenable to.
Tested CRLF for no particular reason also, with identical results.

Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Unspecified → Windows
Version: 49 Branch → Firefox 92

.

Assignee: nobody → gijskruitbosch+bugs
Severity: normal → S3
Status: NEW → ASSIGNED
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/ded9e061f6df
fix reader mode narrate pausing for new lines with some speech back-ends, r=eeejay
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 95 Branch

Ian, could you confirm if this is fixed for you on current nightly? ( https://nightly.mozilla.org/ )

Flags: needinfo?(bugzilla)
Flags: qe-verify+

Reproduced the issue using Firefox 94.0a1 (20211003201113) on Windows 10x64 and website from comment 0 and also from comment 7 while having
https://addons.mozilla.org/en-US/firefox/addon/activate-reader-view/ extension to enable Reader Mode.
The issue is no longer reproducible on the above website with Firefox 95.0b2 (20211102190739) on Windows 10x64.

Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.