Open Bug 1635353 Opened 5 years ago Updated 5 years ago

Reader mode omits text on ncurses manpage style documents when deeply nested nodes have a lot of text

Tracking

()

Status:

NEW

People

(Reporter: alkersh.omar, Unassigned)

Details

(Whiteboard: [reader-mode-readability-algorithm])

Attachments

(1 file)

bug_img.png 5 years ago alkersh.omar 150.43 KB, image/png		Details

alkersh.omar

Reporter

Description

•

5 years ago

Attached image bug_img.png — Details

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:75.0) Gecko/20100101 Firefox/75.0

Steps to reproduce:

Entered reader mode in sites. Specific example is here http://tldp.org/HOWTO/NCURSES-Programming-HOWTO/windows.html

Actual results:

Some paragraphs are not showing in reader mode, but they show up in normal mode.

Expected results:

Same text in normal mode shows up in reader mode.

alkersh.omar

Reporter

Comment 1

•

5 years ago

I noticed that the page is made up of three divs. The content in reader mode shows up from the very last div inside div 2. All other elements are ignored. NOTE: The rendered div is inside the last div in div 2 and is the last div.

BugBot [:suhaib / :marco/ :calixte]

Comment 2

•

5 years ago

Bugbug thinks this bug should belong to this component, but please revert this change in case of error.

Component: Untriaged → Reader Mode

Product: Firefox → Toolkit

:Gijs (he/him)

Comment 3

•

5 years ago

This is a result of the algorithm giving weight to large amounts of text, and the code example has by far the most text of any node on the page. Ancestor nodes get scored at fractions of their child nodes (otherwise we'd just always pick all of <body> as the container of the article text), and clearly here we do not score anything else high enough to lead to the overall container having a higher score. There's also no class names or anything else to help clue readermode into what's happening, and there's some div soup going on in terms of the article structure which isn't helping either.

I'm not sure how best to fix this, and given the fact that it's not the primary target for reader mode (that's news articles and other frequently visited webpages, which have very different DOM structures), marking P5.

Severity: normal → S3

Status: UNCONFIRMED → NEW

Ever confirmed: true

OS: Unspecified → All

Priority: -- → P5

Hardware: Unspecified → All

Summary: reader mode skipping text → Reader mode omits text on ncurses manpage style documents when deeply nested nodes have a lot of text

Whiteboard: [reader-mode-readability-algorithm]

alkersh.omar

Reporter

Comment 4

•

5 years ago

Perhaps an option to use whole of <body> as input without any score or algorithms. This is not a blanket solution, but if allowed to exist with the current setup it can provide a brute force workaround to this 'bug' while a more sophisticated solution is developed.

:Gijs (he/him)

Comment 5

•

5 years ago

(In reply to alkersh.omar from comment #4)

Perhaps an option to use whole of <body> as input without any score or algorithms. This is not a blanket solution, but if allowed to exist with the current setup it can provide a brute force workaround to this 'bug' while a more sophisticated solution is developed.

Options aren't solutions; most users will have no idea whether they should use this option, plus then we'd have to provide UI for the option, or the option would be unused by the vast majority of users and they'd still get a broken experience.

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Reader mode omits text on ncurses manpage style documents when deeply nested nodes have a lot of text

Categories

(Toolkit :: Reader Mode, defect, P5)

Tracking

()

People

(Reporter: alkersh.omar, Unassigned)

References

Details

(Whiteboard: [reader-mode-readability-algorithm])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Attachment

General

Description

File Name

Content Type