Open Bug 1325999 Opened 7 years ago Updated 2 years ago

Reader mode does not display full content (MS Word generated HTML with <table>s for a comment/footnote column)

Categories

(Toolkit :: Reader Mode, defect, P3)

50 Branch
defect

Tracking

()

People

(Reporter: lord.of.the.flies.0, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [reader-mode-readability-algorithm])

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161208153507

Steps to reproduce:

I accessed the article on the following web page: 

http://lolajournal.com/7/lola_montes.html

On this page, the Reader Mode button in the address bar appeared, as it appropriately applied to the article. I clicked the button to activate Reader Mode.


Actual results:

When Reader Mode was activated, only three paragraphs of the article are displayed: the ones starting with "Although Lola's..." and ending with "capitalist industry." These paragraphs are in the middle of the page, not the first few or last few paragraphs.


Expected results:

The full contents of the article should have been displayed in Reader Mode, from the beginning to the end, instead of random paragraphs in the middle of the page.
Component: Untriaged → RSS Discovery and Preview
OS: Unspecified → Mac OS X
Component: RSS Discovery and Preview → General
Component: General → Reader Mode
Product: Firefox → Toolkit
Here is another example of the bug occurring:

https://choppedupjazz.squarespace.com/news/2016/12/31/kamasi-washington-helps-rtj-deliver-a-christmas-miracle

On this website, activating Reader Mode will start with the paragraph that begins with "A bold continuation..." ignoring the three paragraphs above it.
Unfortunately, the second example is quite different in terms of markup, and will require a different solution. I will split it out to a separate bug, and make this bug specifically about the first one.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Mac OS X → All
Priority: -- → P3
Hardware: Unspecified → All
Summary: Reader mode does not display full content → Reader mode does not display full content (MS Word generated HTML with <table>s for a comment/footnote column)
Whiteboard: [reader-mode-readability-algorithm]
Ah. What about this one? From a NYTimes article: https://www.nytimes.com/2014/11/30/books/review/nothing-is-true-and-everything-is-possible-by-peter-pomerantsev.html

Activating Reader mode omits the first four paragraphs and begins with "“TV is the only force..." This is a pretty major site, so it's a bit problematic that Firefox can't parse it properly.
(In reply to lord.of.the.flies.0 from comment #3)
> Ah. What about this one? From a NYTimes article:
> https://www.nytimes.com/2014/11/30/books/review/nothing-is-true-and-
> everything-is-possible-by-peter-pomerantsev.html
> 
> Activating Reader mode omits the first four paragraphs and begins with "“TV
> is the only force..." This is a pretty major site, so it's a bit problematic
> that Firefox can't parse it properly.

Yes, the NYT issues are tracked in bug 1300697. Again, the cause of things not working is often different between different sites. Sites don't sort of expose "hey, here's my main content" in some structured way. We just have to parse the site and then try to guess. That's hard, and so in different cases it sometimes does not work correctly.
Blocks: 1329358
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.