Closed Bug 1259763 Opened 8 years ago Closed 7 years ago

Reader mode omits opening paragraph on CNN articles

Categories

(Toolkit :: Reader Mode, defect, P3)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: abr, Assigned: evanxd)

References

(Blocks 3 open bugs)

Details

(Whiteboard: [reader-mode-readability-algorithm])

Articles on CNN frequently have an opening paragraph that is styled differently than others. Reader mode does not include this lead paragraph in its view.

See, for example, http://money.cnn.com/2016/02/01/news/economy/poverty-inequality-united-states/index.html
Thanks for the report. This sounds like an issue with the Readability library, so I filed an issue here: https://github.com/mozilla/readability/issues/281
Priority: -- → P3
Whiteboard: [reader-mode-readability-algorithm]
Blocks: 1286221
Cannot reproduce anymore. The webpage[1] seems already changed. There is no opening paragraph there and the reader mode result is good.

[1]: http://money.cnn.com/2016/02/01/news/economy/poverty-inequality-united-states/index.html
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
(In reply to Evan Tseng [:evanxd][:愛聞插低] from comment #2)
> Cannot reproduce anymore. The webpage[1] seems already changed. There is no
> opening paragraph there

I see:

The U.S. has long been heralded as a land of opportunity -- a place where anyone can succeed regardless of the economic class they were born into.

on the page in a different font, and that paragraph does not make it into the reader mode result. Are you seeing something else?

> and the reader mode result is good.
> 
> [1]:
> http://money.cnn.com/2016/02/01/news/economy/poverty-inequality-united-
> states/index.html
Flags: needinfo?(evan)
Gijs I see the same. I do know of cases where large sites to serve different markup to different regions.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Status: REOPENED → NEW
Blocks: 1324630
I can reproduce the issue mentioned on Comment 3. Somehow the `<h2>The U.S. has long been heralded as a land of opportunity -- a place where anyone can succeed regardless of the economic class they were born into.</h2>` node is just removed by some kind of reason.

Good thing is the algorithm chooses correct `topCandidate` (`<div id="storytext">`).

Continue investigate the issue...
Flags: needinfo?(evan)
Assignee: nobody → evan
Status: NEW → ASSIGNED
Sent a PR[1] with the solution. Let's discuss it there.

[1]: https://github.com/mozilla/readability/pull/347/commits/a0f94b1869b5188dfad66d1d5cc8b6270c5bc4f2
Updated the patch to use a new solution to fix the issue[1].

[1]: https://github.com/mozilla/readability/pull/347/commits/64e97fead34ed567025109c2b6df0ae2d8a40db4
We'll land it in m-c in the MozReview patch[1].

[1]: https://reviewboard.mozilla.org/r/109976/diff/2#index_header
Status: ASSIGNED → RESOLVED
Closed: 8 years ago7 years ago
Resolution: --- → FIXED
Blocks: 1329358
You need to log in before you can comment on or make changes to this bug.