Closed Bug 1285543 Opened 9 years ago Closed 9 years ago

Different Yahoo News article title and content is displayed when using Reader Mode

Categories

(Toolkit :: Reader Mode, defect, P2)

x86_64
All
defect

Tracking

()

VERIFIED FIXED
Tracking Status
firefox50 --- affected

People

(Reporter: cmuresan, Assigned: evanxd)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [reader-mode-readability-algorithm])

Attachments

(1 file)

[Affected versions]: - Nightly 50.0a1 [Affected Platforms]: - Win 8.1 x64, Win 10 x64 - Ubuntu 14.04 x64, 16.04 x64 [Prerequisites]: - Set the "print.use_simplify_page" pref to true [Steps to reproduce]: 1. Go to https://www.yahoo.com/news/?ref=gs 2. Click on an article from the page. (it should open in an overlay) 3. From Browser Menu Bar go to File->Print Preview and click Simplify Page checkbox. 4. Observe the page title and content in the Print Preview. [Expected result]: The title and contents of the article should be the same as seen without simplifying the page. [Actual result]: "Yahoo News - Latest News & Headlines" is displayed instead of the true article title. The contents of the page is changed with the one from a different article. [Notes]: - The issue is not reproducible if you open the article in a new tab. - Attached a screen recording of the issue.
Thanks for finding this, cmuresan. I think this bug is in the Reader Mode component. If you follow steps 1 and 2, and then click on the Reader Mode button in the URL bar, you don't get to see the article - you see "Yahoo News - Latest News & Headlines" and a chunk from the original page. Going to update the bug.
Component: Printing → Reader Mode
Summary: Different Yahoo News article title and content is displayed when using Simplify Page → Different Yahoo News article title and content is displayed when using Reader Mode
Priority: -- → P3
Priority: P3 → --
Priority: -- → P2
Whiteboard: [reader-mode-readability-algorithm]
Blocks: 1286221
I would say this bug might be a WONTFIX issue because the failure is caused by the webpage itself. At the stage(it should open in an overlay) of finishing step 2 to reproduce the bug, the `<meta>` nodes about title and the `<title>` node in DOM tree[1] is like below ``` <meta name="twitter:title" content="Yahoo News - Latest News &amp; Headlines" /> <meta property="og:title" content="Yahoo News - Latest News &amp; Headlines" /> <title>Russia: Space ship malfunctions, breaks up over Siberia</title> ``` For human-being, we could know `<title>` is more correct title than the other two. But in our algorithm `var articleTitle = metadata.title || this._getArticleTitle();`[2], we just get title form `<meta>` node first. `<title>` node is just a second choice if there is no title in `<meta>` nodes. Seems we might not do AI things in our algorithm to identify a better title, we probably need to do WONTFIX here. Or we could just compromise with Yahoo's article(Yahoo is top 6 website in the world) and switch the priorities of two way of getting title of a webpage. We could do something like `var articleTitle = this._getArticleTitle() || metadata.title;`[3]. What do you think? [1]: https://raw.githubusercontent.com/evanxd/readability/5ae2e69df3fd7e2b150b7faf57d615d9e961f60b/test/test-pages/yahoo-2/source.html [2]: https://github.com/mozilla/readability/blob/master/Readability.js#L1853 [3]: https://github.com/evanxd/readability/blob/bug-1285543/Readability.js#L1853
Flags: needinfo?(gijskruitbosch+bugs)
I won't have time to look at this today, and maybe not until (after?) Hawaii. Sorry!
We talked about this in Hawaii. It seems like PageMetadata.jsm (which we'd like to eventually switch to, see bug 1141782) relies on the document title over the metadata. We should probably do the same. To fix this, we can call _getArticleTitle from inside _getArticleMetadata, and only use the metadata "og:title" stuff if _getArticleTitle does not return a valid title. Then at https://github.com/mozilla/readability/blob/master/Readability.js#L1853 we can just use the thing _getArticleMetadata returns (which will be from _getArticleTitle if that returned something valid).
Flags: needinfo?(gijskruitbosch+bugs)
Assignee: nobody → evan
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Forgot land code in m-c.
This bug can be fixed after Bug 1142312's patch is landed.
I had to back out bug 1142312's patch, for the record.
Flags: needinfo?(evan)
Updated patch for fixing the test failure. https://reviewboard.mozilla.org/r/98730/diff/3#index_header
Flags: needinfo?(evan)
Should this get closed since https://hg.mozilla.org/mozilla-central/rev/c3d23c29c47f landed yesterday?
(In reply to Guilherme Lima from comment #13) > Should this get closed since > https://hg.mozilla.org/mozilla-central/rev/c3d23c29c47f landed yesterday? Yes. Thanks for reminding, Guilherme.
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Blocks: 1324630

Marking this as Verified as the issue is no longer reproducible on Firefox builds from 2016-12-20 or the latest Nightly 117.0a1 (BuildID 20230717160307).

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: