Open Bug 1424032 Opened 7 years ago Updated 2 years ago

Reader View chooses <title> tag over title displayed on webpage (in <h1> or other in-page content)

Categories

(Toolkit :: Reader Mode, defect, P3)

57 Branch
defect

Tracking

()

People

(Reporter: ph, Unassigned)

Details

(Whiteboard: [reader-mode-readability-algorithm])

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:57.0) Gecko/20100101 Firefox/57.0
Build ID: 20171128222554

Steps to reproduce:

Navigate to a webpage whose displayed title differs from its <title> tag, then click on the Reader View button. A couple of examples:

https://www.palmspringslife.com/broadways-best-franks-place/
article title: Broadway’s Best at Frank’s Place
<title> tag: Frank's Place Delivers Feel-Good Costumed Cabaret Performance

https://www.vogue.com/article/meghan-markle-biracial-identity-politics-personal-essay
article title: The Problem With Calling Meghan Markle the “First Black Princess”
<title> tag: On Meghan Markle, Race, and Royalty - Vogue


Actual results:

The title displayed in Reader View is the <title> tag.


Expected results:

The title displayed in Reader View should be the title displayed on the webpage.
Summary: Reader View chooses <title> tag over article title → Reader View chooses <title> tag over title displayed on webpage
Component: Untriaged → Reader Mode
Product: Firefox → Toolkit
TBH, I seem to recall we used to try to do this and then people complained because we made the inverse mistake: we picked a subheading (because the "real" heading and the subheading were badly marked up) over the <title> tag (which would have been a better fallback). Anyway, in principle this is a valid complaint, so I'll file it in the backlog to have a look at at some point...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Priority: -- → P3
Summary: Reader View chooses <title> tag over title displayed on webpage → Reader View chooses <title> tag over title displayed on webpage (in <h1> or other in-page content)
Whiteboard: [reader-mode-readability-algorithm]
In general – and this comment applies to Bug 1424036 as well – rather than relying on metadata and markup to determine what gets included in Reader View (and thereby sometimes omitting important information), I'd prefer that Reader View err on the side of what's actually shown on the webpage (even if that sometimes includes unimportant information). But maybe I'm in the minority there.
(In reply to Patrick Hubenthal from comment #2)
> In general – and this comment applies to Bug 1424036 as well – rather than
> relying on metadata and markup to determine what gets included in Reader
> View (and thereby sometimes omitting important information), I'd prefer that
> Reader View err on the side of what's actually shown on the webpage (even if
> that sometimes includes unimportant information). But maybe I'm in the
> minority there.

Well, that in itself is a perfectly reasonable point, the problem is *which* things on the website are the article title, and which are just subheadings. Picking the "highest" header (ie <h1> over <h2>, <h2> over <h3> ) sadly often won't work because people don't use the headers appropriately (just look at https://bugzilla.mozilla.org/show_bug.cgi?id=1198731 (multiple <h1>) and https://bugzilla.mozilla.org/show_bug.cgi?id=1259763 (using <h2> for article leader text which isn't a header) ), and even when they do it might contain the website title ("CNN", "The Atlantic", ...) and not the article title, which is the bit we want.

Just keeping everything means we get a cluttered display because we don't know which header tags contains an article title, and that loses some of the nice minimalist styling we have and potentially duplicates information.

In practice, the document title is a reasonable "guess", but unfortunately as you found, it is not perfect. I'm happy to take patches to improve things, but getting that right won't be easy.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.