Open Bug 1453810 Opened 6 years ago Updated 2 years ago

Some articles at audiotorrentz.org offer Reader Mode (and the "Simplify Page" checkbox), while others do not

Categories

(Toolkit :: Reader Mode, defect, P3)

defect

Tracking

()

People

(Reporter: alex_mayorga, Unassigned)

References

Details

(Keywords: nightly-community, Whiteboard: [reader-mode-readability-algorithm])

¡Hola!

Filing this bug on behalf of the SuMo user at https://support.mozilla.org/questions/1213327

The "Simplify Page" check box is greyed out for this page:

https://audiotorrentz.org/2017/09/11/refx-nexus-2-pack-vsti-presets-expansiones-torrent/

Kindly look into this issue.

¡Gracias!
Alex
¡Hola!

Further clarifying this is while doing a "Print Preview" of the mentioned URL.

¡Gracias!
Alex
Hmm, I'm seeing this grayed out even when print-previewing simple pages like:
 http://www.example.org/
 about:blank
 data:text/plain,abc

I have yet to find a page where it's *not* grayed out.  And, using example.org as my testcase, it seems to be grayed out as far back as 2016-07-12 when this feature was first enabled in Nightly (via bug 1285607).

mconley / jaws, do you know what's going on here?  Are you seeing behavior that's better than what I'm seeing? (checkbox universally disabled all the way back to the day this was first enabled)?
Depends on: 1285607
Flags: needinfo?(mconley)
The Simplify Page feature uses Readability.js / Reader Mode stuff under the hood to do the simplification. If the checkbox is greyed out, this means that Readability.js failed to extract an article from the page that could potentially be put into Reader Mode (or Simplify Page).

There are certain cases that Readability.js just doesn't handle too well these days. example.org is one, it seems. The mostly empty document with just "abc" in it is another. I think Readability.js is mostly looking for things like <article>, <section>, and other HTML element cues for it to extract the relevant text.

Anyhow, does that help explain what's happening here?
Flags: needinfo?(mconley) → needinfo?(alex_mayorga)
Ah right -- it's non-obvious, but "simplify page" is based on whether or not we think we can offer a "reader view" for the page.

And from the SuMo report, here's one page that does offer Reader View (and simplify page in print-preview as a result):
https://audiotorrentz.org/2017/09/07/rob-papen-predator-2-vsti-aax-x86-x64/
...but here's another that does *not* offer reader view (or simplify page):
https://audiotorrentz.org/2017/09/11/refx-nexus-2-pack-vsti-presets-expansiones-torrent/

So I think this bug is basically asking the question: why does one of those pages get reader view [and the benefits that come with it] whereas the other does not?  --> Transferring needinfo to jaws who IIRC was involved with reader mode.

(The reporter on SuMo seems to have the impression that this regressed recently, but I don't think it actually did -- my guess is that the reporter only just stumbled on a page that has trouble, and/or maybe this particular site just changed something in its markup in a way that trips up our Reader-Mode heuristics for some of its pages.)
(In reply to Daniel Holbert [:dholbert] (away 4/24 - 5/11) from comment #4)
> --> Transferring needinfo to jaws who IIRC was involved with reader mode.

(Er, I forgot to set needinfo, but I probably won't after all)

So I think this is basically a "Readability.js can't handle every article in the world" bug (and it can't handle all articles on the same site in some cases).  I'm not sure how actively Readability.js is being worked on at this moment, but to the extent that there's something to be done here, the task would be to investigate why Readability.js likes one of the links from comment 4 but not the other.

(Also: I think we should consider hiding this UI rather than graying it out, for pages that Readability.js doens't support. I made a note of that in Bug 1440643 comment 3, and I think that'd kinda help here as well.)
Component: Print Preview → Reader Mode
OS: Windows 10 → All
Product: Core → Toolkit
Hardware: x86 → All
Summary: The "Simplify Page" check box is greyed out → Some articles at audiotorrentz.org offer Reader Mode (and the "Simplify Page" checkbox), while others do not
Version: 59 Branch → Trunk
Priority: -- → P3
Whiteboard: [reader-mode-readability-algorithm]
¡Hola Mike!

I don't think I am in a position to answer this n? so I{ll just clear it out for now.

You might want to n? somebody more knowledgeable on this.

I'm sorry =(

¡Gracias!
Alex
Flags: needinfo?(alex_mayorga)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.