Closed Bug 782285 Opened 12 years ago Closed 12 years ago

Reader Mode is offered for inappropriate websites

Tracking

(firefox16 verified, firefox17 fixed)

Status:

VERIFIED FIXED

Milestone:

Firefox 17

Tracking Flags:

Tracking

Status

firefox16

---

verified

firefox17

---

fixed

People

(Reporter: xti, Unassigned)

References

Details

Attachments

(2 files, 2 obsolete files)

screenshot 12 years ago Cristian Nicolae (:xti) 330.88 KB, image/png		Details
Don't offer reader in pages with too much reading competition 12 years ago Lucas Rocha (:lucasr) 5.75 KB, patch	bnicholson : feedback-	Details \| Diff \| Splinter Review
Don't offer reader in pages with too much reading competition * * 12 years ago Lucas Rocha (:lucasr) 5.75 KB, patch	mfinkle : review+ bnicholson : feedback+ akeybl : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
Break line one log messages in Readability 12 years ago Lucas Rocha (:lucasr) 984 bytes, patch		Details \| Diff \| Splinter Review

Cristian Nicolae (:xti)

Reporter

Description

•

12 years ago

Attached image screenshot — Details

Firefox 17.0a1 (2012-08-13)
Device: Galaxy Nexus
OS: Android 4.1.1

Steps to reproduce:
1. Open Fennec
2. Go to http://www.youtube.com/watch?v=hu2rRqFazuA (YouTube desktop site)
3. Tap on the video to play it
4. Long tap on the video
5. Select "Pop up" from the context menu
6. Check if Reader Mode icon is displayed for the new opened tab

Expected result:
The Reader Mode is not offered for inappropriate pages.

Actual result:
The Reader Mode icon is displayed in URL Bar for the tab opened at step 5 as you can see in the attached screenshot.

Cristian Nicolae (:xti)

Reporter

Updated

•

12 years ago

Blocks: 779796

Kevin Brosnan [Ex-Mozilla]

Updated

•

12 years ago

Blocks: reader
No longer blocks: 779796

Lucas Rocha (:lucasr)

Comment 1

•

12 years ago

Attached patch Don't offer reader in pages with too much reading competition (obsolete) — Details — Splinter Review

Attachment #653757 - Flags: review?

Lucas Rocha (:lucasr)

Updated

•

12 years ago

Attachment #653757 - Flags: review?(mark.finkle)

Attachment #653757 - Flags: review?

Attachment #653757 - Flags: feedback?(bnicholson)

Lucas Rocha (:lucasr)

Comment 2

•

12 years ago

This patch greatly improves the readability check results for pages that are not articles. Pages like nytimes.com/technology or http://edition.cnn.com/WORLD should not have reader enabled. Here's a test build:

http://dl.dropbox.com/u/1187037/better-reader-check.apk

Brian Nicholson (:bnicholson)

Comment 3

•

12 years ago

Comment on attachment 653757 [details] [diff] [review]
Don't offer reader in pages with too much reading competition

+        for (let t = 0; t < this.N_TOP_CANDIDATES; t++) {
+          let aTopCandidate = topCandidates[t];
+
+          if (!aTopCandidate || candidateScore > aTopCandidate.readability.contentScore) {
+            if (t + 1 < this.N_TOP_CANDIDATES && aTopCandidate)
+              topCandidates[t + 1] = aTopCandidate;
+
+            topCandidates[t] = candidate;
+            break;
+          }

Wouldn't it be better to shift the list of top candidates rather than just replacing the next candidate? Otherwise, if you have a list of scores [20,15,3,2,1], and you try to add a candidate with score 25, you'd end up with [25,20,3,2,1], whereas I think you'd want [25,20,15,3,2]. Or if you read a page of candidates 1,2,3,4,5, you'd end up with [5,4], but I think you'd want [5,4,3,2,1].


It looks like this breaks some pages that should be articles, e.g.:
* http://en.m.wikipedia.org/wiki/Mozilla
* http://mobile.slate.com/posts/2012/06/22/new_law_threatens_mississippi_s_last_abortion_clinic.html

It also doesn't filter out Google search queries, which I would have expected to fail, e.g.:
http://www.google.com/search?q=query&ie=utf-8&oe=utf-8&channel=fs

For readability, I think it's better to err on the side of having more false positives as opposed to more false negatives. Showing the icon for unreadable pages is annoying, but not showing the icon for readable pages reduces defeats the purpose of reader mode. Having a competition algorithm is a good idea, but it's probably better to tweak it more first to make sure we have as few false negatives as possible.

Attachment #653757 - Flags: feedback?(bnicholson) → feedback-

Lucas Rocha (:lucasr)

Comment 4

•

12 years ago

(In reply to Brian Nicholson (:bnicholson) from comment #3)
> Comment on attachment 653757 [details] [diff] [review]
> Don't offer reader in pages with too much reading competition
> 
> +        for (let t = 0; t < this.N_TOP_CANDIDATES; t++) {
> +          let aTopCandidate = topCandidates[t];
> +
> +          if (!aTopCandidate || candidateScore >
> aTopCandidate.readability.contentScore) {
> +            if (t + 1 < this.N_TOP_CANDIDATES && aTopCandidate)
> +              topCandidates[t + 1] = aTopCandidate;
> +
> +            topCandidates[t] = candidate;
> +            break;
> +          }
> 
> Wouldn't it be better to shift the list of top candidates rather than just
> replacing the next candidate? Otherwise, if you have a list of scores
> [20,15,3,2,1], and you try to add a candidate with score 25, you'd end up
> with [25,20,3,2,1], whereas I think you'd want [25,20,15,3,2]. Or if you
> read a page of candidates 1,2,3,4,5, you'd end up with [5,4], but I think
> you'd want [5,4,3,2,1].
> 
> 
> It looks like this breaks some pages that should be articles, e.g.:
> * http://en.m.wikipedia.org/wiki/Mozilla

Known issue but we're not doing the right thing with Wikipedia pages anyway (see bug 782423). Wikipedia is very unique kind of page. We'll have to sort out the best way to handle it in that separate bug.

I personally prefer to not offer reader mode at all instead of offering and showing something totally unexpected. So, in a way, this is an improvement :-)

> http://mobile.slate.com/posts/2012/06/22/
> new_law_threatens_mississippi_s_last_abortion_clinic.html

This will be fixed once we properly convert DIVs to Ps. This is probably unrelated to this change. I'll double-check.

> It also doesn't filter out Google search queries, which I would have
> expected to fail, e.g.:
> http://www.google.com/search?q=query&ie=utf-8&oe=utf-8&channel=fs

Interesting. I tested it on a few Google search queries and it worked fine. Might be a mobile site thing. Will have a look.

> For readability, I think it's better to err on the side of having more false
> positives as opposed to more false negatives. Showing the icon for
> unreadable pages is annoying, but not showing the icon for readable pages
> reduces defeats the purpose of reader mode. Having a competition algorithm
> is a good idea, but it's probably better to tweak it more first to make sure
> we have as few false negatives as possible.

Agree that we should err on the false positives and I tweaked the maximum contrast ratio to do exactly that. There's no 100% accurate way of doing this. From my tests on a long list of websites (nytimes, cnn, arstechnica, latimes, tons of blogs, and much more) the results have always err'd on the false positive side.

Anyway, I posted the test build exactly for that: to get an idea of where it fails when it really shouldn't. I wouldn't block on things like google.com case though (i.e. this is not a regression). We should be incrementally tweaking the readability code to improve its accuracy.

Lucas Rocha (:lucasr)

Comment 5

•

12 years ago

Attached patch Don't offer reader in pages with too much reading competition * * — Details — Splinter Review

Attachment #653850 - Flags: review?

Lucas Rocha (:lucasr)

Updated

•

12 years ago

Attachment #653757 - Attachment is obsolete: true

Attachment #653757 - Flags: review?(mark.finkle)

Lucas Rocha (:lucasr)

Updated

•

12 years ago

Attachment #653850 - Flags: review?(mark.finkle)

Attachment #653850 - Flags: review?

Attachment #653850 - Flags: feedback?(bnicholson)

Brian Nicholson (:bnicholson)

Comment 6

•

12 years ago

I'm still worried about http://mobile.slate.com/posts/2012/06/22/new_law_threatens_mississippi_s_last_abortion_clinic.html; it looks like it parses fine before this patch, but doesn't work with it applied. Can you confirm the DIV->P patch fixes it?

Lucas Rocha (:lucasr)

Comment 7

•

12 years ago

(In reply to Brian Nicholson (:bnicholson) from comment #6)
> I'm still worried about
> http://mobile.slate.com/posts/2012/06/22/
> new_law_threatens_mississippi_s_last_abortion_clinic.html; it looks like it
> parses fine before this patch, but doesn't work with it applied. Can you
> confirm the DIV->P patch fixes it?

Yes, just confirmed that it fixes it. The reason is simple: without the DIV->P patch, the DIVs-with-P elements in mobile.slate.com were being considered candidates on their own right.

Lucas Rocha (:lucasr)

Comment 8

•

12 years ago

Attached patch Break line one log messages in Readability (obsolete) — Details — Splinter Review

Attachment #654199 - Flags: review?(bnicholson)

Brian Nicholson (:bnicholson)

Comment 9

•

12 years ago

(In reply to Lucas Rocha (:lucasr) from comment #8)
> Created attachment 654199 [details] [diff] [review]
> Break line one log messages in Readability

Is this just for the Readability XULRunner test? I didn't think this happened with Readability.js on Android.

Brian Nicholson (:bnicholson)

Updated

•

12 years ago

Attachment #653850 - Flags: feedback?(bnicholson) → feedback+

Mark Finkle (:mfinkle) (use needinfo?)

Comment 10

•

12 years ago

Comment on attachment 653850 [details] [diff] [review]
Don't offer reader in pages with too much reading competition  * *

nit: aTopCandidate looks a bit odd since we use "a" + "Something" as a function param. Maybe just use "topCandidate" ?

Attachment #653850 - Flags: review?(mark.finkle) → review+

Lucas Rocha (:lucasr)

Comment 11

•

12 years ago

(In reply to Mark Finkle (:mfinkle) from comment #10)
> Comment on attachment 653850 [details] [diff] [review]
> Don't offer reader in pages with too much reading competition  * *
> 
> nit: aTopCandidate looks a bit odd since we use "a" + "Something" as a
> function param. Maybe just use "topCandidate" ?

I didn't use "topCandidate" there to avoid confusion with the important variable called topCandidate below.

Lucas Rocha (:lucasr)

Comment 12

•

12 years ago

Pushed: https://hg.mozilla.org/integration/mozilla-inbound/rev/55c4a3f3a6a9

Lucas Rocha (:lucasr)

Comment 13

•

12 years ago

Comment on attachment 653850 [details] [diff] [review]
Don't offer reader in pages with too much reading competition  * *

[Approval Request Comment]
User impact if declined: We're offering reader mode on pages which are more like lists of articles instead of actual readable content. We want to avoid these false positive cases as much as possible in reader.
Testing completed (on m-c, etc.): Tested locally with a long list of major and minor websites and blogs. The change didn't cause any false negative in my tests. There are still some remaining false positives which I intent to cover in a follow-up bugs though.
Risk to taking this patch (and alternatives if risky): Medium. There's no 100% accurate way of predicting how actually readable a page should be. This patch sets a higher bar for pages to be considered reader mode material. There's a risk of adding false negative cases but the best way to find out about those is to have this patch in the wild.
String or UUID changes made by this patch: None.

Attachment #653850 - Flags: approval-mozilla-aurora?

Lucas Rocha (:lucasr)

Comment 14

•

12 years ago

Comment on attachment 654199 [details] [diff] [review]
Break line one log messages in Readability

No big deal, dropping this one.

Attachment #654199 - Attachment is obsolete: true

Attachment #654199 - Flags: review?(bnicholson)

Ryan VanderMeulen [:RyanVM]

Comment 15

•

12 years ago

https://hg.mozilla.org/mozilla-central/rev/55c4a3f3a6a9

Status: NEW → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Target Milestone: --- → Firefox 17

Alex Keybl [:akeybl]

Comment 16

•

12 years ago

(In reply to Lucas Rocha (:lucasr) from comment #13)
> Comment on attachment 653850 [details] [diff] [review]
> Don't offer reader in pages with too much reading competition  * *
> 
> [Approval Request Comment]
> User impact if declined: We're offering reader mode on pages which are more
> like lists of articles instead of actual readable content. We want to avoid
> these false positive cases as much as possible in reader.

^ Given the user impact here

> Risk to taking this patch (and alternatives if risky): Medium. There's no
> 100% accurate way of predicting how actually readable a page should be. This
> patch sets a higher bar for pages to be considered reader mode material.
> There's a risk of adding false negative cases but the best way to find out
> about those is to have this patch in the wild.

and the fallout being more false negatives (which wouldn't feel like a regression as it's a new feature), let's get this onto Aurora as soon as possible.

Alex Keybl [:akeybl]

Updated

•

12 years ago

Attachment #653850 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Lucas Rocha (:lucasr)

Comment 17

•

12 years ago

Pushed to aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/e978f8eda190

Aaron Train [:aaronmt]

Updated

•

12 years ago

status-firefox16: --- → fixed

status-firefox17: --- → fixed

Aaron Train [:aaronmt]

Updated

•

12 years ago

Status: RESOLVED → VERIFIED

status-firefox16: fixed → verified

henryfhchan

Comment 18

•

11 years ago

What should I do if reader mode icon never shows up even on news sites (e.g. CNN)?

Aaron Train [:aaronmt]

Comment 19

•

11 years ago

(In reply to henryfhchan from comment #18)
> What should I do if reader mode icon never shows up even on news sites (e.g.
> CNN)?

Depending on your Android device you might be yielding the behaviour added in bug 828124 where we've set a memory threshold for using reader-mode

BMO Automation

Updated

•

3 years ago

Product: Firefox for Android → Firefox for Android Graveyard

You need to log in before you can comment on or make changes to this bug.