Closed
Bug 782285
Opened 12 years ago
Closed 12 years ago
Reader Mode is offered for inappropriate websites
Categories
(Firefox for Android Graveyard :: Reader View, defect)
Tracking
(firefox16 verified, firefox17 fixed)
VERIFIED
FIXED
Firefox 17
People
(Reporter: xti, Unassigned)
References
Details
Attachments
(2 files, 2 obsolete files)
330.88 KB,
image/png
|
Details | |
5.75 KB,
patch
|
mfinkle
:
review+
bnicholson
:
feedback+
akeybl
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Firefox 17.0a1 (2012-08-13) Device: Galaxy Nexus OS: Android 4.1.1 Steps to reproduce: 1. Open Fennec 2. Go to http://www.youtube.com/watch?v=hu2rRqFazuA (YouTube desktop site) 3. Tap on the video to play it 4. Long tap on the video 5. Select "Pop up" from the context menu 6. Check if Reader Mode icon is displayed for the new opened tab Expected result: The Reader Mode is not offered for inappropriate pages. Actual result: The Reader Mode icon is displayed in URL Bar for the tab opened at step 5 as you can see in the attached screenshot.
Updated•12 years ago
|
Comment 1•12 years ago
|
||
Attachment #653757 -
Flags: review?
Updated•12 years ago
|
Attachment #653757 -
Flags: review?(mark.finkle)
Attachment #653757 -
Flags: review?
Attachment #653757 -
Flags: feedback?(bnicholson)
Comment 2•12 years ago
|
||
This patch greatly improves the readability check results for pages that are not articles. Pages like nytimes.com/technology or http://edition.cnn.com/WORLD should not have reader enabled. Here's a test build: http://dl.dropbox.com/u/1187037/better-reader-check.apk
Comment 3•12 years ago
|
||
Comment on attachment 653757 [details] [diff] [review] Don't offer reader in pages with too much reading competition + for (let t = 0; t < this.N_TOP_CANDIDATES; t++) { + let aTopCandidate = topCandidates[t]; + + if (!aTopCandidate || candidateScore > aTopCandidate.readability.contentScore) { + if (t + 1 < this.N_TOP_CANDIDATES && aTopCandidate) + topCandidates[t + 1] = aTopCandidate; + + topCandidates[t] = candidate; + break; + } Wouldn't it be better to shift the list of top candidates rather than just replacing the next candidate? Otherwise, if you have a list of scores [20,15,3,2,1], and you try to add a candidate with score 25, you'd end up with [25,20,3,2,1], whereas I think you'd want [25,20,15,3,2]. Or if you read a page of candidates 1,2,3,4,5, you'd end up with [5,4], but I think you'd want [5,4,3,2,1]. It looks like this breaks some pages that should be articles, e.g.: * http://en.m.wikipedia.org/wiki/Mozilla * http://mobile.slate.com/posts/2012/06/22/new_law_threatens_mississippi_s_last_abortion_clinic.html It also doesn't filter out Google search queries, which I would have expected to fail, e.g.: http://www.google.com/search?q=query&ie=utf-8&oe=utf-8&channel=fs For readability, I think it's better to err on the side of having more false positives as opposed to more false negatives. Showing the icon for unreadable pages is annoying, but not showing the icon for readable pages reduces defeats the purpose of reader mode. Having a competition algorithm is a good idea, but it's probably better to tweak it more first to make sure we have as few false negatives as possible.
Attachment #653757 -
Flags: feedback?(bnicholson) → feedback-
Comment 4•12 years ago
|
||
(In reply to Brian Nicholson (:bnicholson) from comment #3) > Comment on attachment 653757 [details] [diff] [review] > Don't offer reader in pages with too much reading competition > > + for (let t = 0; t < this.N_TOP_CANDIDATES; t++) { > + let aTopCandidate = topCandidates[t]; > + > + if (!aTopCandidate || candidateScore > > aTopCandidate.readability.contentScore) { > + if (t + 1 < this.N_TOP_CANDIDATES && aTopCandidate) > + topCandidates[t + 1] = aTopCandidate; > + > + topCandidates[t] = candidate; > + break; > + } > > Wouldn't it be better to shift the list of top candidates rather than just > replacing the next candidate? Otherwise, if you have a list of scores > [20,15,3,2,1], and you try to add a candidate with score 25, you'd end up > with [25,20,3,2,1], whereas I think you'd want [25,20,15,3,2]. Or if you > read a page of candidates 1,2,3,4,5, you'd end up with [5,4], but I think > you'd want [5,4,3,2,1]. > > > It looks like this breaks some pages that should be articles, e.g.: > * http://en.m.wikipedia.org/wiki/Mozilla Known issue but we're not doing the right thing with Wikipedia pages anyway (see bug 782423). Wikipedia is very unique kind of page. We'll have to sort out the best way to handle it in that separate bug. I personally prefer to not offer reader mode at all instead of offering and showing something totally unexpected. So, in a way, this is an improvement :-) > http://mobile.slate.com/posts/2012/06/22/ > new_law_threatens_mississippi_s_last_abortion_clinic.html This will be fixed once we properly convert DIVs to Ps. This is probably unrelated to this change. I'll double-check. > It also doesn't filter out Google search queries, which I would have > expected to fail, e.g.: > http://www.google.com/search?q=query&ie=utf-8&oe=utf-8&channel=fs Interesting. I tested it on a few Google search queries and it worked fine. Might be a mobile site thing. Will have a look. > For readability, I think it's better to err on the side of having more false > positives as opposed to more false negatives. Showing the icon for > unreadable pages is annoying, but not showing the icon for readable pages > reduces defeats the purpose of reader mode. Having a competition algorithm > is a good idea, but it's probably better to tweak it more first to make sure > we have as few false negatives as possible. Agree that we should err on the false positives and I tweaked the maximum contrast ratio to do exactly that. There's no 100% accurate way of doing this. From my tests on a long list of websites (nytimes, cnn, arstechnica, latimes, tons of blogs, and much more) the results have always err'd on the false positive side. Anyway, I posted the test build exactly for that: to get an idea of where it fails when it really shouldn't. I wouldn't block on things like google.com case though (i.e. this is not a regression). We should be incrementally tweaking the readability code to improve its accuracy.
Comment 5•12 years ago
|
||
Attachment #653850 -
Flags: review?
Updated•12 years ago
|
Attachment #653757 -
Attachment is obsolete: true
Attachment #653757 -
Flags: review?(mark.finkle)
Updated•12 years ago
|
Attachment #653850 -
Flags: review?(mark.finkle)
Attachment #653850 -
Flags: review?
Attachment #653850 -
Flags: feedback?(bnicholson)
Comment 6•12 years ago
|
||
I'm still worried about http://mobile.slate.com/posts/2012/06/22/new_law_threatens_mississippi_s_last_abortion_clinic.html; it looks like it parses fine before this patch, but doesn't work with it applied. Can you confirm the DIV->P patch fixes it?
Comment 7•12 years ago
|
||
(In reply to Brian Nicholson (:bnicholson) from comment #6) > I'm still worried about > http://mobile.slate.com/posts/2012/06/22/ > new_law_threatens_mississippi_s_last_abortion_clinic.html; it looks like it > parses fine before this patch, but doesn't work with it applied. Can you > confirm the DIV->P patch fixes it? Yes, just confirmed that it fixes it. The reason is simple: without the DIV->P patch, the DIVs-with-P elements in mobile.slate.com were being considered candidates on their own right.
Comment 8•12 years ago
|
||
Attachment #654199 -
Flags: review?(bnicholson)
Comment 9•12 years ago
|
||
(In reply to Lucas Rocha (:lucasr) from comment #8) > Created attachment 654199 [details] [diff] [review] > Break line one log messages in Readability Is this just for the Readability XULRunner test? I didn't think this happened with Readability.js on Android.
Updated•12 years ago
|
Attachment #653850 -
Flags: feedback?(bnicholson) → feedback+
Comment 10•12 years ago
|
||
Comment on attachment 653850 [details] [diff] [review] Don't offer reader in pages with too much reading competition * * nit: aTopCandidate looks a bit odd since we use "a" + "Something" as a function param. Maybe just use "topCandidate" ?
Attachment #653850 -
Flags: review?(mark.finkle) → review+
Comment 11•12 years ago
|
||
(In reply to Mark Finkle (:mfinkle) from comment #10) > Comment on attachment 653850 [details] [diff] [review] > Don't offer reader in pages with too much reading competition * * > > nit: aTopCandidate looks a bit odd since we use "a" + "Something" as a > function param. Maybe just use "topCandidate" ? I didn't use "topCandidate" there to avoid confusion with the important variable called topCandidate below.
Comment 12•12 years ago
|
||
Pushed: https://hg.mozilla.org/integration/mozilla-inbound/rev/55c4a3f3a6a9
Comment 13•12 years ago
|
||
Comment on attachment 653850 [details] [diff] [review] Don't offer reader in pages with too much reading competition * * [Approval Request Comment] User impact if declined: We're offering reader mode on pages which are more like lists of articles instead of actual readable content. We want to avoid these false positive cases as much as possible in reader. Testing completed (on m-c, etc.): Tested locally with a long list of major and minor websites and blogs. The change didn't cause any false negative in my tests. There are still some remaining false positives which I intent to cover in a follow-up bugs though. Risk to taking this patch (and alternatives if risky): Medium. There's no 100% accurate way of predicting how actually readable a page should be. This patch sets a higher bar for pages to be considered reader mode material. There's a risk of adding false negative cases but the best way to find out about those is to have this patch in the wild. String or UUID changes made by this patch: None.
Attachment #653850 -
Flags: approval-mozilla-aurora?
Comment 14•12 years ago
|
||
Comment on attachment 654199 [details] [diff] [review] Break line one log messages in Readability No big deal, dropping this one.
Attachment #654199 -
Attachment is obsolete: true
Attachment #654199 -
Flags: review?(bnicholson)
Comment 15•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/55c4a3f3a6a9
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 17
Comment 16•12 years ago
|
||
(In reply to Lucas Rocha (:lucasr) from comment #13) > Comment on attachment 653850 [details] [diff] [review] > Don't offer reader in pages with too much reading competition * * > > [Approval Request Comment] > User impact if declined: We're offering reader mode on pages which are more > like lists of articles instead of actual readable content. We want to avoid > these false positive cases as much as possible in reader. ^ Given the user impact here > Risk to taking this patch (and alternatives if risky): Medium. There's no > 100% accurate way of predicting how actually readable a page should be. This > patch sets a higher bar for pages to be considered reader mode material. > There's a risk of adding false negative cases but the best way to find out > about those is to have this patch in the wild. and the fallout being more false negatives (which wouldn't feel like a regression as it's a new feature), let's get this onto Aurora as soon as possible.
Updated•12 years ago
|
Attachment #653850 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 17•12 years ago
|
||
Pushed to aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/e978f8eda190
Updated•12 years ago
|
status-firefox16:
--- → fixed
status-firefox17:
--- → fixed
Updated•12 years ago
|
Status: RESOLVED → VERIFIED
Comment 18•11 years ago
|
||
What should I do if reader mode icon never shows up even on news sites (e.g. CNN)?
Comment 19•11 years ago
|
||
(In reply to henryfhchan from comment #18) > What should I do if reader mode icon never shows up even on news sites (e.g. > CNN)? Depending on your Android device you might be yielding the behaviour added in bug 828124 where we've set a memory threshold for using reader-mode
Updated•3 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•