Closed
Bug 760554
Opened 12 years ago
Closed 2 years ago
Reader Mode: Support multi-page articles
Categories
(Toolkit :: Reader Mode, enhancement, P3)
Toolkit
Reader Mode
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: lucasr, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: [reader-mode-firefox-integration])
Attachments
(2 obsolete files)
If the article in the site is split into multiple pages, we have to incrementally load the rest of the content and append to the initially loaded article content. The readability script does support it but I've disabled it to be able to land the first reader mode patches.
Reporter | ||
Updated•12 years ago
|
Component: General → Reader Mode
Reporter | ||
Comment 1•12 years ago
|
||
Multi-page articles are quite common on some major websites (e.g. nytimes, wired, ...).
Priority: -- → P2
Comment 2•12 years ago
|
||
Oftentimes the printable version of the article contains the full text as well. That's consistent enough among major websites it might be useful? Some also provide a full-text/one-page button, but my feeling is that the nomenclature isn't very consistent.
Attachment #715159 -
Flags: feedback?(lucasr.at.mozilla)
Reporter | ||
Comment 4•11 years ago
|
||
Comment on attachment 715159 [details] [diff] [review] first patch for multipage articles in reader mode Review of attachment 715159 [details] [diff] [review]: ----------------------------------------------------------------- Looking promising but needs work. Readability.js contains the code to fetch the next page link for you (with _findNextPageLink() and all). Why did you need to hardcode specific URLs? ::: mobile/android/chrome/content/Readability.js @@ +194,5 @@ > * @return void > **/ > + _prepDocument: function(document) { > + //let doc = this._doc; > + let doc = document; How's this change related to the main point of the patch? @@ +873,5 @@ > _findBaseUrl: function() { > let uri = this._uri; > + //let noUrlParams = uri.path.split("?")[0]; > + let noUrlParams = uri.spec.split("?")[0]; > + noUrlParams = noUrlParams.replace(uri.prePath, ""); Some context about this change would be nice :-) What's this about? @@ +921,5 @@ > cleanedSegments.push(segment); > } > > // This is our final, cleaned, base article URL. > + //return uri.scheme + "://" + uri.host + cleanedSegments.reverse().join("/"); Cleanup the commented code? @@ +956,5 @@ > + if(!linkHref.match(/www\..+\.com/)) { > + > + linkHref = uri.prePath + linkHref; > + } > + } What's this about? Hardcoding pages here won't scale well. Doesn't the original algorithm cover these things? @@ +1072,5 @@ > + // if the URl contins things like ad,ads,adn,adt, for sure is not a next page link, give a huge decrease. > + // example : "http://www.nytimes.com/adx/bin/adx_click.html?type=goto&opzn&page=www.nytimes.com/yr/mo/day/world/africa&pos=Box1&sn2=5be1d228/22bac537&sn1=4db0b17/d3b11b59&camp=IHT2012-Mktg-336x280-US_ROS-Intl-Mktg_module&ad=28Jan13_Opinion_IndiaInk_336x280&goto=http://global.nytimes.com/" > + //also for things like slideshow(for some reason a link for some slideshow images got a bigger score then the NextPage link so we penalize it) > + if (linkHref.match(/\/ad\w{1,4}|slideshow/i)) > + linkObj.score -= 100; Where's this code coming from?
Attachment #715159 -
Flags: feedback?(lucasr.at.mozilla) → feedback-
I tested this patch mainly on www.nytimes.com (and a little on www.phoronix.com, here I run in to another problem...) 1."Why did you need to hardcode specific URLs?" I had to hardcode specific URLs because on www.nytimes.com the next page link from the <a> tag does not contain the whole URL for the next page but only the path example: <a class="next" href="/2013/03/01/us/politics/house-republicans-cheer-boehners-refusal-to-negotiate-on-cuts.html?pagewanted=2&hp" title="Next Page" onclick="s_code_linktrack('Article-MultiPage-Next');"></a> so because of this code: http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/Readability.js#956 next page link was ignored 2."::: mobile/android/chrome/content/Readability.js @@ +194,5 @@" Now it is not related to this patch (I made this change at the beginning of my attempt of solving this bug) 3."@@ +873,5 @@" uri.path gave me undefined for some reason so I had to make an workaround. Also if I recall correctly uri.scheme gave me undefined too. 4."@@ +921,5 @@" ok :-) 5."@@ +956,5 @@" this is just for www.nytimes.com and for www.phoronix.com for the reason explained in point 1. 6."@@ +1072,5 @@" This code is coming from me :-). I added it because for some unknown reason some links(add ons, slide show) still got a higher score then the next page link.
Attachment #715159 -
Attachment is obsolete: true
Attachment #721181 -
Flags: review?(lucasr.at.mozilla)
Reporter | ||
Updated•11 years ago
|
Reporter | ||
Comment 7•10 years ago
|
||
Comment on attachment 721181 [details] [diff] [review] second patch of multipage articles in reader mode Thanks a lot for the patch, Alex. Apologies for the huge delay here :-/ The reader mode has changed a lot since you submitted this patch. Let me know if you'd like to discuss how to complete this work. Clearing the review flag for now.
Attachment #721181 -
Flags: review?(lucasr.at.mozilla)
Comment 8•10 years ago
|
||
Just a friendly ping to see how things are going in this bug? :)
Updated•10 years ago
|
Attachment #721181 -
Attachment is obsolete: true
Updated•10 years ago
|
Updated•10 years ago
|
Component: Reader Mode → Readability
Updated•10 years ago
|
Component: Readability → Reader Mode
Product: Firefox for Android → Toolkit
Comment 9•9 years ago
|
||
Out of scope for v1 work.
OS: Android → All
Priority: P2 → P3
Whiteboard: [readinglist-v2]
Updated•8 years ago
|
Whiteboard: [readinglist-v2] → [reader-mode-firefox-integration]
Updated•5 years ago
|
Type: defect → enhancement
Updated•2 years ago
|
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•