Closed Bug 760554 Opened 12 years ago Closed 2 years ago

Reader Mode: Support multi-page articles

Tracking

()

Status:

RESOLVED WONTFIX

People

(Reporter: lucasr, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [reader-mode-firefox-integration])

Attachments

(2 obsolete files)

first patch for multipage articles in reader mode 11 years ago alex 9.32 KB, patch	lucasr : feedback-	Details \| Diff \| Splinter Review
second patch of multipage articles in reader mode 11 years ago alex 8.78 KB, patch		Details \| Diff \| Splinter Review

Lucas Rocha (:lucasr)

Reporter

Description

•

12 years ago

If the article in the site is split into multiple pages, we have to incrementally load the rest of the content and append to the initially loaded article content.

The readability script does support it but I've disabled it to be able to land the first reader mode patches.

Lucas Rocha (:lucasr)

Reporter

Updated

•

12 years ago

Component: General → Reader Mode

Lucas Rocha (:lucasr)

Reporter

Comment 1

•

12 years ago

Multi-page articles are quite common on some major websites (e.g. nytimes, wired, ...).

Priority: -- → P2

Tim Martin

Comment 2

•

12 years ago

Oftentimes the printable version of the article contains the full text as well.  That's consistent enough among major websites it might be useful?

Some also provide a full-text/one-page button, but my feeling is that the nomenclature isn't very consistent.

alex

Comment 3

•

11 years ago

Attached patch first patch for multipage articles in reader mode (obsolete) — Details — Splinter Review

Attachment #715159 - Flags: feedback?(lucasr.at.mozilla)

Lucas Rocha (:lucasr)

Reporter

Comment 4

•

11 years ago

Comment on attachment 715159 [details] [diff] [review]
first patch for multipage articles in reader mode

Review of attachment 715159 [details] [diff] [review]:
-----------------------------------------------------------------

Looking promising but needs work. Readability.js contains the code to fetch the next page link for you (with _findNextPageLink() and all). Why did you need to hardcode specific URLs?

::: mobile/android/chrome/content/Readability.js
@@ +194,5 @@
>     * @return void
>     **/
> +  _prepDocument: function(document) {
> +    //let doc = this._doc;
> +    let doc = document;

How's this change related to the main point of the patch?

@@ +873,5 @@
>    _findBaseUrl: function() {
>      let uri = this._uri;
> +    //let noUrlParams = uri.path.split("?")[0];
> +    let noUrlParams = uri.spec.split("?")[0];
> +    noUrlParams = noUrlParams.replace(uri.prePath, "");

Some context about this change would be nice :-) What's this about?

@@ +921,5 @@
>          cleanedSegments.push(segment);
>      }
>  
>      // This is our final, cleaned, base article URL.
> +    //return uri.scheme + "://" + uri.host + cleanedSegments.reverse().join("/");

Cleanup the commented code?

@@ +956,5 @@
> +       if(!linkHref.match(/www\..+\.com/)) {
> +
> +        linkHref = uri.prePath + linkHref;
> +       }
> +      }

What's this about? Hardcoding pages here won't scale well. Doesn't the original algorithm cover these things?

@@ +1072,5 @@
> +      // if the URl contins things like ad,ads,adn,adt, for sure is not a next page link, give a huge decrease.
> +      // example : "http://www.nytimes.com/adx/bin/adx_click.html?type=goto&opzn&page=www.nytimes.com/yr/mo/day/world/africa&pos=Box1&sn2=5be1d228/22bac537&sn1=4db0b17/d3b11b59&camp=IHT2012-Mktg-336x280-US_ROS-Intl-Mktg_module&ad=28Jan13_Opinion_IndiaInk_336x280&goto=http://global.nytimes.com/"
> +      //also for things like slideshow(for some reason a link for some slideshow images got a bigger score then the NextPage link so we penalize it)
> +      if (linkHref.match(/\/ad\w{1,4}|slideshow/i))
> +       linkObj.score -= 100;

Where's this code coming from?

Attachment #715159 - Flags: feedback?(lucasr.at.mozilla) → feedback-

alex

Comment 5

•

11 years ago

I tested this patch mainly on www.nytimes.com (and a little on www.phoronix.com, here I run in to another problem...) 
1."Why did you need to hardcode specific URLs?"
I had to hardcode specific URLs because on www.nytimes.com the next page link from the <a> tag does not contain the whole URL for the next page but only the path 
example: <a class="next" href="/2013/03/01/us/politics/house-republicans-cheer-boehners-refusal-to-negotiate-on-cuts.html?pagewanted=2&hp" title="Next Page" onclick="s_code_linktrack('Article-MultiPage-Next');"></a>
so because of this code: 
http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/Readability.js#956
next page link was ignored

2."::: mobile/android/chrome/content/Readability.js
  @@ +194,5 @@"
Now it is not related to this patch (I made this change at the beginning of my attempt of solving this bug)

3."@@ +873,5 @@"
uri.path gave me undefined for some reason so I had to make an workaround.
Also if I recall correctly uri.scheme gave me undefined too.

4."@@ +921,5 @@"
ok :-)

5."@@ +956,5 @@"
this is just for www.nytimes.com and for www.phoronix.com for the reason explained in point 1.

6."@@ +1072,5 @@"
This code is coming from me :-). I added it because for some unknown reason some links(add ons, slide show) still got a higher score then the next page link.

alex

Comment 6

•

11 years ago

Attached patch second patch of multipage articles in reader mode (obsolete) — Details — Splinter Review

Attachment #715159 - Attachment is obsolete: true

Attachment #721181 - Flags: review?(lucasr.at.mozilla)

Lucas Rocha (:lucasr)

Reporter

Updated

•

11 years ago

Blocks: readerv2
No longer blocks: reader

Lucas Rocha (:lucasr)

Reporter

Comment 7

•

10 years ago

Comment on attachment 721181 [details] [diff] [review]
second patch of multipage articles in reader mode

Thanks a lot for the patch, Alex. Apologies for the huge delay here :-/ The reader mode has changed a lot since you submitted this patch. Let me know if you'd like to discuss how to complete this work. Clearing the review flag for now.

Attachment #721181 - Flags: review?(lucasr.at.mozilla)

Ian Barlow (:ibarlow)

Comment 8

•

10 years ago

Just a friendly ping to see how things are going in this bug? :)

:Margaret Leibovic

Updated

•

10 years ago

Attachment #721181 - Attachment is obsolete: true

:Margaret Leibovic

Updated

•

10 years ago

Blocks: fix-readability
No longer blocks: readerv2

:Margaret Leibovic

Updated

•

10 years ago

Component: Reader Mode → Readability

:Margaret Leibovic

Updated

•

10 years ago

Component: Readability → Reader Mode

Product: Firefox for Android → Toolkit

Justin Dolske [:Dolske]

Comment 9

•

9 years ago

Out of scope for v1 work.

OS: Android → All

Priority: P2 → P3

Whiteboard: [readinglist-v2]

:Gijs (he/him)

Updated

•

8 years ago

Whiteboard: [readinglist-v2] → [reader-mode-firefox-integration]

Asif Youssuff

Updated

•

5 years ago

Type: defect → enhancement

Micah [:mtigley] (she/her)

Updated

•

2 years ago

Status: NEW → RESOLVED

Closed: 2 years ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.