Closed Bug 1113454 Opened 10 years ago Closed 10 years ago

Download and cache reader mode content in the background

Tracking

(firefox38 fixed)

Status:

RESOLVED FIXED

Milestone:

Firefox 38

Tracking Flags:

Tracking

Status

firefox38

---

fixed

People

(Reporter: Margaret, Assigned: Margaret)

References

Details

Attachments

(1 file, 2 obsolete files)

MozReview Request: bz://1113454/margaret 10 years ago :Margaret Leibovic 39 bytes, text/x-review-board-request	rnewman : review+	Details
WIP - Download and cache reader mode content in the background 10 years ago :Margaret Leibovic 16.49 KB, patch	rnewman : feedback+	Details \| Diff \| Splinter Review
MozReview Request: Bug 1113454 - Download and cache reader mode content in the background. r=rnewman 9 years ago :Margaret Leibovic 39 bytes, text/x-review-board-request	rnewman : review+	Details

:Margaret Leibovic

Assignee

Description

•

10 years ago

This builds on top of bug 1093172. We should find a time to periodically go through the status UNFETCHED items in the reading list provider and attempt to download and cache their content.

:Margaret Leibovic

Assignee

Comment 1

•

10 years ago

We'll also need bug 1107588 if we want this to work for URLs that redirect.

Depends on: 1107588

:Margaret Leibovic

Assignee

Updated

•

10 years ago

Blocks: 1117830
No longer blocks: readerv2

:Margaret Leibovic

Assignee

Updated

•

10 years ago

Assignee: nobody → margaret.leibovic

Richard Newman [:rnewman]

Updated

•

10 years ago

Blocks: 1123101

Richard Newman [:rnewman]

Updated

•

10 years ago

No longer blocks: 1117830

:Margaret Leibovic

Assignee

Comment 2

•

10 years ago

Attached file MozReview Request: bz://1113454/margaret (obsolete) — Details

/r/2985 - Bug 1113454 - Download and cache reader mode content in the background

Pull down this commit:

hg pull review -r 2b1d7e59ecb8a2b070e4e947886eb491948fc069

:Margaret Leibovic

Assignee

Comment 3

•

10 years ago

Posted a WIP. Outstanding issues:

* This patch only observes content changes while BrowserApp is running. We should probably add some delayed update logic to check for unfetched items some amount of time after startup.
* We should batch requests to fetch content, so that we don't try downloading dozens of articles in the background (in the case of first encounter with this logic, or after a sync).
* We should send a message back to Java with the updated meta data (excerpt, title, etc). I think we could re-purpose the "Reader:AddToList" message to be more of a "Reader:InsertOrUpdate", although we'd have to split out some of the logic if we still want the "This item is already in your reading list" toast.
* I just realized this patch doesn't actually update the CONTENT_STATUS after we attempt to fetch an article, so we'll need to to that :) (also, updating the status for error cases).
* A test. I think I could add something to testReadingListCache to add an item to the content provider, then check to make sure it ended up in the cache.

Follow-up bugs:

* We should allow the user to specify a maximum cache size and purge old articles as necessary.
* Should we maybe only do this over wifi? Or have that as an option?


Am I missing anything?

Flags: needinfo?(rnewman)

Richard Newman [:rnewman]

Updated

•

10 years ago

Status: NEW → ASSIGNED

Version: Firefox 35 → Trunk

Richard Newman [:rnewman]

Comment 4

•

10 years ago

For clarity's sake: "phase one" = Gecko fetches content. "phase two" = Java fetches content, Gecko readerifies it. (Phase three might be that we ditch Readability.js altogether.)

> * This patch only observes content changes while BrowserApp is running. We
> should probably add some delayed update logic to check for unfetched items
> some amount of time after startup.

I'm not sure this is necessary during phase one. We can't fetch unless the Gecko thread is alive, and that'll only happen after browser startup.

If you want to catch items that are added while BrowserApp is still in memory but paused (e.g., via the share overlay), how 'bout having the share overlay check the GeckoThread's state?

Things can either arrive while BA is alive or not. If the latter, they'll be fetched on next launch. If the former, trigger the fetch immediately?

Or are you concerned about downloading when on on wi-fi etc.?

> * We should batch requests to fetch content, so that we don't try
> downloading dozens of articles in the background (in the case of first
> encounter with this logic, or after a sync).

You mean have a queue, and only work through it if each fetch doesn't fail for some kind of painful network reason?

> * We should allow the user to specify a maximum cache size and purge old
> articles as necessary.

Agreed. (Needs a new DB state, too -- otherwise we'll re-fetch, purge, re-fetch, purge.)

Probably also max size per article, some kind of sane bandwidth limitation, ...


> * Should we maybe only do this over wifi? Or have that as an option?

Having it as an option makes sense to me. That gets easier when we have a real fetching service (phase two) -- we can listen for network state and fetch as soon as we have Wi-Fi, regardless of whether Gecko's running.

Next up: file bugs for these things?

Richard Newman [:rnewman]

Updated

•

10 years ago

Flags: needinfo?(rnewman)

Attachment #8554628 - Flags: feedback?(rnewman)

:Margaret Leibovic

Assignee

Updated

•

10 years ago

Depends on: 1126244

:Margaret Leibovic

Assignee

Updated

•

10 years ago

Depends on: 1126246

:Margaret Leibovic

Assignee

Comment 5

•

10 years ago

(In reply to Richard Newman [:rnewman] from comment #4)
> For clarity's sake: "phase one" = Gecko fetches content. "phase two" = Java
> fetches content, Gecko readerifies it. (Phase three might be that we ditch
> Readability.js altogether.)
> 
> > * This patch only observes content changes while BrowserApp is running. We
> > should probably add some delayed update logic to check for unfetched items
> > some amount of time after startup.
> 
> I'm not sure this is necessary during phase one. We can't fetch unless the
> Gecko thread is alive, and that'll only happen after browser startup.
> 
> If you want to catch items that are added while BrowserApp is still in
> memory but paused (e.g., via the share overlay), how 'bout having the share
> overlay check the GeckoThread's state?
> 
> Things can either arrive while BA is alive or not. If the latter, they'll be
> fetched on next launch. If the former, trigger the fetch immediately?
> 
> Or are you concerned about downloading when on on wi-fi etc.?

I think I wasn't clear here. With the current version of the patch here, we only fetch content when a ContentObserver that lives in BrowserApp notices a change. So if BrowserApp isn't alive, we won't do a check to fetch content until the next time a content change event happens. I was suggesting here that we need another time besides content change to decide to fetch content (e.g. some random amount of time after the browser/gecko starts up).

> > * We should batch requests to fetch content, so that we don't try
> > downloading dozens of articles in the background (in the case of first
> > encounter with this logic, or after a sync).
> 
> You mean have a queue, and only work through it if each fetch doesn't fail
> for some kind of painful network reason?

Catching network errors is a good reason for this, but I'm also a bit concerned about the memory/performance implications of kicking off a lot of workers to be parsing content in the background (since the ReaderMode APIs for this are asynchronous).

:Margaret Leibovic

Assignee

Updated

•

10 years ago

Depends on: 1117226

:Margaret Leibovic

Assignee

Comment 6

•

10 years ago

Attached patch WIP - Download and cache reader mode content in the background (obsolete) — Details — Splinter Review

A new updated version built on top of bug 1117226.

Attachment #8554628 - Attachment is obsolete: true

Attachment #8554628 - Flags: feedback?(rnewman)

Attachment #8555314 - Flags: feedback?(rnewman)

Richard Newman [:rnewman]

Comment 7

•

10 years ago

Comment on attachment 8555314 [details] [diff] [review]
WIP - Download and cache reader mode content in the background

Review of attachment 8555314 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great. Pretty much just your own TODO left!

::: mobile/android/base/ReadingListHelper.java
@@ +36,1 @@
>      public ReadingListHelper(Context context) {

I think this is a good opportunity to have BrowserApp pass the profile as an argument to RLH.

BA has the profile immediately before it calls this constructor, and we fetch it a few times throughout this file.

The profile (and, consequently, the DB) shouldn't change for the lifetime of this instance.

Let's just keep a final member for the DB, grab the name in the constructor, and kill another few singleton accesses!

Attachment #8555314 - Flags: feedback?(rnewman) → feedback+

:Margaret Leibovic

Assignee

Comment 8

•

10 years ago

Comment on attachment 8554628 [details]
MozReview Request: bz://1113454/margaret

/r/2985 - Bug 1113454 - Download and cache reader mode content in the background. r=rnewman

Pull down this commit:

hg pull review -r 8eca44e4cfc16e2e3805564c4c371f823d52caa2

Attachment #8554628 - Attachment is obsolete: false

Attachment #8554628 - Flags: review?(rnewman)

:Margaret Leibovic

Assignee

Updated

•

10 years ago

Attachment #8555314 - Attachment is obsolete: true

:Margaret Leibovic

Assignee

Updated

•

10 years ago

Depends on: 1129725

Richard Newman [:rnewman]

Comment 9

•

10 years ago

https://reviewboard.mozilla.org/r/2985/#review2733

Good enough to land, but read through the issues.

::: mobile/android/base/ReadingListHelper.java
(Diff revision 2)
> +        final BrowserDB db = profile.getDB();

I think I commented on this before, but can we replace the `profile` member with a `db` member, and just do this once?

::: mobile/android/chrome/content/Reader.js
(Diff revision 2)
> +        // TODO: Distinguish between temorary and permanent failures.

Nit: temporary.

::: mobile/android/chrome/content/Reader.js
(Diff revision 2)
> +          status: this.STATUS_FETCH_FAILED_UNSUPPORTED_FORMAT,

I worry a little that this is overly broad, and that a null returned from `_dACA` doesn't imply an unsupported format. Keep your eye on this.

::: mobile/android/base/ReadingListHelper.java
(Diff revision 2)
> +                                GeckoEvent.createBroadcastEvent("Reader:FetchContent", json.toString()));

What stops us calling fetchContent() twice, resulting in duplicate messages, resulting in duplicate fetches?

:Margaret Leibovic

Assignee

Comment 10

•

10 years ago

(In reply to Richard Newman [:rnewman] from comment #9)
> https://reviewboard.mozilla.org/r/2985/#review2733
> 
> Good enough to land, but read through the issues.
> 
> ::: mobile/android/base/ReadingListHelper.java
> (Diff revision 2)
> > +        final BrowserDB db = profile.getDB();
> 
> I think I commented on this before, but can we replace the `profile` member
> with a `db` member, and just do this once?

Sure, I can update that.

> ::: mobile/android/chrome/content/Reader.js
> (Diff revision 2)
> > +        // TODO: Distinguish between temorary and permanent failures.
> 
> Nit: temporary.

I'll just get rid of this comment, I filed bug 1129725 about this.

> ::: mobile/android/chrome/content/Reader.js
> (Diff revision 2)
> > +          status: this.STATUS_FETCH_FAILED_UNSUPPORTED_FORMAT,
> 
> I worry a little that this is overly broad, and that a null returned from
> `_dACA` doesn't imply an unsupported format. Keep your eye on this.

I believe that a null returned from _dACA does indeed mean this isn't an article, and right now that's the format we support (it throws an exception if something else goes wrong). But yes, we can look out for this and update it in the future when we start supporting different kinds of things.

> ::: mobile/android/base/ReadingListHelper.java
> (Diff revision 2)
> > +                                GeckoEvent.createBroadcastEvent("Reader:FetchContent", json.toString()));
> 
> What stops us calling fetchContent() twice, resulting in duplicate messages,
> resulting in duplicate fetches?

I suppose this makes an assumption that content changes won't happen too frequently, but that's probably not an assumption we should make. I'll file a follow-up bug to throttle this somehow.

Richard Newman [:rnewman]

Comment 11

•

10 years ago

Comment on attachment 8554628 [details]
MozReview Request: bz://1113454/margaret

r+ with the two suggested changes.

Attachment #8554628 - Flags: review?(rnewman) → review+

:Margaret Leibovic

Assignee

Comment 12

•

10 years ago

https://hg.mozilla.org/integration/fx-team/rev/be61d1a3818e

Carsten Book [:Tomcat]

Comment 13

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/be61d1a3818e

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

status-firefox38: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → Firefox 38

:Margaret Leibovic

Assignee

Comment 14

•

9 years ago

Attachment #8554628 - Attachment is obsolete: true

Attachment #8618937 - Flags: review+

:Margaret Leibovic

Assignee

Comment 15

•

9 years ago

Attached file MozReview Request: Bug 1113454 - Download and cache reader mode content in the background. r=rnewman — Details

csims460

Comment 16

•

9 years ago

How difficult would it be to modify the behavior of the caching to cache both the 'reader mode' view and the original web view, rather than only the reader mode view?

The problem with only caching the reader mode view is that some saved webpages render better in the original web view rather than the reader mode view. Some webpages are even unreadable in the reader mode view (e.g. reddit comment pages).

When a user has added a bunch of articles to their reading list and then goes offline, it would be a bad experience to open an article and find out you can't actually read it because the reader mode view didn't render the page well. It would be great if the user could tap the reader mode icon to turn reader mode off and get the cached original web view instead.

It seems that Firefox on Android already caches open tabs to some extent, so hopefully adding this functionality wouldn't require a ton of new code.

I'm interested in contributing if others think this is a good idea, but don't have the time to implement this themselves. Should I create a new bug for this? (I'm new here)

BMO Automation

Updated

•

4 years ago

Product: Firefox for Android → Firefox for Android Graveyard

You need to log in before you can comment on or make changes to this bug.