Add CONTENT_STATUS column to reading list provider to keep track of whether or not content is cached

RESOLVED FIXED in Firefox 37

Status

()

RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: aaronmt, Assigned: Margaret)

Tracking

(Blocks: 1 bug, {reproducible})

37 Branch
Firefox 37
All
Android
reproducible
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox35 ?, firefox36 affected, fennec37+)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(4 attachments, 8 obsolete attachments)

64.06 KB, image/png
Details
39 bytes, text/x-review-board-request
rnewman
: review+
Details | Review
39 bytes, text/x-review-board-request
rnewman
: review+
Details | Review
39 bytes, text/x-review-board-request
rnewman
: review+
Details | Review
(Reporter)

Description

4 years ago
Created attachment 8516049 [details]
screenshot.png

Occasionally, I will notice that articles I add to the reading list are missing their summaries. This might be happening on adding articles to the reading list before the pages are done loading. This feature does not always work. The TextView has no text value at all and the use just sees a padded LinearLayout cell.

See screenshot.

--
Nexus 5 (Android 4.4.4)
Nightly (11/03)
(Reporter)

Updated

4 years ago
Blocks: 917884
(Reporter)

Comment 1

4 years ago
Easy way to reproduce

  Visit http://www.neowin.net/news/google-ceo-larry-page-says-company-should-change-its-dont-be-evil-mantra
  While the page is loading, invoke our share overlay from the browser menu and 'Add to Reading List'
  Check the new entry in the reading list pane
The share overlay doesn't provide excerpts. (It doesn't even supply a title if the intent sender -- in this case, Fennec -- doesn't know one.)

I think that means this bug is either INVALID or can be morphed into "Reading List should fetch and process new items in the background".
(Assignee)

Comment 3

4 years ago
Yeah, I think we can work on this after we land bug 1007409.

This should only affect the share overlay, since the code path to add items from reader mode or from the reader icon both already have the processed article.

However, we probably should file a new bug about not showing "1 min" in the UI if we don't actually have a time estimate.
Summary: Missing summaries for newly added reading list items → Reading List should fetch and process new items in the background
Assignee: nobody → margaret.leibovic
tracking-fennec: ? → 37+
(Assignee)

Comment 4

4 years ago
So, right now, we just add a URL/title to the database from the share overlay, and we would need to actually parse the article (which we do in JS) to get data for additional fields (as well as the article content itself to store in our content cache).

Since Gecko isn't necessarily running when we want to add URLs to the reading list, I think we need to keep our current logic to store whatever we have, but then we should send an event to Gecko (or create some queue for the next time Gecko is running) to download the article content, store it in the cache, and update the items in the content provider appropriately. The current addReadingListItem implementation in LocalBrowserDB actually does an insert-if-needed-update, so we can probably just call that with the new data (and maybe rename the method).

I'm also thinking about how to approach bug 1093869, and for that bug we may want to also go through existing reading list items to re-populate the cache (rather than migrating over the data that's in indexedDB). So creating some sort of "download and update reading list items for these URLs in the background" functionality might help us there as well.

rnewman: sanity check, what do you think of this plan?
Flags: needinfo?(rnewman)
I don't think we can rely on a message to Gecko -- it might not start before our process finishes, and we might not successfully get through the work before dying. (This is fine as an additional signal that there are new items to fetch, but we can't rely on it.)

So yeah, a queue for next time.

That can essentially be the content provider itself, assuming it can track state like "failed" or "no content" or "not an article": Gecko starts, asks for items to readerify, readerifies what it can, and reports back with what it fetched and/or failures.

We should be tracking attempts, too, so that we can stop trying for content that repeatedly kills Firefox.

For Bug 1093869, we can have a second initial fetch: look for the indexedDB, and if it's there, batch our way through that, too. Take that code out after a release or two.
Flags: needinfo?(rnewman)
(Assignee)

Comment 6

4 years ago
Created attachment 8529348 [details] [diff] [review]
(Part 1) Add STATUS column to reading list table

I still need to write a patch to do the actual content fetching once Gecko is running, but I have patches here to keep track of the state of the reading list items, as suggested in comment 5.

It may be worth just landing the patches I have here and filing a separate bug for the actual fetching, so that users updating their Nightly/Aurora will have this state included in the migration that was added in bug 1093869.

Totally open to feedback on the naming here. Also I should update testReadingListProvider as well.
Attachment #8529348 - Flags: review?(rnewman)
(Assignee)

Comment 7

4 years ago
Created attachment 8529349 [details] [diff] [review]
(Part 2) Add status when adding reading list item from Reader.js
Attachment #8529349 - Flags: review?(rnewman)
Comment on attachment 8529348 [details] [diff] [review]
(Part 1) Add STATUS column to reading list table

Review of attachment 8529348 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/db/BrowserContract.java
@@ +404,5 @@
> +
> +        public static final int STATUS_NONE = 0;
> +        public static final int STATUS_ARTICLE = 1;
> +        public static final int STATUS_NO_CONTENT = 2;
> +        public static final int STATUS_FAILED = 3;

Can you explain why you picked these particular fields?

Looking at these, it looks like the status represents the results of a fetch attempt, starting at NONE (= unfetched).

If we succeed, the results could be ARTICLE, NO_CONTENT, or presumably NON_ARTICLE_CONTENT.

If we fail, it might be a temporary failure (continue to use NONE?) or a permanent failure (404 etc.).

So perhaps we want to steal a leaf from the RL etherpad:

  ** This is really a type: UNFETCHED, FETCH_FAILED_TEMPORARY, FETCH_FAILED_PERMANENT, FETCHED_ARTICLE, FETCHED_VIDEO, FETCHED_AUDIO, FETCHED_NON_ARTICLE.

and stick a STATUS_ prefix before each of those?

::: mobile/android/base/db/BrowserDatabaseHelper.java
@@ +774,5 @@
>                      ReadingListItems.TITLE + " TEXT, " +
>                      ReadingListItems.EXCERPT + " TEXT, " +
>                      ReadingListItems.READ + " TINYINT DEFAULT 0, " +
>                      ReadingListItems.IS_DELETED + " TINYINT DEFAULT 0, " +
> +                    ReadingListItems.STATUS + " TINYINT DEFAULT 0, " +

Let's move this to be the last item, so that both new and upgraded tables have the same schema.

@@ +1412,5 @@
>  
> +    private void upgradeDatabaseFrom21to22(SQLiteDatabase db) {
> +        debug("Adding STATUS column to reading list table.");
> +        db.execSQL("ALTER TABLE " + TABLE_READING_LIST
> +                + " ADD COLUMN " + ReadingListItems.STATUS + " TINYINT DEFAULT 0");

Nit: + at end of previous line, not start.

I'm assuming that we consider an index on STATUS to be unnecessary, because we expect this table to be small, and/or we expect our queries to be walking DATE_MODIFIED instead.

If that's not true, then this is a good time to add an index.
Attachment #8529348 - Flags: review?(rnewman) → feedback+
Comment on attachment 8529349 [details] [diff] [review]
(Part 2) Add status when adding reading list item from Reader.js

Review of attachment 8529349 [details] [diff] [review]:
-----------------------------------------------------------------

This might need some reworking to capture more granular status, but r+ for now!

::: mobile/android/chrome/content/Reader.js
@@ +15,5 @@
>  
> +  STATUS_NONE: 0,
> +  STATUS_ARTICLE: 1,
> +  STATUS_NO_CONTENT: 2,
> +  STATUS_FAILED: 3,

Add a comment noting that this should agree with the Java-side definition.
Attachment #8529349 - Flags: review?(rnewman) → review+
(Assignee)

Comment 10

4 years ago
(In reply to Richard Newman [:rnewman] from comment #8)
> Comment on attachment 8529348 [details] [diff] [review]
> (Part 1) Add STATUS column to reading list table
> 
> Review of attachment 8529348 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mobile/android/base/db/BrowserContract.java
> @@ +404,5 @@
> > +
> > +        public static final int STATUS_NONE = 0;
> > +        public static final int STATUS_ARTICLE = 1;
> > +        public static final int STATUS_NO_CONTENT = 2;
> > +        public static final int STATUS_FAILED = 3;
> 
> Can you explain why you picked these particular fields?
> 
> Looking at these, it looks like the status represents the results of a fetch
> attempt, starting at NONE (= unfetched).

Correct.

> If we succeed, the results could be ARTICLE, NO_CONTENT, or presumably
> NON_ARTICLE_CONTENT.
> 
> If we fail, it might be a temporary failure (continue to use NONE?) or a
> permanent failure (404 etc.).
> 
> So perhaps we want to steal a leaf from the RL etherpad:
> 
>   ** This is really a type: UNFETCHED, FETCH_FAILED_TEMPORARY,
> FETCH_FAILED_PERMANENT, FETCHED_ARTICLE, FETCHED_VIDEO, FETCHED_AUDIO,
> FETCHED_NON_ARTICLE.
> 
> and stick a STATUS_ prefix before each of those?

That sounds good to me. UNFETCHED is clearer than NONE. However, I think I'll leave out the video/audio/non-article for now, since we don't actually support any of those, and perhaps that will change once we start supporting those new things.

> ::: mobile/android/base/db/BrowserDatabaseHelper.java
> @@ +774,5 @@
> >                      ReadingListItems.TITLE + " TEXT, " +
> >                      ReadingListItems.EXCERPT + " TEXT, " +
> >                      ReadingListItems.READ + " TINYINT DEFAULT 0, " +
> >                      ReadingListItems.IS_DELETED + " TINYINT DEFAULT 0, " +
> > +                    ReadingListItems.STATUS + " TINYINT DEFAULT 0, " +
> 
> Let's move this to be the last item, so that both new and upgraded tables
> have the same schema.

Will do.

> @@ +1412,5 @@
> >  
> > +    private void upgradeDatabaseFrom21to22(SQLiteDatabase db) {
> > +        debug("Adding STATUS column to reading list table.");
> > +        db.execSQL("ALTER TABLE " + TABLE_READING_LIST
> > +                + " ADD COLUMN " + ReadingListItems.STATUS + " TINYINT DEFAULT 0");
> 
> Nit: + at end of previous line, not start.
> 
> I'm assuming that we consider an index on STATUS to be unnecessary, because
> we expect this table to be small, and/or we expect our queries to be walking
> DATE_MODIFIED instead.
> 
> If that's not true, then this is a good time to add an index.

If we're going to re-try for FETCH_FAILED_TEMPORARY items, I don't think we can count on just walking DATE_MODIFIED. I think this is your way of just suggesting I add an index ;)
(Assignee)

Comment 11

4 years ago
Created attachment 8530678 [details] [diff] [review]
(Part 1 v2) Add STATUS column to reading list table

Addressed feedback.
Attachment #8529348 - Attachment is obsolete: true
Attachment #8530678 - Flags: review?(rnewman)
(Assignee)

Comment 12

4 years ago
Created attachment 8530679 [details] [diff] [review]
(Part 2 v2) Add status when adding reading list item from Reader.js

I updated this patch to match what I changed in the first patch.

I was a bit uncertain as to what to make the status for the case where we failed to find an article. I don't think it should be STATUS_NON_ARTICLE, since we're not actually storing any content for it. I decided to go with STATUS_FETCH_FAILED_TEMPORARY, hoping that eventually we'll re-try when we have better logic in place for storing other kinds of content. But maybe it should really be STATUS_FETCH_FAILED_PERMANENT if we never want to re-try getting content for this article.

https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=178ba5454fbc
Attachment #8529349 - Attachment is obsolete: true
Attachment #8530679 - Flags: review?(rnewman)
(In reply to :Margaret Leibovic from comment #12)

> I was a bit uncertain as to what to make the status for the case where we
> failed to find an article. I don't think it should be STATUS_NON_ARTICLE,
> since we're not actually storing any content for it. I decided to go with
> STATUS_FETCH_FAILED_TEMPORARY, hoping that eventually we'll re-try when we
> have better logic in place for storing other kinds of content. But maybe it
> should really be STATUS_FETCH_FAILED_PERMANENT if we never want to re-try
> getting content for this article.

What about introducing STATUS_FETCH_FAILED_UNSUPPORTED_FORMAT? Whenever we add code to support a new type, we'll bump the version number, and in the migration we can automatically turn all of those into STATUS_FETCH_FAILED_TEMPORARY so that they'll be refetched.

UNSUPPORTED_FORMAT is kinda what I intended NON_ARTICLE to be ("it ain't an article or one of the other formats we support"), so maybe it just needs a better name.
(Assignee)

Comment 14

4 years ago
(In reply to Richard Newman [:rnewman] from comment #13)
> (In reply to :Margaret Leibovic from comment #12)
> 
> > I was a bit uncertain as to what to make the status for the case where we
> > failed to find an article. I don't think it should be STATUS_NON_ARTICLE,
> > since we're not actually storing any content for it. I decided to go with
> > STATUS_FETCH_FAILED_TEMPORARY, hoping that eventually we'll re-try when we
> > have better logic in place for storing other kinds of content. But maybe it
> > should really be STATUS_FETCH_FAILED_PERMANENT if we never want to re-try
> > getting content for this article.
> 
> What about introducing STATUS_FETCH_FAILED_UNSUPPORTED_FORMAT? Whenever we
> add code to support a new type, we'll bump the version number, and in the
> migration we can automatically turn all of those into
> STATUS_FETCH_FAILED_TEMPORARY so that they'll be refetched.
> 
> UNSUPPORTED_FORMAT is kinda what I intended NON_ARTICLE to be ("it ain't an
> article or one of the other formats we support"), so maybe it just needs a
> better name.

This makes sense. Right now we only store articles in the cache, but I imagine in the future we'll still only store some sort of structured data. Maybe one day we'll want to store the raw response from the URL, even if we can't parse it into one of our supported formats, but in that case we would probably still want to label those items as STATUS_FETCHED_RAW or something like that.
(Assignee)

Comment 15

4 years ago
Created attachment 8530921 [details] [diff] [review]
(Part 1 v3) Add STATUS column to reading list table

Updated to address new comments (and also fixed problem where previous patch tried creating a unique index on a non-unique column).
(Assignee)

Updated

4 years ago
Attachment #8530921 - Flags: review?(rnewman)
(Assignee)

Comment 16

4 years ago
Created attachment 8530922 [details] [diff] [review]
(Part 2 v3) Add status when adding reading list item from Reader.js
Attachment #8530678 - Attachment is obsolete: true
Attachment #8530679 - Attachment is obsolete: true
Attachment #8530678 - Flags: review?(rnewman)
Attachment #8530679 - Flags: review?(rnewman)
Attachment #8530922 - Flags: review?(rnewman)
(Assignee)

Comment 17

4 years ago
Created attachment 8530923 [details] [diff] [review]
(Part 3) Add test to check default status of inserted items

There isn't really that much to test with these two patches, so I just decided to add a new check to testReadingListProvider to make sure the default status is STATUS_UNFETCHED.

Passed locally, but here's a try push:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=f604a3f91f77
Attachment #8530923 - Flags: review?(rnewman)
Comment on attachment 8530921 [details] [diff] [review]
(Part 1 v3) Add STATUS column to reading list table

Review of attachment 8530921 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/db/BrowserDatabaseHelper.java
@@ +773,4 @@
>                      ReadingListItems.URL + " TEXT NOT NULL, " +
>                      ReadingListItems.TITLE + " TEXT, " +
>                      ReadingListItems.EXCERPT + " TEXT, " +
>                      ReadingListItems.READ + " TINYINT DEFAULT 0, " +

While we're here, doing a DB migration, we should consider separating the concept of read/unread, new (i.e., never read), and archived.

It's something we'll want to do for the reading list service, and eventually on clients to manage sheer volume, so might be wise to avoid the need for two migrations.

@@ +777,5 @@
>                      ReadingListItems.IS_DELETED + " TINYINT DEFAULT 0, " +
>                      ReadingListItems.GUID + " TEXT UNIQUE NOT NULL, " +
>                      ReadingListItems.DATE_MODIFIED + " INTEGER NOT NULL, " +
>                      ReadingListItems.DATE_CREATED  + " INTEGER NOT NULL, " +
> +                    ReadingListItems.LENGTH + " INTEGER DEFAULT 0, " +

Incidentally, is this length in words or characters?

@@ +778,5 @@
>                      ReadingListItems.GUID + " TEXT UNIQUE NOT NULL, " +
>                      ReadingListItems.DATE_MODIFIED + " INTEGER NOT NULL, " +
>                      ReadingListItems.DATE_CREATED  + " INTEGER NOT NULL, " +
> +                    ReadingListItems.LENGTH + " INTEGER DEFAULT 0, " +
> +                    ReadingListItems.STATUS + " TINYINT DEFAULT 0 ); ");

Consider

  DEFAULT " + ReadingListItems.STATUS_UNFETCHED + " ); "

for clarity.
Attachment #8530921 - Flags: review?(rnewman) → review+
Status: NEW → ASSIGNED
Hardware: ARM → All
Version: Firefox 36 → Firefox 37
Comment on attachment 8530922 [details] [diff] [review]
(Part 2 v3) Add status when adding reading list item from Reader.js

Review of attachment 8530922 [details] [diff] [review]:
-----------------------------------------------------------------

r+ without the ContentValues change.

::: mobile/android/base/ReadingListHelper.java
@@ +95,5 @@
>                      values.put(ReadingListItems.URL, url);
>                      values.put(ReadingListItems.TITLE, message.optString("title"));
>                      values.put(ReadingListItems.LENGTH, message.optInt("length"));
>                      values.put(ReadingListItems.EXCERPT, message.optString("excerpt"));
> +                    values.put(ReadingListItems.STATUS, message.optInt("status", ReadingListItems.STATUS_UNFETCHED));

We probably don't want to do this.

addReadingListItem is an update-or-insert.

That means if you add a reading list item, read it, then try to add the URL again, we'll clear its download status.

We should probably just trust to the DB's default column value (which is UNFETCHED) and the rest of our state management.
Attachment #8530922 - Flags: review?(rnewman) → review+
Attachment #8530923 - Flags: review?(rnewman) → review+
(Assignee)

Comment 20

4 years ago
(In reply to Richard Newman [:rnewman] from comment #18)
> Comment on attachment 8530921 [details] [diff] [review]
> (Part 1 v3) Add STATUS column to reading list table
> 
> Review of attachment 8530921 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mobile/android/base/db/BrowserDatabaseHelper.java
> @@ +773,4 @@
> >                      ReadingListItems.URL + " TEXT NOT NULL, " +
> >                      ReadingListItems.TITLE + " TEXT, " +
> >                      ReadingListItems.EXCERPT + " TEXT, " +
> >                      ReadingListItems.READ + " TINYINT DEFAULT 0, " +
> 
> While we're here, doing a DB migration, we should consider separating the
> concept of read/unread, new (i.e., never read), and archived.
> 
> It's something we'll want to do for the reading list service, and eventually
> on clients to manage sheer volume, so might be wise to avoid the need for
> two migrations.

We currently don't do anything with this column, so there's not actually anything to migrate. Given the reading list service cloud storage schema you proposed, I think we should actually drop this READ column and replace it with a STATUS column that reflects new/archived/deleted. And then I would rename this local content status column to something like CONTENT_STATUS.
 
> @@ +777,5 @@
> >                      ReadingListItems.IS_DELETED + " TINYINT DEFAULT 0, " +
> >                      ReadingListItems.GUID + " TEXT UNIQUE NOT NULL, " +
> >                      ReadingListItems.DATE_MODIFIED + " INTEGER NOT NULL, " +
> >                      ReadingListItems.DATE_CREATED  + " INTEGER NOT NULL, " +
> > +                    ReadingListItems.LENGTH + " INTEGER DEFAULT 0, " +
> 
> Incidentally, is this length in words or characters?

Characters: http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/Readability.js#1520

I think this makes sense from an l10n perspective, since the concept of "words" might vary, but consumers should be able to combine character count and locale to come up with a rough sense of how "long" the article is. 

> @@ +778,5 @@
> >                      ReadingListItems.GUID + " TEXT UNIQUE NOT NULL, " +
> >                      ReadingListItems.DATE_MODIFIED + " INTEGER NOT NULL, " +
> >                      ReadingListItems.DATE_CREATED  + " INTEGER NOT NULL, " +
> > +                    ReadingListItems.LENGTH + " INTEGER DEFAULT 0, " +
> > +                    ReadingListItems.STATUS + " TINYINT DEFAULT 0 ); ");
> 
> Consider
> 
>   DEFAULT " + ReadingListItems.STATUS_UNFETCHED + " ); "
> 
> for clarity.

Good suggestion.
(Assignee)

Comment 21

4 years ago
(In reply to :Margaret Leibovic from comment #20)
> (In reply to Richard Newman [:rnewman] from comment #18)
> > Comment on attachment 8530921 [details] [diff] [review]
> > (Part 1 v3) Add STATUS column to reading list table
> > 
> > Review of attachment 8530921 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > ::: mobile/android/base/db/BrowserDatabaseHelper.java
> > @@ +773,4 @@
> > >                      ReadingListItems.URL + " TEXT NOT NULL, " +
> > >                      ReadingListItems.TITLE + " TEXT, " +
> > >                      ReadingListItems.EXCERPT + " TEXT, " +
> > >                      ReadingListItems.READ + " TINYINT DEFAULT 0, " +
> > 
> > While we're here, doing a DB migration, we should consider separating the
> > concept of read/unread, new (i.e., never read), and archived.
> > 
> > It's something we'll want to do for the reading list service, and eventually
> > on clients to manage sheer volume, so might be wise to avoid the need for
> > two migrations.
> 
> We currently don't do anything with this column, so there's not actually
> anything to migrate. Given the reading list service cloud storage schema you
> proposed, I think we should actually drop this READ column and replace it
> with a STATUS column that reflects new/archived/deleted. And then I would
> rename this local content status column to something like CONTENT_STATUS.

Looking at the schema proposal more closely, I see you did add an "unread" column. But I still think I should rename the STATUS column to CONTENT_STATUS to avoid conflicting with the status column in the spec.

After looking at this proposed schema a bit more, I think it would be best to just land the patches we have here and add a new migration in the future as we incorporate the reading list service into the client. Given how many new columns there may be, I worry that we'll change our minds about *something* and need a new migration anyway, so I'd rather hold off until we have a reason to change things.
(Assignee)

Comment 22

4 years ago
Created attachment 8534050 [details]
MozReview Request: bz://1093172/margaret
Attachment #8534050 - Flags: review?(rnewman)
(Assignee)

Comment 23

4 years ago
/r/1373 - Bug 1093172 - Add CONTENT_STATUS column to reading list table. r=rnewman
/r/1375 - Bug 1093172 - Add status when adding reading list item from Reader.js. r=rnewman
/r/1377 - Bug 1093172 - Add test to check default status of inserted items. r=rnewman

Pull down these commits:

hg pull review -r e4fbd6cd0f0d9b7b4ed6dec263601b8ec5da55c2
(Assignee)

Updated

4 years ago
Attachment #8530921 - Attachment is obsolete: true
(Assignee)

Updated

4 years ago
Attachment #8530922 - Attachment is obsolete: true
(Assignee)

Updated

4 years ago
Attachment #8530923 - Attachment is obsolete: true
(In reply to :Margaret Leibovic from comment #21)

> Looking at the schema proposal more closely, I see you did add an "unread"
> column.

Yeah, making the read/unread state distinct from new/non-new, archived, deleted -- much like email. You can archive an item without reading it, and it can be in your inbox and marked as unread.


> After looking at this proposed schema a bit more, I think it would be best
> to just land the patches we have here and add a new migration in the future
> as we incorporate the reading list service into the client. Given how many
> new columns there may be, I worry that we'll change our minds about
> *something* and need a new migration anyway, so I'd rather hold off until we
> have a reason to change things.

That reasoning works for me :)


(In reply to :Margaret Leibovic from comment #20)

> I think this makes sense from an l10n perspective, since the concept of
> "words" might vary, but consumers should be able to combine character count
> and locale to come up with a rough sense of how "long" the article is. 

Aye. I wonder if it's worth (eventually) also using the guesstimated locale of the content to choose a tokenizer, and store a token count to match, to save each client doing the work again?

(Do we know the content locale at this point?)

As you note, using characters alone -- without knowledge of content locale -- will dramatically under-estimate for ideographic scripts.
(Assignee)

Comment 25

4 years ago
(In reply to Richard Newman [:rnewman] from comment #24)

> > I think this makes sense from an l10n perspective, since the concept of
> > "words" might vary, but consumers should be able to combine character count
> > and locale to come up with a rough sense of how "long" the article is. 
> 
> Aye. I wonder if it's worth (eventually) also using the guesstimated locale
> of the content to choose a tokenizer, and store a token count to match, to
> save each client doing the work again?
> 
> (Do we know the content locale at this point?)
> 
> As you note, using characters alone -- without knowledge of content locale
> -- will dramatically under-estimate for ideographic scripts.

See also: bug 1106380.
See Also: → bug 1106380
Margaret, I've lost track of whether I'm waiting for an updated patch, or you decided to land this as-is. Lemme know! :D
Flags: needinfo?(margaret.leibovic)
(Assignee)

Comment 27

4 years ago
(In reply to Richard Newman [:rnewman] from comment #26)
> Margaret, I've lost track of whether I'm waiting for an updated patch, or
> you decided to land this as-is. Lemme know! :D

I believe I decided to land the first approach, but I updated it since it was previously reviewed, so I would appreciate you giving the currently posted patches a once-over before I land them.
Flags: needinfo?(margaret.leibovic) → needinfo?(rnewman)
https://reviewboard.mozilla.org/r/1371/#review785

I found this old review laying around, so I think it's all still valid :D

---

Pretty much my only concern is about what happens in two situations: the initial migration (where we want to set the state to match whether we have a cached article) and during addition (where we want to make sure we don't clobber an existing entry by mistake).

Other than that, looks great!

::: mobile/android/base/ReadingListHelper.java
(Diff revision 1)
> +                        values.put(ReadingListItems.CONTENT_STATUS, message.optInt("status"));

I think you can switch this to `getInt`, given that it's wrapped in a `has` check. Alternatively, specify an explicit fallback for the `optInt` check if we're willing to live with malformed messages.

::: mobile/android/chrome/content/Reader.js
(Diff revision 1)
>        yield this.storeArticleInCache(article);

Doesn't this status change also need to be propagated to the RL DB?

::: mobile/android/chrome/content/Reader.js
(Diff revision 1)
> +      status: article.status

Is this correct if article.status is UNFETCHED, but the DB already contains the article?

Also, trailing comma.

::: mobile/android/base/db/BrowserDatabaseHelper.java
(Diff revision 1)
> +

Should we add another block here like (pseudo-SQL):

```
UPDATE readinglist SET status = STATUS_FETCHED_ARTICLE WHERE id IN (?, ?, ?, ...)
```

passing in the list of IDs for which we know we have articles?

Or will we automatically update this state to the correct value when we try to fetch the article content, checking the cache regardless of the DB value?
(Assignee)

Comment 29

4 years ago
https://reviewboard.mozilla.org/r/1371/#review997

> I think you can switch this to `getInt`, given that it's wrapped in a `has` check. Alternatively, specify an explicit fallback for the `optInt` check if we're willing to live with malformed messages.

We don't want to have a fallback, since we don't necessarily always want to update the CONTENT_STATUS, such as when we're updating a subset of columns on an existing item. However, I may have misinterpreted some of our previous discussions to come to this conclusion, since the messages we pass to handleAddToList always have a status property.

> Should we add another block here like (pseudo-SQL):
> 
> ```
> UPDATE readinglist SET status = STATUS_FETCHED_ARTICLE WHERE id IN (?, ?, ?, ...)
> ```
> 
> passing in the list of IDs for which we know we have articles?
> 
> Or will we automatically update this state to the correct value when we try to fetch the article content, checking the cache regardless of the DB value?

I envisioned us updating this state in another patch where we actually try to do the fetching and updating. We also don't have an id for articles in the reader mode cache, so the best we could do would be to pass URLs.

> Is this correct if article.status is UNFETCHED, but the DB already contains the article?
> 
> Also, trailing comma.

If the reader mode cache has the article, we would never end up passing UNFETCHED in here, so I don't think that would ever happen.

> Doesn't this status change also need to be propagated to the RL DB?

Yes, I just noticed that myself. Must have slipped my mind. I feel like we can actually just omit this bit for now, and deal with it the first time we try to fetch content in the background. We can just look in the cache before trying to re-download articles.
(Assignee)

Updated

4 years ago
Summary: Reading List should fetch and process new items in the background → Add CONTENT_STATUS column to reading list provider to keep track of whether or not content is cached
(Assignee)

Updated

4 years ago
Component: General → Data Providers
(Assignee)

Updated

4 years ago
Blocks: 1113454
(Assignee)

Comment 30

4 years ago
/r/1373 - Bug 1093172 - Add CONTENT_STATUS column to reading list table. r=rnewman
/r/1375 - Bug 1093172 - Add status when adding reading list item from Reader.js. r=rnewman
/r/1377 - Bug 1093172 - Add test to check default status of inserted items. r=rnewman

Pull down these commits:

hg pull review -r bca2e7843c2f5cb1ef9fc32c8004c7115aa65872
Attachment #8534050 - Flags: review?(rnewman) → review+
Looks like there's a missing

  Cu.import("resource://gre/modules/ReaderMode.jsm");

in there somewhere; this bug spanned the move into toolkit, I think. Will test and reland.

(No, I don't know why I'm working at 6:30am on a Saturday.)
Two small errors; retrospective review if you please, Margaret.

https://hg.mozilla.org/integration/fx-team/rev/9acd85c8081c

Failing test now passes locally for me.
Flags: needinfo?(margaret.leibovic)
(Assignee)

Comment 37

4 years ago
Sorry about that Ryan, I had a green try run before, but as rnewman pointed out, the tree has changed since I made that try run.

rnewman, you didn't need to add the Cu and the import in there, since Reader.js is loaded as a subscript, and we already have that import in browser.js:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/chrome/content/browser.js#112

It looks like what actually happened is that I did a bad rebase :/
https://hg.mozilla.org/integration/fx-team/rev/ec992bcb95a6#l2.79

I can clean up the extra imports later, since Reader.js is going to see more changes soon as I work to make about:reader work on desktop.
Flags: needinfo?(margaret.leibovic)
(In reply to :Margaret Leibovic from comment #37)

> It looks like what actually happened is that I did a bad rebase :/
> https://hg.mozilla.org/integration/fx-team/rev/ec992bcb95a6#l2.79

Yeah, that was the second thing I fixed -- didn't double-check that the first thing I 'fixed' wasn't necessary after doing the second :D

> I can clean up the extra imports later, since Reader.js is going to see more
> changes soon as I work to make about:reader work on desktop.

Thanks!
Depends on: 1128523
(Assignee)

Comment 40

3 years ago
Comment on attachment 8534050 [details]
MozReview Request: bz://1093172/margaret
Attachment #8534050 - Attachment is obsolete: true
Attachment #8618540 - Flags: review+
Attachment #8618541 - Flags: review+
Attachment #8618542 - Flags: review+
(Assignee)

Comment 41

3 years ago
Created attachment 8618540 [details]
MozReview Request: Bug 1093172 - Add test to check default status of inserted items. r=rnewman
(Assignee)

Comment 42

3 years ago
Created attachment 8618541 [details]
MozReview Request: Bug 1093172 - Add CONTENT_STATUS column to reading list table. r=rnewman
(Assignee)

Comment 43

3 years ago
Created attachment 8618542 [details]
MozReview Request: Bug 1093172 - Add status when adding reading list item from Reader.js. r=rnewman
You need to log in before you can comment on or make changes to this bug.