Closed Bug 1396282 Opened 2 years ago Closed 2 years ago

Add query for getting Highlights (recent bookmarks and recent history with metadata)

Categories

(Firefox :: New Tab Page, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
Firefox 57
Tracking Status
firefox57 --- fixed

People

(Reporter: Mardak, Assigned: Mardak)

References

Details

Attachments

(1 file)

This query will be used by Activity Stream Highlights section to be implemented https://github.com/mozilla/activity-stream/issues/3147

The desired functionality there is to show up to 9 highlights with the most recent bookmarks from the last 5 days followed by recent history that has metadata from bug 1393924.

mak, I see there's `Bookmarks.getRecent` that calls `fetchRecentBookmarks`:
https://searchfox.org/mozilla-central/source/toolkit/components/places/Bookmarks.jsm#1684-1712

We could just call that to get 9 bookmarks and filter out those that are older than 5 days, but it looks like that query isn't optimized:
> 0|0|0|SCAN TABLE moz_bookmarks AS b
> 0|1|1|SEARCH TABLE moz_bookmarks AS p USING INTEGER PRIMARY KEY (rowid=?)
> 0|2|2|SEARCH TABLE moz_places AS h USING INTEGER PRIMARY KEY (rowid=?)
> 0|0|0|USE TEMP B-TREE FOR ORDER BY

It looks like it might just want an index on dateAdded for moz_bookmarks? But we might put this behind a cache/timer anyway (top sites only updates once every 15 minutes unless there's deletions), so performance might not be totally critical. And we'll use notifications to detect added bookmarks.

Activity stream would still need the metadata for those bookmarks if available, so that will require re-querying or having a dedicated query for activity stream in the first place.

So, does it make sense to just have a custom query somewhere in NewTabUtils's ActivityStreamLinks/ActivityStreamProvider to get bookmarks and history (by last_visit_date) with `description` and `preview_image_url`?
Flags: needinfo?(mak77)
Blocks: 1394533
Depends on: 1389125
mak, the bookmarks query isn't using all indices because there is none for dateAdded, so it behaves:

0|0|2|SEARCH TABLE moz_places USING INDEX moz_places_frecencyindex (frecency>?)
0|1|0|SEARCH TABLE moz_bookmarks AS b USING INDEX moz_bookmarks_itemlastmodifiedindex (fk=?)
0|2|1|SEARCH TABLE moz_bookmarks AS p USING INTEGER PRIMARY KEY (rowid=?)
0|0|0|USE TEMP B-TREE FOR ORDER BY

If we create an index, it would behave as below:
CREATE INDEX moz_bookmarks_dateAdded ON moz_bookmarks (dateAdded);

0|0|0|SEARCH TABLE moz_bookmarks AS b USING INDEX moz_bookmarks_dateAdded (dateAdded>?)
0|1|1|SEARCH TABLE moz_bookmarks AS p USING INTEGER PRIMARY KEY (rowid=?)
0|2|2|SEARCH TABLE moz_places USING INTEGER PRIMARY KEY (rowid=?)


The history query would normally use frecency without the "unnecessary" NB line to get sqlite to use last_visit_date (even though the ORDER BY is with last_visit_date):

0|0|0|SEARCH TABLE moz_places USING INDEX moz_places_lastvisitdateindex (last_visit_date>?)
0|1|1|SEARCH TABLE moz_bookmarks AS b USING INDEX moz_bookmarks_itemlastmodifiedindex (fk=?)
0|2|2|SEARCH TABLE moz_bookmarks AS p USING INTEGER PRIMARY KEY (rowid=?)
Note, this builds on top of bug 1389125 that pulls out some shared queries parts. The full queries look like:

Recent Bookmarks:

      SELECT
        b.guid AS bookmarkGuid,
        description,
        moz_places.guid,
        preview_image_url,
        b.title,
        url
      FROM moz_bookmarks b
      JOIN moz_bookmarks p
        ON p.id = b.parent
      JOIN moz_places 
        ON moz_places.id = b.fk
      WHERE b.dateAdded >= :dateAddedThreshold
        AND b.title NOT NULL
        AND frecency >= :frecencyThreshold
        AND hidden = 0
        AND (url_hash BETWEEN hash("https", "prefix_lo") AND hash("https", "prefix_hi")
          OR url_hash BETWEEN hash("http",  "prefix_lo") AND hash("http",  "prefix_hi"))
        AND (b.type   IS NULL OR b.type = :bookmarkType)
        AND (p.parent IS NULL OR p.parent <> :tagsFolderId)
      ORDER BY b.dateAdded DESC
      LIMIT :limit


Recent History:

      SELECT
        b.guid AS bookmarkGuid,
        description,
        moz_places.guid,
        preview_image_url,
        moz_places.title,
        url
      FROM moz_places
      LEFT JOIN moz_bookmarks b
        ON b.fk = moz_places.id
      LEFT JOIN moz_bookmarks p
        ON p.id = b.parent
      WHERE last_visit_date > 0 /* NB: Use this index instead of frecency's */
        AND b.id IS NULL
        AND description NOT NULL
        AND preview_image_url NOT NULL
        AND frecency >= :frecencyThreshold
        AND hidden = 0
        AND (url_hash BETWEEN hash("https", "prefix_lo") AND hash("https", "prefix_hi")
          OR url_hash BETWEEN hash("http",  "prefix_lo") AND hash("http",  "prefix_hi"))
        AND (b.type   IS NULL OR b.type = :bookmarkType)
        AND (p.parent IS NULL OR p.parent <> :tagsFolderId)
      ORDER BY last_visit_date DESC
      LIMIT :limit
r1cky, do you know what fields/columns you'll want from the Highlights query. Notably I did not include frecency or lastVisitDate from the TopSites query. It also doesn't include a "bookmark" or "history" type as that can be inferred from having a bookmarkGuid or not. Looks like the UI doesn't show any times.
Flags: needinfo?(rrosario)
I'll update the patch with the dateAdded index. From IRC:

> it depends on how often this runs, in general we don't have lots of bookmarks and dateAdded is an integer, so I'd not be too scary of indexing it
> Anyway I expect even adding thousands of bookmarks won't get much worse
> At a maximum I'd try to get a guess of how much a db may grow
> with say 1000 bookmarks, each date is going to take 8bytes, I suppose
> shouldn't be too bad :)
(In reply to Ed Lee :Mardak from comment #4)
> r1cky, do you know what fields/columns you'll want from the Highlights
> query. Notably I did not include frecency or lastVisitDate from the TopSites
> query. It also doesn't include a "bookmark" or "history" type as that can be
> inferred from having a bookmarkGuid or not. Looks like the UI doesn't show
> any times.

Sounds good. The columns you are selecting should be enough to implement the UI as designed.
Flags: needinfo?(rrosario)
Assignee: nobody → edilee
Status: NEW → ASSIGNED
Flags: needinfo?(mak77)
(In reply to Ed Lee :Mardak from comment #3)
>         AND b.title NOT NULL

Is there a reason to exclude bookmarks without a title, rather than using the url as title? It may exclude more than valid bookmarks, even if I understand they may not look "nice" in the UI.

>         AND frecency >= :frecencyThreshold

Due to decay, I think that frecency thresholds don't work very well, because if you take one user that uses the browser every day (decays every day) vs one that uses the browser every 7 days, they will have their numbers shifted and thus it would require a personalize threshold for each user. It will surely exclude something, it could just be too strict or too wide, it's a common problem we bring on.

>         AND hidden = 0

Since you're limiting your search to bookmarks with a certain scheme, I doubt this is necessary at all.

>         AND (b.type   IS NULL OR b.type = :bookmarkType)
>         AND (p.parent IS NULL OR p.parent <> :tagsFolderId)

I'm not sure why there are these NULL checks? once you limit the query to b.type = TYPE_BOOKMARK you should be fine for your needs?
As well as parent, we should really never have a null parent.

> Recent History:
> 
>       SELECT
>         b.guid AS bookmarkGuid,

I think extracting this with a subselect would be more efficient, since you care about history here, you should not care about excluding tags, thus you can save 2 JOINS that are more expensive. The subselects will only run on the few items you care about, rather than all of them.

>       WHERE last_visit_date > 0 /* NB: Use this index instead of frecency's
> */

last_visit_date NOTNULL is more correct
Note that you can give hints to the query optimizer by adding a + in front of things you don't want to be indexed.
Btw, considered you are ordering and conditioning on it, it's high likely the index will be used.

>         AND b.id IS NULL

Why do you want to exclude recently visited bookmarks?

>         AND (b.type   IS NULL OR b.type = :bookmarkType)
>         AND (p.parent IS NULL OR p.parent <> :tagsFolderId)

I think these are pointless here
Comment on attachment 8904054 [details]
Bug 1396282 - Add query for getting Highlights (recent bookmarks and recent history with metadata).

https://reviewboard.mozilla.org/r/175792/#review181294

Clearing because of the previous comment about the queries, the approach looks ok.

::: toolkit/components/places/tests/migration/xpcshell.ini:27
(Diff revision 2)
>    places_v34.sqlite
>    places_v35.sqlite
>    places_v36.sqlite
>    places_v37.sqlite
>    places_v38.sqlite
> +  places_v39.sqlite

Just to be sure, this file should have been VACUUMed (so it should be a bit more than 1MB)

::: toolkit/modules/tests/xpcshell/test_NewTabUtils.js:380
(Diff revision 2)
> +  Assert.equal(links.length, 0, "empty history yields empty links");
> +
> +  // Add bookmarks
> +  let bookmarks = [
> +    {url: "https://mozilla1.com/0", title: "foo", parentGuid: PlacesUtils.bookmarks.unfiledGuid, type: PlacesUtils.bookmarks.TYPE_BOOKMARK},
> +    {url: "https://mozilla1.com/1", title: "foo", parentGuid: PlacesUtils.bookmarks.unfiledGuid, type: PlacesUtils.bookmarks.TYPE_BOOKMARK}

type is optional and can be omitted when url is provided.

::: toolkit/modules/tests/xpcshell/test_NewTabUtils.js:438
(Diff revision 2)
> +  links = await getHighlights({excludeBookmarks: true, excludeHistory: true});
> +  Assert.equal(links.length, 0, "requested nothing, get nothing");
> +
> +  // Add a visit to the older bookmark
> +  await PlacesTestUtils.addVisits(bookmarks[0].url);
> +  links = await getHighlights({bookmarkFrecency: 150});

Please check out for intermittents, frecency is updates asynchronously and you're using the read-only concurrent connection, that is faster but may read from an outdated db snapshot.
I'd suggest to run the test multiple times on Try. In case you could use something like waitForCondition.
Attachment #8904054 - Flags: review?(mak77)
Sounds like generally, I'll avoid the WHERE sharing to better optimize the queries.

(In reply to Marco Bonardo [::mak] from comment #8)
> (In reply to Ed Lee :Mardak from comment #3)
> > Recent Bookmarks:
> >         AND b.title NOT NULL
> Is there a reason to exclude bookmarks without a title, rather than using
> the url as title?
UX confirmed skipping bookmarks without title.

> >         AND frecency >= :frecencyThreshold
> Due to decay, I think that frecency thresholds don't work very well
On one hand, we're looking for recent stuff anyway. But sounds like we can just be more explicit on excluding unvisited bookmarks, and not do frecency stuff at all.

> >         AND hidden = 0
> Since you're limiting your search to bookmarks with a certain scheme, I
> doubt this is necessary at all.
This was part of the common WHERE. Although looking at my own moz_places (which maybe is a good ? or bad ? test case):

url: https://smile.amazon.com/exec/obidos/external-search/?field-keywords=%s&mode=blended&tag=mozilla-20&sourceid=Mozilla-search
title:
visit_count: 0
hidden: 1
typed: 0
frecency: 20
last_visit_date: 
foreign_count: 2

url: https://maps.google.com/
title:
visit_count: 2
hidden: 1
typed: 1
frecency: 49
last_visit_date: 1496695126209519
foreign_count: 3

Should those be excluded with "AND hidden = 0" … ? dunno… I suppose if someone just bookmarked a page that moz_places some reason considered hidden, maybe just show it?

> >         AND (b.type   IS NULL OR b.type = :bookmarkType)
> >         AND (p.parent IS NULL OR p.parent <> :tagsFolderId)
> I'm not sure why there are these NULL checks?
Part of the common WHERE again. I'll just remove.

> > Recent History:
> >       SELECT
> >         b.guid AS bookmarkGuid,
> I think extracting this with a subselect would be more efficient, since you
> care about history here, you should not care about excluding tags
Depending on if we allow bookmarks or not from this query, we'll want the correct bookmark guid to remove the bookmark from the Highlight card's context menu. A subselect could work (which would require selecting one bookmark guid of there's multiple bookmarks for the same url…)

> >       WHERE last_visit_date > 0 /* NB: Use this index instead of frecency's
> last_visit_date NOTNULL is more correct
> Note that you can give hints to the query optimizer by adding a + in front
> of things you don't want to be indexed.
Good to know. I'll play around with that.

> Btw, considered you are ordering and conditioning on it, it's high likely
> the index will be used.
As in the output of EXPLAIN QUERY PLAN is not accurate and something different will happen at run/query-time?

> >         AND b.id IS NULL
> Why do you want to exclude recently visited bookmarks?
This was partially to avoid duplicates between recent bookmarks and recent history. Also, it was unclear if the user excludes Bookmarks, should History also exclude Bookmarks? But at least for "show both bookmarks and history" case, UX says an old bookmarked page can be recently visited and should be shown. This means we'll need to fetch up to the original amount of `numItems` because potentially each history item could be a dupe of a recent bookmark.

> >         AND (b.type   IS NULL OR b.type = :bookmarkType)
> >         AND (p.parent IS NULL OR p.parent <> :tagsFolderId)
> I think these are pointless here
Okay. I might just need to move these to the subselect to pick out the correct bookmarkGuid.

(In reply to Marco Bonardo [::mak] from comment #9)
> > +  places_v39.sqlite
> Just to be sure, this file should have been VACUUMed (so it should be a bit
> more than 1MB)
Okay.

> > +    {url: "https://mozilla1.com/1", title: "foo", parentGuid: PlacesUtils.bookmarks.unfiledGuid, type: PlacesUtils.bookmarks.TYPE_BOOKMARK}
> type is optional and can be omitted when url is provided.
Okay.

> Please check out for intermittents, frecency is updates asynchronously and
Looks like try decided to run more times than I expected :p
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a44064f2fe7&group_state=expanded&filter-searchStr=(x
Updated query for Recent Bookmarks:

      SELECT
        b.guid AS bookmarkGuid,
        description,
        moz_places.guid,
        preview_image_url,
        b.title, 
        url
      FROM moz_bookmarks b
      JOIN moz_bookmarks p
        ON p.id = b.parent
      JOIN moz_places
        ON moz_places.id = b.fk
      WHERE b.dateAdded >= :dateAddedThreshold
        AND b.title NOTNULL
        AND b.type = :bookmarkType
        AND p.parent <> :tagsFolderId
        AND hidden = 0
        AND last_visit_date NOTNULL
        AND (url_hash BETWEEN hash("https", "prefix_lo") AND hash("https", "prefix_hi")
          OR url_hash BETWEEN hash("http",  "prefix_lo") AND hash("http",  "prefix_hi"))
      ORDER BY b.dateAdded DESC
      LIMIT :limit

0|0|0|SEARCH TABLE moz_bookmarks AS b USING INDEX moz_bookmarks_dateaddedindex (dateAdded>?)
0|1|1|SEARCH TABLE moz_bookmarks AS p USING INTEGER PRIMARY KEY (rowid=?)
0|2|2|SEARCH TABLE moz_places USING INTEGER PRIMARY KEY (rowid=?)


Updated query for Recent History:

      SELECT
        (
          SELECT guid
          FROM moz_bookmarks b
          WHERE fk = moz_places.id
            AND type = :bookmarkType
            AND (
              SELECT id
              FROM moz_bookmarks p
              WHERE p.id = b.parent
                AND p.parent <> :tagsFolderId
          ) NOTNULL
        ) AS bookmarkGuid,
        description,
        guid,
        preview_image_url,
        title,
        url
      FROM moz_places
      WHERE description NOTNULL
        AND preview_image_url NOTNULL
        AND hidden = 0
        AND last_visit_date NOTNULL
        AND (url_hash BETWEEN hash("https", "prefix_lo") AND hash("https", "prefix_hi")
          OR url_hash BETWEEN hash("http",  "prefix_lo") AND hash("http",  "prefix_hi"))
      ORDER BY last_visit_date DESC
      LIMIT :limit

0|0|0|SCAN TABLE moz_places USING INDEX moz_places_lastvisitdateindex
0|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 1
1|0|0|SEARCH TABLE moz_bookmarks AS b USING INDEX moz_bookmarks_itemindex (fk=? AND type=?)
1|0|0|EXECUTE CORRELATED SCALAR SUBQUERY 2
2|0|0|SEARCH TABLE moz_bookmarks AS p USING INTEGER PRIMARY KEY (rowid=?)

Of note, the first line says SCAN … USING INDEX instead of SEARCH. (Should be okay as it's using an index?) If it was `last_visit_date > 0` it would be `SEARCH TABLE moz_places USING INDEX moz_places_lastvisitdateindex (last_visit_date>?)`
We played around with measuring the performance of SCAN … USING INDEX (last_visit_date NOTNULL) vs SEARCH … USING INDEX (last_visit_date > 0), and the performance seems comparable to each other and slightly faster with SEARCH:

$ time sqlite3 places.sqlite "SELECT 1 FROM moz_places WHERE last_visit_date NOTNULL ORDER BY last_visit_date DESC LIMIT 1"
real	0m0.009s
user	0m0.003s
sys	0m0.003s

$ time sqlite3 places.sqlite "SELECT 1 FROM moz_places WHERE last_visit_date > 0 ORDER BY last_visit_date DESC LIMIT 1"
real	0m0.009s
user	0m0.003s
sys	0m0.003s

This is on my main profile with ~120k moz_places rows.

And checking the EXPLAIN output for opcodes shows SEARCH has 1 fewer opcode in its loop (IdxLE vs Column+IsNull):

sqlite> explain SELECT 1 FROM moz_places WHERE last_visit_date NOTNULL ORDER BY last_visit_date DESC LIMIT 1;
addr  opcode         p1    p2    p3    p4             p5  comment      
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     13    0                    00               
1     Noop           1     3     0                    00               
2     Integer        1     1     0                    00               
3     OpenRead       2     25    0     k(2,,)         00               
4     Last           2     11    2     0              00               
5       Column         2     0     2                    00               
6       IsNull         2     10    0                    00               
7       Integer        1     3     0                    00               
8       ResultRow      3     1     0                    00               
9       DecrJumpZero   1     11    0                    00               
10    Prev           2     5     0                    01               
11    Close          2     0     0                    00               
12    Halt           0     0     0                    00               
13    Transaction    0     0     126   0              01               
14    TableLock      0     2     0     moz_places     00               
15    Goto           0     1     0                    00               

sqlite> explain SELECT 1 FROM moz_places WHERE last_visit_date > 0 ORDER BY last_visit_date DESC LIMIT 1;
addr  opcode         p1    p2    p3    p4             p5  comment      
----  -------------  ----  ----  ----  -------------  --  -------------
0     Init           0     13    0                    00               
1     Noop           1     3     0                    00               
2     Integer        1     1     0                    00               
3     OpenRead       2     25    0     k(2,,)         00               
4     Last           2     11    2     0              00               
5     Integer        0     2     0                    00               
6       IdxLE          2     11    2     1              00               
7       Integer        1     3     0                    00               
8       ResultRow      3     1     0                    00               
9       DecrJumpZero   1     11    0                    00               
10    Prev           2     6     0                    00               
11    Close          2     0     0                    00               
12    Halt           0     0     0                    00               
13    Transaction    0     0     126   0              01               
14    TableLock      0     2     0     moz_places     00               
15    Goto           0     1     0                    00               

So this seems to point to using "> 0" instead of "NOTNULL"?
(In reply to Ed Lee :Mardak from comment #12)
> So this seems to point to using "> 0" instead of "NOTNULL"?
Better than static analysis.. actual runtime results! I copied over my usual profile with ~120k moz_places and ran a number of query variations 100x each and averaging:
- sub SELECT bookmarkGuid vs LEFT JOIN moz_bookmarks for guid
- last_visit_date > 0 vs last_visit_date NOTNULL
- LIMIT 1 vs LIMIT 10 (just to compare short circuit behavior)

Here are the results where the 3 columns are for 1 page vs 153 pages vs 1673 pages with description/preview_image_url (I added desc/image to pages from one site that I visit daily):

sub SELECT > 0 LIMIT 1 :  5.13ms vs  5.13ms vs  0.56ms
sub SELECT > 0 LIMIT 10: 63.37ms vs 21.45ms vs  2.31ms
sub SELECT NOTNULL L 1 : 15.39ms vs 15.47ms vs 16.91ms
sub SELECT NOTNULL L 10: 63.68ms vs 21.92ms vs  2.22ms
LEFT JOIN  > 0 LIMIT 1 :  5.35ms vs  5.22ms vs  0.54ms
LEFT JOIN  > 0 LIMIT 10: 63.80ms vs 21.52ms vs  2.23ms
LEFT JOIN  NOTNULL L 1 :  5.28ms vs  5.13ms vs  0.57ms
LEFT JOIN  NOTNULL L 10: 65.78ms vs 21.83ms vs  2.24ms

The sub SELECT query looks to be the winner when compared against LEFT JOIN.
Conditioning on last_visit_date > 0 looks to be the winner when compared against last_visit_date NOTNULL.

Sqlite is indeed short circuiting when it reaches its LIMIT… most of the time. In particular sub SELECT NOTNULL LIMIT 1 *increases* in time while LIMIT 10 correctly decreases.. maybe sqlite is getting confused and decides to run the sub SELECT on all pages with descriptions only later deciding to sort and limit to best 1.

In the common case going forwards, I would expect moz_places to be relatively dense with descriptions as opposed to having just a few pages with descriptions, so rarely should we be hitting the first column "1 page with description" case resulting in a full scan.


Code for sub select:

(async() => {
  const RUNS = 100;
  let total = 0;
  for (let i = 0; i < RUNS; i++) {
    const start = Date.now();
    await (await PlacesUtils.promiseDBConnection()).execute(`
      SELECT
        (
          SELECT guid
          FROM moz_bookmarks b
          WHERE fk = moz_places.id
            AND type = 1
            AND (
              SELECT id
              FROM moz_bookmarks p
              WHERE p.id = b.parent
                AND p.parent <> 4
              ) NOTNULL
        ) AS bookmarkGuid,
        description,
        guid,
        preview_image_url,
        title,
        url
      FROM moz_places
      WHERE description NOTNULL
        AND preview_image_url NOTNULL
        AND hidden = 0
        AND last_visit_date > 0
      ORDER BY last_visit_date DESC
      LIMIT 10`);
    total += Date.now() - start;
  }
  console.log(total / RUNS);
})()

Query for JOIN:

(async() => {
  const RUNS = 100;
  let total = 0;
  for (let i = 0; i < RUNS; i++) {
    const start = Date.now();
    await (await PlacesUtils.promiseDBConnection()).execute(`
      SELECT
        b.guid AS bookmarkGuid,
        description,
        moz_places.guid,
        preview_image_url,
        moz_places.title,
        url
      FROM moz_places
      LEFT JOIN moz_bookmarks b
        ON b.fk = moz_places.id
      LEFT JOIN moz_bookmarks p
        ON p.id = b.parent
      WHERE description NOTNULL
        AND preview_image_url NOTNULL
        AND (b.type ISNULL OR b.type = 1)
        AND (p.parent ISNULL OR p.parent <> 4)
        AND hidden = 0
        AND last_visit_date > 0
      ORDER BY last_visit_date DESC
      LIMIT 10`);
    total += Date.now() - start;
  }
  console.log(total / RUNS);
})()
Behavior differences from revision 2 to revision 3:

- uses optimized queries as per comment 13 (last_visit_date > 0 and sub SELECT)
- no longer takes frecency values for Highlights (only for topFrecentSites)
- a "type" is associated to Highlight results based on the query not of the result, e.g., an old bookmark (has bookmarkGuid) recently visited is type "history" and a recently bookmarked page (must be visited) is type "bookmark"
- because "history" can include bookmarks, recently bookmarked pages that are also recently visited are only included once
Bookmarks:
      SELECT
        b.guid AS bookmarkGuid,
        description,
        moz_places.guid,
        preview_image_url,
        b.title,
        url
      FROM moz_bookmarks b
      JOIN moz_bookmarks p
        ON p.id = b.parent
      JOIN moz_places
        ON moz_places.id = b.fk
      WHERE b.dateAdded >= :dateAddedThreshold
        AND b.title NOTNULL
        AND b.type = :bookmarkType
        AND p.parent <> :tagsFolderId
        AND hidden = 0
        AND last_visit_date > 0
        AND (SUBSTR(url, 1, 6) == "https:"
          OR SUBSTR(url, 1, 5) == "http:")
      ORDER BY b.dateAdded DESC
      LIMIT :limit

History:
      SELECT
        (
          SELECT guid
          FROM moz_bookmarks b
          WHERE fk = moz_places.id
            AND type = :bookmarkType
            AND (
              SELECT id
              FROM moz_bookmarks p
              WHERE p.id = b.parent
                AND p.parent <> :tagsFolderId
            ) NOTNULL
        ) AS bookmarkGuid,
        description,
        guid,
        preview_image_url,
        title,
        url
      FROM moz_places
      WHERE description NOTNULL
        AND preview_image_url NOTNULL
        AND hidden = 0
        AND last_visit_date > 0
        AND (SUBSTR(url, 1, 6) == "https:"
          OR SUBSTR(url, 1, 5) == "http:")
      ORDER BY last_visit_date DESC
      LIMIT :limit
Comment on attachment 8904054 [details]
Bug 1396282 - Add query for getting Highlights (recent bookmarks and recent history with metadata).

https://reviewboard.mozilla.org/r/175792/#review181906

::: toolkit/modules/NewTabUtils.jsm:989
(Diff revision 3)
> +
> +    const sqlQuery = `
> +      SELECT
> +        b.guid AS bookmarkGuid,
> +        description,
> +        moz_places.guid,

just to be consistent with the other places queries, please assign the h alias to moz_places and use h.guid, h.id and so on.

::: toolkit/modules/NewTabUtils.jsm:1030
(Diff revision 3)
> +      numItems: ACTIVITY_STREAM_DEFAULT_LIMIT,
> +    }, aOptions || {});
> +
> +    const sqlQuery = `
> +      SELECT
> +        ${this._commonBookmarkGuidSelect},

I was looking at _commonBookmarkGuidSelect in the other patch, did you measure the subselect to be faster than a left join in that case?
Hopefully in 58 we'll have time to remove tags from the bookmarks table.

::: toolkit/modules/NewTabUtils.jsm:1038
(Diff revision 3)
> +        preview_image_url,
> +        title,
> +        url
> +      FROM moz_places
> +      WHERE description NOTNULL
> +        AND preview_image_url NOTNULL

please check in a debug build if any of these queries causes a warning for missing indices, and eventually add to them the /* do not warn comment
Attachment #8904054 - Flags: review?(mak77) → review+
Comment on attachment 8904054 [details]
Bug 1396282 - Add query for getting Highlights (recent bookmarks and recent history with metadata).

https://reviewboard.mozilla.org/r/175792/#review181906

> just to be consistent with the other places queries, please assign the h alias to moz_places and use h.guid, h.id and so on.

Fixed here and in the refactoring.

> I was looking at _commonBookmarkGuidSelect in the other patch, did you measure the subselect to be faster than a left join in that case?
> Hopefully in 58 we'll have time to remove tags from the bookmarks table.

Yup. After 50,000 runs, on average, subselect was 14 microseconds faster.

> please check in a debug build if any of these queries causes a warning for missing indices, and eventually add to them the /* do not warn comment

I tried the debug build, and it didn't warn. (But it seems like it didn't warn for any bad ORDER BY either…) But manual verification of explain query plan shows INDEX. Interestingly, the LEFT JOIN uses moz_bookmarks_itemlastmodifiedindex (fk=?) while subselect uses moz_bookmarks_itemindex (fk=? AND type=?)
Pushed by edilee@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/fc91951971ec
Add query for getting Highlights (recent bookmarks and recent history with metadata). r=mak
https://hg.mozilla.org/mozilla-central/rev/fc91951971ec
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 57
Component: Activity Streams: Newtab → New Tab Page
You need to log in before you can comment on or make changes to this bug.