Closed Bug 1350377 Opened 7 years ago Closed 7 years ago

Implement History.fetch

Categories

(Toolkit :: Places, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: mak, Assigned: milindl)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

fetch should probably replace promisePlaceInfo and getPlacesInfo (or wrap this one if that can be more efficient).

it should be able to retrieve visits data, the referrer, frecency. Though data that requires a join with history may want a param or an alternative API so we don't pay useless costs.
Blocks: 1354531
I am working on this, I need some clarification. I am following the signature given in History.jsm::fetch and PlacesUtils.jsm::promisePlaceInfo, which makes `fetch` a method which takes a single uri or guid and returns a PageInfo. But the History.cpp GetPlacesInfo takes gets info for an array of the same. Please tell me if I am using the correct option.

Thanks!
It doesn't really depend much on the current signatures, more on the usage of the current APIs.
The new fetch() APIs will replace are nsINavHistoryService::GetPageTitle and promisePlaceInfo. The only production use of getPlacesInfo is promisePlaceInfo. There is no usage of the array API, and it may be trivial to add it in the future.
Also GetPlacesInfo internally is just running a loop, so it doesn't make much sense as-is.

To sum up, I think for now fetch should just fetch info about a single entry. In the future it should be trivial to make it accept an array, but we don't have a use-case for now.

Additionally, I think we should remove GetPlacesInfo completely. The only advantage of it is that it is reusing FetchPageInfo, but the overhead to cross XPCOM and convert arguments from jsval, plus the upside of making easy for contributors to touch this code (in Jistory.jsm instead of a complex cpp file) makes that more expensive than writing new js code. By removing the external API we could be able to specialize more the internal History.cpp implementation.

Thus, rather than doing what we did for updatePlaces, where insert and insertMany are barely wrappers, in this case I'd prefer a complete js rewrite of fetch() and the removal of the old GetPlacesInfo.  promisePlaceInfo should be rewritten as a wrapper around history.fetch and it should use Deprecated.jsm to issue a deprecation warning (there are a couple 30k+ users add-ons that I don't want to break yet). The removal of GetPageTitle will instead happen in bug 885247 since it requires deeper changes to the UI code.
The existing tests for GetPlacesInfo can be ported to a new test for fetch.

I hope this covers most of the design, if you have further questions feel free to ask them.
Assignee: nobody → i.milind.luthra
Status: NEW → ASSIGNED
I've come up with a new implementation. I will follow this up with a commit for the tests if this is looking OK. 

Also, if you have any ideas on how to get the visits in a better way than using `group_concat` followed by a javascript split(","), please tell me (I used group concat since I wanted to minimize my sql queries).

Thanks! :)
Attachment #8867713 - Flags: review?(mak77)
Comment on attachment 8867713 [details]
Bug 1350377 - Implement `History.fetch` and accordingly change `promisePlaceInfo`,

https://reviewboard.mozilla.org/r/139286/#review143422

I think we should also (maybe in a separate changeset):
* Remove GetPlacesInfo from mozIAsyncHistory.idl
* Remove its implementation from History.cpp (History::GetPlacesInfo and class GetPlaceInfo)
* Remove toolkit/components/places/tests/unit/test_getPlacesInfo.js (Also from xpcshell.ini) and port any tests that may be relevant from it to the new test for fetch
* Update the comment here: http://searchfox.org/mozilla-central/rev/ae24a3c83d22e0e35aecfd9049c2b463ca7e045b/toolkit/components/places/nsINavHistoryService.idl#1319 to point to PlacesUtils.history.fetch
* Replace this call http://searchfox.org/mozilla-central/rev/9a7fbdee1d54f99cd548af95b81231d80e5f9ad1/browser/components/migration/tests/marionette/test_refresh_firefox.py#175 with a call to fetch
* Replace this http://searchfox.org/mozilla-central/rev/9a7fbdee1d54f99cd548af95b81231d80e5f9ad1/browser/base/content/browser-places.js#485
* Replace this http://searchfox.org/mozilla-central/rev/9a7fbdee1d54f99cd548af95b81231d80e5f9ad1/browser/components/places/content/bookmarkProperties.js#663
* Replace this http://searchfox.org/mozilla-central/rev/9a7fbdee1d54f99cd548af95b81231d80e5f9ad1/toolkit/components/places/tests/unit/test_history.js#151

::: toolkit/components/places/History.jsm:117
(Diff revision 1)
>  
>  this.History = Object.freeze({
>    /**
>     * Fetch the available information for one page.
>     *
>     * @param guidOrURI: (URL or nsIURI)

Should accept either URL, nsIURI or href. I think normalizeToURLOrGUID is already doing the right thing here.

::: toolkit/components/places/History.jsm:121
(Diff revision 1)
>     *
>     * @param guidOrURI: (URL or nsIURI)
>     *      The full URI of the page.
>     *            or (string)
>     *      Either the full URI of the page or the GUID of the page.
> +   * @param getVisits: (boolean) (optional, default = false)

let's instead accept an optional options object, and one of the options should be (bool) includeVisits
so we can add more options in the future.

::: toolkit/components/places/History.jsm:137
(Diff revision 1)
>     *      that may be parsed neither as a valid URL nor as a valid GUID.
>     */
> -  fetch(guidOrURI) {
> -    throw new Error("Method not implemented");
> +  fetch(guidOrURI, getVisits = false) {
> +    // First, normalize to guid or string, and throw if not possible
> +    guidOrURI = PlacesUtils.normalizeToURLOrGUID(guidOrURI);
> +    return PlacesUtils.withConnectionWrapper(

let's use promiseDBConnection here instead. The advantage is that promiseDBConnection is concurrent, so it can read even if something else is writing. For history it's less critical to have properly read/write serialization but more critical to have better perf. So I think we should try with the R/O connection first.

::: toolkit/components/places/History.jsm:836
(Diff revision 1)
> +// Inner implementation of History.fetch
> +var fetch = async function(db, guidOrURL, getVisits) {
> +  let selectionQueryFragment = "";
> +  let params = {};
> +  if (typeof guidOrURL !== "string") {
> +    // guidOrURL is a URL

instanceof URL would be more readable and avoid the comment since it's self explanatory.

::: toolkit/components/places/History.jsm:845
(Diff revision 1)
> +    // guidOrURL is a guid
> +    selectionQueryFragment = "WHERE guid = :guid";
> +    params.guid = guidOrURL;
> +  }
> +
> +  // Now, we need to add a join if we need to get visits also

very nit: per coding style we usually end comments with a period. not a big deal, mostly for consistency.

::: toolkit/components/places/History.jsm:851
(Diff revision 1)
> +  let concatQueryFragment = "";
> +  let optionalJoin = "";
> +  let optionalJoinWhereClause = "";
> +  if (getVisits) {
> +    concatQueryFragment = `, group_concat(v.visit_date, ",") AS visit_date
> +                           , group_concat(v.from_visit, ",") AS from_visit

There is a problem with this, the problem is that we are trying to remove visit ids from APIs since it's an internal implementation detail.
We should replace from_visit with the URL of the originating visit...
That requires a further JOIN or subquery and makes all of this quite expensive.

So, considering none of the current consumers does anything with visits, for now I'd suggest to either:
* leave the visits implementation as a TODO for a future bug
* for now just return date and transition and leave referred as a TODO

::: toolkit/components/places/History.jsm:852
(Diff revision 1)
> +  let optionalJoin = "";
> +  let optionalJoinWhereClause = "";
> +  if (getVisits) {
> +    concatQueryFragment = `, group_concat(v.visit_date, ",") AS visit_date
> +                           , group_concat(v.from_visit, ",") AS from_visit
> +                           , group_concat(v.visit_type, ",") AS visit_type`;

The only risk with group_concat is that is skips NULLs, and from_visit is not defined NOTNULL (even if we always write 0), so there is a small risk of wrong assignment if something would ever write nulls there.

The alternative is to not group the join and handle all the results in a for loop. The first iteration would fill all the page values, next ones just append to visits. If there's no join you just iterate once so it should do.

::: toolkit/components/places/History.jsm:867
(Diff revision 1)
> +  await db.executeCached(
> +    query,
> +    params,
> +    row => {
> +      // There should be atmost 1 row for any place, so we just process one
> +      if (pageInfo !== {}) {

I don't think this works ({} !== {}) => true

::: toolkit/components/places/History.jsm:872
(Diff revision 1)
> +      if (pageInfo !== {}) {
> +        return;
> +      }
> +      pageInfo = {
> +        guid: row.getResultByName("guid"),
> +        url: row.getResultByName("url"),

should be a URL

::: toolkit/components/places/History.jsm:874
(Diff revision 1)
> +      }
> +      pageInfo = {
> +        guid: row.getResultByName("guid"),
> +        url: row.getResultByName("url"),
> +        frecency: row.getResultByName("frecency"),
> +        title: row.getResultByName("title")

could be null, should add `|| ""`;

::: toolkit/components/places/History.jsm:898
(Diff revision 1)
> +      }
> +    });
> +  if (pageInfo === {}) {
> +    pageInfo = null;
> +  }
> +  return Promise.resolve(pageInfo);

Why the Promise.resolve here?
Attachment #8867713 - Flags: review?(mak77) → review-
> The only risk with group_concat is that is skips NULLs, and from_visit is not defined NOTNULL (even if we always write 0), so > there is a small risk of wrong assignment if something would ever write nulls there.
> 
> The alternative is to not group the join and handle all the results in a for loop. The first iteration would fill all the 
> page values, next ones just append to visits. If there's no join you just iterate once so it should do.

I am going with removing from_visit from my implementation and keeping the other two, so I'll keep the group_concat query. However neither visit_date nor visit_type are marked NOTNULL - are there instances where they might be NULL?

I switched to a single statement in the first place since I wanted to minimize sql queries, if the cost of a query is low enough I will be ok with a loop as well (I'm unsure how to test this however).

I've been working on a preliminary test toay, I'll do the other things and the final test in the next changeset.

Thanks!
(In reply to Milind L (:milindl) from comment #7)
> I am going with removing from_visit from my implementation and keeping the
> other two, so I'll keep the group_concat query. However neither visit_date
> nor visit_type are marked NOTNULL - are there instances where they might be
> NULL?

Not in Firefox code, but add-ons could do whatever, yet.

> I switched to a single statement in the first place since I wanted to
> minimize sql queries, if the cost of a query is low enough I will be ok with
> a loop as well (I'm unsure how to test this however).

The cost doing group_concat or a for loop shouldn't differ much, it's still running a single query. The only overhead is building an array of results, but you are paying a cost for split, so I don't expect much difference globally.
So yeah, I'd prefer to loop than the group_concat, since we don't have schema protection against NULL insertions.
I've updated it to get rid of group concats now. I am using the callback to collect visitsInfos in an array, and appending to the pageInfo object later if required. 

I had some confusion regarding this till you said that the #queries remains the same (I was thinking that we would remove the join and run multiple queries, which would increase the time taken).

I've also fixed the files using getPlacesInfo or promisePlaceIfo to use fetch.

Thanks!
Comment on attachment 8867713 [details]
Bug 1350377 - Implement `History.fetch` and accordingly change `promisePlaceInfo`,

https://reviewboard.mozilla.org/r/139286/#review144034

::: toolkit/components/places/History.jsm:120
(Diff revision 5)
> -   * @param guidOrURI: (URL or nsIURI)
> +   * @param guidOrURI: (string) or (URL, nsIURI or href)
> -   *      The full URI of the page.
> -   *            or (string)
>     *      Either the full URI of the page or the GUID of the page.
> +   * @param options: (object)
> +   *      This object may contain the following:

please add [optional] to the javadoc and change this text to
"An optional object whose properties describe options:"

::: toolkit/components/places/History.jsm:123
(Diff revision 5)
>     *      Either the full URI of the page or the GUID of the page.
> +   * @param options: (object)
> +   *      This object may contain the following:
> +   *        - `includeVisits` (boolean) set this to true if `visits` in the
> +   *           PageInfo needs to contain VisitInfo. By default, `visits`
> +   *           is undefined inside the returned `PageInfo`.

Let's annotate that for now VisitInfo are missing the referrer property with a TODO reference to a new bug # filed for that.

this comment should also specify the order in which visits are returned.
that also means in the includeVisits case we need to add an ORDER BY clause to the query.
We'll surely order by visit_date, we must just decide the order. I think we should go coherent with the WebExtension API that returns visits in reverse chronological order (from newer to older). This also requires a test.

::: toolkit/components/places/History.jsm:140
(Diff revision 5)
> -    throw new Error("Method not implemented");
> +    // First, normalize to guid or string, and throw if not possible
> +    guidOrURI = PlacesUtils.normalizeToURLOrGUID(guidOrURI);
> +
> +    // See if options exists and make sense
> +    if (typeof options !== "object") {
> +      throw new TypeError("options should be an object");

the only problem is typeof null. let's change this to if (!options || typeof options != "object")

also please add a test passing null.

::: toolkit/components/places/History.jsm:150
(Diff revision 5)
> +      throw new TypeError("includeVisits should be a boolean if exists");
> +    }
> +
> +    if (!hasIncludeVisits) {
> +      options.includeVisits = false;
> +    }

doesn't seem necessary. does checking "if (options.includeVisits)" warn otherwise?

::: toolkit/components/places/History.jsm:883
(Diff revision 5)
> +      if (options.includeVisits) {
> +        // On every row (not just the first), we need to collect visit data.
> +        let date = PlacesUtils.toDate(row.getResultByName("visit_date"));
> +        let transition = row.getResultByName("visit_type");
> +
> +        // TODO: add referrer URL to the `VisitInfo` data as well.

here as well, please file a bug for this and add the bug # to the comment

::: toolkit/components/places/History.jsm:888
(Diff revision 5)
> +        // TODO: add referrer URL to the `VisitInfo` data as well.
> +        visitInfos.push({ date, transition });
> +      }
> +    });
> +  if (pageInfo && options.includeVisits) {
> +    pageInfo.visits = visitInfos;

you could do this in the loop,
 if (options.includeVisits) {
   if (!("visits" in pageInfo))
     pageInfo.visits = [];
   ...
   pageInfo.visits.push...
Attachment #8867713 - Flags: review?(mak77) → review-
Comment on attachment 8868715 [details]
Bug 1350377 - Remove `getPlacesInfo` and change associated files and tests,

https://reviewboard.mozilla.org/r/140304/#review144042

it looks good overall, good job!
I suggested some additional tests in the other review, and a few details here.

::: browser/components/migration/tests/marionette/test_refresh_firefox.py:174
(Diff revision 2)
>          """, script_args=[self._bookmarkURL])
>          self.assertEqual(titleInBookmarks, self._bookmarkText)
>  
>      def checkHistory(self):
> -        historyResults = self.runAsyncCode("""
> +        historyResult = self.runAsyncCode("""
>            let placeInfos = [];

this looks unused now, and could be removed?

::: browser/components/migration/tests/marionette/test_refresh_firefox.py:175
(Diff revision 2)
>          self.assertEqual(titleInBookmarks, self._bookmarkText)
>  
>      def checkHistory(self):
> -        historyResults = self.runAsyncCode("""
> +        historyResult = self.runAsyncCode("""
>            let placeInfos = [];
> -          PlacesUtils.asyncHistory.getPlacesInfo(makeURI(arguments[0]), {
> +          let pageInfoPromise = PlacesUtils.history.fetch(makeURI(arguments[0]));

we don't need the makeURI call, I assume, since we accept hrefs too.

::: browser/components/migration/tests/marionette/test_refresh_firefox.py:176
(Diff revision 2)
>  
>      def checkHistory(self):
> -        historyResults = self.runAsyncCode("""
> +        historyResult = self.runAsyncCode("""
>            let placeInfos = [];
> -          PlacesUtils.asyncHistory.getPlacesInfo(makeURI(arguments[0]), {
> -            handleError(resultCode, place) {
> +          let pageInfoPromise = PlacesUtils.history.fetch(makeURI(arguments[0]));
> +          pageInfoPromise.then(pageInfo => {

why not directly appending the .then to the fetch call, instead of assigning the promise to a temp var?

::: toolkit/components/places/mozIAsyncHistory.idl
(Diff revision 2)
>  
>  [scriptable, uuid(1643EFD2-A329-4733-A39D-17069C8D3B2D)]
>  interface mozIAsyncHistory : nsISupports
>  {
>    /**
> -   * Gets the available information for the given array of places, each

while here, please add a comment like

/*
  This interface contains APIs for cpp consumers.
  javascript consumers should look at History.jsm instead,
  that is exposed through PlacesUtils.history.

  If you're evaluating adding a new history API, it should
  usually go to History.jsm, unless it needs to do long and
  expensive work in a batch, then it could be worth doing
  that in History.cpp.
*/

::: toolkit/components/places/tests/history/test_fetch.js:25
(Diff revision 2)
> +        }
> +      }
> +      Assert.ok(visitFound);
> +    }
> +  }
> +}

I don't know if it would perfectly cover this, but note we have Assert.deepEqual. My biggest doubt though is what it would do with URL objects, would it compare them directly or their toString? no idea off-hand.

::: toolkit/components/places/tests/history/test_fetch.js:37
(Diff revision 2)
> +  let uriString = `http://mozilla.com/test_browserhistory/test_fetch`;
> +  let uri = NetUtil.newURI(uriString);
> +  let title = `Test Visit ${Math.random()}`;
> +  let dates = [];
> +  let visits = [];
> +  let transition = PlacesUtils.history.TRANSITION_LINK;

it would be great to test a mixup of transitions. But note that TRANSITION_EMBED visits won't be returned.
Also it may be good to add to the API javadoc a @note about this.

::: toolkit/components/places/tests/history/test_fetch.js:50
(Diff revision 2)
> +      visitDate: dates[i]
> +    });
> +  }
> +  await PlacesTestUtils.addVisits(visits);
> +  Assert.ok((await PlacesTestUtils.isPageInDB(uri)));
> +  await PlacesTestUtils.visitsInDB(uri);

did you plan to store the number for later comparison?
Attachment #8868715 - Flags: review?(mak77)
Blocks: 1365913
Comment on attachment 8867713 [details]
Bug 1350377 - Implement `History.fetch` and accordingly change `promisePlaceInfo`,

https://reviewboard.mozilla.org/r/139286/#review144034

> you could do this in the loop,
>  if (options.includeVisits) {
>    if (!("visits" in pageInfo))
>      pageInfo.visits = [];
>    ...
>    pageInfo.visits.push...

I actually did something similar here https://reviewboard.mozilla.org/r/139286/diff/4-5/
I was just wondering whether this could happen: 
Calllback A is called but before any statements are executed control changes to Callback B
Callback B gets into the if clause since visits is not in pageInfo. Then it's stopped.
Callback A gets into the if clause since visits is not in pageInfo. Then, it adds visits to pageInfo and does pageInfo.visits.push(...)
Callback B gets control again, and sets pageInfo.visits = [] (we lose a visit)

It may be that I am overthinking this (is this even a possible scenario?) but if not then this could lead to a small number of missed visits.
Comment on attachment 8867713 [details]
Bug 1350377 - Implement `History.fetch` and accordingly change `promisePlaceInfo`,

https://reviewboard.mozilla.org/r/139286/#review144034

> I actually did something similar here https://reviewboard.mozilla.org/r/139286/diff/4-5/
> I was just wondering whether this could happen: 
> Calllback A is called but before any statements are executed control changes to Callback B
> Callback B gets into the if clause since visits is not in pageInfo. Then it's stopped.
> Callback A gets into the if clause since visits is not in pageInfo. Then, it adds visits to pageInfo and does pageInfo.visits.push(...)
> Callback B gets control again, and sets pageInfo.visits = [] (we lose a visit)
> 
> It may be that I am overthinking this (is this even a possible scenario?) but if not then this could lead to a small number of missed visits.

IF you're speaking about the onResult callbacks (that give you back rows) they are just invoked synchronously in a loop, since you are just doing main-thread stuff here, there's no way that a callback could be invoked before the previous one is one. That would only happen if in the callback you'd start an async function or such, then it would be up to you to synchronize those.
You can think of these as the forEach() callbacks.
typo: s/previous one is one/previous one is done/
Comment on attachment 8867713 [details]
Bug 1350377 - Implement `History.fetch` and accordingly change `promisePlaceInfo`,

https://reviewboard.mozilla.org/r/139286/#review144516

::: toolkit/components/places/History.jsm:123
(Diff revision 7)
>     *      Either the full URI of the page or the GUID of the page.
> +   * @param [optional] options (object)
> +   *      An optional object whose properties describe options:
> +   *        - `includeVisits` (boolean) set this to true if `visits` in the
> +   *           PageInfo needs to contain VisitInfo. By default, `visits`
> +   *           is undefined inside the returned `PageInfo`.

"needs to contain VisitInfo in reverse chronological order."

we must annotate that somewhere

::: toolkit/components/places/History.jsm:131
(Diff revision 7)
>     *      A promise resolved once the operation is complete.
>     * @resolves (PageInfo | null) If the page could be found, the information
> -   *      on that page. Note that this `PageInfo` does NOT contain the visit
> -   *      data (i.e. `visits` is `undefined`).
> +   *      on that page.
> +   * @note that the VisitInfo objects returned while fetching visits do not
> +   *      contain the property `referrer`.
> +   *      TODO: Add `referrer` to VisitInfo. See Bug #1365913.

please verticall align the text after the @note

::: toolkit/components/places/History.jsm:132
(Diff revision 7)
>     * @resolves (PageInfo | null) If the page could be found, the information
> -   *      on that page. Note that this `PageInfo` does NOT contain the visit
> -   *      data (i.e. `visits` is `undefined`).
> +   *      on that page.
> +   * @note that the VisitInfo objects returned while fetching visits do not
> +   *      contain the property `referrer`.
> +   *      TODO: Add `referrer` to VisitInfo. See Bug #1365913.
> +   * @note that the visits returned will not contain `TRANSITION_EMBED` visits.

nit: in both @note, please remove "that", I think that parsing software would just hilight the note in a particular way if it is a @note, but it may not read as "note that..."
Attachment #8867713 - Flags: review?(mak77) → review+
Comment on attachment 8868715 [details]
Bug 1350377 - Remove `getPlacesInfo` and change associated files and tests,

https://reviewboard.mozilla.org/r/140304/#review144522

It looks pretty good, thank you!
I'll start a try run to check tests.

::: toolkit/components/places/mozIAsyncHistory.idl:139
(Diff revision 4)
> -   *        data (GUID or URI).
> -   *
> -   * @throws NS_ERROR_INVALID_ARG
> -   *         - Passing in NULL for aPlaceIdentifiers or aCallback.
> -   *         - Not providing at least one valid GUID or URI. 
> -   */
> + */

please move the comment above the [scriptable, uuid...] part, with an empty line in the middle.
I prefer the decorators to be close to the interface definition.

::: toolkit/components/places/tests/history/test_fetch.js:3
(Diff revision 4)
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

nit: if it doesn't bother to you, you can also use a more liberal public domain license in tests code:

/* Any copyright is dedicated to the Public Domain.
   http://creativecommons.org/publicdomain/zero/1.0/ */

::: toolkit/components/places/tests/unit/test_history.js:151
(Diff revision 4)
>    // By default history is enabled.
>    do_check_true(!histsvc.historyDisabled);
>  
>    // test getPageTitle
>    await PlacesTestUtils.addVisits({ uri: uri("http://example.com"), title: "title" });
> -  let placeInfo = await PlacesUtils.promisePlaceInfo(uri("http://example.com"));
> +  let placeInfo = await PlacesUtils.history.fetch(uri("http://example.com"));

the uri() call is no more needed
Attachment #8868715 - Flags: review?(mak77) → review+
Pushed by mak77@bonardo.net:
https://hg.mozilla.org/integration/autoland/rev/1eac66ddfb73
Implement `History.fetch` and accordingly change `promisePlaceInfo`, r=mak
https://hg.mozilla.org/integration/autoland/rev/74a3dc70ad7d
Remove `getPlacesInfo` and change associated files and tests, r=mak
https://hg.mozilla.org/mozilla-central/rev/1eac66ddfb73
https://hg.mozilla.org/mozilla-central/rev/74a3dc70ad7d
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: