Refactor the search URL provider for the location bar

RESOLVED FIXED in mozilla34

Status

()

Toolkit
Places
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: mconley, Assigned: Paolo)

Tracking

unspecified
mozilla34
Points:
8
Dependency tree / graph
Bug Flags:
firefox-backlog +
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [search])

Attachments

(1 attachment, 3 obsolete attachments)

Bug 959580 allows Places to recognize a history entry as a search. The Awesomebar should then be able to extract the search terms from that entry by reverse engineering the query URI based on the default search engine query template.

Updated

4 years ago
Blocks: 950073

Updated

4 years ago
Whiteboard: [p=8] → [feature] p=8

Updated

4 years ago
Depends on: 959580

Updated

4 years ago
Whiteboard: [feature] p=8 → [search] p=8

Updated

4 years ago
No longer blocks: 950073
Flags: firefox-backlog+
(Assignee)

Updated

4 years ago
Depends on: 1038225
(Assignee)

Updated

4 years ago
Blocks: 1034381
(Assignee)

Comment 1

4 years ago
We might need to do this for non-default search engines as well, for bug 1034381.

I filed bug 1038225 to define which case we will support initially.
(Assignee)

Updated

4 years ago
Summary: Parse recognized searches back into their search terms via default search engine → Parse recognized searches back into their search terms and engine name
Marco: please add to 33.3
Flags: needinfo?(mmucci)
Added to Iteration 33.3
Assignee: nobody → paolo.mozmail
Status: NEW → ASSIGNED
Iteration: --- → 33.3
Points: --- → 8
QA Whiteboard: [qa?]
Flags: needinfo?(mmucci)
Whiteboard: [search] p=8 → [search]
(Assignee)

Comment 4

4 years ago
mak, thanks for pointing me to PriorityUrlProvider. I think that, thanks to the UnifiedCompelte design, we might easily repurpose the module to be a PlacesSearchUrlProvider (or similar name). It would have two public functions:

- matchEngineByToken: The current getMatch.
- parseSearchUrl: Given a full URI spec, if it is from a search engine result, returns the search engine name and the full terms used.

This shifts the task of priority determination (finding out whether a search token matches something that should generically have priority over other matches) to UnifiedComplete itself. In fact it already does the mixing of results based on frecency, and if we'll have new types of "priority" results in the future, UnifiedComplete better knows what to do with them than another module.
Flags: needinfo?(mak77)
(In reply to :Paolo Amadini from comment #4)
> mak, thanks for pointing me to PriorityUrlProvider. I think that, thanks to
> the UnifiedCompelte design, we might easily repurpose the module to be a
> PlacesSearchUrlProvider (or similar name).

Yes, I made priorityUrlProvider more general thinking in future we might add different stuff than searches to it... but maybe we rather need different providers to keep the code more partitioned and have the logic toi go through providers in UnifiedComplete.js. I kinda agree with the plan!
Flags: needinfo?(mak77)
Hardware: x86 → All
(Assignee)

Comment 6

4 years ago
I've noticed that some of our default engines use UTF-8 encoding for the search terms in the parameter, while others use ISO-8859-1. Do we deal with both encodings correctly when matching Unicode searches in the Places database?

I believe this will not affect our ability to identify search URLs from a location bar search, because in most cases search terms are present in the page title anyways, and once we matched the URL, the component can use the correct encoding from the search engine definition. How this works is however something we may want to note in the code comments.
Flags: needinfo?(mak77)
(Assignee)

Comment 7

4 years ago
It turned out that the only default engine for the English locale using ISO-8859-1 encoding has now switched to UTF-8 (bug 1039456), so this won't even affect us by default. When parsing the search terms in the URL I think we should support other encodings from the beginning anyways, for search engines added by the user, or those in localized builds:

http://mxr.mozilla.org/l10n-central/search?string=%3CInputEncoding%3E
(Assignee)

Updated

4 years ago
Depends on: 1040721
(Assignee)

Comment 8

4 years ago
Created attachment 8458831 [details] [diff] [review]
Preliminary patch

This patch depends on the code in bug 1040721, and refactors the existing provider module.

Note that the cached data structures are now completely refreshed on every change to search engines, since they are quite rare and we need to keep them in the correct order anyways.

The module is now optimized on the searches side, using arrays instead of maps, since they should be faster for the sequential scanning we need (not needing a new iterator to be generated at each access).
Attachment #8458831 - Flags: feedback?(mak77)
Hi Paolo, can you mark this bug as [qa+] or [qa-] for verification.
Flags: needinfo?(paolo.mozmail)
(Assignee)

Updated

4 years ago
QA Whiteboard: [qa?] → [qa-]
Flags: needinfo?(paolo.mozmail)

Updated

4 years ago
Iteration: 33.3 → 34.1
(Assignee)

Comment 10

4 years ago
Created attachment 8460223 [details] [diff] [review]
Refactoring

In the end, we moved most of the logic to bug 1040721. What remains here is only the refactoring of the component, that may still be worth doing, as it reduces the number of lines of code, adds more comments, and fixes a minor race condition on initialization.
Attachment #8458831 - Attachment is obsolete: true
Attachment #8458831 - Flags: feedback?(mak77)
Attachment #8460223 - Flags: review?(mak77)
(Assignee)

Updated

4 years ago
No longer depends on: 1040721
(Assignee)

Updated

4 years ago
No longer depends on: 959580, 1038225
Summary: Parse recognized searches back into their search terms and engine name → Refactor the search URL provider for the location bar
(Assignee)

Comment 11

4 years ago
(In reply to :Paolo Amadini from comment #10)
> In the end, we moved most of the logic to bug 1040721. What remains here is
> only the refactoring of the component, that may still be worth doing, as it
> reduces the number of lines of code, adds more comments, and fixes a minor
> race condition on initialization.

Ah, forgot to mention that this also preserves the engine order, which may be relevant in the case of similar names or localized versions. For example, if a custom engine for google.it is installed before google.com, it will always take precedence when typing "google" in the location bar.
Comment on attachment 8460223 [details] [diff] [review]
Refactoring

Review of attachment 8460223 [details] [diff] [review]:
-----------------------------------------------------------------

I'm fine with the idea, though considered the discussion we had regarding using this as a more general search service helper for autocomplete, it might still need some refactoring. I'm going to review changes anyway, with the assumption it might change.

::: toolkit/components/places/PlacesSearchUrlProvider.jsm
@@ +1,3 @@
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */

it would greatly help the review if you could hg mv the file instead of renaming it. I can't see the diff this way and we are losing all of the blame

@@ +15,5 @@
> +Cu.import("resource://gre/modules/NetUtil.jsm");
> +Cu.import("resource://gre/modules/Promise.jsm");
> +Cu.import("resource://gre/modules/Services.jsm");
> +Cu.import("resource://gre/modules/Task.jsm");
> +Cu.import("resource://gre/modules/XPCOMUtils.jsm");

I'd make NetUtil a lazy module getter, it's less common than the other modules

@@ +47,5 @@
> +      });
> +    });
> +  },
> +
> +  initialized: false,

what's the point of tracking this? it looks unused

@@ +54,5 @@
> +    switch (data) {
> +      case "engine-added":
> +      case "engine-changed":
> +      case "engine-removed":
> +        this._refresh();

why reloading all of the engines when we can do simple addition/removals? why the change?

::: toolkit/components/places/UnifiedComplete.js
@@ +678,5 @@
>        return;
> +
> +    // Handle priority matches for search engine domains.
> +    let priorityMatch =
> +        yield PlacesSearchUrlProvider.findPriorityMatchByToken(this._searchString);

I'm thinking maybe we should remove the word Priority from everywhere, it hs a sense of negativity and it's not exactly what we do (we merge with a given frecency...)
Maybe just PlacesSearchAutocompleteProvider?
Attachment #8460223 - Flags: review?(mak77) → feedback+
I think between the irc discussion and the  feedbacks we discussed all of the points, if not please reflag.
Flags: needinfo?(mak77)
(Assignee)

Updated

4 years ago
Depends on: 1040721
(Assignee)

Comment 14

4 years ago
Created attachment 8463139 [details] [diff] [review]
The patch

(In reply to Marco Bonardo [:mak] from comment #12)
> > +  initialized: false,
> 
> what's the point of tracking this? it looks unused

Leftover, now used in this patch for checking that the synchronous parseSubmissionURL is called only after initialization.

> @@ +54,5 @@
> > +    switch (data) {
> > +      case "engine-added":
> > +      case "engine-changed":
> > +      case "engine-removed":
> > +        this._refresh();
> 
> why reloading all of the engines when we can do simple addition/removals?
> why the change?

This is done to preserve the priority order of the engines, since multiple engine domains may start with the same letter, and we want to return the first in the user-defined priority list.

The "engine-changed" notification is received when an engine is moved, but keeping a mirrored ordered list in sync by responding to that notification would be much more work, if even possible. We would probably need to call getEngines again to find the new index, making the fine-grained sync pointless anyways.

Since a change in the engines or their order is a rare event, I think it is better to just rebuild the list when this happens. It is much less code.
Attachment #8460223 - Attachment is obsolete: true
Attachment #8463139 - Flags: review?(mak77)
Comment on attachment 8463139 [details] [diff] [review]
The patch

Review of attachment 8463139 [details] [diff] [review]:
-----------------------------------------------------------------

r=me but please clarify why parseSubmissionURL can't be async and thus ensure init by itself

::: toolkit/components/places/PlacesSearchAutocompleteProvider.jsm
@@ +145,5 @@
> +   */
> +  parseSubmissionURL: function (url) {
> +    if (!SearchAutocompleteProviderInternal.initialized) {
> +      throw new Error("The component has not been initialized.");
> +    }

why can't we make it call this.ensureInitialized()? do we really need this API to be synchronous?

I'd mostly prefer if we'd have a single init path/promise instead of internal and external init...

::: toolkit/components/places/tests/unit/test_PlacesSearchAutocompleteProvider.js
@@ +56,5 @@
> +  // itself, thus we only do a sanity check of the wrapper here.
> +  let engine = yield promiseDefaultSearchEngine();
> +  let submissionURL = engine.getSubmission("terms").uri.spec;
> +
> +  let result = yield PlacesSearchAutocompleteProvider.parseSubmissionURL(submissionURL);

and indeed this is yielding over parseSubmissionURL...

@@ +60,5 @@
> +  let result = yield PlacesSearchAutocompleteProvider.parseSubmissionURL(submissionURL);
> +  do_check_eq(result.engineName, engine.name);
> +  do_check_eq(result.terms, "terms");
> +
> +  let result = yield PlacesSearchAutocompleteProvider.parseSubmissionURL("http://example.org/");

as well as this one
Attachment #8463139 - Flags: review?(mak77) → review+
(Assignee)

Comment 16

4 years ago
(In reply to Marco Bonardo [:mak] from comment #15)
> why can't we make it call this.ensureInitialized()? do we really need this
> API to be synchronous?

Unfortunately, these are called by UnifiedComplete for each result row, in the synchronous path that processes the results of the SQLite query. While the queries themselves are executed asynchronously, the Sqlite.jsm row processing callbacks must complete synchronously, and we can't wait on calling addResult, otherwise the order of results in the dropdown could change randomly.

> and indeed this is yielding over parseSubmissionURL...

Ah, cut-and-paste leftovers, thanks for catching those!
(Assignee)

Comment 17

4 years ago
Created attachment 8466180 [details] [diff] [review]
Updated patch

I've added this note to the function:

   * @note This API function needs to be synchronous because it is called inside
   *       a row processing callback of Sqlite.jsm, in UnifiedComplete.js.
Attachment #8463139 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/2bbd5e4afe2d
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.