Correctly handle "www." and/or http/https prefixes for prefilled / top sites in the location bar autocomplete / unifiedcomplete

RESOLVED FIXED in Firefox 54

Status

()

P3
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: Gijs, Assigned: sveta.orlik.code)

Tracking

(Blocks: 1 bug)

53 Branch
Firefox 54
Points:
---
Dependency tree / graph
Bug Flags:
qe-verify -

Firefox Tracking Flags

(firefox54 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

2 years ago
In bug 1211726 we added support to provide a prefilled set of suggested sites (based on more or less unbiased lists of popular sites) to users who have new profiles without a lot of history.

One of the problems we struggled with was how to correctly store and match these items such that users typing any of:

face
www.face
http://face
https://face
http://www.face
https://www.face

would all get the suggestion for "facebook.com" .

Handling user input which includes the protocol is probably less important than handling the "www." case.

Another point here seems to be that while the sites we added initially were added without the "www." bit, most sites I tested redirect to the "with www." version, which seems to mean we sometimes end up with 2 entries that the location bar doesn't combine. Ideally we should avoid that.
(In reply to :Gijs from comment #0)
> which seems to mean we sometimes end up with 2 entries that
> the location bar doesn't combine. Ideally we should avoid that.

The location bar doesn't combine those because it can't know if they are the same thing. While in vast majority of the cases that is true, in general we can't assume www.something.com is the same as something.com (as well as we can't assume the http and https versions are the same).
(Reporter)

Comment 2

2 years ago
(In reply to Marco Bonardo [::mak] from comment #1)
> (In reply to :Gijs from comment #0)
> > which seems to mean we sometimes end up with 2 entries that
> > the location bar doesn't combine. Ideally we should avoid that.
> 
> The location bar doesn't combine those because it can't know if they are the
> same thing. While in vast majority of the cases that is true, in general we
> can't assume www.something.com is the same as something.com (as well as we
> can't assume the http and https versions are the same).

Right, yes, but in this case because we control the list of URLs we will know that they are the same, so we should provide the 'final' URL and ideally treat the items such that they will match/autocomplete regardless.

To use a specific example, facebook.com redirects to www.facebook.com . So we should autocomplete directly to the www. version, and we should do that (and match correctly) even if the user types "face" rather than "www.face", if that makes sense.
Blocks: 1340663
Priority: -- → P3
Comment hidden (mozreview-request)
(Reporter)

Comment 4

2 years ago
mozreview-review
Comment on attachment 8841483 [details]
Bug 1340108 - autocomplete prefill sites with proper protocol and www. management,

https://reviewboard.mozilla.org/r/115704/#review117072

I think this is the right direction. For the final patch, of course please remove the commented out dump() statements. :-)

::: toolkit/components/places/UnifiedComplete.js:556
(Diff revision 1)
> +  // For just the host - we add "https://"
> +  if (!(url.startsWith("http://") || url.startsWith("https://"))) {
> +    url = "https://" + url;
> +  }

As noted on IRC, I think we should make sure any sites we add to the list/file have a protocol included. Then you can omit this check (and the similar check elsewhere).

::: toolkit/components/places/UnifiedComplete.js:564
(Diff revision 1)
> +  }
>    this.uri = NetUtil.newURI(url);
>    this.title = title;
>    this._matchTitle = title.toLowerCase();
> +  this._hasWWW = this.uri.host.startsWith("www.");
> +  this._hostWitoutWWW = this._hasWWW ? this.uri.host.slice(4)

Nit: 'without' has an 'h' after the first 't' :-)

::: toolkit/components/places/UnifiedComplete.js:1075
(Diff revision 1)
> +    if (!this._searchString)
> +      return;

Is this necessary? I would imagine the general UnifiedComplete code stops sooner and never gets here. Marco would know for sure.

::: toolkit/components/places/UnifiedComplete.js:1098
(Diff revision 1)
> +    let searchStringHasHTTP = this._strippedPrefix.startsWith("http://");
> +    let searchStringHasHTTPS = this._strippedPrefix.startsWith("https://");
> +    let searchStringHasProtocol = searchStringHasHTTP || searchStringHasHTTPS;
> +
>      for (let site of PrefillSiteStorage.sites) {
> -      if (site.uri.host.startsWith(this._searchString)) {
> +      let protocolMatches = (searchStringHasHTTP && (site.uri.scheme === "http")) ||
> +                            (searchStringHasHTTPS && (site.uri.scheme === "https"));

Given my comment for http/https usage further down, I think we can simplify these checks like this:

```js
let searchStringProtocol = this._strippedPrefix.match(/^(https?):/i)[1] || "";
searchStringProtocol = searchStringProtocol.toLowerCase();
```

which will assign the (normalized/lowercased) protocol string to the variable, or the empty string.

Then inside the loop, you can check:

```js
let protocolMatches = !searchStringProtocol || searchStringProtocol == site.uri.scheme;
```

This also means the checks work if the user types "Https" or "HTTPS" or something like that.

::: toolkit/components/places/UnifiedComplete.js:1121
(Diff revision 1)
> +           (!searchStringHasProtocol || (searchStringHasProtocol && protocolMatches))
> +         ) {
> +        // finalCompleteValue is where we go after Enter is pressed
> +        // gets "www." if site has it OR user typed it
> +        // gets "http://" only if site has it explicitly AND user typed it, oherwise "https://"
> +        let finalWWW = site._hasWWW || searchStringHasWWW ? "www." : "";

Nit: please put the condition in ().

::: toolkit/components/places/UnifiedComplete.js:1122
(Diff revision 1)
> +        let finalProtocol = searchStringHasHTTP && (site.uri.scheme === "http")
> +                            ? "http://" : "https://";

This doesn't seem right - if the prefill site list has 'http://foo.com/' that must be because the https site is broken or different, and in that case if the user types 'foo' we should not suggest https.

Given this, I think we can just reuse `site.uri.scheme` if we get here.
Attachment #8841483 - Flags: review?(gijskruitbosch+bugs)
Comment hidden (mozreview-request)
(Reporter)

Comment 6

2 years ago
mozreview-review
Comment on attachment 8841483 [details]
Bug 1340108 - autocomplete prefill sites with proper protocol and www. management,

https://reviewboard.mozilla.org/r/115704/#review117340

I think this is almost done - can you remove the commented out dump()s and other commented out code for the final version? :-)

::: toolkit/components/places/UnifiedComplete.js:1096
(Diff revision 2)
>        return false;
> +
> +    let searchStringHasWWW = this._strippedPrefix.endsWith("www.");
> +    let searchStringWWW = searchStringHasWWW ? "www." : "";
> +
> +    // Extract any stripped scheme (as user can type even "ftp://")

It might make sense to just return false early if we find a scheme in user input but the scheme doesn't match http/https? In that case, neither strict nor loose matching will ever find a matching site, right? :-)

::: toolkit/components/places/unifiedcomplete-top-urls.json:8
(Diff revision 2)
> +  ,["http://nonsecure.net/", "Non-secure site"]
> +  ,["https://noprotocol.net", "No-protocol site"]
> +  ,["https://www.worldwidewebsite.net", "World Wide Web site"]

We probably shouldn't leave these in the set of sites we ship to users.
Attachment #8841483 - Flags: review?(gijskruitbosch+bugs)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Reporter)

Comment 9

2 years ago
mozreview-review
Comment on attachment 8842343 [details]
Bug 1340108 - autocomplete list entries match regardless of www. in them,

https://reviewboard.mozilla.org/r/116226/#review117840

Nice work! Can you unify the 2 commits into 1 commit?

Also, I'm a little confused about the 'edge case bug' you mention in the commit message. What happens if the user types "ww" ?

::: toolkit/components/places/UnifiedComplete.js:1074
(Diff revision 1)
>      if (!this._searchString)
>        return;
>  
> +    if (!(this._searchStringScheme === "" ||
> +          this._searchStringScheme === "https" ||
> +          this._searchStringScheme === "http" ))

Nit: no space before the `)` please.

::: toolkit/components/places/UnifiedComplete.js:1088
(Diff revision 1)
> -        this._addMatch(match);
> -      }
> +      if (site.uri.spec.includes(this._strippedPrefix + this._searchString) ||
> +          site._matchTitle.includes(this._strippedPrefix + this._searchString))

It probably makes sense to put these search string concatenations in a temp variable outside the loop, e.g.:

```js
let prefixedSearchString = this._strippedPrefix + this._searchString;
```

and then use that here.

Nit: please brace (with {} ) the if/else blocks.

::: toolkit/components/places/UnifiedComplete.js:1092
(Diff revision 1)
> +        if (
> +             (
> +               site.uri.host.includes(this._searchString) ||
> +               site._matchTitle.includes(this._searchString)
> +             ) &&
> +             (
> +               !this._searchStringScheme ||
> +               this._searchStringScheme === site.uri.scheme
> +             )
> +            )
> +          looseMatches.push(match);

To make this easier to read (and not take so many lines), I would suggest checking whether the scheme matches first, around both if blocks:

```js
if (!this._searchStringScheme || this._searchStringScheme === site.uri.scheme) {
  if (site.uri.spec.includes(prefixedSearchString) ||
      site._matchTitle.includes(prefixedSearchString)) {
    strictMatches.push(match);
  } else if (site.uri.host.includes(this._searchString) ||
             site._matchTitle.includes(this._searchString)) {
    looseMatches.push(match);
  }
}
```

::: toolkit/components/places/UnifiedComplete.js:1111
(Diff revision 1)
>  
>    _matchPrefillSiteForAutofill() {
>      if (!Prefs.prefillSitesEnabled)
>        return false;
>  
> +/*

Nit: please remove this, too. :-)

::: toolkit/components/places/tests/unifiedcomplete/test_prefill_sites.js:125
(Diff revision 1)
> -  let https_www_site_URI = NetUtil.newURI("https://www.ooops-https-www.com/");
> -  let https_____site_URI = NetUtil.newURI("https://ooops-https.com/");
> -  // let http__www_site_URI = NetUtil.newURI("HTTP://www.ooops-HTTP-www.com/");
> -  let http______site_URI = NetUtil.newURI("HTTP://ooops-HTTP.com/");
> -
> -  autocompleteObject.addPrefillSite(https_www_site_URI.spec, "Ooops");
> +  let ooops_https_www_URI = NetUtil.newURI("https://www.ooops-https-www.com/");
> +  let ooops_https_____URI = NetUtil.newURI("https://ooops-https.com/");
> +  let ooops_http______URI = NetUtil.newURI("HTTP://ooops-HTTP.com/");
> +  let ooops_http__www_URI = NetUtil.newURI("HTTP://www.ooops-HTTP-www.com/");
> +
> +  // Order is important to check sortig

Nit: 'sorting'
Attachment #8842343 - Flags: review?(gijskruitbosch+bugs)
(Reporter)

Comment 10

2 years ago
mozreview-review
Comment on attachment 8841483 [details]
Bug 1340108 - autocomplete prefill sites with proper protocol and www. management,

https://reviewboard.mozilla.org/r/115704/#review117842

(This is effectively an r+ for this part - most of the feedback is for the second patch, but if I r+ this now and then you combine the two patches, mozreview won't ask me for a review again for the combined set...)
Attachment #8841483 - Flags: review?(gijskruitbosch+bugs)
Comment hidden (mozreview-request)
(Assignee)

Updated

2 years ago
Attachment #8842343 - Attachment is obsolete: true
(Reporter)

Comment 12

2 years ago
mozreview-review
Comment on attachment 8841483 [details]
Bug 1340108 - autocomplete prefill sites with proper protocol and www. management,

https://reviewboard.mozilla.org/r/115704/#review117894

Assuming the tests pass, this looks good to me. Thanks!

::: toolkit/components/places/UnifiedComplete.js:1063
(Diff revision 4)
>      let daysSinceProfileCreation = (Date.now() - profileCreationDate) / MS_PER_DAY;
>      if (daysSinceProfileCreation > Prefs.prefillSitesExpireDays)
>        Services.prefs.setBoolPref("browser.urlbar.usepreloadedtopurls.enabled", false);
>    },
>  
> -  // TODO: manage protocol and "www." like _matchSearchEngineUrl() does
> +  // TODO: manage protocol

Is there still something left to do for protocols? If not this can go away, I think.
Attachment #8841483 - Flags: review?(gijskruitbosch+bugs) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 15

2 years ago
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/4af8a2761308
autocomplete prefill sites with proper protocol and www. management, r=Gijs
(Reporter)

Updated

2 years ago
Assignee: nobody → sveta.orlik.code

Comment 16

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/4af8a2761308
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox54: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 54
This fix seems to have at least some automated coverage. Gijs, do you think manual testing would be of value here?
Flags: qe-verify?
Flags: needinfo?(gijskruitbosch+bugs)
(Reporter)

Comment 18

2 years ago
(In reply to Andrei Vaida [:avaida], Desktop Release QA – please ni? me from comment #17)
> This fix seems to have at least some automated coverage. Gijs, do you think
> manual testing would be of value here?

I think we should do a generic feature check when we get ready to ship this (it's still preffed off), but I don't think verifying this bug individually is going to be useful, so I'll set qe-verify-. Thanks for checking!
Flags: qe-verify?
Flags: qe-verify-
Flags: needinfo?(gijskruitbosch+bugs)
You need to log in before you can comment on or make changes to this bug.