when user types "www" or "http" we should order results based on URL only, no adaptive, no recency, no frequency, no bookmark benefit

NEW
Unassigned

Status

()

Firefox
Address Bar
10 years ago
6 years ago

People

(Reporter: beltzner, Unassigned)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

I'm not sure how much this will be needed, but I'm sure that it'll come up so I wanted to get it on file.

Right now when a user types "www" or "http" into the location bar, the unique nature of those strings usually means that we match against the URL of history and bookmark entries. However, we continue to order results based on adaptive learning and frecency.

There are use cases that are better supported by old-style typeahead find matching, where we match against the closest possible result (ie: shortest) in the history, not taking recency, frequency, bookmarking, tagging, or learning into account. It seems that when users start typing an entire URL in (by entering "http" or "www") we should revert to this.

So typing "bugzilla" could result in:

 https://_bugzilla_.mozilla.org/enter_bug.cgi?product=Firefox     *
 https://_bugzilla_.mozilla.org/

but typeing in "https://bug" would result in:

 _https://bug_zilla.mozilla.org/
 _https://bug_zilla.mozilla.org/enter_bug.cgi?product=Firefox     *
I'm not sure such a simple heuristic will be sufficient ...

Comment 2

6 years ago
(In reply to Jens Müller Comment 1)

What about this algorithm:

Autocomplete individual words (say, "ww" to "www", or "mo" to "mozilla") only, until the first non-ASCII-alphanumeric character. (Autocomplete is Bug 566489, separate from this bug, but interrelated).

If the first non-ASCII-alphanumeric character is a period, forward slash, question mark, or colon, begin to assume it's a URL (technically, these would vary by scheme, but according to Mozilla developers, it's not a URL bar, it's a location bar). You *might* also want to include the at symbol.

If a space, or other URL-unsafe symbol other than dash or percent sign, is typed at any time by the user, revert from URL matching to generic search-based matching.

Due to non-punycode internationalized domain names often being rendered by default now, the rules need elaboration for non-ASCII character. Many non-latin alphabets don't use spacing characters, and rules for determining word boundaries or which symbols are punctuation and which are letters are complex. However, domain names have very clear rules which makes this much simpler than free-entry text such as a search box. As they apply to typed URLs, it's unlikely to b a spoof risk, because users typing in a given language are unlikely to accidentally enter homographic non-native characters in between native characters.

If non-ASCII characters are entered, switch to a full search, but switch back to URL-only if a period, forward slash, question mark, or colon appears, but then switch back again to full search, if a space or URL-unsafe symbol other than dash or percent sign, is entered.

Possibly, though, if searches using URL rules result in no matches at all, then you might switch back to the general search algorithm. That would be Bug 424673.
You need to log in before you can comment on or make changes to this bug.