Closed Bug 1107883 Opened 10 years ago Closed 9 years ago

The awesomebar shows a search as first suggestion for typed IPv6 URLs

Categories

(Firefox :: Address Bar, defect, P1)

37 Branch
defect
Points:
5

Tracking

()

VERIFIED FIXED
Firefox 41
Iteration:
41.2 - Jun 8
Tracking Status
firefox36 --- affected
firefox37 --- affected
firefox41 --- verified

People

(Reporter: avaida, Assigned: markh)

References

Details

(Whiteboard: [unifiedautocomplete][fxsearch])

Attachments

(2 files, 3 obsolete files)

Attached image screenshot
Reproducible on: Nightly 37.0a1 (2014-12-04), Aurora 36.0a2 (2014-12-04).
Affected platform(s): Windows 7 64-bit, Ubuntu 14.04 LTS 64-bit and Mac OS X 10.9.5.

STR:
1. Launch Firefox with a clean profile.
2. Make sure 'browser.urlbar.unifiedcomplete' is set to true in about:config. If it isn't, set it so and restart the browser.
3. Type a valid, short IPv6 address inside the Location Bar, e.g. [2607:f0d0:1002:51::4].

ER:
The first suggestion is "Visit <url>".

AR:
The first suggestion is "<input> - Search with <default search engine>".

Additional notes:
- This is not a regression, it seems to work this way since UnifiedComplete was implemented.
- IPv4 URLs work properly, with the first suggestion being "Visit <url>".
Flags: qe-verify+
Blocks: 1157952
Whiteboard: [fxsearch][unifiedautocomplete]
Priority: -- → P1
Flags: firefox-backlog+
IIUC, the problem with [2607:f0d0:1002:51::4] is that it doesn't contain dots, so the URI fixup code assumes it's a single-level domain name (eg "mozilla") and thus insists it is whitelisted. However, that seems wrong. It works as expected for an ipv6 address with dots (eg, the test uses [64:ff9b::8.8.8.8]) but those addresses have special meaning.

But while it *seems* wrong, the tests are explicitly checking this - ie, all ipv6 addresses in test_nsDefaultURIFixup_info.js that don't also have a '.' explicitly set a |affectedByWhitelist| flag - the implication being that the behaviour is as designed. But I can't think it why that would be.

Gijs, you seem all over this code and particularly bug 1057531 which landed the ipv6 logic - can you shed some light on this?  ISTM we want to skip the FIXUP_FLAG_REQUIRE_WHITELISTED_HOST logic when we find a literal ipv6 address?
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Mark Hammond [:markh] from comment #1)
> IIUC, the problem with [2607:f0d0:1002:51::4] is that it doesn't contain
> dots, so the URI fixup code assumes it's a single-level domain name (eg
> "mozilla") and thus insists it is whitelisted. However, that seems wrong. It
> works as expected for an ipv6 address with dots (eg, the test uses
> [64:ff9b::8.8.8.8]) but those addresses have special meaning.
> 
> But while it *seems* wrong, the tests are explicitly checking this - ie, all
> ipv6 addresses in test_nsDefaultURIFixup_info.js that don't also have a '.'
> explicitly set a |affectedByWhitelist| flag - the implication being that the
> behaviour is as designed. But I can't think it why that would be.
> 
> Gijs, you seem all over this code and particularly bug 1057531 which landed

Wrong bug? ITYM bug 494092

> the ipv6 logic - can you shed some light on this?  ISTM we want to skip the
> FIXUP_FLAG_REQUIRE_WHITELISTED_HOST logic when we find a literal ipv6
> address?

I didn't really understand the FIXUP_FLAG_REQUIRE_WHITELISTED_HOST thing to begin with.

The main purpose of that code is to go from "random string" to "valid url that we can load".

The whitelist was to force "foo" to become "http://foo/", when in almost all other cases we want "http://searchengine.whatever/search?query=foo" .


I don't understand why unifiedcomplete needs to mess with the whitelist system to begin with. bug 1057186 added the flag. It was not reviewed by me, and I recall reading the rest of the related bugs and still being confused. AFAIK the tests there encode current behaviour. If the current behaviour for the flag is "wrong", then it'd be useful to know *why exactly we have the flag*, what it is *meant to do*, and why that isn't what it's doing here.

The current effect is that in the case where we require the whitelist, and the host isn't on the whitelist, we avoid setting the preferred URI to just be the fixed URI (ie the one with the literal host, rather than a search URL).


The ipv6 detection is in KeywordURLFixup, but that just avoids setting the preferredURI. Not setting the preferred URI used to just work because as long as we had an mFixedURI (ie we could somehow manage to even get that, and the input wasn't completely invalid as a URI, like "foo..gobbledegook:bar"), we would use that as the preferred URI. The bug that added FIXUP_FLAG_REQUIRE_WHITELISTED_HOST put that logic behind a load of conditionals, which means it now breaks in that case.

This is all a bit messed up; it seems like maybe the loop at the start of KeywordURLFixup could always run and produce useful info, even if we don't want keyword URLs. We could put that info in an internal struct and reuse it in a couple of places to make decisions about what to do.

The short-term fix, however, could be much simpler - AIUI if you make the early returns for ipv4 and ipv6 in KeywordURLFixup just assign info->mPreferredURI to info->mFixedURI if it exists, and make the REQUIRE_WHITELISTED_HOST blob check that we don't have an mPreferredURI yet, that should fix the issue.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs Kruitbosch from comment #2)
> I don't understand why unifiedcomplete needs to mess with the whitelist
> system to begin with. bug 1057186 added the flag.

We need to know if the locationbar will consider this a valid uri, cause we show the action that will happen when the user selects this entry, and if it's a valid uri it will be a visit action, if it's not it will be a search action. Since duplicating code to figure that out in unifiedcomplete would be silly, we reuse the whitelist system that will take the decision.
(In reply to Marco Bonardo [::mak] from comment #3)
> (In reply to :Gijs Kruitbosch from comment #2)
> > I don't understand why unifiedcomplete needs to mess with the whitelist
> > system to begin with. bug 1057186 added the flag.
> 
> We need to know if the locationbar will consider this a valid uri, cause we
> show the action that will happen when the user selects this entry, and if
> it's a valid uri it will be a visit action, if it's not it will be a search
> action. Since duplicating code to figure that out in unifiedcomplete would
> be silly, we reuse the whitelist system that will take the decision.

But you don't reuse the code... (because then you would have gotten a search URL...) - you added a special flag and now the special flag doesn't do the right thing and everything is sad. :-(

Why not use the API and check the output? You can tell if info.fixedURI != info.preferredURI and/or info.mKeywordProviderName / info.keywordAsSent to ask for the keyword search done, if any. Will be empty string or null otherwise (I forget which).
(In reply to :Gijs Kruitbosch from comment #2)

> If the current behaviour for
> the flag is "wrong", then it'd be useful to know *why exactly we have the
> flag*, what it is *meant to do*, and why that isn't what it's doing here.

I think it's clear that as you said, "servername" is treated as such. I can't explain the rationale for the implementation nor who's expected to leverage it (enterprise?), but I think the intent is clear. Clearly (IMO), that behaviour isn't desired for ipv6 (or ipv4) addresses.

> The short-term fix, however, could be much simpler - AIUI if you make the
> early returns for ipv4 and ipv6 in KeywordURLFixup just assign
> info->mPreferredURI to info->mFixedURI if it exists, and make the
> REQUIRE_WHITELISTED_HOST blob check that we don't have an mPreferredURI yet,
> that should fix the issue.

I don't know the code, which is probably why such a change sounds more "scary" in terms of subtle semantic changes in behaviour. I was thinking a simple guard (eg, ':' in hostname? '[' at the start?) would be quite simple and respects the intent of the flag.
(In reply to :Gijs Kruitbosch from comment #4)
> But you don't reuse the code... (because then you would have gotten a search
> URL...) - you added a special flag and now the special flag doesn't do the
> right thing and everything is sad. :-(

Unfortunately I don't know the very deep internals, off-hand doesn't look like it was possible to obtain the behavior Blair was looking for with the existing flags.

I think the idea was to be able to figure out whitelisted domains, since those will do a visit instead of a search and we must be able to tell what a given input text will do. But whitelisting was happening only in certain cases (he says only for search keywords) thus he needed a way to run the whitelist code without running a keyword search. This is what I can gather from the available comments around.
So trying to get rid of this extra flag reveals interesting other issues.

It seems the tests for unifiedcomplete here encode broken assumptions. For instance, in the URL bar, if I type:

flip/flop


we will first try http://flip/flop. This is what the fixup URI service returns. When the DNS lookup for "flip" fails, we will fall back to doing a web search. But that's different, and we won't know the result of the DNS lookup until after it's done. If you replace "flip" with something that actually points to a local server, you will get that page.

However, e.g. test_visiturl.js encodes the expectation that unifiedcomplete returns one entry, namely the one for the search. I don't know how to update this; I tried to use just the host, but then because of the "visited URL" logic it completes to just the host with a trailing slash, apparently, and so the test keeps failing.

This isn't the only test that fails here, either.
we don't need the action shown by unified to be 100% correct, cause that would be a huge effort for which we didn't (and don't) have resources. We expect it would fail in some rare situations. It is rather based on most common and less surprising behaviors.
We can't know at that point if a host is valid or not, and we can't query the dns, since it's unlikely someone has a "flip" host, and if he does, it will likely be added to the domain whitelist. So it should be "safe" to assume it's a search.
(In reply to Marco Bonardo [::mak] from comment #8)
> We can't know at that point if a host is valid or not, and we can't query
> the dns, since it's unlikely someone has a "flip" host, and if he does, it
> will likely be added to the domain whitelist. So it should be "safe" to
> assume it's a search.

I disagree with this assertion. The thing that actually happens is that we try to load the url http://flip/flop. Spurious example, of course. But we made a pretty careful choice in deciding that "foo/bar" would *not* default to a search in nsIURIFixup, as that was deemed to be the most likely intent of the user - there aren't that many no-spaces expressions with a slash in them that would be useful/common search queries, and a surprising number of enterprise users and web devs who rely on host/path 'Just Working'. So actually, no, I don't think it's "safe" to assume it's a search in this case, especially not in the case in the test which actually has visits on the host in question with a different path.

It would be different if this was just a word with no slashes, but as noted, changing the tests there is very difficult for me, not being familiar with places. I will put up my WIP patch that fixes this the way I think it should be fixed, but it will need test adjustments (which will probably provide fodder for discussion of what the right behaviour is).
Attachment #8598699 - Flags: feedback?(mak77)
This fails the following tests:

toolkit/components/places/tests/unifiedcomplete/test_searchEngine_current.js
----------------------------------------------------------------------------
FAIL [Parent]
toolkit/components/places/tests/unifiedcomplete/test_searchEngine_alias.js
--------------------------------------------------------------------------
FAIL [Parent]
toolkit/components/places/tests/unifiedcomplete/test_empty_search.js
--------------------------------------------------------------------
FAIL [Parent]
toolkit/components/places/tests/unifiedcomplete/test_tabmatches.js
------------------------------------------------------------------
FAIL [Parent]
toolkit/components/places/tests/unifiedcomplete/test_visiturl.js
----------------------------------------------------------------
FAIL [Parent]
(In reply to :Gijs Kruitbosch from comment #9)
> It would be different if this was just a word with no slashes, but as noted,
> changing the tests there is very difficult for me, not being familiar with
> places. I will put up my WIP patch that fixes this the way I think it should
> be fixed, but it will need test adjustments (which will probably provide
> fodder for discussion of what the right behaviour is).

So, is your patch only affecting things containing a slash, or is it also affecting single words? like if I whitelist "mylocalpage" and type it, do we show the action as "visit mylocalpage"?
I agree the slash is a fancy example and in most cases it's safe to assume it *might* be a url. We might have telemetry telling us how many times we get it right and how many we get it wrong, to stay on the safe side!

But in any other cases where it's hard to guess if it's a word or a url, there are 2 alternatives (I made the percentages by fancy):
1. we show to the user "Visit might_be_a_url", but 99,9% of the times he will end up on a search. we are surprising 99,9% of users since it's unlikely they have that local host.
2. (current) we show to the user that we will do a search, and 0.1% of the time instead we will visit might_be_a_url cause it's a real host. we are surprising 0.1% of users.

In this case I can't see how 1 is better.
The domain whitelist appears as a good half-way to satisfy both parts, the developer will have very likely whitelisted the domain, since he is using it. we reuse that information to do a better guess covering the 0.1% of users.

What I'd not want, is getting tens of bugs filed cause we show one thing and almost always do another.
(In reply to Marco Bonardo [::mak] from comment #12)
> (In reply to :Gijs Kruitbosch from comment #9)
> > It would be different if this was just a word with no slashes, but as noted,
> > changing the tests there is very difficult for me, not being familiar with
> > places. I will put up my WIP patch that fixes this the way I think it should
> > be fixed, but it will need test adjustments (which will probably provide
> > fodder for discussion of what the right behaviour is).
> 
> So, is your patch only affecting things containing a slash, or is it also
> affecting single words? like if I whitelist "mylocalpage" and type it, do we
> show the action as "visit mylocalpage"?

I'm not sure what the reasoning behind the question is. The slash forces a "real" URL lookup to precede depending on a search. So does the whitelist. So the latter situation doesn't change in this patch - it would have shown visit before (because of the 'require a whitelist' flag) and it will show the same now. The slash case changes - it used to show a search, and now it shows you a visit entry.

The interesting case is arguably "what happens if I type just a single word that *isn't* whitelisted", and that should continue being a search, if I read the autocomplete code correctly (which I might not).

> I agree the slash is a fancy example and in most cases it's safe to assume
> it *might* be a url. We might have telemetry telling us how many times we
> get it right and how many we get it wrong, to stay on the safe side!
> 
> But in any other cases where it's hard to guess if it's a word or a url,
> there are 2 alternatives (I made the percentages by fancy):
> 1. we show to the user "Visit might_be_a_url", but 99,9% of the times he
> will end up on a search. we are surprising 99,9% of users since it's
> unlikely they have that local host.
> 2. (current) we show to the user that we will do a search, and 0.1% of the
> time instead we will visit might_be_a_url cause it's a real host. we are
> surprising 0.1% of users.

Right, see above. For a single word with no slashes (followed by something useful) that hasn't been whitelisted, we will generate a search. That will continue to be the case. The only exception is if the user turns off the behaviour by setting the browser.fixup.dns_first_for_single_words pref to true (which I suspect the current version of the code doesn't deal with).

Note that this is all pretty subtle, e.g. "42.5/32.1" should be generating a search, too (and does, last I checked, though it seems to also offer going to "42.5", which I guess is a bug... different issue though).

Same with literal IPv4/Ipv6 addresses - they will go to that address instead.

> The domain whitelist appears as a good half-way to satisfy both parts, the
> developer will have very likely whitelisted the domain, since he is using
> it. we reuse that information to do a better guess covering the 0.1% of
> users.

Sure. Like I said, I think this should really only affect the case with a slash.

If we want to fix this and not alter the slash case, I would prefer extracting the "is this whitelisted" thing into an IDL'd function (IIRC it's already its own function on the C++ side), and call that from the unifiedcomplete side on the URI we get back to check if the hostname is whitelisted.

> What I'd not want, is getting tens of bugs filed cause we show one thing and
> almost always do another.

Yeah, I think we agree with each other there. :-)
I'm still trying to get my head around this bug and the discussion above, and particularly the "slash" case. But ISTM this patch is a significant change in behaviour in that a single word will now be matched as a host when it previously would not.  eg, the failure in test_empty_search.js is that when a tab is open with "http://t.foo/6", a search for "foo" will yield "visit http://foo/" instead of "switch to tab" for the opened tab.  This makes sense as the patch just lets all single-word hosts though without the whitelist check.

Gijs, are you suggesting the new behaviour is correct, or that the patch still needs to implement the whitelist check in a different way?  IOW, you seem to be saying "the entire whitelist check is bogus" which doesn't seem to be what this bug is about (other than the fact the whitelist check doesn't take ipv6 literals into account)
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Mark Hammond [:markh] from comment #14)
> I'm still trying to get my head around this bug and the discussion above,
> and particularly the "slash" case. But ISTM this patch is a significant
> change in behaviour in that a single word will now be matched as a host when
> it previously would not.  eg, the failure in test_empty_search.js is that
> when a tab is open with "http://t.foo/6", a search for "foo" will yield
> "visit http://foo/" instead of "switch to tab" for the opened tab.  This
> makes sense as the patch just lets all single-word hosts though without the
> whitelist check.

Actually, it doesn't really make sense to me. Clearly, "switch to tab" is still an appropriate action for just searching for "foo", and with default preferences typing just "foo" in the location bar will do a search for foo without trying to visit http://foo/ . So it sounds like the behaviour of unifiedcomplete and/or the uri fixup service there is incorrect. I haven't looked at the test so I don't know which is the case.

> Gijs, are you suggesting the new behaviour is correct, or that the patch
> still needs to implement the whitelist check in a different way?

Neither. See also my previous comment that said:

> For a single word with no slashes (followed by something useful) that hasn't been whitelisted, we will generate a search.

That shouldn't change with my patch. I don't know why the unifiedcomplete code makes a different decision if the input string in the location bar is really just "foo", unless "foo" has been whitelisted, which I'm assuming isn't the case here.

> IOW, you
> seem to be saying "the entire whitelist check is bogus" which doesn't seem
> to be what this bug is about (other than the fact the whitelist check
> doesn't take ipv6 literals into account)

The whitelist check is a hack. It defeats the other logic in the code, including but not limited to the ipv6 case. We could fix just the ipv6 case, but we'd keep the code complexity and confusing behaviour and ultimately we'd just run into other edgecases. So I'm very uncomfortable adding another "quick fix" on top of this code in order to fix the ipv6 case, just because fixing it properly is difficult. :-)
Flags: needinfo?(gijskruitbosch+bugs)
Per Mark's IRC-noted debugging, it seems this is affecting the test because keywordAsSent is empty, which it shouldn't be if we're really sticking just "foo" in as a search string.
Attached patch 0004-build-on-Gijs-s-patch.patch (obsolete) — Splinter Review
I'm slowly getting my head around this, thanks!

The issue with keywordsAsSent is two-fold - the preference "keywords.enabled" must be true (which it isn't in most xpcshell tests) and there must be a default search engine (which doesn't exist in some of those tests). Setting that pref in those tests sounds reasonable but ensuring a default search engine less so (and test_empty_search.js will get upset by this for valid reasons IIUC)

As an experiment I added a new member nsIURIFixupInfo::keywordConsidered - this is set to true if the code would have set keywordAsSent but failed due to the lack of a search engine. Best I can tell from looking at the rest of the docshell code, we could probably also unconditionally set keywordAsSent and rely on keywordProviderName remaining empty to do the right thing, but that seems less clean. I might also be missing something :)

With this change and with the tests setting that pref, all existing tests pass - except for that pesky "slash" test...

... the discussion above regarding the "correct" behaviour for flip/flop is interesting, but I question whether we need to tackle that as part of this bug? IOW, I think we should have a new bug for that change in behaviour. So the other thing I did in this patch is expose isDomainWhitelisted and have UnifiedComplete.js do the "if there's a dot check the whitelist" dance. Best I can tell this restores the semantics.

The end result is that with this patch, all *existing* tests under toolkit/components/places/tests pass...

... which is where it gets more complicated :( The patch adds 2 tests for literal ip addresses to test_visiturl - the ipv4 one passes but the ipv6 one doesn't. IOW, none of this actually fixes this bug ;)

The problem here is nsDefaultURIFixup::FixupURIProtocol fails to handle the ipv6 address due to 'uriString.FindCharInSet("/:?#");' so mFixedURI doesn't get set. *sigh*. I only just discovered this and I'm out of time for today, so insights etc welcome!
Attachment #8601982 - Flags: feedback?(mak77)
Attachment #8601982 - Flags: feedback?(gijskruitbosch+bugs)
(In reply to Mark Hammond [:markh] from comment #17)
> Created attachment 8601982 [details] [diff] [review]
> 0004-build-on-Gijs-s-patch.patch
> 
> I'm slowly getting my head around this, thanks!
> 
> The issue with keywordsAsSent is two-fold - the preference
> "keywords.enabled" must be true (which it isn't in most xpcshell tests) and
> there must be a default search engine (which doesn't exist in some of those
> tests).
> Setting that pref in those tests sounds reasonable but ensuring a
> default search engine less so (and test_empty_search.js will get upset by
> this for valid reasons IIUC)

TBH, I think we should fix both of these in the head.js of the unifiedcomplete xpcshell tests in order for them to reflect reality. Nobody I know runs Firefox with 0 search engines and with keywords.enabled turned off. I get that that's what xpcshell gives you by default, but it doesn't reflect what will happen in Firefox, and it will just make test results differ from what happens in-browser for 99% of people.

(there are some people who do turn off keywords.enabled, but they're a very definite minority)


> As an experiment I added a new member nsIURIFixupInfo::keywordConsidered -
> this is set to true if the code would have set keywordAsSent but failed due
> to the lack of a search engine. Best I can tell from looking at the rest of
> the docshell code, we could probably also unconditionally set keywordAsSent
> and rely on keywordProviderName remaining empty to do the right thing, but
> that seems less clean. I might also be missing something :)

Well... in this case (search is disabled), why do we prefer the switch to tab case and not the hostname lookup? And if we basically always prefer switch to tab (over either search or "visit host 'foo'"), why are we even hitting the URI fixup service? That really shouldn't even be happening, and seems like a separate bug.

> With this change and with the tests setting that pref, all existing tests
> pass - except for that pesky "slash" test...
> 
> ... the discussion above regarding the "correct" behaviour for flip/flop is
> interesting, but I question whether we need to tackle that as part of this
> bug? IOW, I think we should have a new bug for that change in behaviour. So
> the other thing I did in this patch is expose isDomainWhitelisted and have
> UnifiedComplete.js do the "if there's a dot check the whitelist" dance. Best
> I can tell this restores the semantics.
> 
> The end result is that with this patch, all *existing* tests under
> toolkit/components/places/tests pass...
> 
> ... which is where it gets more complicated :( The patch adds 2 tests for
> literal ip addresses to test_visiturl - the ipv4 one passes but the ipv6 one
> doesn't. IOW, none of this actually fixes this bug ;)
> 
> The problem here is nsDefaultURIFixup::FixupURIProtocol fails to handle the
> ipv6 address due to 'uriString.FindCharInSet("/:?#");' so mFixedURI doesn't
> get set. *sigh*. I only just discovered this and I'm out of time for today,
> so insights etc welcome!

I'm confused about how this breaks anything. I'll try running with the patch and seeing if I can understand what happens here - FixupURIProtocol should be adding 'http://' just fine either way.
So there are two issues with the patch as-is:

1) the ipv6 address the testcase adds is invalid.
2) once you replace it with a valid one, this code which you added:

+    // If the host is a single word we need to check the domain whitelist.
+    if (uri.asciiHost) {
+      let dotLoc = uri.asciiHost.indexOf(".");
+      if ((dotLoc == -1 || dotLoc == uri.asciiHost.length) && !Services.uriFixup.isDomainWhitelisted(uri.asciiHost, dotLoc)) {
+        return false;
+      }
+    }

breaks out early and returns false.

I'm really uncomfortable with this manual post-processing. If this is really just for the "host/path" (aka flip/flop) case I would lean heavily on the side of leaving this out, and accepting this change in behaviour. If this causes a lot of issues, esp. on devedition, we can fix it later. Right now it would seem that trying to keep (IMO, wrong) behaviour there complicates fixing the actual bug here.
Rank: 4
Attached patch 0003-build-on-Gijs-s-patch.patch (obsolete) — Splinter Review
As suggested by Gijs, I tweaked the tests to not assume no search engines, which avoids the need for the keywordsConsidered property. This typically just meant that some of the tests got an additional search result, which seems fine.

I also used a regex to check if we need to do the whitelist check.  All tests pass.
Attachment #8601982 - Attachment is obsolete: true
Attachment #8601982 - Flags: feedback?(mak77)
Attachment #8601982 - Flags: feedback?(gijskruitbosch+bugs)
Attachment #8602554 - Flags: feedback?(mak77)
Attachment #8602554 - Flags: feedback?(gijskruitbosch+bugs)
Comment on attachment 8602554 [details] [diff] [review]
0003-build-on-Gijs-s-patch.patch

Review of attachment 8602554 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/places/UnifiedComplete.js
@@ +67,5 @@
>  
> +// A regex that matches "single word" hostnames for whitelisting purposes.
> +// It allows only alphanumeric at the start and end, plus allowing hypens
> +// anywhere else.
> +const REGEXP_SINGLEWORD_HOST = new RegExp("^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])$");

Nit:

const REGEXP_SINGLEWORD_HOST = /^[a-z0-9-]*$/i;

should work; we allow URIs with trailing -'s and if not, nsIURI's constructor should already have complained, so we might as well simplify.

::: toolkit/components/places/tests/unifiedcomplete/head_autocomplete.js
@@ +307,1 @@
>  add_task(function ensure_no_search_engines() {

Should probably rename this. :-)

@@ +309,5 @@
> +
> +  Services.search.addEngineWithDetails("MozSearch", "", "", "", "GET",
> +                                       "http://s.example.com/search");
> +  let engine = Services.search.getEngineByName("MozSearch");
> +  Services.search.currentEngine = engine;

We should reset this after the tests are done, maybe? Or is that unnecessary because xpcshell tests?

::: toolkit/components/places/tests/unifiedcomplete/test_empty_search.js
@@ +87,5 @@
>      search: "",
>      searchParam: "enable-actions",
> +    matches: [
> +               { uri: makeActionURI("switchtab", {url: "http://t.foo/6"}), title: "title", style: [ "action,switchtab" ] },
> +               { uri: makeActionURI("searchengine", {engineName: "MozSearch", input: "", searchQuery: ""}), title: "MozSearch", style: [ "action", "searchengine" ] },

Huh, an empty search shows up as a search? That seems like a bug, but probably a separate one.
Attachment #8602554 - Flags: feedback?(gijskruitbosch+bugs) → feedback+
Hi Mark, can you provide a point value.
Assignee: nobody → markh
Status: NEW → ASSIGNED
Iteration: --- → 40.3 - 11 May
Flags: needinfo?(markh)
Points: --- → 5
Flags: needinfo?(markh)
Whiteboard: [fxsearch][unifiedautocomplete] → [unifiedautocomplete][fxsearch]
Iteration: 40.3 - 11 May → 41.1 - May 25
Comment on attachment 8602554 [details] [diff] [review]
0003-build-on-Gijs-s-patch.patch

Review of attachment 8602554 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry if this took more time than needed, I really needed boxed time to think about what's up here.

I think we can proceed, so nothing is lost :)

::: toolkit/components/places/tests/unifiedcomplete/head_autocomplete.js
@@ +307,2 @@
>  add_task(function ensure_no_search_engines() {
> +  Services.prefs.setBoolPref("keyword.enabled", true);

this needs a comment why we need it. also UnifiedComplete.js should grow a comment about what's expected to happen when keyword.enabled is false and there are no search engines, even if it's a minority, we must document it.

@@ +309,5 @@
> +
> +  Services.search.addEngineWithDetails("MozSearch", "", "", "", "GET",
> +                                       "http://s.example.com/search");
> +  let engine = Services.search.getEngineByName("MozSearch");
> +  Services.search.currentEngine = engine;

So, the problem here is third party consumers, like Seamonkey/comm-central, the scope of ensure_no_search_engines was to hide all search engines so that results where consistent across consumers.

Clearly we can add a search engine here and make it the current engine, but we should still hide any other existing engine before doing that.

So this method can likely be something like "ensure_consistent_searchegines" or such.

Note it's untrue that xpcshell gives us no search engines, they exist and it's what broke tests for other consumers since the engines differ per application.

::: toolkit/components/places/tests/unifiedcomplete/test_empty_search.js
@@ +87,5 @@
>      search: "",
>      searchParam: "enable-actions",
> +    matches: [
> +               { uri: makeActionURI("switchtab", {url: "http://t.foo/6"}), title: "title", style: [ "action,switchtab" ] },
> +               { uri: makeActionURI("searchengine", {engineName: "MozSearch", input: "", searchQuery: ""}), title: "MozSearch", style: [ "action", "searchengine" ] },

This is sort of related to Bug 1075532, but should be filed apart.

The fact is that if you click the dropdown arrow, and press enter, nothing happens! So, what should the first special item do in such a case? Maybe we should modified unified complete so that it autofills the first entry? Maybe it should show an animated fox, or shorlander's avatar? dunno.

Please file a bug for the issue, and here put a // TODO (bug XYZ): comment pointing to it
Attachment #8602554 - Flags: feedback?(mak77) → feedback+
Comment on attachment 8598699 [details] [diff] [review]
WIP to get rid of the whitelist requirement flag

this alone wouldn't work, but looks like Mark changes cover most cases.

I'd still love to have a better eplanation of what we should expect if keyword.enabled is false and/or there's no search engine (but mostly the former, the latter is really uncommon)
Attachment #8598699 - Flags: feedback?(mak77) → feedback+
(In reply to Marco Bonardo [::mak] from comment #23)

> ::: toolkit/components/places/tests/unifiedcomplete/head_autocomplete.js
> @@ +307,2 @@
> >  add_task(function ensure_no_search_engines() {
> > +  Services.prefs.setBoolPref("keyword.enabled", true);
> 
> this needs a comment why we need it. 

Only 1 test actually needs it, but I added it here due to comment 18. I can move it to that one test if you prefer.

> also UnifiedComplete.js should grow a
> comment about what's expected to happen when keyword.enabled is false and
> there are no search engines, even if it's a minority, we must document it.

Where should we document it? The issue is the new version of _matchUnknownUrl() now checks for fixupInfo.keywordAsSent, and this is only set when the pref is true *and* there are search engines. When that's not true, _matchUnknownUrl() will return a visit URL.  There's already a comment:

>    // If the URI cannot be fixed or the preferred URI would do a keyword search,
>    // that basically means this isn't useful to us:
>    if (!fixupInfo.fixedURI || fixupInfo.keywordAsSent)

To which I could append something like:

> (note that fixupInfo.keywordAsSent will never be true if the keyword.enabled pref is false or there are no engines, so in that case we will always return a "visit".)

Would that be sufficient?

> This is sort of related to Bug 1075532, but should be filed apart.
> 
> The fact is that if you click the dropdown arrow, and press enter, nothing
> happens! So, what should the first special item do in such a case? Maybe we
> should modified unified complete so that it autofills the first entry? Maybe
> it should show an animated fox, or shorlander's avatar? dunno.
> 
> Please file a bug for the issue, and here put a // TODO (bug XYZ): comment
> pointing to it

So you just want a bug that says "decide what to show as the first autocomplete entry for an empty string"? Personally I don't see a big issue with the current behaviour for this edge-case - you end up at a regular search page where your search term can be entered, which sounds perfectly reasonable to me.
Flags: needinfo?(mak77)
(In reply to Mark Hammond [:markh] from comment #25)
> To which I could append something like:
> 
> > (note that fixupInfo.keywordAsSent will never be true if the keyword.enabled pref is false or there are no engines, so in that case we will always return a "visit".)
> 
> Would that be sufficient?

yes, I'd probably just put that into a phrase and avoid the parenthesis, but that's a nit.

> > This is sort of related to Bug 1075532, but should be filed apart.
> > 
> > The fact is that if you click the dropdown arrow, and press enter, nothing
> > happens! So, what should the first special item do in such a case? Maybe we
> > should modified unified complete so that it autofills the first entry? Maybe
> > it should show an animated fox, or shorlander's avatar? dunno.
> > 
> > Please file a bug for the issue, and here put a // TODO (bug XYZ): comment
> > pointing to it
> 
> So you just want a bug that says "decide what to show as the first
> autocomplete entry for an empty string"? Personally I don't see a big issue
> with the current behaviour for this edge-case - you end up at a regular
> search page where your search term can be entered, which sounds perfectly
> reasonable to me.

well, it could be reasonable if it would work... but if I open a new tab and then press DOWN/ENTER nothing happens.
Flags: needinfo?(mak77)
(In reply to Marco Bonardo [::mak] from comment #23)
> Comment on attachment 8602554 [details] [diff] [review]
> 0003-build-on-Gijs-s-patch.patch
> 
> Review of attachment 8602554 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Sorry if this took more time than needed, I really needed boxed time to
> think about what's up here.
> 
> I think we can proceed, so nothing is lost :)
> 
> ::: toolkit/components/places/tests/unifiedcomplete/head_autocomplete.js
> @@ +307,2 @@
> >  add_task(function ensure_no_search_engines() {
> > +  Services.prefs.setBoolPref("keyword.enabled", true);
> 
> this needs a comment why we need it. also UnifiedComplete.js should grow a
> comment about what's expected to happen when keyword.enabled is false and
> there are no search engines, even if it's a minority, we must document it.

In principle, if you toggle keyword.enabled then right now, at least, you can't search from the URL bar at all. We will just show you errors if you put things in that aren't (and can't be correct to be) valid URIs.

I don't know how that should inform the dropdown; I guess there could still be search entries but they shouldn't be the default/first entry, or something? It's a little mismatched in terms of what the pref's original intent was, now that we have unifiedcomplete.
(In reply to Marco Bonardo [::mak] from comment #26)
> > So you just want a bug that says "decide what to show as the first
> > autocomplete entry for an empty string"? Personally I don't see a big issue
> > with the current behaviour for this edge-case - you end up at a regular
> > search page where your search term can be entered, which sounds perfectly
> > reasonable to me.
> 
> well, it could be reasonable if it would work... but if I open a new tab and
> then press DOWN/ENTER nothing happens.

I think that's a different bug - the first autocomplete entry clearly indicates it expects to do a search, and pressing DOWN and *clicking* that first entry does an empty search - so the bug here is that pressing enter doesn't do what is expected, rather than the first entry being an empty search.

So best I can tell, there's no bug to open and add a comment for (ie, there is a bug about ENTER, but this test isn't hitting it)
Smaug, I'm requesting your review for the docshell part of this patch - which changes nsDefaultURIFixup to remove the FIXUP_FLAG_REQUIRE_WHITELISTED_HOST flag and publicly expose IsDomainWhitelisted() - see comment 2 for more context.

Mak, thanks for the earlier feedback and I think this is now ready.
Attachment #8598699 - Attachment is obsolete: true
Attachment #8602554 - Attachment is obsolete: true
Attachment #8609921 - Flags: review?(mak77)
Attachment #8609921 - Flags: review?(bugs)
Comment on attachment 8609921 [details] [diff] [review]
0001-Bug-1107883-support-ipv6-URLs-in-the-awesomebar-with.patch

>+nsDefaultURIFixup::IsDomainWhitelisted(const nsACString& aDomain,
>+                                       const uint32_t aDotLoc,
>+                                       bool *aResult)
* goes with the type, so bool* aResult


I'm all for moving more of urifixup handling to js.
Attachment #8609921 - Flags: review?(bugs) → review+
Iteration: 41.1 - May 25 → 41.2 - Jun 8
Comment on attachment 8609921 [details] [diff] [review]
0001-Bug-1107883-support-ipv6-URLs-in-the-awesomebar-with.patch

Review of attachment 8609921 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/places/UnifiedComplete.js
@@ +67,5 @@
>  
> +// A regex that matches "single word" hostnames for whitelisting purposes.
> +// It allows only alphanumeric at the start and end, plus allowing hypens
> +// anywhere else.
> +const REGEXP_SINGLEWORD_HOST = new RegExp("^([a-zA-Z0-9]|[a-zA-Z0-9][a-zA-Z0-9\-]*[a-zA-Z0-9])$");

I agree with Gijs we can simplify this regex since nsIURI checked the URI for us already.
/^[a-z0-9-]+$/i (there is no need to escape -, if it's at the beginning or end of a character class) should do.

But actually, don't we just care there's no dot in it? in such a case a simple !string.includes(".") would do.

::: toolkit/components/places/tests/unifiedcomplete/head_autocomplete.js
@@ +307,1 @@
>  add_task(function ensure_no_search_engines() {

please rename to ensure_search_engine

Also, as I said in comment 23, this must still hide any existing engine before adding the new one. the new code must be appended to the existing code.

::: toolkit/components/places/tests/unifiedcomplete/test_visiturl.js
@@ +39,5 @@
>    // And hosts with no dot in them are special, due to requiring whitelisting.
>    do_print("visit url, host matching visited host but not visited url, non-whitelisted host");
> +  // XXXmarkh - not sure why the visit is added here as the test doesn't depend
> +  // on it. ISTM the visit should make this test invalid (ie, that a visit
> +  // implies the result should be a "visiturl" even when not whitelisted.)

I think the idea was that we don't wrongly show another url starting with the same host... looks like mostly a sanity check.
Attachment #8609921 - Flags: review?(mak77) → review+
(In reply to Marco Bonardo [::mak] from comment #31)

> I agree with Gijs we can simplify this regex since nsIURI checked the URI
> for us already.

Fair enough - I'd have thought a strictly correct regex was better but I've changed it.

> But actually, don't we just care there's no dot in it? in such a case a
> simple !string.includes(".") would do.

Then an ipv6 literal qualifies, which defeats the point of the patch entirely :) I used the regex above.

> please rename to ensure_search_engine

Done.

> Also, as I said in comment 23, this must still hide any existing engine
> before adding the new one. the new code must be appended to the existing
> code.

Done.

> I think the idea was that we don't wrongly show another url starting with
> the same host... looks like mostly a sanity check.

Fair enough - I removed the "XXX markh" comment.

2 try runs (I forgot to include b2g in the first one)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=32ad667e0cfb
https://treeherder.mozilla.org/#/jobs?repo=try&revision=882cf57ee013
https://hg.mozilla.org/mozilla-central/rev/e67ddc64b010
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → Firefox 41
QA Contact: andrei.vaida
This is verified fixed on Nightly 41.0a1 (2015-06-15), using Windows 8.1 (x64), Ubuntu 14.04 (x64) and Mac OS X 10.9.5. Valid IPv6 URLs are now treated properly.

A negative side-effect here might be that even strings such as "![64:ff9b::8.8.8.8]" or "!@[2607:f0d0:1002:51::4]" are now considered valid URLs. Mark, what's your take on this?
Status: RESOLVED → VERIFIED
Flags: qe-verify+ → needinfo?(markh)
Depends on: 1175769
(In reply to Andrei Vaida, QA [:avaida] from comment #35)
> A negative side-effect here might be that even strings such as
> "![64:ff9b::8.8.8.8]" or "!@[2607:f0d0:1002:51::4]" are now considered valid
> URLs. Mark, what's your take on this?

Nice catch! I opened bug 1175769
Flags: needinfo?(markh)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: