Predictive lookup for awesomebar entries

RESOLVED FIXED in Firefox 28

Status

()

RESOLVED FIXED
5 years ago
2 years ago

People

(Reporter: rnewman, Assigned: mfinkle)

Tracking

(Depends on: 1 bug, Blocks: 4 bugs, {feature, relnote})

Trunk
Firefox 28
All
Android
feature, relnote
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(relnote-firefox 28+)

Details

Attachments

(1 attachment, 3 obsolete attachments)

(Reporter)

Description

5 years ago
Bug 881804 added network prefetching to Gecko. When the user types a domain, opens the search panel, or otherwise moves closer to starting a network request, we should get that string over to Gecko to warm things up.
(Reporter)

Comment 1

5 years ago
mfinkle points out that we already do this for search suggestions: we open a speculative connection as soon as we fetch the list of search engines to hand to Java. Let's see if there are more!
Remember, speculative connections only care about the domain. The full URL path is not used. Meaning google.com/foo and google.com/bar both just need google.com
Created attachment 8341171 [details] [diff] [review]
prefetch-speculative v0.1

Speculative connections are currently limited to 6 active connections. They will timeout, but until then any new requests after the limit has been hit are ignored.

nsHttpConnectionMgr.cpp seems to have the core speculative connection code:
http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpConnectionMgr.cpp#2570

This patch tries to guess at the most likely awesomebar results, and send only those URLs to Gecko for speculative connections. The patch holds a domain cache so we don't use up our 6 speculative connections on duplicates. The cache is cleared on pageshow events.

Google seems to do something similar with Omnibox results:
http://www.igvita.com/posa/high-performance-networking-in-google-chrome/#omnibox

The patch seems to send the minimal amount of prefetch attempts while typing in the awesomebar. Further, the cache does cause the code to not speculatively connect to the same URL over and over.

No idea if this makes page loads faster or not. Some subjective and objective testing would be nice.
Test build:
http://people.mozilla.org/~mfinkle/fennec/fennec-28.0a1.en-US.android-arm.prefetch.apk

This build will output some debug messages with "PREFETCH" in the log. Use | adb logcat | grep PREFETCH | to see the flow.
(Reporter)

Updated

5 years ago
Depends on: 945408
(Reporter)

Updated

5 years ago
Blocks: 945408
No longer depends on: 945408
Created attachment 8342129 [details] [diff] [review]
prefetch-speculative v0.2

This patch works around my objections of speculatively connecting to all results of an awesomebar search by only speculatively connecting to the first result and any domain auto-completion. We don't blow-up our connection limit or sending too many messages from Java to JS.

Testing whether this makes a measurable difference is not possible right now, but we do use speculative connection for default search engines and link tapping, so why not do it here too?

I'll look into writing a test that checks to see that the Session:Prefetch messages are sent.
Assignee: nobody → mark.finkle
Attachment #8342129 - Flags: feedback?(rnewman)
(Assignee)

Updated

5 years ago
Attachment #8341171 - Attachment is obsolete: true
(Assignee)

Updated

5 years ago
Summary: Predictive lookup for awesomebar entries, search engines, etc. → Predictive lookup for awesomebar entries
(Reporter)

Comment 6

5 years ago
Comment on attachment 8342129 [details] [diff] [review]
prefetch-speculative v0.2

Review of attachment 8342129 [details] [diff] [review]:
-----------------------------------------------------------------

I agree with the idea of hooking into matches against domains; definitely the right spot.

But this is a tight loop over a cursor, so I have two recommendations:

* I think we should accumulate, rather than dispatching one message per match.
* I think we should accumulate candidate domains (plus protocols?), rather than URLs.

Otherwise, typing "reddit.com" is going to send N messages to Gecko.

Something as trivial as

  Set<String> domains = new Set<String>();
  do {
      ...
      domains.put(url.getProtocol() + "://" + url.getHost());
  } while (...);
  GeckoAppShell.sendEventToGecko(<create tab-separated domain string here>)

would save on messaging overhead and potential redundant processing.

This also raises the possibility of iteratively pre-fetching matches, because now you don't have to send one message per domain.

By all means have the first result be the first thing to prefetch, but when you've done that DNS lookup and shaken hands on a connection, one might as well move on to the next, no? The awesomebar isn't so perfect that its first match is always the one you want, and prefetching is fairly cheap.

(I'm happy to take this over if you're bored :D)

::: mobile/android/base/home/BrowserSearch.java
@@ +366,2 @@
>              final StringBuilder hostBuilder = new StringBuilder(host);
>              if (hostBuilder.indexOf(searchTerm) == 0) {

startsWith!

@@ +373,5 @@
>  
>              if (searchPath) {
>                  final List<String> path = url.getPathSegments();
>  
>                  for (String s : path) {

While we're here, I think there are more efficient ways of phrasing this, no?

findAutocompletion("reddit.com/r/boop", *, true)

where the cursor returns 

  "http://www.reddit.com/r/boop/comments/1s1amu/the_quintessential_boop/"

will result in 

  uri = (the above)
  host = reddit.com
  hostBuilder = ["reddit.com"]

Then we'll get here, and produce:

  hostBuilder = ["reddit.com", "/", "r"]

(which doesn't match), then

  hostBuilder = ["reddit.com", "/", "r", "/", "boop"]

which does.

We can achieve the exact same goal much more efficiently (and also trivially extend to match a user typing "www.reddit.com/r/boop") via:

  int offset = url.indexOf(host);                             // With prefixes removed.

  if (url.startsWith(searchTerm) ||                           // They typed "www.reddit.com/r/boop"
      ((offset > 0) && url.startsWith(searchTerm, offset)) {  // Match "reddit.com/r/boop"

      // Return the matched string.
      return url.substring(offset, offset + searchTerm.length());
  }

right? No point building up all of those segments when all we're trying to do is verify that the user has typed a hostname and part of a path.

I'll file a bug. We're doing way more work than is necessary.

::: mobile/android/chrome/content/browser.js
@@ +8175,5 @@
>        return false;
>      }
>      let expireTimeMs = Services.prefs.getIntPref("browser.tabs.expireTime") * 1000;
>      if (expireTimeMs < 0) {
> +      // This behaviour is disabled

If you're touching this, maybe also add punctuation?

(And if you're feeling appropriately Murrican, kill that 'u' like Webster.)
Attachment #8342129 - Flags: feedback?(rnewman) → feedback+
(Reporter)

Comment 7

5 years ago
Filed Bug 946074, with patch, for the Awesomebar code unpleasantness. And found a bunch of other bugs in that implementation, too :/
(In reply to Richard Newman [:rnewman] from comment #6)

> I agree with the idea of hooking into matches against domains; definitely
> the right spot.
> 
> But this is a tight loop over a cursor, so I have two recommendations:
> 
> * I think we should accumulate, rather than dispatching one message per
> match.
> * I think we should accumulate candidate domains (plus protocols?), rather
> than URLs.
> 
> Otherwise, typing "reddit.com" is going to send N messages to Gecko.

Based on my logging, this does not happen. I message is only sent when the domain autocompletion changes the the value in the URLBar.

> By all means have the first result be the first thing to prefetch, but when
> you've done that DNS lookup and shaken hands on a connection, one might as
> well move on to the next, no? The awesomebar isn't so perfect that its first
> match is always the one you want, and prefetching is fairly cheap.

Speculative connecting has a "cap" which is currently 6 connections. After that, you need to wait for the connections to timeout before new ones can be made.
(Reporter)

Comment 9

5 years ago
(In reply to Mark Finkle (:mfinkle) from comment #8)

> Based on my logging, this does not happen. I message is only sent when the
> domain autocompletion changes the the value in the URLBar.

Ah, it'll stop on the first autocomplete.

> Speculative connecting has a "cap" which is currently 6 connections. After
> that, you need to wait for the connections to timeout before new ones can be
> made.

Ouch.

Firstly, can we cancel speculative connections that we know are no longer needed?

Secondly, are we able to at least speculatively issue DNS lookups? We don't need to open a socket. Seems like almost every domain that pops up in the top N of an awesomebar search should join a DNS lookup queue that we work through until you start loading a page... you wouldn't be able to do that inside the autocomplete loop, but that's nbd.
(In reply to Richard Newman [:rnewman] from comment #9)

> Firstly, can we cancel speculative connections that we know are no longer
> needed?

Not with nsISpeculativeConnect. The closing of "half open" connections seem to be internal methods only:
http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpConnectionMgr.cpp#2844

> Secondly, are we able to at least speculatively issue DNS lookups? We don't
> need to open a socket. Seems like almost every domain that pops up in the
> top N of an awesomebar search should join a DNS lookup queue that we work
> through until you start loading a page... you wouldn't be able to do that
> inside the autocomplete loop, but that's nbd.

Not with nsISpeculativeConnect. We might have a different way to do this.nsIDNSService might be useful:
http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/nsIDNSService.idl

NI'ing Patrick for any more deatils
Flags: needinfo?(mcmanus)
nsIDNSService::AsyncResolve (PRIORITY_LOW | RESOLVE_SPECULATE) is the best you've got.. you'll need a listener, but it could be one perma object for that. if that turns out to be a problem in practice it looks like a pretty simple patch to allow a nullptr listener.
Flags: needinfo?(mcmanus)
(Reporter)

Updated

5 years ago
Blocks: 946399
(Reporter)

Comment 12

5 years ago
Filed Bug 946399 for DNS lookups.

Blocking on Bug 946074 'cos bitrotting.
Status: NEW → ASSIGNED
Depends on: 946074
Created attachment 8342647 [details] [diff] [review]
prefetch-speculative v0.3

Just removed the logging and unbittrotted to deal with bug 946074. We can look at DNS stuff in a new bug, if anything seems actionable.

The code still worked the same with bug 946074 applied. I needed to add one more explicit "Session:Prefetch" event since bug 946074 adds more exit points to the findAutocompletion method. Not the prettiest, but at least it's all in one method.
Attachment #8342129 - Attachment is obsolete: true
Attachment #8342647 - Flags: review?(rnewman)
(Reporter)

Comment 14

5 years ago
Comment on attachment 8342647 [details] [diff] [review]
prefetch-speculative v0.3

Review of attachment 8342647 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/home/BrowserSearch.java
@@ +431,1 @@
>                  return strippedHost + "/";

Shouldn't we be prefetching the stripped hostname instead?

Otherwise we're making a request to https://m.cnn.com, and the user is gonna hit enter and get http://cnn.com.

(We can't fix the protocol switch, but we can fix the strip.)

That might be as simple as prefetching the return value of this function instead of doing it inline, which in general I would prefer regardless (for testability, for example).

::: mobile/android/chrome/content/browser.js
@@ +8144,5 @@
> +      case "Session:Prefetch":
> +        if (aData) {
> +          let uri = Services.io.newURI(aData, null, null);
> +          if (uri && !this._domains.has(uri.host)) {
> +            Services.io.QueryInterface(Ci.nsISpeculativeConnect).speculativeConnect(uri, null);

Can this throw?
(In reply to Richard Newman [:rnewman] from comment #14)

> Shouldn't we be prefetching the stripped hostname instead?
> 
> Otherwise we're making a request to https://m.cnn.com, and the user is gonna
> hit enter and get http://cnn.com.

I agree, but I am also troubled that we are changing an HTTPS to an HTTP scheme. I also dislike that we are sending the user to a URL that is not where the prediction system thinks they want to go. Do we just assume the http://cnn.com will redirct them to https://m.cnn.com ?

> (We can't fix the protocol switch, but we can fix the strip.)
> 
> That might be as simple as prefetching the return value of this function
> instead of doing it inline, which in general I would prefer regardless (for
> testability, for example).

That was my initial patch attempt, but I need to prefix a scheme to the returned host. I guess I can just assume HTTP for all domain autocompletions, which makes me a bit sad.

> Services.io.QueryInterface(Ci.nsISpeculativeConnect).speculativeConnect(uri, null);
> 
> Can this throw?

It looks like it could for some non HTTP/HTTPS urls, so I am wrapping all uses of it in a try/catch
Created attachment 8343336 [details] [diff] [review]
prefetch-speculative v0.4

* Uses try/catch blocks for nsISpeculativeConnect
* Prefetches the return value of findAutocomplete with a prefixed "http://" since that is what we'll send to Gecko if the user chooses it.

We still send the entire URL for the first item in the list though.
Attachment #8342647 - Attachment is obsolete: true
Attachment #8342647 - Flags: review?(rnewman)
Attachment #8343336 - Flags: review?(rnewman)
(Reporter)

Comment 17

5 years ago
Comment on attachment 8343336 [details] [diff] [review]
prefetch-speculative v0.4

Review of attachment 8343336 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with speculativeConnect collected into one place.

I think this is good for now, but I'd like to think a little about how we might do better.

If we're really speculatively connecting, we have to match the destination protocol to get the full benefit. (I imagine that the speculative connection will follow redirects to https: for sites that do that?)

It'd be nice if we could return the protocol from the autocomplete matcher and have the right thing happen, both within the editor (if I autocomplete and hit enter, I want https...) and in speculative connections.

So maybe we can think about hiding the protocol in the presentation/editing layer, and otherwise passing around "https://www.reddit.com/"?

Or does it make sense to just return something richer than a String from findAutocompleteMatch?

  private static class AutocompleteMatch {
    private final String matchedString;
    private final String protocol;       // or...
    private final List<Uri> speculativeURIs;
  }

::: mobile/android/base/home/BrowserSearch.java
@@ +401,5 @@
>              final String url = c.getString(urlIndex);
>  
> +            if (searchCount == 0) {
> +                // Prefetch the first item in the list since it's weighted the highest
> +                GeckoAppShell.sendEventToGecko(GeckoEvent.createBroadcastEvent("Session:Prefetch", url.toString()));

What happens if the first item in the cursor is the same domain as the autocomplete match? Do we burn two speculative connections? (Is that a bad thing, given parallel fetches?)

::: mobile/android/chrome/content/browser.js
@@ +8154,5 @@
> +        if (aData) {
> +          let uri = Services.io.newURI(aData, null, null);
> +          if (uri && !this._domains.has(uri.host)) {
> +            try {
> +              Services.io.QueryInterface(Ci.nsISpeculativeConnect).speculativeConnect(uri, null);

There are now three or four places in browser.js that we do exactly this. Can we break this out to BrowserApp.speculativeConnect, and isolate the try block and the connector reference?
Attachment #8343336 - Flags: review?(rnewman) → review+
(In reply to Richard Newman [:rnewman] from comment #17)

> If we're really speculatively connecting, we have to match the destination
> protocol to get the full benefit. (I imagine that the speculative connection
> will follow redirects to https: for sites that do that?)

I don't believe speculative connections follow redirects. Especially not the brute-force JS based redirects we see lots of sites use.

> It'd be nice if we could return the protocol from the autocomplete matcher
> and have the right thing happen, both within the editor (if I autocomplete
> and hit enter, I want https...) and in speculative connections.
> 
> So maybe we can think about hiding the protocol in the presentation/editing
> layer, and otherwise passing around "https://www.reddit.com/"?
> 
> Or does it make sense to just return something richer than a String from
> findAutocompleteMatch?

I wouldn't mind seeing either approach used to make the domain completion respect the scheme, just in a new bug.

> What happens if the first item in the cursor is the same domain as the
> autocomplete match? Do we burn two speculative connections? (Is that a bad
> thing, given parallel fetches?)

We do not burn two speculative connections. The domain check in browser.js blocks the second attempt.

> > +          let uri = Services.io.newURI(aData, null, null);
> > +          if (uri && !this._domains.has(uri.host)) {
> > +            try {
> > +              Services.io.QueryInterface(Ci.nsISpeculativeConnect).speculativeConnect(uri, null);
> 
> There are now three or four places in browser.js that we do exactly this.
> Can we break this out to BrowserApp.speculativeConnect, and isolate the try
> block and the connector reference?

Not exactly the same. The search engine code is a bit different. The new code does a domain check on the nsURI. Sounds like you want a silentException(function) wrapper so we can call speculativeConnect without the try/catch block repeated. I don't think it's worth trying to pull that out yet. Also, I would not want it in BrowserApp. We could put it in a JSM or something. BrowserApp is needs to be gutted as it is.
(Reporter)

Comment 19

5 years ago
(In reply to Mark Finkle (:mfinkle) from comment #18)

> I don't believe speculative connections follow redirects. Especially not the
> brute-force JS based redirects we see lots of sites use.

Then yeah, we definitely want to get the scheme right.


> I wouldn't mind seeing either approach used to make the domain completion
> respect the scheme, just in a new bug.

I don't intend to drag this one out :D

I'll file some follow-ups.


> I don't think it's worth trying to pull that
> out yet. Also, I would not want it in BrowserApp. We could put it in a JSM
> or something. BrowserApp is needs to be gutted as it is.

Fair enough!
(Reporter)

Updated

5 years ago
Blocks: 947067
(Reporter)

Updated

5 years ago
Blocks: 947390
No longer depends on: 947390
https://hg.mozilla.org/mozilla-central/rev/8b409f4159a3
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 28
(Reporter)

Updated

5 years ago
Keywords: feature, relnote

Updated

5 years ago
relnote-firefox: --- → ?
relnote-firefox: ? → 28+

Updated

3 years ago
Depends on: 1220794
You need to log in before you can comment on or make changes to this bug.