Last Comment Bug 713503 - prefetch urls from known url shortening sites before gecko is running
: prefetch urls from known url shortening sites before gecko is running
Status: RESOLVED FIXED
:
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: unspecified
: ARM Android
: P2 enhancement (vote)
: Firefox 12
Assigned To: Brad Lassey [:blassey] (use needinfo?)
:
Mentors:
: 708792 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-12-25 23:48 PST by Brad Lassey [:blassey] (use needinfo?)
Modified: 2012-01-20 13:48 PST (History)
17 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
fixed
11+


Attachments
patch (3.79 KB, patch)
2011-12-25 23:48 PST, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Review
patch (3.82 KB, patch)
2011-12-25 23:56 PST, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Review
patch (5.92 KB, patch)
2011-12-27 01:42 PST, Brad Lassey [:blassey] (use needinfo?)
mark.finkle: review-
Details | Diff | Review
patch (6.05 KB, patch)
2011-12-28 14:14 PST, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Review
patch playing with UAs and logging (2.14 KB, patch)
2011-12-30 09:44 PST, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Review
patch (6.24 KB, patch)
2011-12-30 09:56 PST, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Review
patch (7.67 KB, patch)
2012-01-05 18:28 PST, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Review
patch (7.98 KB, patch)
2012-01-05 18:35 PST, Brad Lassey [:blassey] (use needinfo?)
mark.finkle: review+
Details | Diff | Review
patch (9.51 KB, patch)
2012-01-09 23:56 PST, Brad Lassey [:blassey] (use needinfo?)
no flags Details | Diff | Review
patch (10.43 KB, patch)
2012-01-17 15:33 PST, Brad Lassey [:blassey] (use needinfo?)
mark.finkle: review+
akeybl: approval‑mozilla‑aurora+
Details | Diff | Review

Description Brad Lassey [:blassey] (use needinfo?) 2011-12-25 23:48:54 PST
Created attachment 584310 [details] [diff] [review]
patch

Loading a link from the twitter app has been identified as a primary use case. The round trip time for the shortened urls seems to be a non-trivial part of the page load time in this case. If we can get one of those round trips done while gecko is still being brought up we should be able to shortcut that bit.

In the case of a redirect to an other url shortener, such as twitter (t.co) redirecting to Al Jazeera (aje.me) we save two round trips
Comment 1 Brad Lassey [:blassey] (use needinfo?) 2011-12-25 23:56:09 PST
Created attachment 584311 [details] [diff] [review]
patch

I think I like this approach slightly better, here's the interdiff:

         Uri data = intent.getData();
         if (data != null && isKnownRedirector(data)) {
             GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
-            intent.setData(Uri.parse("about:blank"));
+            intent.setAction(Intent.ACTION_MAIN);
+            intent.setData(null);
         }
Comment 2 Brad Lassey [:blassey] (use needinfo?) 2011-12-27 01:42:23 PST
Created attachment 584401 [details] [diff] [review]
patch

This patch runs everything through this code. The advantage is it is no longer a white list so will also pick up on redirects such as www.facebook.com -> m.facebook.com. It also should do the job of the prefetchDNS() method, which it now replaces.
Comment 3 Mark Finkle (:mfinkle) (use needinfo?) 2011-12-28 07:09:29 PST
Comment on attachment 584401 [details] [diff] [review]
patch

>diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java

>+        Uri data = intent.getData();
>+        if (data != null) {
>+            GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
>+            intent.setAction(Intent.ACTION_MAIN);
>+            intent.setData(null);
>+        }

This code changes the intent so we don't try to load a URI, but won't this cause us to load about:home? That would cause more CPU to be used than we'd like I assume. Can we try passing about:blank or about:empty ?

>+    String getDefaultUAString() {
>+        // This can be overridden in App.java.in to be preprocessed
>+        return "Mozilla/5.0 (Android; Linux armv7l; rv:12.0a1) Gecko/20111226 Firefox/12.0a1 Fennec/12.0a1";

I agree that we should be preprocessing this ASAP. Can we try to get that working better for the initial landing?

>+    class RedirectorRunnable implements Runnable {

>+        public void run() {
>+            try {
>+                // this class should only be initialized with an intent with non-null data
>+                URL url = new URL(mIntent.getData().toString());

Could check that the URL is not null or empty here, just to be safe

>+                // data url should have an http or https scheme
>+                HttpURLConnection connection = (HttpURLConnection) url.openConnection();
>+                connection.setRequestProperty( "User-Agent", getDefaultUAString());
>+                connection.setInstanceFollowRedirects(false);
>+                connection.setRequestMethod("GET");
>+                connection.connect();

Do we know the performance affect for non-redirect pages? How long do we wait here while loading a real page?

>+                    String loc = connection.getHeaderField("Location");

nit: loc -> location

>+                connection.disconnect();

Might want to move this to a finally block

>+            mMainHandler.post(new Runnable() {
>+                public void run() {
>+                    onNewIntent(mIntent);
>+                }

Why wait? Why not post to front of queue? We want this to happen ASAP right? Any delay will result in the user looking at about:home or about:blank if you change it

r- to address the feedback
Comment 4 Brad Lassey [:blassey] (use needinfo?) 2011-12-28 14:04:04 PST
(In reply to Mark Finkle (:mfinkle) from comment #3)
> Comment on attachment 584401 [details] [diff] [review]
> patch
> 
> >diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java
> 
> >+        Uri data = intent.getData();
> >+        if (data != null) {
> >+            GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
> >+            intent.setAction(Intent.ACTION_MAIN);
> >+            intent.setData(null);
> >+        }
> 
> This code changes the intent so we don't try to load a URI, but won't this
> cause us to load about:home? That would cause more CPU to be used than we'd
> like I assume. Can we try passing about:blank or about:empty ?
No, about:home is shown or not shown in onCreate(), which runs before this so sees the original url
> 
> >+    String getDefaultUAString() {
> >+        // This can be overridden in App.java.in to be preprocessed
> >+        return "Mozilla/5.0 (Android; Linux armv7l; rv:12.0a1) Gecko/20111226 Firefox/12.0a1 Fennec/12.0a1";
> 
> I agree that we should be preprocessing this ASAP. Can we try to get that
> working better for the initial landing?
can you point me to the template we should create this from?
> 
> >+    class RedirectorRunnable implements Runnable {
> 
> >+        public void run() {
> >+            try {
> >+                // this class should only be initialized with an intent with non-null data
> >+                URL url = new URL(mIntent.getData().toString());
> 
> Could check that the URL is not null or empty here, just to be safe
> 
> >+                // data url should have an http or https scheme
> >+                HttpURLConnection connection = (HttpURLConnection) url.openConnection();
> >+                connection.setRequestProperty( "User-Agent", getDefaultUAString());
> >+                connection.setInstanceFollowRedirects(false);
> >+                connection.setRequestMethod("GET");
> >+                connection.connect();
> 
> Do we know the performance affect for non-redirect pages? How long do we
> wait here while loading a real page?
this only runs before gecko isn't up yet. I expect it to be a net help since we'll do the DNS look-up and get the radio fired up while gecko is off loading libraries from disk.
> 
> >+                    String loc = connection.getHeaderField("Location");
> 
> nit: loc -> location
sure
> 
> >+                connection.disconnect();
> 
> Might want to move this to a finally block
sure, although calling disconnect() is optional.
> 
> >+            mMainHandler.post(new Runnable() {
> >+                public void run() {
> >+                    onNewIntent(mIntent);
> >+                }
> 
> Why wait? Why not post to front of queue? We want this to happen ASAP right?
> Any delay will result in the user looking at about:home or about:blank if
> you change it
1) the main messange loop is supposed to not block, so it should get processed soon enough
2) gecko wasn't running yet when we started this, chances are its not yet either.

That being said, putting at the front of the queue isn't horrible either
Comment 5 Brad Lassey [:blassey] (use needinfo?) 2011-12-28 14:14:57 PST
Created attachment 584643 [details] [diff] [review]
patch
Comment 6 Doug Turner (:dougt) 2011-12-28 17:44:17 PST
i asked chris to collect data before this lands in Fennec.  Do we have any data that this makes an improvement?  I can't imagine it wouldn't, but it would be good to have some numbers backing up our claim.
Comment 7 Brad Lassey [:blassey] (use needinfo?) 2011-12-30 09:44:53 PST
Created attachment 584975 [details] [diff] [review]
patch playing with UAs and logging

our current Fennec UA is:
"Mozilla/5.0 (Android; Linux armv7l; rv:12.0a1) Gecko/20111226 Firefox/12.0a1 Fennec/12.0a1"

my testing shows that twitter's url shortener doesn't give us a redirecting response code with that UA. However it does for the iPad's UA. I played around with the UA a bit and came up with one that is pretty close to what we currently use and does get t.co to redirect us:
"Mozilla/5.0 (Linux; U; Android 2.3; en-us) Gecko/20111226 Firefox Mobile/12.0a1"

I know that changing our UA is a different bug, but hopefully this data can feed into that. Perhaps we can land this feature to send the UA that works for now until we figure out what we want to do with the real UA string.
Comment 8 Brad Lassey [:blassey] (use needinfo?) 2011-12-30 09:56:10 PST
Created attachment 584979 [details] [diff] [review]
patch

I lost the code that was restricting this to http urls when I rejigged it last
Comment 9 Mozilla RelEng Bot 2011-12-30 15:30:32 PST
Try run for 56272e1094c7 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=56272e1094c7
Results (out of 26 total builds):
    exception: 11
    success: 3
    warnings: 8
    failure: 4
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-56272e1094c7
Comment 10 Mozilla RelEng Bot 2011-12-31 16:30:33 PST
Try run for 2a1130098c85 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=2a1130098c85
Results (out of 26 total builds):
    exception: 2
    success: 6
    warnings: 12
    failure: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-2a1130098c85
Comment 11 Mozilla RelEng Bot 2011-12-31 19:30:47 PST
Try run for a9b163809cbe is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=a9b163809cbe
Results (out of 25 total builds):
    success: 6
    warnings: 14
    failure: 5
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-a9b163809cbe
Comment 12 Mozilla RelEng Bot 2012-01-04 21:20:32 PST
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Comment 13 Mozilla RelEng Bot 2012-01-04 21:20:34 PST
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Comment 14 Mozilla RelEng Bot 2012-01-04 21:20:37 PST
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Comment 15 Mozilla RelEng Bot 2012-01-04 21:20:39 PST
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Comment 16 Mozilla RelEng Bot 2012-01-04 21:20:41 PST
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Comment 17 Mozilla RelEng Bot 2012-01-05 01:01:00 PST
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Comment 18 Mozilla RelEng Bot 2012-01-05 01:01:03 PST
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Comment 19 Mozilla RelEng Bot 2012-01-05 01:01:05 PST
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Comment 20 Mozilla RelEng Bot 2012-01-05 01:01:08 PST
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Comment 21 Mozilla RelEng Bot 2012-01-05 01:01:10 PST
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Comment 22 Brad Lassey [:blassey] (use needinfo?) 2012-01-05 18:28:53 PST
Created attachment 586308 [details] [diff] [review]
patch

this patch special cases t.co rather than make any bigger UA String changes.
Comment 23 Chris Peterson [:cpeterson] 2012-01-05 18:35:20 PST
*** Bug 708792 has been marked as a duplicate of this bug. ***
Comment 24 Brad Lassey [:blassey] (use needinfo?) 2012-01-05 18:35:30 PST
Created attachment 586310 [details] [diff] [review]
patch
Comment 25 Tanvi Vyas - behind on reviews [:tanvi] 2012-01-06 11:16:54 PST
A couplw od 

What happens if (for some reason) the response from the request for the original url is not back by the time gecko is loaded?

If twitter (or another url shortening service) responds with a message saying the site is unsafe, what is the flow?  The shortened url returns a 302 to a twitter url with the warning message?  Sounds like this shouldn't be a problem with the current implementation, but wouldn't hurt to add a test case showing that the user isn't accidentally redirected to the malicious site without the warning.
Comment 26 Tanvi Vyas - behind on reviews [:tanvi] 2012-01-06 11:17:22 PST
A couple of questions:

What happens if (for some reason) the response from the request for the original url is not back by the time gecko is loaded?

If twitter (or another url shortening service) responds with a message saying the site is unsafe, what is the flow?  The shortened url returns a 302 to a twitter url with the warning message?  Sounds like this shouldn't be a problem with the current implementation, but wouldn't hurt to add a test case showing that the user isn't accidentally redirected to the malicious site without the warning.
Comment 27 David Chan [:dchan] 2012-01-06 11:47:31 PST
What happens in the following cases

> The response header contains an invalid Location
Uri.parse() only throws NPE. What is mIntent set to? Are mLastUri and mLastTitle used for any trust decisions?
 22                         mIntent.setData(Uri.parse(location));
 23                         mLastUri = mLastTitle = location;

> The response header contains a location such as about:home
Same question about trust decisions
Comment 28 Mark Finkle (:mfinkle) (use needinfo?) 2012-01-08 07:02:38 PST
(In reply to Tanvi Vyas from comment #26)
> A couple of questions:
> 
> What happens if (for some reason) the response from the request for the
> original url is not back by the time gecko is loaded?

Gecko should just load the original, shortened URL

> If twitter (or another url shortening service) responds with a message
> saying the site is unsafe, what is the flow?

Fennec would likely display the "unsafe" message
Comment 29 Mark Finkle (:mfinkle) (use needinfo?) 2012-01-08 07:04:19 PST
(In reply to David Chan [:dchan] from comment #27)
> What happens in the following cases
> 
> > The response header contains an invalid Location
> Uri.parse() only throws NPE. What is mIntent set to? Are mLastUri and
> mLastTitle used for any trust decisions?
>  22                         mIntent.setData(Uri.parse(location));
>  23                         mLastUri = mLastTitle = location;

We should protect against a possible NPE

> The response header contains a location such as about:home
> Same question about trust decisions

I think we should blacklist about: or chrome: redirects
Comment 30 Mark Finkle (:mfinkle) (use needinfo?) 2012-01-08 07:22:50 PST
Comment on attachment 586310 [details] [diff] [review]
patch

>diff --git a/mobile/android/base/App.java.in b/mobile/android/base/App.java.in

>+    public String getDefaultUAString() {
>+        return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@ Fennec/@MOZ_APP_VERSION@";
>+    }
>+
>+    public String getUAStringForHost(String host) {
>+        if ("t.co".equals(host))
>+            return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox Mobile/@MOZ_APP_VERSION@";


Would this work?
              return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@ Mobile/@MOZ_APP_VERSION@";

The only diff there is Fennec -> Mobile (and I would be OK with pushing that as our default UA too)

>diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java
>         if (sGREDir == null)
>             sGREDir = new File(this.getApplicationInfo().dataDir);
>+        Uri data = intent.getData();

Add a blank line above

>+        if (data != null && ("http".equals(data.getScheme()) || "https".equals(data.getScheme()))) {
>+            GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
>+            intent.setAction(Intent.ACTION_MAIN);
>+            intent.setData(null);

Can you add a simple comment here about what affect setAction(Intent.ACTION_MAIN) and setData(null) have on the flow of this feature. I just want to make it easier to keep new code from breaking the feature.

>-        prefetchDNS(intent.getData());

The new code would only handle prefetching "http" and "https". Is that enough to not care about any others? I guess "ftp" might be the only "uses the modem" protocol?

>+    abstract public String getDefaultUAString();
>+    abstract public String getUAStringForHost(String host);
>+
>+    class RedirectorRunnable implements Runnable {

>+        public void run() {

>+                if (code >= 300 && code < 400) {

Can we be more strict about the return code? 300 to 400 is a wide range.

>+                    String location = connection.getHeaderField("Location");
>+                    if (location != null) {
>+                        mIntent.setData(Uri.parse(location));
>+                        mLastUri = mLastTitle = location;

#1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left in their old state. Should we reset them by wrapping this in a try/catch?
#2: Should we blacklist about: and chrome: URIs from being redirected too? The point is: Someone could use this redirection as a malicious attack vector.


>+            } catch (IOException ioe) {
>+                Log.i(LOGTAG, "exception trying to pre-fetch redirected url", ioe);
>+                mIntent.putExtra("prefetched", 1);
>+            } catch (Exception e) {
>+                Log.w(LOGTAG, "unexpected exception, passing url directly to Gecko but we should explicitly catch this", e);
>+                mIntent.putExtra("prefetched", 1);                    
>+            } finally {
>+            } finally {
>+                if (connection != null)
>+                    connection.disconnect();

Too many "finally" blocks

>+        if (checkLaunchState(LaunchState.Launched)) {
>+            Uri data = intent.getData();
>+            Bundle bundle = intent.getExtras();
>+            if (data != null && 
>+                ("http".equals(data.getScheme()) || "https".equals(data.getScheme())) && 
>+                (bundle == null || bundle.getInt("prefetched", 0) != 1)) {
>+                GeckoAppShell.getHandler().post(new RedirectorRunnable(intent));
>+                return;
>+            }
>+        }

This code could really use a simple comment about how the flow is expected to work: data != null means what? what is the affect of "prefetched"? I just think more info would be easier for maintains to not break what you have created.

Landing this soon would be good. We need stability feedback. We should also take the time to actually test any speed improvments. If we can't measure any speedup, we should remove the code. It adds some complexity to the startup and if it adds no measureable value, it should be removed.

r+ with these comments addressed
Comment 31 Brad Lassey [:blassey] (use needinfo?) 2012-01-09 19:45:06 PST
(In reply to Mark Finkle (:mfinkle) from comment #30)
> Comment on attachment 586310 [details] [diff] [review]
> patch
> 
> >diff --git a/mobile/android/base/App.java.in b/mobile/android/base/App.java.in
> 
> >+    public String getDefaultUAString() {
> >+        return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@ Fennec/@MOZ_APP_VERSION@";
> >+    }
> >+
> >+    public String getUAStringForHost(String host) {
> >+        if ("t.co".equals(host))
> >+            return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox Mobile/@MOZ_APP_VERSION@";
> 
> 
> Would this work?
>               return "Mozilla/5.0 (Android; Linux armv7l;
> rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@
> Mobile/@MOZ_APP_VERSION@";
> 
> The only diff there is Fennec -> Mobile (and I would be OK with pushing that
> as our default UA too)
nope
Comment 32 Brad Lassey [:blassey] (use needinfo?) 2012-01-09 23:55:52 PST
(In reply to Mark Finkle (:mfinkle) from comment #30)
> >-        prefetchDNS(intent.getData());
> 
> The new code would only handle prefetching "http" and "https". Is that
> enough to not care about any others? I guess "ftp" might be the only "uses
> the modem" protocol?
Given how rare I assume the use case of ftp is, I'd rather not add the complexity to prefetch for it.

> >+                if (code >= 300 && code < 400) {
> 
> Can we be more strict about the return code? 300 to 400 is a wide range.
anything in that range should be a redirect of some sort



> 
> >+                    String location = connection.getHeaderField("Location");
> >+                    if (location != null) {
> >+                        mIntent.setData(Uri.parse(location));
> >+                        mLastUri = mLastTitle = location;
> 
> #1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left
> in their old state. Should we reset them by wrapping this in a try/catch?
It already is wrapped in a try/catch. If parse() throws, setData() won't be called and mLastUri and mLastTitle won't get touched.
 
> r+ with these comments addressed
re-requesting review because the UA String you proposed doesn't work and rebasing this required adding the passedUri variable
Comment 33 Brad Lassey [:blassey] (use needinfo?) 2012-01-09 23:56:28 PST
Created attachment 587253 [details] [diff] [review]
patch
Comment 34 David Chan [:dchan] 2012-01-10 10:19:54 PST
(In reply to Brad Lassey [:blassey] from comment #32)
> (In reply to Mark Finkle (:mfinkle) from comment #30)
> > 
> > >+                    String location = connection.getHeaderField("Location");
> > >+                    if (location != null) {
> > >+                        mIntent.setData(Uri.parse(location));
> > >+                        mLastUri = mLastTitle = location;
> > 
> > #1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left
> > in their old state. Should we reset them by wrapping this in a try/catch?
> It already is wrapped in a try/catch. If parse() throws, setData() won't be
> called and mLastUri and mLastTitle won't get touched.

Even with the try/catch, the documentation says that URI functions may return garbage. However parse() does expect "an RFC 2396-compliant, encoded URI"

"In the interest of performance, this class performs little to no validation. Behavior is undefined for invalid input. This class is very forgiving--in the face of invalid input, it will return garbage rather than throw an exception unless otherwise specified." [1]

[1] http://developer.android.com/reference/android/net/Uri.html
Comment 35 Brad Lassey [:blassey] (use needinfo?) 2012-01-10 10:45:36 PST
(In reply to David Chan [:dchan] from comment #34)
> (In reply to Brad Lassey [:blassey] from comment #32)
> > (In reply to Mark Finkle (:mfinkle) from comment #30)
> > > 
> > > >+                    String location = connection.getHeaderField("Location");
> > > >+                    if (location != null) {
> > > >+                        mIntent.setData(Uri.parse(location));
> > > >+                        mLastUri = mLastTitle = location;
> > > 
> > > #1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left
> > > in their old state. Should we reset them by wrapping this in a try/catch?
> > It already is wrapped in a try/catch. If parse() throws, setData() won't be
> > called and mLastUri and mLastTitle won't get touched.
> 
> Even with the try/catch, the documentation says that URI functions may
> return garbage. However parse() does expect "an RFC 2396-compliant, encoded
> URI"
> 
> "In the interest of performance, this class performs little to no
> validation. Behavior is undefined for invalid input. This class is very
> forgiving--in the face of invalid input, it will return garbage rather than
> throw an exception unless otherwise specified." [1]
> 
> [1] http://developer.android.com/reference/android/net/Uri.html

If it doesn't throw then we can trust that Uri.parse(spec).toString.equals(spec) to be true, which is all we care about.
Comment 36 Ian Melven :imelven 2012-01-10 13:56:41 PST
(In reply to Mark Finkle (:mfinkle) from comment #29)
>
> > The response header contains a location such as about:home
> > Same question about trust decisions
> 
> I think we should blacklist about: or chrome: redirects

could we consider a whitelist of http: and https: ? we don't try to do the redirect check for non http: and https: source URL's and i think this minimizes the risk here.
Comment 37 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-01-10 16:01:15 PST
I spoke with dougt about this.

It is *very* unfortunate that Necko (as part of libxul) takes too long to start up, to the point that we have to duplicate this logic in the Java code. If this is a big win, we should do it, but we should really fix the underlying problem--Necko (libxul) taking too long to start up--as part of the long-term solution to this.

Here are some security and correctness constraints:

* We should not fetch a http://example.org URL without checking whether example.org is a HSTS site. The problem is that the HSTS service is in Gecko and it seems quite impractical to duplicate it correctly in Java. So, AFAICT, the only way to verify that the site isn't HSTS is to hard-code a list of non-HSTS sites into the feature.

* In general, the response you get from a server (even which redirect or whether to redirect) is dependent on the information in the request, especially cookies and/or HTTP auth data. Since we don't have access to cookies (since they are in Necko, in libxul), we should make sure that we do not include ANY cookies in the request and that we do not set any cookies from the responses. And, again, this means that we must hard-code a list of sites that give the correct redirects without any cookies present. We should investigate how the implementation of the Java APIs handles cookies.

* Do we really need to do this for https:// links? If so, then we have to resolve the issue of the mismatch in CA trust between the native Android stack and our SSL stack, or more generally the differences in security properties between the Android stack and our stack. It is a LOT of work to try to analyze all of these differences and I don't have time to do so any time soon. So, at least for now, we must restrict it to non-HTTPS sites. If we think it is important to support HTTPS sites for this, then I suggest we do so in a follow-up bug so that we don't block the more common cases on that separate and unscheduled security analysis.

* It seems impractical to duplicate any logic regarding the parsing or processing of response bodies or non-redirect responses. So, we should make sure we throw away the response body for any response, and the entire response for any non-redirect response.

* We should be careful to avoid redirect loops and very long or infinite redirect chains. I think that if we have a hard-coded list of URL shorteners as suggested above, we can avoid redirect loops. But, if we tried to open this up, redirect loops are much more likely. Are these things handled internally in the implementation of the Java APIs? If so, are they handled correctly? If not, are we handling them correctly?

* Even with a hardcoded list, captive portals are going to cause us many of the troubles above--especially with potential redirect loops. Note that Necko doesn't handle captive portals correctly either.

* We should make sure that we are handling the case where Firefox (including libxul) is already loaded when the intent is activated. If Necko (libxul) is already loaded, I bet it would be better to NOT do this prefecthing, and instead send the request directly to Necko. I believe that if Necko is loaded, Necko doing the fetching will be faster because it maintains its own DNS cache, other caches, connection pools, etc. that are not shared with the Java HTTP implementation.

* My understanding is that this prefetching happens ONLY when we are opening a URL that was given to us from an external application, that we intend to load unconditionally anyway. In that case, I don't think there are any significant privacy implications of the feature, unless the Android Java HTTP API implementation has privacy leaks in its implementation. The exception might be that we will do this HTTP request unconditionally without checking with the Necko HTTP (disk) cache first; if the cache already had the redirect cached then we would have avoided making the request at all.

* Please indicate in a comment in the source code that this code has non-obvious security, privacy, and correctness factors that should be addressed. (Referencing this bug directly in the comment might be a good idea here, even though it is bad form normally.) This way, people who extend the feature in the future are aware of these non-obvious considerations and they will be able to extend the feature properly.
Comment 38 Jason Duell [:jduell] (needinfo? me) 2012-01-17 12:03:45 PST
It seems very hard to get this approach right. If there's any way to get necko up and running earlier that would be preferable.  That said, I'm not sure easy that would be--necko doesn't rely on most things (content, layout, etc), but it does listen to XPCOM startup events, uses the pref service, etc.  Perhaps we should start a conversation in dev.tech.networking about what that might involve and if it's feasible.
Comment 39 Brad Lassey [:blassey] (use needinfo?) 2012-01-17 15:33:50 PST
Created attachment 589326 [details] [diff] [review]
patch

based on brian's feedback, this restricts to http and a white list.

We should generate a more thoroughly thought out whitelist than this though.
Comment 40 Mark Finkle (:mfinkle) (use needinfo?) 2012-01-17 19:53:05 PST
Comment on attachment 589326 [details] [diff] [review]
patch

>+    public String getUAStringForHost(String host) {
>+        if ("t.co".equals(host))
>+            return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox Mobile/@MOZ_APP_VERSION@";
>+        return getDefaultUAString();
>+    }

Add a quick comment here that t.co will try a client-side redirect without the "right" UA

>+    private final String kPrefetchWhiteListArray[] = new String[] { "t.co",
>+                                                       "bit.ly",

nit: can you put "t.co" on it's own line too?
Comment 42 Brad Lassey [:blassey] (use needinfo?) 2012-01-18 08:36:57 PST
Comment on attachment 589326 [details] [diff] [review]
patch

[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: 
It will take a up to a second longer to load a link from the twitter app
Testing completed (on m-c, etc.): 
just landed on m-c
Risk to taking this patch (and alternatives if risky):
This change carries with it a fair amount of risk. We should let it ride on m-c for a few days before up lifting. The alternative is to not take it.
Comment 43 Ian Melven :imelven 2012-01-18 11:14:37 PST
(In reply to Mark Finkle (:mfinkle) from comment #30)
>
> We should also
> take the time to actually test any speed improvments. If we can't measure
> any speedup, we should remove the code. It adds some complexity to the
> startup and if it adds no measureable value, it should be removed.
> 

is there a followup bug for the performance testing of this feature ?
Comment 44 Alex Keybl [:akeybl] 2012-01-19 12:03:01 PST
Comment on attachment 589326 [details] [diff] [review]
patch

[Triage Comment]
Mobile only - approved for Aurora.
Comment 45 Brad Lassey [:blassey] (use needinfo?) 2012-01-20 13:48:06 PST
https://hg.mozilla.org/releases/mozilla-aurora/rev/4012b1f4bd56

Note You need to log in before you can comment on or make changes to this bug.