Closed Bug 713503 Opened 12 years ago Closed 12 years ago

prefetch urls from known url shortening sites before gecko is running

Categories

(Firefox for Android Graveyard :: General, enhancement, P2)

ARM
Android
enhancement

Tracking

(firefox11 fixed, firefox12 fixed, fennec11+)

RESOLVED FIXED
Firefox 12
Tracking Status
firefox11 --- fixed
firefox12 --- fixed
fennec 11+ ---

People

(Reporter: blassey, Assigned: blassey)

References

Details

Attachments

(1 file, 9 obsolete files)

Attached patch patch (obsolete) — Splinter Review
Loading a link from the twitter app has been identified as a primary use case. The round trip time for the shortened urls seems to be a non-trivial part of the page load time in this case. If we can get one of those round trips done while gecko is still being brought up we should be able to shortcut that bit.

In the case of a redirect to an other url shortener, such as twitter (t.co) redirecting to Al Jazeera (aje.me) we save two round trips
Attachment #584310 - Flags: review?(doug.turner)
Attached patch patch (obsolete) — Splinter Review
I think I like this approach slightly better, here's the interdiff:

         Uri data = intent.getData();
         if (data != null && isKnownRedirector(data)) {
             GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
-            intent.setData(Uri.parse("about:blank"));
+            intent.setAction(Intent.ACTION_MAIN);
+            intent.setData(null);
         }
Assignee: nobody → blassey.bugs
Attachment #584310 - Attachment is obsolete: true
Attachment #584310 - Flags: review?(doug.turner)
Attachment #584311 - Flags: review?(doug.turner)
Attached patch patch (obsolete) — Splinter Review
This patch runs everything through this code. The advantage is it is no longer a white list so will also pick up on redirects such as www.facebook.com -> m.facebook.com. It also should do the job of the prefetchDNS() method, which it now replaces.
Attachment #584311 - Attachment is obsolete: true
Attachment #584311 - Flags: review?(doug.turner)
Attachment #584401 - Flags: review?(doug.turner)
Comment on attachment 584401 [details] [diff] [review]
patch

>diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java

>+        Uri data = intent.getData();
>+        if (data != null) {
>+            GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
>+            intent.setAction(Intent.ACTION_MAIN);
>+            intent.setData(null);
>+        }

This code changes the intent so we don't try to load a URI, but won't this cause us to load about:home? That would cause more CPU to be used than we'd like I assume. Can we try passing about:blank or about:empty ?

>+    String getDefaultUAString() {
>+        // This can be overridden in App.java.in to be preprocessed
>+        return "Mozilla/5.0 (Android; Linux armv7l; rv:12.0a1) Gecko/20111226 Firefox/12.0a1 Fennec/12.0a1";

I agree that we should be preprocessing this ASAP. Can we try to get that working better for the initial landing?

>+    class RedirectorRunnable implements Runnable {

>+        public void run() {
>+            try {
>+                // this class should only be initialized with an intent with non-null data
>+                URL url = new URL(mIntent.getData().toString());

Could check that the URL is not null or empty here, just to be safe

>+                // data url should have an http or https scheme
>+                HttpURLConnection connection = (HttpURLConnection) url.openConnection();
>+                connection.setRequestProperty( "User-Agent", getDefaultUAString());
>+                connection.setInstanceFollowRedirects(false);
>+                connection.setRequestMethod("GET");
>+                connection.connect();

Do we know the performance affect for non-redirect pages? How long do we wait here while loading a real page?

>+                    String loc = connection.getHeaderField("Location");

nit: loc -> location

>+                connection.disconnect();

Might want to move this to a finally block

>+            mMainHandler.post(new Runnable() {
>+                public void run() {
>+                    onNewIntent(mIntent);
>+                }

Why wait? Why not post to front of queue? We want this to happen ASAP right? Any delay will result in the user looking at about:home or about:blank if you change it

r- to address the feedback
Attachment #584401 - Flags: review?(doug.turner) → review-
(In reply to Mark Finkle (:mfinkle) from comment #3)
> Comment on attachment 584401 [details] [diff] [review]
> patch
> 
> >diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java
> 
> >+        Uri data = intent.getData();
> >+        if (data != null) {
> >+            GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
> >+            intent.setAction(Intent.ACTION_MAIN);
> >+            intent.setData(null);
> >+        }
> 
> This code changes the intent so we don't try to load a URI, but won't this
> cause us to load about:home? That would cause more CPU to be used than we'd
> like I assume. Can we try passing about:blank or about:empty ?
No, about:home is shown or not shown in onCreate(), which runs before this so sees the original url
> 
> >+    String getDefaultUAString() {
> >+        // This can be overridden in App.java.in to be preprocessed
> >+        return "Mozilla/5.0 (Android; Linux armv7l; rv:12.0a1) Gecko/20111226 Firefox/12.0a1 Fennec/12.0a1";
> 
> I agree that we should be preprocessing this ASAP. Can we try to get that
> working better for the initial landing?
can you point me to the template we should create this from?
> 
> >+    class RedirectorRunnable implements Runnable {
> 
> >+        public void run() {
> >+            try {
> >+                // this class should only be initialized with an intent with non-null data
> >+                URL url = new URL(mIntent.getData().toString());
> 
> Could check that the URL is not null or empty here, just to be safe
> 
> >+                // data url should have an http or https scheme
> >+                HttpURLConnection connection = (HttpURLConnection) url.openConnection();
> >+                connection.setRequestProperty( "User-Agent", getDefaultUAString());
> >+                connection.setInstanceFollowRedirects(false);
> >+                connection.setRequestMethod("GET");
> >+                connection.connect();
> 
> Do we know the performance affect for non-redirect pages? How long do we
> wait here while loading a real page?
this only runs before gecko isn't up yet. I expect it to be a net help since we'll do the DNS look-up and get the radio fired up while gecko is off loading libraries from disk.
> 
> >+                    String loc = connection.getHeaderField("Location");
> 
> nit: loc -> location
sure
> 
> >+                connection.disconnect();
> 
> Might want to move this to a finally block
sure, although calling disconnect() is optional.
> 
> >+            mMainHandler.post(new Runnable() {
> >+                public void run() {
> >+                    onNewIntent(mIntent);
> >+                }
> 
> Why wait? Why not post to front of queue? We want this to happen ASAP right?
> Any delay will result in the user looking at about:home or about:blank if
> you change it
1) the main messange loop is supposed to not block, so it should get processed soon enough
2) gecko wasn't running yet when we started this, chances are its not yet either.

That being said, putting at the front of the queue isn't horrible either
Attached patch patch (obsolete) — Splinter Review
Attachment #584401 - Attachment is obsolete: true
Attachment #584643 - Flags: review?(mark.finkle)
i asked chris to collect data before this lands in Fennec.  Do we have any data that this makes an improvement?  I can't imagine it wouldn't, but it would be good to have some numbers backing up our claim.
Priority: -- → P2
our current Fennec UA is:
"Mozilla/5.0 (Android; Linux armv7l; rv:12.0a1) Gecko/20111226 Firefox/12.0a1 Fennec/12.0a1"

my testing shows that twitter's url shortener doesn't give us a redirecting response code with that UA. However it does for the iPad's UA. I played around with the UA a bit and came up with one that is pretty close to what we currently use and does get t.co to redirect us:
"Mozilla/5.0 (Linux; U; Android 2.3; en-us) Gecko/20111226 Firefox Mobile/12.0a1"

I know that changing our UA is a different bug, but hopefully this data can feed into that. Perhaps we can land this feature to send the UA that works for now until we figure out what we want to do with the real UA string.
Attached patch patch (obsolete) — Splinter Review
I lost the code that was restricting this to http urls when I rejigged it last
Attachment #584643 - Attachment is obsolete: true
Attachment #584643 - Flags: review?(mark.finkle)
Attachment #584979 - Flags: review?(mark.finkle)
Try run for 56272e1094c7 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=56272e1094c7
Results (out of 26 total builds):
    exception: 11
    success: 3
    warnings: 8
    failure: 4
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-56272e1094c7
Try run for 2a1130098c85 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=2a1130098c85
Results (out of 26 total builds):
    exception: 2
    success: 6
    warnings: 12
    failure: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-2a1130098c85
Try run for a9b163809cbe is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=a9b163809cbe
Results (out of 25 total builds):
    success: 6
    warnings: 14
    failure: 5
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-a9b163809cbe
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Try run for 960067ff34a6 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=960067ff34a6
Results (out of 2 total builds):
    exception: 2
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-960067ff34a6
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Try run for e75b82f8fe23 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=e75b82f8fe23
Results (out of 26 total builds):
    success: 20
    warnings: 6
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/blassey@mozilla.com-e75b82f8fe23
Attached patch patch (obsolete) — Splinter Review
this patch special cases t.co rather than make any bigger UA String changes.
Attachment #584975 - Attachment is obsolete: true
Attachment #584979 - Attachment is obsolete: true
Attachment #584979 - Flags: review?(mark.finkle)
Attachment #586308 - Flags: review?(mark.finkle)
Attached patch patch (obsolete) — Splinter Review
Attachment #586308 - Attachment is obsolete: true
Attachment #586308 - Flags: review?(mark.finkle)
Attachment #586310 - Flags: review?(mark.finkle)
A couplw od 

What happens if (for some reason) the response from the request for the original url is not back by the time gecko is loaded?

If twitter (or another url shortening service) responds with a message saying the site is unsafe, what is the flow?  The shortened url returns a 302 to a twitter url with the warning message?  Sounds like this shouldn't be a problem with the current implementation, but wouldn't hurt to add a test case showing that the user isn't accidentally redirected to the malicious site without the warning.
A couple of questions:

What happens if (for some reason) the response from the request for the original url is not back by the time gecko is loaded?

If twitter (or another url shortening service) responds with a message saying the site is unsafe, what is the flow?  The shortened url returns a 302 to a twitter url with the warning message?  Sounds like this shouldn't be a problem with the current implementation, but wouldn't hurt to add a test case showing that the user isn't accidentally redirected to the malicious site without the warning.
What happens in the following cases

> The response header contains an invalid Location
Uri.parse() only throws NPE. What is mIntent set to? Are mLastUri and mLastTitle used for any trust decisions?
 22                         mIntent.setData(Uri.parse(location));
 23                         mLastUri = mLastTitle = location;

> The response header contains a location such as about:home
Same question about trust decisions
tracking-fennec: --- → 11+
(In reply to Tanvi Vyas from comment #26)
> A couple of questions:
> 
> What happens if (for some reason) the response from the request for the
> original url is not back by the time gecko is loaded?

Gecko should just load the original, shortened URL

> If twitter (or another url shortening service) responds with a message
> saying the site is unsafe, what is the flow?

Fennec would likely display the "unsafe" message
(In reply to David Chan [:dchan] from comment #27)
> What happens in the following cases
> 
> > The response header contains an invalid Location
> Uri.parse() only throws NPE. What is mIntent set to? Are mLastUri and
> mLastTitle used for any trust decisions?
>  22                         mIntent.setData(Uri.parse(location));
>  23                         mLastUri = mLastTitle = location;

We should protect against a possible NPE

> The response header contains a location such as about:home
> Same question about trust decisions

I think we should blacklist about: or chrome: redirects
Comment on attachment 586310 [details] [diff] [review]
patch

>diff --git a/mobile/android/base/App.java.in b/mobile/android/base/App.java.in

>+    public String getDefaultUAString() {
>+        return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@ Fennec/@MOZ_APP_VERSION@";
>+    }
>+
>+    public String getUAStringForHost(String host) {
>+        if ("t.co".equals(host))
>+            return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox Mobile/@MOZ_APP_VERSION@";


Would this work?
              return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@ Mobile/@MOZ_APP_VERSION@";

The only diff there is Fennec -> Mobile (and I would be OK with pushing that as our default UA too)

>diff --git a/mobile/android/base/GeckoApp.java b/mobile/android/base/GeckoApp.java
>         if (sGREDir == null)
>             sGREDir = new File(this.getApplicationInfo().dataDir);
>+        Uri data = intent.getData();

Add a blank line above

>+        if (data != null && ("http".equals(data.getScheme()) || "https".equals(data.getScheme()))) {
>+            GeckoAppShell.getHandler().post(new RedirectorRunnable(new Intent(intent)));
>+            intent.setAction(Intent.ACTION_MAIN);
>+            intent.setData(null);

Can you add a simple comment here about what affect setAction(Intent.ACTION_MAIN) and setData(null) have on the flow of this feature. I just want to make it easier to keep new code from breaking the feature.

>-        prefetchDNS(intent.getData());

The new code would only handle prefetching "http" and "https". Is that enough to not care about any others? I guess "ftp" might be the only "uses the modem" protocol?

>+    abstract public String getDefaultUAString();
>+    abstract public String getUAStringForHost(String host);
>+
>+    class RedirectorRunnable implements Runnable {

>+        public void run() {

>+                if (code >= 300 && code < 400) {

Can we be more strict about the return code? 300 to 400 is a wide range.

>+                    String location = connection.getHeaderField("Location");
>+                    if (location != null) {
>+                        mIntent.setData(Uri.parse(location));
>+                        mLastUri = mLastTitle = location;

#1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left in their old state. Should we reset them by wrapping this in a try/catch?
#2: Should we blacklist about: and chrome: URIs from being redirected too? The point is: Someone could use this redirection as a malicious attack vector.


>+            } catch (IOException ioe) {
>+                Log.i(LOGTAG, "exception trying to pre-fetch redirected url", ioe);
>+                mIntent.putExtra("prefetched", 1);
>+            } catch (Exception e) {
>+                Log.w(LOGTAG, "unexpected exception, passing url directly to Gecko but we should explicitly catch this", e);
>+                mIntent.putExtra("prefetched", 1);                    
>+            } finally {
>+            } finally {
>+                if (connection != null)
>+                    connection.disconnect();

Too many "finally" blocks

>+        if (checkLaunchState(LaunchState.Launched)) {
>+            Uri data = intent.getData();
>+            Bundle bundle = intent.getExtras();
>+            if (data != null && 
>+                ("http".equals(data.getScheme()) || "https".equals(data.getScheme())) && 
>+                (bundle == null || bundle.getInt("prefetched", 0) != 1)) {
>+                GeckoAppShell.getHandler().post(new RedirectorRunnable(intent));
>+                return;
>+            }
>+        }

This code could really use a simple comment about how the flow is expected to work: data != null means what? what is the affect of "prefetched"? I just think more info would be easier for maintains to not break what you have created.

Landing this soon would be good. We need stability feedback. We should also take the time to actually test any speed improvments. If we can't measure any speedup, we should remove the code. It adds some complexity to the startup and if it adds no measureable value, it should be removed.

r+ with these comments addressed
Attachment #586310 - Flags: review?(mark.finkle) → review+
(In reply to Mark Finkle (:mfinkle) from comment #30)
> Comment on attachment 586310 [details] [diff] [review]
> patch
> 
> >diff --git a/mobile/android/base/App.java.in b/mobile/android/base/App.java.in
> 
> >+    public String getDefaultUAString() {
> >+        return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@ Fennec/@MOZ_APP_VERSION@";
> >+    }
> >+
> >+    public String getUAStringForHost(String host) {
> >+        if ("t.co".equals(host))
> >+            return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox Mobile/@MOZ_APP_VERSION@";
> 
> 
> Would this work?
>               return "Mozilla/5.0 (Android; Linux armv7l;
> rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox/@MOZ_APP_VERSION@
> Mobile/@MOZ_APP_VERSION@";
> 
> The only diff there is Fennec -> Mobile (and I would be OK with pushing that
> as our default UA too)
nope
(In reply to Mark Finkle (:mfinkle) from comment #30)
> >-        prefetchDNS(intent.getData());
> 
> The new code would only handle prefetching "http" and "https". Is that
> enough to not care about any others? I guess "ftp" might be the only "uses
> the modem" protocol?
Given how rare I assume the use case of ftp is, I'd rather not add the complexity to prefetch for it.

> >+                if (code >= 300 && code < 400) {
> 
> Can we be more strict about the return code? 300 to 400 is a wide range.
anything in that range should be a redirect of some sort



> 
> >+                    String location = connection.getHeaderField("Location");
> >+                    if (location != null) {
> >+                        mIntent.setData(Uri.parse(location));
> >+                        mLastUri = mLastTitle = location;
> 
> #1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left
> in their old state. Should we reset them by wrapping this in a try/catch?
It already is wrapped in a try/catch. If parse() throws, setData() won't be called and mLastUri and mLastTitle won't get touched.
 
> r+ with these comments addressed
re-requesting review because the UA String you proposed doesn't work and rebasing this required adding the passedUri variable
Attached patch patch (obsolete) — Splinter Review
Attachment #586310 - Attachment is obsolete: true
Attachment #587253 - Flags: review?(mark.finkle)
(In reply to Brad Lassey [:blassey] from comment #32)
> (In reply to Mark Finkle (:mfinkle) from comment #30)
> > 
> > >+                    String location = connection.getHeaderField("Location");
> > >+                    if (location != null) {
> > >+                        mIntent.setData(Uri.parse(location));
> > >+                        mLastUri = mLastTitle = location;
> > 
> > #1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left
> > in their old state. Should we reset them by wrapping this in a try/catch?
> It already is wrapped in a try/catch. If parse() throws, setData() won't be
> called and mLastUri and mLastTitle won't get touched.

Even with the try/catch, the documentation says that URI functions may return garbage. However parse() does expect "an RFC 2396-compliant, encoded URI"

"In the interest of performance, this class performs little to no validation. Behavior is undefined for invalid input. This class is very forgiving--in the face of invalid input, it will return garbage rather than throw an exception unless otherwise specified." [1]

[1] http://developer.android.com/reference/android/net/Uri.html
(In reply to David Chan [:dchan] from comment #34)
> (In reply to Brad Lassey [:blassey] from comment #32)
> > (In reply to Mark Finkle (:mfinkle) from comment #30)
> > > 
> > > >+                    String location = connection.getHeaderField("Location");
> > > >+                    if (location != null) {
> > > >+                        mIntent.setData(Uri.parse(location));
> > > >+                        mLastUri = mLastTitle = location;
> > > 
> > > #1: If Uri.parse fails for some reason, mLastUri & mLastTitle would be left
> > > in their old state. Should we reset them by wrapping this in a try/catch?
> > It already is wrapped in a try/catch. If parse() throws, setData() won't be
> > called and mLastUri and mLastTitle won't get touched.
> 
> Even with the try/catch, the documentation says that URI functions may
> return garbage. However parse() does expect "an RFC 2396-compliant, encoded
> URI"
> 
> "In the interest of performance, this class performs little to no
> validation. Behavior is undefined for invalid input. This class is very
> forgiving--in the face of invalid input, it will return garbage rather than
> throw an exception unless otherwise specified." [1]
> 
> [1] http://developer.android.com/reference/android/net/Uri.html

If it doesn't throw then we can trust that Uri.parse(spec).toString.equals(spec) to be true, which is all we care about.
(In reply to Mark Finkle (:mfinkle) from comment #29)
>
> > The response header contains a location such as about:home
> > Same question about trust decisions
> 
> I think we should blacklist about: or chrome: redirects

could we consider a whitelist of http: and https: ? we don't try to do the redirect check for non http: and https: source URL's and i think this minimizes the risk here.
I spoke with dougt about this.

It is *very* unfortunate that Necko (as part of libxul) takes too long to start up, to the point that we have to duplicate this logic in the Java code. If this is a big win, we should do it, but we should really fix the underlying problem--Necko (libxul) taking too long to start up--as part of the long-term solution to this.

Here are some security and correctness constraints:

* We should not fetch a http://example.org URL without checking whether example.org is a HSTS site. The problem is that the HSTS service is in Gecko and it seems quite impractical to duplicate it correctly in Java. So, AFAICT, the only way to verify that the site isn't HSTS is to hard-code a list of non-HSTS sites into the feature.

* In general, the response you get from a server (even which redirect or whether to redirect) is dependent on the information in the request, especially cookies and/or HTTP auth data. Since we don't have access to cookies (since they are in Necko, in libxul), we should make sure that we do not include ANY cookies in the request and that we do not set any cookies from the responses. And, again, this means that we must hard-code a list of sites that give the correct redirects without any cookies present. We should investigate how the implementation of the Java APIs handles cookies.

* Do we really need to do this for https:// links? If so, then we have to resolve the issue of the mismatch in CA trust between the native Android stack and our SSL stack, or more generally the differences in security properties between the Android stack and our stack. It is a LOT of work to try to analyze all of these differences and I don't have time to do so any time soon. So, at least for now, we must restrict it to non-HTTPS sites. If we think it is important to support HTTPS sites for this, then I suggest we do so in a follow-up bug so that we don't block the more common cases on that separate and unscheduled security analysis.

* It seems impractical to duplicate any logic regarding the parsing or processing of response bodies or non-redirect responses. So, we should make sure we throw away the response body for any response, and the entire response for any non-redirect response.

* We should be careful to avoid redirect loops and very long or infinite redirect chains. I think that if we have a hard-coded list of URL shorteners as suggested above, we can avoid redirect loops. But, if we tried to open this up, redirect loops are much more likely. Are these things handled internally in the implementation of the Java APIs? If so, are they handled correctly? If not, are we handling them correctly?

* Even with a hardcoded list, captive portals are going to cause us many of the troubles above--especially with potential redirect loops. Note that Necko doesn't handle captive portals correctly either.

* We should make sure that we are handling the case where Firefox (including libxul) is already loaded when the intent is activated. If Necko (libxul) is already loaded, I bet it would be better to NOT do this prefecthing, and instead send the request directly to Necko. I believe that if Necko is loaded, Necko doing the fetching will be faster because it maintains its own DNS cache, other caches, connection pools, etc. that are not shared with the Java HTTP implementation.

* My understanding is that this prefetching happens ONLY when we are opening a URL that was given to us from an external application, that we intend to load unconditionally anyway. In that case, I don't think there are any significant privacy implications of the feature, unless the Android Java HTTP API implementation has privacy leaks in its implementation. The exception might be that we will do this HTTP request unconditionally without checking with the Necko HTTP (disk) cache first; if the cache already had the redirect cached then we would have avoided making the request at all.

* Please indicate in a comment in the source code that this code has non-obvious security, privacy, and correctness factors that should be addressed. (Referencing this bug directly in the comment might be a good idea here, even though it is bad form normally.) This way, people who extend the feature in the future are aware of these non-obvious considerations and they will be able to extend the feature properly.
It seems very hard to get this approach right. If there's any way to get necko up and running earlier that would be preferable.  That said, I'm not sure easy that would be--necko doesn't rely on most things (content, layout, etc), but it does listen to XPCOM startup events, uses the pref service, etc.  Perhaps we should start a conversation in dev.tech.networking about what that might involve and if it's feasible.
Attached patch patchSplinter Review
based on brian's feedback, this restricts to http and a white list.

We should generate a more thoroughly thought out whitelist than this though.
Attachment #587253 - Attachment is obsolete: true
Attachment #587253 - Flags: review?(mark.finkle)
Attachment #589326 - Flags: review?(mark.finkle)
Comment on attachment 589326 [details] [diff] [review]
patch

>+    public String getUAStringForHost(String host) {
>+        if ("t.co".equals(host))
>+            return "Mozilla/5.0 (Android; Linux armv7l; rv:@MOZ_APP_VERSION@) Gecko/@UA_BUILDID@ Firefox Mobile/@MOZ_APP_VERSION@";
>+        return getDefaultUAString();
>+    }

Add a quick comment here that t.co will try a client-side redirect without the "right" UA

>+    private final String kPrefetchWhiteListArray[] = new String[] { "t.co",
>+                                                       "bit.ly",

nit: can you put "t.co" on it's own line too?
Attachment #589326 - Flags: review?(mark.finkle) → review+
https://hg.mozilla.org/mozilla-central/rev/952ff8dadb81
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 12
Comment on attachment 589326 [details] [diff] [review]
patch

[Approval Request Comment]
Regression caused by (bug #): 
User impact if declined: 
It will take a up to a second longer to load a link from the twitter app
Testing completed (on m-c, etc.): 
just landed on m-c
Risk to taking this patch (and alternatives if risky):
This change carries with it a fair amount of risk. We should let it ride on m-c for a few days before up lifting. The alternative is to not take it.
Attachment #589326 - Flags: approval-mozilla-aurora?
(In reply to Mark Finkle (:mfinkle) from comment #30)
>
> We should also
> take the time to actually test any speed improvments. If we can't measure
> any speedup, we should remove the code. It adds some complexity to the
> startup and if it adds no measureable value, it should be removed.
> 

is there a followup bug for the performance testing of this feature ?
Comment on attachment 589326 [details] [diff] [review]
patch

[Triage Comment]
Mobile only - approved for Aurora.
Attachment #589326 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: