Closed Bug 1048131 Opened 5 years ago Closed 4 years ago

Implement CaptivePortalService using nsICaptivePortalDetector

Categories

(Core :: Networking, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla42
Tracking Status
firefox42 --- fixed

People

(Reporter: valentin, Assigned: valentin)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 11 obsolete files)

2.68 KB, patch
Details | Diff | Splinter Review
13.41 KB, patch
Details | Diff | Splinter Review
20.57 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
7.32 KB, patch
mcmanus
: review+
Details | Diff | Splinter Review
Attached patch captive-portal.patch (obsolete) — Splinter Review
Implement a captive portal detection mechanism for firefox desktop.

captivedetect.js [1] is used on b2g in the network manager [2] to detect captive portals and open the login page.
We need a similar mechanism for Firefox Desktop (and other platforms) that performs a similar function.
I have started a wiki page aggregating related bugs [3].

captivedetect.js implements nsICaptivePortalDetect and makes an xmlhttprequest to http://detectportal.firefox.com/success.txt and checks that the result is what we expect (success\n)

The attached patch represents a WIP for a detection service.
It's instantiated by nsIOService, and makes a request every 60 seconds to check for a captive portal. It has a 'isCaptive' attribute, which reflects that status. 

[1] http://dxr.mozilla.org/mozilla-central/source/toolkit/components/captivedetect/captivedetect.js
[2] http://dxr.mozilla.org/mozilla-central/source/dom/system/gonk/NetworkManager.js#1238
[3] https://wiki.mozilla.org/Necko/CaptivePortal
Attachment #8466903 - Flags: feedback?(mcmanus)
Comment on attachment 8466903 [details] [diff] [review]
captive-portal.patch

Review of attachment 8466903 [details] [diff] [review]:
-----------------------------------------------------------------

I just glanced through the code real quick but a few things come to mind.. forgive me if my comments are more about captive portals in general than this sub-bug - but I've tried to keep them on target.

1] there is a privacy/tracking complication here. needs kill switch.

2] windows already has a captive portal service of some type as part of the OS. (Well, I don't know about XP) Let's use that where possible to mitigate the tracking angle

3] when that's not available, I would prefer to use something in the same domain as the daily ping - again tracking.

4] Since this doesn't do the actual channel manipulation, I'm not sure if it is actually doing the network activity every N seconds - but polling isn't going to work well here. just too expensive. We need to link this to daniel's network changed events and maybe other kinds of failures.

::: b2g/app/b2g.js
@@ -866,5 @@
>  pref("toolkit.storage.pageSize", 2048);
>  #endif
>  
> -// Enable captive portal detection.
> -pref("captivedetect.canonicalURL", "http://detectportal.firefox.com/success.txt");

I would probably actually cut and paste this into each application pref - so they can use different ones

::: netwerk/base/src/CaptivePortalService.cpp
@@ +11,5 @@
> +static const char kOpenCaptivePortalLoginEvent[] = "captive-portal-login";
> +static const char kAbortCaptivePortalLoginEvent[] = "captive-portal-login-abort";
> +static const char kCaptivePortalLoginSuccessEvent[] = "captive-portal-login-success";
> +
> +static const uint32_t defaultInterval = 60*1000; // check every 60 seconds

this isn't used?

@@ +24,5 @@
> +CaptivePortalService::CaptivePortalService()
> +  : mLoginRequired(false)
> +  , mInitialized(false)
> +  , mRequestInProgress(false)
> +  , mInterval(10*1000)

where does that number come from? at least preffable
Attachment #8466903 - Flags: feedback?(mcmanus)
(In reply to Patrick McManus [:mcmanus] from comment #1)
> Comment on attachment 8466903 [details] [diff] [review]
> captive-portal.patch
> 
> Review of attachment 8466903 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I just glanced through the code real quick but a few things come to mind..
> forgive me if my comments are more about captive portals in general than
> this sub-bug - but I've tried to keep them on target.
> 
> 1] there is a privacy/tracking complication here. needs kill switch.
> 
> 2] windows already has a captive portal service of some type as part of the
> OS. (Well, I don't know about XP) Let's use that where possible to mitigate
> the tracking angle

We could definitely use that, but as some comments in bug 562917 say, it might have trouble detecting when a captive portal's time expires.

> 
> 3] when that's not available, I would prefer to use something in the same
> domain as the daily ping - again tracking.
> 
> 4] Since this doesn't do the actual channel manipulation, I'm not sure if it
> is actually doing the network activity every N seconds - but polling isn't
> going to work well here. just too expensive. We need to link this to
> daniel's network changed events and maybe other kinds of failures.

Indeed, on FirefoxOS they seem to check for a captive portal each time an interface changes. It would be reasonable to do the same on desktop, when Daniel's patch lands, but I'd also like to provide some kind of polling mechanism in case such a use case might arise.

> 
> ::: b2g/app/b2g.js
> @@ -866,5 @@
> >  pref("toolkit.storage.pageSize", 2048);
> >  #endif
> >  
> > -// Enable captive portal detection.
> > -pref("captivedetect.canonicalURL", "http://detectportal.firefox.com/success.txt");
> 
> I would probably actually cut and paste this into each application pref - so
> they can use different ones
> 
> ::: netwerk/base/src/CaptivePortalService.cpp
> @@ +11,5 @@
> > +static const char kOpenCaptivePortalLoginEvent[] = "captive-portal-login";
> > +static const char kAbortCaptivePortalLoginEvent[] = "captive-portal-login-abort";
> > +static const char kCaptivePortalLoginSuccessEvent[] = "captive-portal-login-success";
> > +
> > +static const uint32_t defaultInterval = 60*1000; // check every 60 seconds
> 
> this isn't used?
> 
> @@ +24,5 @@
> > +CaptivePortalService::CaptivePortalService()
> > +  : mLoginRequired(false)
> > +  , mInitialized(false)
> > +  , mRequestInProgress(false)
> > +  , mInterval(10*1000)
> 
> where does that number come from? at least preffable

it used to be defaultInterval, but I changed it for development purposes :)

I've also talked to Jason about this patch, and I'll also try to provide a way to check the status without the constant polling. I suspect this service is going to be quite simple at first, but then we're going to add several heuristics, such as detecting redirects to the local network.
(In reply to Patrick McManus [:mcmanus] from comment #1)
> 2] windows already has a captive portal service of some type as part of the
> OS. (Well, I don't know about XP) Let's use that where possible to mitigate
> the tracking angle

Bug 604975 is related.

> 4] Since this doesn't do the actual channel manipulation, I'm not sure if it
> is actually doing the network activity every N seconds - but polling isn't
> going to work well here. just too expensive. We need to link this to
> daniel's network changed events and maybe other kinds of failures.

See also bug 562917 comment 52 about times we could detect captive portals without polling (e.g. HTTP 511, certificate errors, startup, Windows detection).
Attachment #8466903 - Attachment is obsolete: true
Would it be feasable to run and do the detectportal magic over HTTPS ?

It could be a way to avoid a possible problem when the captive portal provides its own DNS responses that redirect requests to a private server that in fact _can_ pretend to be "detectportal.firefox.com" (and trick our code) as long as its HTTP only. Using HTTPS would be way to detect DNS "spoofs" and making sure only the "real" server is what says "success".
From what I can see, detectportal.firefox.com doesn't have a valid certificate at the moment.
However, I think we could potentially switch to using the HTTPS version, with little to no effort. It's just a pref after all.
there are also environments that capture http but just leave https alone..
Attachment #8500415 - Attachment is obsolete: true
Comment on attachment 8504309 [details] [diff] [review]
Make some network events trigger a captive portal recheck

Review of attachment 8504309 [details] [diff] [review]:
-----------------------------------------------------------------

so I like this patch a lot.. but even with it in place it seems like we're running the check every N (60) seconds? That's just not going to work..

::: netwerk/base/src/nsIOService.cpp
@@ +338,5 @@
> +    nsCOMPtr<nsIURI> uri;
> +    newChan->GetURI(getter_AddRefs(uri));
> +    nsCString host;
> +    uri->GetHost(host);
> +    if (StringBeginsWith(host, NS_LITERAL_CSTRING("192.168"))) {

I think there are dns functions to do this for you more comprehensively..
Attachment #8504309 - Flags: feedback?(mcmanus) → feedback+
we still have the pretty serious tracking problem.. that why I strongly wish we used the windows OS service at least on that platform. we've got some handwaving about concerns there, but isn't it a much better answer for those users (given they are already doing that ping)
Attachment #8504308 - Flags: feedback?(mcmanus)
(In reply to Patrick McManus [:mcmanus] from comment #10)
> Comment on attachment 8504309 [details] [diff] [review]
> Make some network events trigger a captive portal recheck
> 
> Review of attachment 8504309 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> so I like this patch a lot.. but even with it in place it seems like we're
> running the check every N (60) seconds? That's just not going to work..

How about we increase the delay after every iteration?
So, if no other network request is issued, we check every minute for 5 minutes, then after 20 min, then every hour?
The reasoning is that when one is using the browser, he'd want to know right away if he's in a captive portal, instead of being redirected when visiting a page. If no activity is performed, this becomes less important, and it's likely the hourly checks would notice the eventual captive portal.

> 
> ::: netwerk/base/src/nsIOService.cpp
> > +    if (StringBeginsWith(host, NS_LITERAL_CSTRING("192.168"))) {
> 
> I think there are dns functions to do this for you more comprehensively..

Thanks, I'll try using those. It does seem better that manually checking for every type of private net :)

(In reply to Patrick McManus [:mcmanus] from comment #11)
> we still have the pretty serious tracking problem.. that why I strongly wish
> we used the windows OS service at least on that platform. we've got some
> handwaving about concerns there, but isn't it a much better answer for those
> users (given they are already doing that ping)

Windows Vista+ has an API to query for captive portals, but it might require a partial rewrite of nsNotifyAddrListener.cpp using the new API [1]. There is also an API available to Windows8+, that seems quite sane [2], but this might only be available to apps.

Daniel also suggested a rewrite of nsNotifyAddrListener using the new API in Bug 939318 comment 11, which might be worth it, but it's hard to estimate how difficult it would be. I'll do some experimenting with it, and post my findings.

[1] http://msdn.microsoft.com/en-us/library/ms697388%28VS.85%29.aspx
[2] http://msdn.microsoft.com/en-us/library/windows/Apps/windows.networking.connectivity.connectionprofile.getnetworkconnectivitylevel
Other than using OS APIs, the idea Devdatta and Brian Smith discussed was to use nsICaptivePortalDetector when we detect a certificate mismatch for an HTTPS request (bug 816866). That provides much faster detection that polling on average if the user/browser is fetching HTTPS sites. If the user is visiting an HTTP site, the captive portal detection is much less important for the user as they will be redirected to the portal login page in their tab automatically.

I think we should focus on the certificate case first before pursuing OS APIs (if we have a use case for detecting captive portals when there are no HTTPS requests) since it seems like it would have a higher cost:benefit ratio. As Brian points out in bug 816866 comment 8, OS detection is best for when new connections happen, but aren't so good when a portal recaptures a user e.g. after a trial period or a session expiration so we may want to implement something like bug 816866 anyways to get better detection.

Of course, if HTTP 511 has taken off then that is the most straightforward way to detect portals (see bug 728658) IMO.
Thanks Matthew. Certificate errors are also part of the queues that would trigger the CaptivePortalService, which I mean to get into the final version of the patch.
> (In reply to Patrick McManus [:mcmanus] from comment #11)
> > we still have the pretty serious tracking problem.. that why I strongly wish
> > we used the windows OS service at least on that platform. we've got some
> > handwaving about concerns there, but isn't it a much better answer for those
> > users (given they are already doing that ping)
> 

I was surprized to discover that the Tomato WRT captive portal isn't properly identified on windows 8.1.
From what I see, it allows DNS queries, and intercepts HTTP requests to which it responds with a 302 redirect to a local ip address. HTTPS requests just time out.
For some reason, Win8.1 doesn't show the regular captive portal pop-up, nor does the NLM API have the NLM_INTERNET_CONNECTIVITY_WEBHIJACK flag set. My FirefoxOS phone had no problem in detecting the captive portal.
I should probably test with Win7 as well.

Given these issues, I propose the following plan:
1. Finish up the patch and land the captive portal service preffed off.
2. Work on a UI for it while improving detection.
3. Start work on integrating the NLM API - my implementation is fairly buggy (I'm new to working with windows COM objects), and it doesn't get the proper INetworkEvents. If we have that, we can feed whatever connectivity info we get into the captive portal, so the detection is better than each one method.

Side note:
Some captive portal implementers try to avoid detection by the OS, to prevent the pop-up from opening in a browser that doesn't support JS, cookies, etc. Having our own implementation would ensure the captive portal is detected, and that the pop-up is properly displayed.
http://blog.tanaza.com/blog/bid/318805/iOS-7-and-captive-portal-a-guide-to-captive-portal-requirements
valentin - I've got no doubt that you can do a better job that msft :)

But can you do enough of a better job to make up the privacy difference?

Or can we somehow combine something like blocklist checker domain and this work so we aren't adding any new information to the mix.. can you reach out to mark mayo's team in services for ideas on what it would mean to host this?
In general I'm more of a fan of being explicit about what one of our network services is for (vs, say, overloading the original intent of a given service like the blocklist checker).

If we run it ourselves, we can define the policy (and trigger audits) to ensure that we're not retaining any info beyond what is needed to run the specific service (which could be "nothing"). Obviously, we can't offer the same sort of guarantees for someone elses service, but I do like not having to run things.  :)

:mcmanus, any specific idea on what the endpoint would need to do in order to maintain a level of privacy we're comfortable with?
(In reply to Mark Mayo [:mmayo] from comment #17)
> In general I'm more of a fan of being explicit about what one of our network
> services is for (vs, say, overloading the original intent of a given service
> like the blocklist checker).

agreed.. I think the suggestion was meant to be to overload the domain rather than the service - as the blocklist domain is already being given the this-ip-runs-firefox information, so we're not adding a lot more information. certainly other domains might fit too.

otoh we are adding significant granularity vs something like the blocklist - by its very nature the captive portal service needs to check quite often (at least in comparison) basically to download a static file with well known contents.

And our windows users are already doing this as part of the OS (and windows makes the results available to applications) so it makes sense to me to try and use it when available.. we also get consistency between os notifications and firefox state that way which is good.

> 
> If we run it ourselves, we can define the policy (and trigger audits) to
> ensure that we're not retaining any info beyond what is needed to run the
> specific service (which could be "nothing"). Obviously, we can't offer the
> same sort of guarantees for someone elses service, but I do like not having
> to run things.  :)
> 
> :mcmanus, any specific idea on what the endpoint would need to do in order
> to maintain a level of privacy we're comfortable with?

it should retain absolutely nothing - its only purpose in life is to offer up something deterministic that the client can confirm reachability to. conceivably http and https versions of same.
Attachment #8607794 - Attachment is obsolete: true
Attachment #8607794 - Flags: review?(mcmanus)
Attachment #8504308 - Attachment is obsolete: true
Attachment #8504309 - Attachment is obsolete: true
Comment on attachment 8607800 [details] [diff] [review]
Add support for 511 Response code (captive portal)

Review of attachment 8607800 [details] [diff] [review]:
-----------------------------------------------------------------

in general there is no standards space support for this status code - so lets not do anything here. We'll track what warren et al are trying to get started in this summer's ietf.

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +1500,5 @@
>          }
>          else
>              mAuthRetryPending = true; // see DoAuthRetry
>          break;
> +    case 511: // Network Authentication Required (RFC6585)

this looks exactly like the default handling, right? why the extra code?
Attachment #8607800 - Flags: review?(mcmanus) → review-
Comment on attachment 8607803 [details] [diff] [review]
Make some network events trigger a captive portal recheck

Review of attachment 8607803 [details] [diff] [review]:
-----------------------------------------------------------------

what about cert mismatches ala bug 816866?

::: netwerk/base/nsIOService.cpp
@@ +361,5 @@
> +    if (NS_FAILED(rv)) {
> +        return rv;
> +    }
> +
> +    nsCString host;

autocstring

@@ +1219,5 @@
> +
> +    if (!pref || strcmp(pref, NETWORK_CAPTIVE_PORTAL_PREF) == 0) {
> +        static int disabledForTest = -1;
> +        if (disabledForTest == -1) {
> +            char *s = getenv("MOZ_DISABLE_NONLOCAL_CONNECTIONS");

I think this is fine, but we should make sure the test harness configs have the captive portal pref explicitly set to off as well (I haven't looked yet).

@@ +1221,5 @@
> +        static int disabledForTest = -1;
> +        if (disabledForTest == -1) {
> +            char *s = getenv("MOZ_DISABLE_NONLOCAL_CONNECTIONS");
> +            if (s) {
> +                disabledForTest = !!strncmp(s, "0", 1);

!! is bool notation, but you're assigning things to ints and comparing specific values (not just bools). It assigning to +1 or -1 or whatever using ? : notation would be more clear if you're looking for a tristate.

@@ +1388,5 @@
>  
>          SetOffline(true);
>  
> +        if (mCaptivePortalService) {
> +            mCaptivePortalService->Stop();

service = nullptr?

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +1506,5 @@
>          rv = ProcessNormal();
>          // This counts as a redirect. Don't store the response body.
>          MaybeInvalidateCacheEntryForSubsequentGet();
> +        {
> +            nsCOMPtr<nsICaptivePortalService> cps =

ah, now I see some new code for 511 :)

Still, let's not do this quite yet without some other meaningful market agreement.
Attachment #8607803 - Flags: review?(mcmanus) → review+
Comment on attachment 8607799 [details] [diff] [review]
Implement CaptivePortalService using nsICaptivePortalDetector

Review of attachment 8607799 [details] [diff] [review]:
-----------------------------------------------------------------

I think we need to track captive portal state for both http and https.

mDelay never seems to be set to 0.. I thought when we were confirmed to not be behind a portal we were going to turn polling off, and just retrigger it on network change, or a failed cert, etc..

::: modules/libpref/init/all.js
@@ +4486,5 @@
>  // nsMemoryInfoDumper can watch a fifo in the temp directory and take various
>  // actions when the fifo is written to.  Disable this in general.
>  pref("memory_info_dumper.watch_fifo.enabled", false);
>  
>  #ifdef MOZ_CAPTIVEDETECT

I would like to get rid of this compiler directive.

@@ +4489,5 @@
>  
>  #ifdef MOZ_CAPTIVEDETECT
> +// If minInterval is 0, the check will only happen
> +// when the service has a strong suspicion we are in a captive portal
> +// such as a 511 response, or a redirect to a local address.

511 :(

@@ +4496,5 @@
> +pref("network.captive-portal-service.maxInterval", 1500000); // 25 minutes
> +// Every 10 checks, the delay is increased by a factor of 5
> +pref("network.captive-portal-service.backoffFactor", "5.0");
> +
> +pref("network.captive-portal-service.enabled", true);

lets start this as false so we can land it and enable it separately

::: netwerk/base/CaptivePortalService.cpp
@@ +9,5 @@
> +#include "nsXULAppAPI.h"
> +
> +#define kInterfaceName "captive-portal-inteface"
> +
> +static const char kOpenCaptivePortalLoginEvent[] = "captive-portal-login";

these are observations.. they should be one event (aka "captive-portal" and the data should be "login", "abort", "success", etc..

@@ +169,5 @@
> +  mCaptivePortalDetector = nullptr;
> +  return NS_OK;
> +}
> +
> +// returns true if a captive portal has been detected

rather than getiscaptive, I would like to see a list of states.. such as

"unknown", "checking", "in unlocked portal", "in locked portal"

@@ +261,5 @@
> +NS_IMETHODIMP
> +CaptivePortalService::Prepare()
> +{
> +  LOG(("CaptivePortalService::Prepare\n"));
> +  // XXX: Finish preparation shouldn't be called until dns and routing is available.

this is a concerning comment :)

::: netwerk/base/nsICaptivePortalService.idl
@@ +23,5 @@
> +  readonly attribute boolean isCaptive;
> +
> +  [noscript] void initialize();
> +  /**
> +  * Starting and stopping the captivePortalService should only be done by

perhaps the idl should not offer init, start, or stop? the ioservice can cetainly find the captive portal service without the help of xpcom
Attachment #8607799 - Flags: review?(mcmanus) → review-
(In reply to Patrick McManus [:mcmanus] from comment #25)
> Comment on attachment 8607803 [details] [diff] [review]
> >  
> > +        if (mCaptivePortalService) {
> > +            mCaptivePortalService->Stop();
> 
> service = nullptr?

I was thinking I'd keep the service initialized and the pointer around, even when not in use, similar to the network link service.

(In reply to Patrick McManus [:mcmanus] from comment #26)
> Comment on attachment 8607799 [details] [diff] [review]
> I think we need to track captive portal state for both http and https.

At the moment I don't have a https captive portal to test with. I'll look around.
However, there are 3 things a captive portal might do for https connections:
1. Timeout the connection
2. Redirect it, with an invalid cert
3. Let it through
In all these cases it is still important to detect the portal through the http channel, and get the user logged in ASAP, and as far as I can tell, knowing the state of HTTPS doesn't help very much. I don't think a captive portal would only redirect for https connections, and even if that were the case, it would be more of a MitM than a captive portal :)

I'd leave HTTPS improvements for a followup.

> mDelay never seems to be set to 0.. I thought when we were confirmed to not
> be behind a portal we were going to turn polling off, and just retrigger it
> on network change, or a failed cert, etc..

Setting the pref to 0 would give the behaviour you're mentioning. The captive portal service would start up, and do a check. If the check succeeds, it would only be triggered by calling RecheckCaptivePortal(). If the check fails, it will keep polling every captivedetect.pollingTime until the captive portal goes away. However, after we login into the captive portal that has an expiration time, if the user isn't actively browsing, the service may not detect that we are in a captive portal.

> > +static const char kOpenCaptivePortalLoginEvent[] = "captive-portal-login";
> 
> these are observations.. they should be one event (aka "captive-portal" and
> the data should be "login", "abort", "success", etc..

I filed bug 1170474 for this.

> rather than getiscaptive, I would like to see a list of states.. such as
> 
> "unknown", "checking", "in unlocked portal", "in locked portal"

As checking may overlap with other states, and doesn't really provide more information... I thinking the states would be unknown, not_captive, locked_portal, unlocked_portal.

> > +  // XXX: Finish preparation shouldn't be called until dns and routing is available.
> this is a concerning comment :)

The service shouldn't be active when the network is down anyway, and if DNS or routing isn't available the connections will simply time out.

> > +  [noscript] void initialize();
> > +  /**
> > +  * Starting and stopping the captivePortalService should only be done by
> 
> perhaps the idl should not offer init, start, or stop? the ioservice can
> cetainly find the captive portal service without the help of xpcom

These are indeed not necessary, but code in the IOService looks much better if it goes through XPCOM.
It's your call though. Let me know if I should remove them still.

> what about cert mismatches ala bug 816866?

I'll put up a separate patch for that.
thanks valentin!

> 
> At the moment I don't have a https captive portal to test with. I'll look
> around.
> However, there are 3 things a captive portal might do for https connections:
> 1. Timeout the connection
> 2. Redirect it, with an invalid cert
> 3. Let it through
> In all these cases it is still important to detect the portal through the
> http channel, and get the user logged in ASAP, and as far as I can tell,
> knowing the state of HTTPS doesn't help very much. I don't think a captive
> portal would only redirect for https connections, and even if that were the
> case, it would be more of a MitM than a captive portal :)
> 

conceivably http may work and https may not work. 
1] Perhaps its just due to funny routing (vpns, or proxies)..
2] Perhaps we really are in a captive portal that is trying to hide itself.. they sometimes do that to the well known os x and windows service URIs for captive portal detection. the https version isn't spoofable in the same way.

So I think we should provide that check.

> 
> Setting the pref to 0 would give the behaviour you're mentioning.

I'm a little unclear on what we expect to configure.. I think the expectation is

1] if the service starts up and it all goes ok the first time, further checks should only be event based instead of timer based

2] if the service starts up and detects a portal, even after being logged in, timers should be part of the algorithm.

is that what we're doing? I'm not sure it needs to have more prefs.

> 
> These are indeed not necessary, but code in the IOService looks much better
> if it goes through XPCOM.
> It's your call though. Let me know if I should remove them still.
> 

I would rather not have the footgun in the xpcom. thanks!
Attachment #8607799 - Attachment is obsolete: true
Attachment #8607803 - Attachment is obsolete: true
Attachment #8607797 - Attachment is obsolete: true
Attachment #8607800 - Attachment is obsolete: true
(In reply to Patrick McManus [:mcmanus] from comment #28)
> they sometimes do that to the well known os x and windows service URIs for
> captive portal detection. the https version isn't spoofable in the same way.
> 
> So I think we should provide that check.

I don't have a HTTPS captive portal to test this out at the moment, so I'll leave this for a followup.
We also need to define how the service would react to differences between HTTP and HTTPS checks.
AFAICT the only advantage would be when HTTP succeeds or times out, and HTTPS redirects. In all other cases, the plain-text only check would return the correct result.

> 1] if the service starts up and it all goes ok the first time, further
> checks should only be event based instead of timer based

Very good point. Thanks!
Attachment #8615745 - Flags: review?(mcmanus)
Comment on attachment 8615745 [details] [diff] [review]
Remove MOZ_CAPTIVEDETECT and enable captive portal detector for all products

Review of attachment 8615745 [details] [diff] [review]:
-----------------------------------------------------------------

It looks like we need a pref based kill switch (instead of the COMPILE-TIME-DIRECTIVE) added as part of this.. then we can set the initial default to on for b2g but off for other folks.. then we can enable/disable more easily.
Attachment #8615745 - Flags: review?(mcmanus) → review-
Comment on attachment 8615742 [details] [diff] [review]
Implement CaptivePortalService using nsICaptivePortalDetector

Review of attachment 8615742 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not going to r+ this without the use of an https service or a more convincing argument.

If you really can't do 2 services, then the single service ought to be https. Let's not get screwed with by intermediaries.

::: netwerk/base/CaptivePortalService.cpp
@@ +276,5 @@
> +
> +NS_IMETHODIMP
> +CaptivePortalService::Complete(bool success)
> +{
> +  LOG(("CaptivePortalService::Complete(success=%d)\n", success));

add mstate to this log

::: netwerk/base/nsICaptivePortalService.idl
@@ +16,5 @@
> +/**
> + * Service used for captive portal detection.
> + */
> +[scriptable, uuid(bdbe0555-fc3d-4f7b-9205-c309ceb2d641)]
> +interface nsICaptivePortalService : nsISupports

this is much better.. maybe add an attribute for lastChecked ?
Attachment #8615742 - Flags: review?(mcmanus)
(In reply to Patrick McManus [:mcmanus] from comment #34)
> Comment on attachment 8615745 [details] [diff] [review]
> It looks like we need a pref based kill switch (instead of the
> COMPILE-TIME-DIRECTIVE) added as part of this.. then we can set the initial
> default to on for b2g but off for other folks.. then we can enable/disable
> more easily.

nsICaptivePortalDetector is currently only used in NetworkManager.js. So it's on for b2g, and not used in all other products.

(In reply to Patrick McManus [:mcmanus] from comment #35)
> Comment on attachment 8615742 [details] [diff] [review]
> 
> If you really can't do 2 services, then the single service ought to be
> https. Let's not get screwed with by intermediaries.

Doing it ONLY over https doesn't seem like an option to me. Captive portals either redirect https leading to the invalid cert, or they simply block https connections until you log in over http (this is more common in my experience). So if https times out, our detection service wouldn't really tell us anything.

> I'm not going to r+ this without the use of an https service or a more
> convincing argument.

There are a couple of other reasons I feel checking over https is unnecessary:
* Captive portals don't have the same incentive of blacklisting our URL given that Firefox is a fully fledged browser.
* Scaling our endpoint over https may be more difficult (https://detectportal.firefox.com/success.txt gives you a "This Connection is Untrusted" page)
* All of the detection methods I've looked at are only over HTTP
* Worst case scenario is a false negative - doesn't trigger our detection, but still intercepts https connections - which is basically what we get right now, with no check at all, and technically equivalent to a man-in-the-middle attack. I don't think we should take this a captive portal.

Let me know if I missed any use case/attack which be mitigated by using https. At the moment, I can only see minor improvements from confirming a captive portal over https.

> > +interface nsICaptivePortalService : nsISupports
> this is much better.. maybe add an attribute for lastChecked ?

what should this return?
> 
> (In reply to Patrick McManus [:mcmanus] from comment #35)
> > Comment on attachment 8615742 [details] [diff] [review]
> > 
> > If you really can't do 2 services, then the single service ought to be
> > https. Let's not get screwed with by intermediaries.
> 
> Doing it ONLY over https doesn't seem like an option to me. Captive portals

let's take this from first principles. We are using the concept of "can I download a known piece of data from mozilla" as a stand in for "can I reach the internet". Since this is a known piece of data it is easily spoofed - the only really interesting part of the question is "can I handshake with someone I can verify as mozilla". You can't do that without crypto (logically https). T

here is also the minor point that if we aren't integrity protected someone will screw with the data (just because they can - maybe accidental ad injection!) as well.

> either redirect https leading to the invalid cert, or they simply block
> https connections until you log in over http (this is more common in my
> experience). So if https times out, our detection service wouldn't really
> tell us anything.

I think it tells us we're not connected to the internet, right?

> * Captive portals don't have the same incentive of blacklisting our URL
> given that Firefox is a fully fledged browser.

meh - portals want to control the UI. Our captive UI may or may not leave them with that incentive.

> * Scaling our endpoint over https may be more difficult
> (https://detectportal.firefox.com/success.txt gives you a "This Connection
> is Untrusted" page)

cry me a river :)

> * All of the detection methods I've looked at are only over HTTP

it's not 2003 anymore. Don't trust the network.

> * Worst case scenario is a false negative - doesn't trigger our detection,
> but still intercepts https connections - which is basically what we get
> right now, with no check at all, and technically equivalent to a
> man-in-the-middle attack. I don't think we should take this a captive portal.
> 

actually, what I'm a bit more worried about is where all https is whitelisted but http is CP. That's why I tihnk we need both. This is probably less common that it once was.

> > > +interface nsICaptivePortalService : nsISupports
> > this is much better.. maybe add an attribute for lastChecked ?
> 
> what should this return?

a u64 delta in milliseconds ?
(In reply to Patrick McManus [:mcmanus] from comment #37)
> > 
> > (In reply to Patrick McManus [:mcmanus] from comment #35)
> > > Comment on attachment 8615742 [details] [diff] [review]
> > > 
> > > If you really can't do 2 services, then the single service ought to be
> > > https. Let's not get screwed with by intermediaries.
> > 
> > Doing it ONLY over https doesn't seem like an option to me. Captive portals
> 
> let's take this from first principles. We are using the concept of "can I
> download a known piece of data from mozilla" as a stand in for "can I reach
> the internet". Since this is a known piece of data it is easily spoofed -
> the only really interesting part of the question is "can I handshake with
> someone I can verify as mozilla". You can't do that without crypto
> (logically https). T
> 
> here is also the minor point that if we aren't integrity protected someone
> will screw with the data (just because they can - maybe accidental ad
> injection!) as well.

If http data is messed with, or if we get redirected, we count that as a captive portal.
If we need a https cert validation to fail so we can identify it, that's not a very good captive portal :)

> > either redirect https leading to the invalid cert, or they simply block
> > https connections until you log in over http (this is more common in my
> > experience). So if https times out, our detection service wouldn't really
> > tell us anything.
> 
> I think it tells us we're not connected to the internet, right?

If https times out, I'd imagine we still report a captive portal, right? If so, do we wait (and/or retry) the secure connection before we eventually show the UI? Because if it does take us a long time to confirm it, the user may have already loaded a URL in a new tab, got redirected, and logged into the captive portal.
So if we show the UI after we get confirmation for a HTTP captive portal, the HTTPS check is redundant.

> 
> > * Captive portals don't have the same incentive of blacklisting our URL
> > given that Firefox is a fully fledged browser.
> 
> meh - portals want to control the UI. Our captive UI may or may not leave
> them with that incentive.

True, but if captive portals blacklist our URL, either by timing-out or by letting it through unmodified, HTTPS does not help.

> > * Scaling our endpoint over https may be more difficult
> > (https://detectportal.firefox.com/success.txt gives you a "This Connection
> > is Untrusted" page)
> cry me a river :)
> > * All of the detection methods I've looked at are only over HTTP
> it's not 2003 anymore. Don't trust the network.

Can't argue with these comments. :)

> 
> > * Worst case scenario is a false negative - doesn't trigger our detection,
> > but still intercepts https connections - which is basically what we get
> > right now, with no check at all, and technically equivalent to a
> > man-in-the-middle attack. I don't think we should take this a captive portal.
> > 
> actually, what I'm a bit more worried about is where all https is
> whitelisted but http is CP. That's why I tihnk we need both. This is
> probably less common that it once was.

But I don't see how HTTPS helps here. If it's whitelisted, meaning the HTTPS connection isn't intercepted, this check tells us nothing more.

I'll try to get captivedetect.js to do both checks. But we need to decide exactly what to do with that information, especially when the HTTP and HTTPS results conflict with each other. But for what it's worth, I'm still not convinced we need the HTTPS check. Even though captive portals are untrusted networks, they usually signal that fact over HTTP, and I see little reason to doubt it.
> But we need to decide
> exactly what to do with that information, especially when the HTTP and HTTPS
> results conflict with each other.

So, any thoughts on comment 38 ? Should I still try to do both HTTP and HTTPS, or try and land the HTTP only version.
Flags: needinfo?(mcmanus)
valentin has convinced me that the only signal a CP will provide is whether or not plaintext http:// can transit it un-mutilated. Therefore the only test that is viable is a plaintext http:// check.

let's engage with the IETF work on this to make the situation better.
Flags: needinfo?(mcmanus)
Attachment #8615742 - Attachment is obsolete: true
Comment on attachment 8625891 [details] [diff] [review]
Implement CaptivePortalService using nsICaptivePortalDetector

Review of attachment 8625891 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/init/all.js
@@ +4619,5 @@
> +pref("network.captive-portal-service.minInterval", 60000); // 60 seconds
> +pref("network.captive-portal-service.maxInterval", 1500000); // 25 minutes
> +// Every 10 checks, the delay is increased by a factor of 5
> +pref("network.captive-portal-service.backoffFactor", "5.0");
> +pref("network.captive-portal-service.enabled", false);

are there bugs for front ends for firefox and/or fennec? If so this pref should be noted in them.. if not sworkman can probably get them filed.
Attachment #8625891 - Flags: review?(mcmanus) → review+
Re-requesting review as the MOZ_CONFIG build flag is no longer needed. nsICaptivePortalDetector needs to be available on all platforms
Attachment #8627718 - Flags: review?(mcmanus)
Attachment #8615745 - Attachment is obsolete: true
Comment on attachment 8627718 [details] [diff] [review]
Remove MOZ_CAPTIVEDETECT and enable captive portal detector for all products

Review of attachment 8627718 [details] [diff] [review]:
-----------------------------------------------------------------

(11:46:59 AM) mcmanus: how do the prefs in the new patch (captivedetect.*) interact with the ones I r+'d yesterday? network.captive-portal-service.*
(11:47:40 AM) mcmanus: optimally there would be one set, (all in init/all.js) with *.enabled being overridden in b2g
(11:49:28 AM) valentin: mcmanus: the CaptivePortalService uses nsICaptivePortalDetector which uses these prefs
(11:49:37 AM) valentin: so they also need to be available in all.js
(11:49:53 AM) valentin: I could alternatively put them in firefox.js if you prefer
(11:52:28 AM) mcmanus: delete them from b2g.js I think
(11:52:34 AM) mcmanus: we definitely dont want them duplicated
(11:52:57 AM) valentin: mcmanus: ok, sounds good
(11:53:41 AM) mcmanus: and I can't tell for sure because of how the patches are split. but make captivedetect* and network.captive-portal-service* sit next to each other
(11:53:53 AM) mcmanus: and then b2g.js needs a enabled = true, right?
(11:56:42 AM) valentin: mcmanus: b2g doesn't use captive-portal-service yet, (only nsICaptivePortalDetector in NetworkManager.js) but that will be the case when we change their implementation
(11:57:37 AM) mcmanus: ok.. then a comment on the enabled pref to that effect
(11:57:38 AM) mcmanus: thanks
Attachment #8627718 - Flags: review?(mcmanus) → review+
sorry had to back this out for test failures like https://treeherder.mozilla.org/logviewer.html#?job_id=11346678&repo=mozilla-inbound
Flags: needinfo?(valentin.gosu)
I had missed a couple of MOZ_CAPTIVEDETECTs in mobile/android/installer/package-manifest.in
Removed those too, hopefully it should stick now.
Flags: needinfo?(valentin.gosu)
Depends on: 1183332
You need to log in before you can comment on or make changes to this bug.