Closed Bug 989197 Opened 10 years ago Closed 8 years ago

Add SSL error page when captive portal is detected

Categories

(Firefox :: Address Bar, enhancement, P3)

enhancement

Tracking

()

VERIFIED FIXED
Firefox 53
Tracking Status
firefox51 --- wontfix
firefox52 + verified
firefox53 --- verified

People

(Reporter: jboriss, Assigned: nhnt11)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxprivacy])

Attachments

(8 files, 3 obsolete files)

210.70 KB, image/png
Details
252.66 KB, image/png
Details
58 bytes, text/x-review-board-request
Gijs
: review+
Details
58 bytes, text/x-review-board-request
Gijs
: review+
Details
58 bytes, text/x-review-board-request
Gijs
: review+
Details
58 bytes, text/x-review-board-request
valentin
: review+
Details
58 bytes, text/x-review-board-request
Gijs
: review+
Details
35.69 KB, patch
Details | Diff | Splinter Review
Note: Implementation followup from design Bug 878566
Spec: https://github.com/vtsatskin/FX-Captive-Portals-Design/blob/master/design/spec.md

If a captive portal is detected and a page with an SSL connection is being loaded and appears to be tampered with, an alternative warning page will be displayed. This error page would behave as the existing about:certerror page.

Pressing "open login page" on this warning would open the captive portal in a new tab. Similar to the status bar button, if the captive portal is already open, switch to the tab.
(In reply to Jennifer Morrow [:Boriss] (Firefox UX) from comment #0)
> If a captive portal is detected and a page with an SSL connection is being
> loaded and appears to be tampered with, an alternative warning page will be
> displayed.

I think this phrasing may imply the opposite cause-and-effect. I would word it "if we are about to show the cert error page, then do captive portal detection. If captive portal detection indicates a captive portal, then show the captive portal error page instead of the normal about:certerror page."

In particular:

1. SSL cert errors are the highest-signal indicator we have, during regular browsing, that a captive portal is in effect.
2. We don't generally care how long it takes to show the about:certerror page, and we'd like to show the captive portal page instead of the about:certerror page as much as possible, so it is better to defer showing about:certerror until we've run captive portal detection *every time* we'd show about:certerror.

To be clear, I don't think I'm saying something substantially different than what Jennifer is saying, but I do think that the way I'm wording hints to a much different (and better) implementation.
Depends on: 878566
(In reply to Brian Smith (:briansmith, was :bsmith; NEEDINFO? for response) from comment #1)
> (In reply to Jennifer Morrow [:Boriss] (Firefox UX) from comment #0)
> 1. SSL cert errors are the highest-signal indicator we have, during regular
> browsing, that a captive portal is in effect.
> 2. We don't generally care how long it takes to show the about:certerror
> page, and we'd like to show the captive portal page instead of the
> about:certerror page as much as possible, so it is better to defer showing
> about:certerror until we've run captive portal detection *every time* we'd
> show about:certerror.

Agreed!
Flags: firefox-backlog?
Flags: firefox-backlog? → firefox-backlog+
No longer blocks: fxdesktopbacklog
Whiteboard: p=0
Whiteboard: p=0 → p=3
This needs corrected.

However it's difficult for the browser to tell if it's behind a captive portal or an invalid cert, unless it completes a DNS check first.

Firefox could phone home before it loads the cert page and if it cant reach something on the internet, it could add to the cert error page that they might be seeing this page because they are behind a captive portal and need to login by visiting a non-https page.
Or if it detects a captive portal probe from the OS (if supported).

(In reply to Cory from comment #3)
> This needs corrected.
> 
> However it's difficult for the browser to tell if it's behind a captive
> portal or an invalid cert, unless it completes a DNS check first.
> 
> Firefox could phone home before it loads the cert page and if it cant reach
> something on the internet, it could add to the cert error page that they
> might be seeing this page because they are behind a captive portal and need
> to login by visiting a non-https page.
Points: --- → 3
Flags: qe-verify?
Whiteboard: p=3
One suggestion: you might want to consider removing the "I understand the risks" link from the mock.

* If this is a real captive portal, then the user should click on the "login" link. Creating an exception won't actually help the user get to the desired target page.
* If this is actually a MITMA pretending to be a captive portal, well... this error page doesn't present any risks, so users can't reason about that. You don't want to provide an incentive for MITMAs to pretend to be captive portals so that users see a "softer" warning with less security info.

For this reason, the Chrome captive portal SSL error doesn't let people override it.
Well, there is option 3 (admittedly an unlikely one): the browser made a mistake in detecting a captive portal?
If you consider option 3 likely enough to warrant the option to override, then it's probably necessary to include the full risks in the warning text.
Blocks: 1202680
Priority: -- → P3
Whiteboard: [fxprivacy]
Points: 3 → ---
Flags: firefox-backlog+
Priority: P3 → P2
Priority: P2 → P3
I've started thinking about how to implement this. The obvious part: when we encounter a cert error, ask the captive portal service for the current state. If we're behind a portal, show the alternate error page. The not obvious part is the implementation of the error page. I can think of a few approaches:
a) Use about:certerror - just pass different error/desc strings in the URL. This will make things messy for styling though.
b) Add strings/styling to about:neterror, and send the user to this page instead of about:certerror.
c) Use a completely new page, make about:certerror use this page as long as there's a portal, and reset it to aboutCertError.xhtml once the portal is freed. The general idea is to have the frontend show a different page for about:certerror as long as there's a portal (and platform can just not care). Another way to do this might be for windows to listen for a captive portal being detected, and be clever about switching
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(MattN+bmo)
Bah, I hit "Save Changes" too soon :( This comment is a continuation of the previous one:

> Another way to do this might be for windows to listen for a
> captive portal being detected, and be clever about switching

Scratch this completely.

We need to identify which cert errors we want to show a different page for. Bug 816866 adds code to trigger captive portal checks for certain cert errors - I think we can target the same ones.

Right now, I'm going to go investigate to see if these cert errors are all about:certerror errors, or if some of them use about:neterror. My hypothesis is the former, but I'll comment further once I'm sure. This information will help decide the best implementation route.
Jeez, I feel silly: I had aboutCertError.xhtml open and was wondering why it was back after bug 1240594 (it's not - turns out the one I had open was from b2g/). This makes things a bit clearer.
I also thought they were merged so was confused… I guess that leaves options (a) and (c). I suspect the page will be similar enough to about:neterror that I think we should try build upon it but I don't think (a) and (c) are the only options. I think we can use about:neterror but then modify the page based on the context of being in a captive portal using onCertErrorDetails like we do for clock skew issues.
Flags: needinfo?(MattN+bmo)
As the mockup is old, I don't know how much we'd still want to change the existing net/certerror page, which I think would probably be the determining factor in deciding between (a) and (c). Like Matt, I'd expect the page to be similar enough to be able to reuse about:net/certerror and alter it as necessary, so that's effectively (a) and/or only using onCertErrorDetails. The work in bug 1303284 could help? You may want to ping jkt about that patch...
Flags: needinfo?(gijskruitbosch+bugs)
Comment on attachment 8808614 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/91420/#review91248

::: browser/base/content/aboutNetError.xhtml:99
(Diff revision 1)
>          }
>  
>          buttonEl.disabled = true;
>        }
>  
> +      function openPortal() {

Maybe a better name for this function?

::: browser/base/content/aboutNetError.xhtml:229
(Diff revision 1)
>          var event = new CustomEvent("AboutNetErrorLoad", {bubbles:true});
>          document.getElementById("advancedButton").dispatchEvent(event);
>  
>          addDomainErrorLinks();
>        }
> + 

Whitespace :(

::: browser/base/content/aboutNetError.xhtml:258
(Diff revision 1)
>  
>        function initPage()
>        {
>          var err = getErrorCode();
>          gIsCertError = (err == "nssBadCert");
> +        // Only worry about captive portals if this is a cert error.

Which cert errors though? Need to ask #security

::: browser/locales/en-US/chrome/overrides/netError.dtd:60
(Diff revision 1)
> +<!ENTITY captivePortal.longDesc "
> +<p>This network may require you to login to access the internet.</p>
> +">
> +
> +<!ENTITY openPortalLoginPage.label "Open Login Page">
> +

All of these new strings probably need to be validated by someone with more expertise than I.
Attached image Screenshot (obsolete) —
This is what the error page looks like right now. I need to figure out how best to include "advanced information".
Assignee: nobody → nhnt11
Status: NEW → ASSIGNED
I would go with a different approach for the buttons. Since we are presumably only overriding the regular cert error page when we are reasonably certain that a captive portal is in play, it doesn't make sense to keep Try Again as the default action. We know it won't do any good. Instead I would put Open Login Page on the left as the default action (blue) and keep Advanced on the right with the current functionality. Reloading the page can function as Try Again anyway.
It would be great to have some feedback from UX about this feature.

tl;dr: We are showing a different UI for cert errors if there is a captive portal blocking the network. The error page simply states that the network may require the user to login, and provides a button to open the login page. The advanced button is still available for those who want the cert error details.
Flags: needinfo?(philipp)
Flags: needinfo?(bram)
Forgot to say in the previous comment: I've initiated a special try build that will pretend that there's a captive portal for all cert errors, to make it easy to test. (you can just pick any cert error page from https://badssl.com/ to test out the new UI)

The build will be available here when it's complete: https://archive.mozilla.org/pub/firefox/try-builds/nhnt11@gmail.com-fe19015fd0e99b334db01e2f9aebbaee68dcf391/
Attached image Screenshot v2
(In reply to Panos Astithas [:past] from comment #16)
> I would go with a different approach for the buttons. Since we are
> presumably only overriding the regular cert error page when we are
> reasonably certain that a captive portal is in play, it doesn't make sense
> to keep Try Again as the default action. We know it won't do any good.
> Instead I would put Open Login Page on the left as the default action (blue)
> and keep Advanced on the right with the current functionality. Reloading the
> page can function as Try Again anyway.

I've implemented this in the latest patch, please see the screenshot.
Attachment #8808615 - Attachment is obsolete: true
Comment on attachment 8809537 [details]
Bug 989197 - Strings for alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/92092/#review92228
Attachment #8809537 - Flags: review?(MattN+bmo) → review+
Comment on attachment 8808614 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/91420/#review92230

I'll be on PTO tomorrow so adding Gijs to help look at this before the merge.
Attachment #8808614 - Flags: review?(gijskruitbosch+bugs)
Pushed by nhnt11@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d728a7fdade2
Strings for alternate UI in cert error pages when a captive portal is active. r=MattN
Comment on attachment 8808614 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/91420/#review92294

I'm pretty concerned about this effectively delaying showing the user a useful error and formatting the page correctly based on all the async back and forth with the captive portal state thing, which I presume might hit the network. On slow/intermittent wifi this will basically break error pages, I think, which isn't OK. I would prefer a fallback to the 'real' SSL error after a timeout, and/or more proactive work to ensure we don't end up showing the user a half-baked error page.

::: browser/base/content/aboutNetError.xhtml:175
(Diff revision 4)
>          var cssClass = getCSSClass();
>          if (cssClass == "expertBadCert") {

We don't reuse this, so can just compare the result of the function directly.

::: browser/base/content/aboutNetError.xhtml:209
(Diff revision 4)
> -
> -        var event = new CustomEvent("AboutNetErrorLoad", {bubbles:true});
> -        document.getElementById("advancedButton").dispatchEvent(event);
> -
> -        addDomainErrorLinks();
> +      // active captive portal on the network. If there is one, we will display
> +      // different text. We request the captive portal state via an event, and
> +      // the response is sent through an event as well.
> +      function preInit() {
> +        // Start listening for the response...
> +        window.addEventListener("AboutNetErrorCaptivePortalState", function listen(e) {

If there's something broken about the captive portal state or it takes a long time to gather that data, which is several layers of async, we will display what, exactly? I understand that it's sad if we display the wrong thing first and then swap over, but it's equally sad if we end up waiting indefinitely for this other request and don't tell the user anything. So I'm worried that there's no fallback that calls initPage after e.g. 1-3 seconds pass without more details, nor can I see details about what state the error page will be in when that happens.

::: browser/base/content/aboutNetError.xhtml:407
(Diff revision 4)
> +
> +        setupAdvancedButton(true);
> +
> +        document.getElementById("learnMoreContainer").style.display = "block";
> +
> +        var checkbox = document.getElementById("automaticallyReportInFuture");

Nit: let

::: browser/base/content/aboutNetError.xhtml:426
(Diff revision 4)
> +            // set the checkbox
> +            checkbox.checked = !!options.automatic;
> +          }
> +        }, true, true);
> +
> +        var event = new CustomEvent("AboutNetErrorLoad", {bubbles:true});

nit: let

::: browser/base/content/aboutNetError.xhtml:597
(Diff revision 4)
>          <h1 id="et_sslv3Used">&sslv3Used.title;</h1>
>          <h1 id="et_weakCryptoUsed">&weakCryptoUsed.title;</h1>
>          <h1 id="et_inadequateSecurityError">&inadequateSecurityError.title;</h1>
>        </div>
>        <div id="errorDescriptionsContainer">
> -        <div id="ed_generic">&generic.longDesc;</div>
> +        <div id="ed_captivePortal">&captivePortal.longDesc;</div>

You're still using ed_generic, so it should stay.

::: browser/base/content/aboutNetError.xhtml:662
(Diff revision 4)
>            <button id="prefResetButton" class="primary" autocomplete="off">&prefReset.label;</button>
>          </div>
>  
> -        <div id="certErrorButtonContainer" class="button-container">
> +        <div id="certErrorAndCaptivePortalButtonContainer" class="button-container">
>            <button id="returnButton" class="primary" autocomplete="off" autofocus="true">&returnToPreviousPage.label;</button>
> +          <button id="openPortalLoginPageButton" class="primary" autocomplete="off" autofocus="true" onclick="openPortal()">&openPortalLoginPage.label;</button>

Please don't add new inline event handlers.

::: browser/base/content/browser.js:154
(Diff revision 4)
>      };
>    }
>    return null;
>  });
>  
> +XPCOMUtils.defineLazyServiceGetter(this, "cps",

I would prefer a more descriptive variable name, like "gCaptivePortalService" or whatever. This abbreviation could also refer to the content pref service, to give an example. :-)

::: browser/base/content/browser.js:2717
(Diff revision 4)
>   */
>  var BrowserOnClick = {
>    init: function () {
>      let mm = window.messageManager;
>      mm.addMessageListener("Browser:CertExceptionError", this);
> +    mm.addMessageListener("Browser:RequestCaptivePortalState", this);

Please file a followup to move all the error page handling to its own file.

::: browser/base/content/browser.js:2920
(Diff revision 4)
> +    // Open a new tab with the canonical URL that we use to check for a captive portal.
> +    // It will be redirected to the login page.
> +    // TODO - Open login page directly once we have a way to get the URL. (Bug 1315967)
> +    let tab = gBrowser.addTab(Services.prefs.getCharPref("captivedetect.canonicalURL"));
> +    gBrowser.selectedTab = tab;
> +    // TODO - Close the tab once login is complete.

Can the tab itself take care of this? Otherwise, how would we keep track of this? I'm not sure it's useful to land code with "TODO" comments. I don't think they add value here.

::: browser/themes/shared/aboutNetError.css
(Diff revision 4)
> -body.certerror #netErrorButtonContainer {
> +body:not(.neterror) #netErrorButtonContainer {
>    display: none;
>  }
>  
>  #errorTryAgain {
> -  margin-top: 1.2em;

Why did you remove this margin?
Attachment #8808614 - Flags: review?(gijskruitbosch+bugs) → review-
(In reply to Nihanth Subramanya [:nhnt11] from comment #21)
> It would be great to have some feedback from UX about this feature.

I’ve worked the most on this project, so I would be happy to provide information on behalf of Philipp in case he’s too busy!

Can I confirm that the behaviour is as follows?

1. You connect to a network with captive portal
2. Firefox will try to open the network’s login page automatically. If you’re not currently sitting on the login tab, a grey bar is shown. It has a button to “Show Login Page”.
3. If, instead of opening the login tab, you decide to open a webpage, you should also get redirected to the login page automatically. But in case this redirection fail, then the “Login to Network” page is shown.

If this is true, then my only concern is about the button label and link behaviour. On the “Login to Network” page, the blue button should be labelled “Show Login Page” and it should switch to the correct tab instead of opening the login page in the current tab. This way, the page behaviour is exactly the same as the grey bar.

However, if we can automatically open the login page without going through this interstitial page first, then we should not show this page at all. Just show the login page. It will save our user one click, and he won’t have to figure out the correct button to get to the login page.

What do you think?
Flags: needinfo?(philipp)
Flags: needinfo?(bram)
(In reply to Bram Pitoyo [:bram] from comment #29)
> 2. Firefox will try to open the network’s login page automatically. If
> you’re not currently sitting on the login tab, a grey bar is shown. It has a
> button to “Show Login Page”.

After bug 1315969 lands, the login page will only be opened automatically in one case: if the portal is detected when no browser window has focus. Then, when a browser window is focused, the login page is opened in a new tab *in that window*, and that new tab is selected.

The notification bar is shown in the browser window that is focused at the time of detection of the portal, or in the first window that gets focused in the case I just mentioned above. It's displayed in that window for all tabs. The "Show Login Page" button is visible in all tabs except the captive portal login page tab, iff we opened said tab.

> 3. If, instead of opening the login tab, you decide to open a webpage, you
> should also get redirected to the login page automatically. But in case this
> redirection fail, then the “Login to Network” page is shown.

If you attempt to access a HTTP page, the portal will do redirection and we don't need to do anything.
If you attempt to access a HTTPS page, it will almost certainly fail with a certificate error (because the portal is interfering). In this case, we show the "Login to Network" page.

> If this is true, then my only concern is about the button label and link
> behaviour. On the “Login to Network” page, the blue button should be
> labelled “Show Login Page” and it should switch to the correct tab instead
> of opening the login page in the current tab. This way, the page behaviour
> is exactly the same as the grey bar.

In this patch, the blue button opens the login page in a new tab. At the moment, since cert errors can happen in any window, the behavior of this button is not tied to the notification bar and automatically opened captive portal tab (because those are restricted to one window).

There is a bug on file to make the notification bar functionality not restricted to a single window: bug 1313568.
Until that bug is resolved, I think we can just have the blue button on the error page open a new tab with the login page. I think we can probably select this newly opened tab immediately too, what do you think?

> However, if we can automatically open the login page without going through
> this interstitial page first, then we should not show this page at all. Just
> show the login page. It will save our user one click, and he won’t have to
> figure out the correct button to get to the login page.

This definitely sounds like good UX! However, remembering the originally requested page and redirecting when appropriate may not be easy. What if the captive portal sends the user to a page with important information, like the amount of time remaining in their session? It's not good for us to redirect the user back to the original page in this case. Which means the user has to navigate to the original page again on their own. So in the overall process, clicks are not saved. Another problematic case is when Firefox starts up and a previous session is restored - we might open several tabs that lead to this error, and it'll be difficult for the user to recover the original pages. And by the way, a follow up that is planned soon after this patch is to automatically reload these "Login to Network" pages as soon as we detect that the portal has been freed.
(In reply to Nihanth Subramanya [:nhnt11] from comment #30)
> (In reply to Bram Pitoyo [:bram] from comment #29)

> [...] I think we can probably
> select this newly opened tab immediately too, what do you think?

Uh, this patch already does that, d'oh.
Attachment #8809537 - Attachment is obsolete: true
Comment on attachment 8808614 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/91420/#review92294

I don't have a strong opinion on this, but please see https://bugzilla.mozilla.org/show_bug.cgi?id=989197#c1. There, it's suggested that we don't care how long it takes to load the cert error page and it's worth the overhead to wait for an accurate captive portal state to be obtained.

Having said this, this patch doesn't attempt to gather any new data, so the overhead added comes purely from messaging. I've written in more detail about this below in this comment.

> If there's something broken about the captive portal state or it takes a long time to gather that data, which is several layers of async, we will display what, exactly? I understand that it's sad if we display the wrong thing first and then swap over, but it's equally sad if we end up waiting indefinitely for this other request and don't tell the user anything. So I'm worried that there's no fallback that calls initPage after e.g. 1-3 seconds pass without more details, nor can I see details about what state the error page will be in when that happens.

In this patch, the current captive portal state as we know it is used. If we already know that there's a portal at the time aboutNetError.xhtml loads, we show the "Login to Network" UI. If we don't know this, we just display the cert error as usual.

We are not attempting to gather any new data here, so the overhead is only due to the async message passing (which was less than 10ms usually when I dumped it). There should be no indefinite wait, because we aren't touching the network or waiting for any data that might take time to gather.

The code in this patch does not attempt to swap out the UI if a portal is detected after we've already shown the usual cert error UI. I think any such swapping will require its own discussion and should be done in a follow up.
Comment on attachment 8808614 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/91420/#review92294

> Please file a followup to move all the error page handling to its own file.

Filed https://bugzilla.mozilla.org/show_bug.cgi?id=1317302
As discussed on IRC, I am concerned about what happens if, for reasons of sync messaging by other consumers or actual 'hanging' of either process, or CPU starvation by other actors, the messages take a long time to arrive and in the meantime we show an "unfinished" error page. Right now error pages get filled in effectively synchronously before the "load" event fires, meaning we're pretty well guaranteed no flash of unstyled content or flash of 'half-filled-in-error-page'. This patch changes that substantially by delaying initialization. I would prefer not to have to do that.

I can think of three ways to fix this, all with their own up/downsides.

1) make the captive portal state code broadcast messages to the content processes about its state, and make the docshells in the content process listen for this information and pass it as a url param like it does other stuff. This guarantees that all this stuff is available to the page immediately.
2) like (1), but instead make tab-content.js or content.js or whatever listen for these messages, and have them pass it to the error page. This also guarantees immediate availability.
3) make the error page pretend to be unloaded until we have all the data by using a "loading" favicon, a fake title ("Loading..."), and hiding its content (ie body { display: none}) until all its data is in.

I would prefer 1, or if necessary 2. We can do 3, but it feels hacky and the code would not be as neat. As a bonus, if done well (1) or (2) wouldn't really be an awful lot more messaging to-and-fro than the current patch. The only big argument I see for doing (3) is if we anticipate that there will be other things for which we'll want async initialization in the near future (I dunno, using places history to Levenshtein-distance-based-correct broken URLs, or something).

I think there are other cases where the content process(es) will want to know about the captive portal state. It should actually probably/possibly affect things like navigator.onLine, or at least be available for add-ons that run frame scripts or other in-content-process stuff.

Matt, Nihanth, does this make sense? How do you feel about this - maybe I'm overthinking this?
Flags: needinfo?(nhnt11)
Flags: needinfo?(MattN+bmo)
Comment on attachment 8808614 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/91420/#review92712

Going to clear this while we settle on an approach. For some minor other comments, see below.

FWIW, it might make sense to split off the 'freed' stuff into a separate cset.

::: browser/base/content/browser.js:2984
(Diff revisions 4 - 7)
> +          return;
> +        }
> +        if (!portalHost) {
> +          // The tab is initially loaded with the canonical URL. We treat
> +          // the host of the first location change as the portal host.
> +          portalHost = aBrowser.currentURI.host;

This can throw if currentURI is a hostless URI.

::: browser/base/content/browser.js:2987
(Diff revisions 4 - 7)
> +        if (aBrowser.currentURI.host != portalHost &&
> +            aBrowser.lastURI.host == portalHost) {

Ditto.

::: browser/base/content/browser.js:2993
(Diff revisions 4 - 7)
> +            aBrowser.lastURI.host == portalHost) {
> +          // The browser has navigated away from the portal for the first time.
> +          // We can stop listening for location changes.
> +          gBrowser.removeTabsProgressListener(progressListener);
> +
> +          if (aBrowser.currentURI.spec == canonicalURL) {

I think this should probably be something more like:

```js
let canonicalURI = makeURI(canonicalURL);
if (aBrowser.currentURI.equalsExceptRef(canonicalURI)) {
}
```

To effectively avoid having encoding changes or whatever avoid this matching.

Also, I'd point out that this being inside the other `if` statement means that the hostname of the 'canonical' URL should never be the same as that of the captive portal, or the code won't work right. That might be true for the prefs we're using, but seems like a somewhat odd restrictions more generally, especially if corporate / distro builds might use an alternative URL. Can we come up with a different way of detecting this case, or is this just a necessity? (I haven't thought about it too hard.)
Attachment #8808614 - Flags: review?(gijskruitbosch+bugs)
I'm not sure what the right way is to get the current captive portal state from docshell code running in the content process. I tinkered a bit but not to much success. Valentin, I was wondering if you could help with this. Specifically, we want to add a flag to the URL here if the state is locked: https://dxr.mozilla.org/mozilla-central/rev/47e0584afe0ab0b867412189c610b302b6ba0ea7/docshell/base/nsDocShell.cpp#5379
Flags: needinfo?(nhnt11) → needinfo?(valentin.gosu)
I would suggest something similar to:

nsCOMPtr<nsICaptivePortalService> cps = do_GetService(NS_CAPTIVEPORTAL_CID);
int32_t state;
if (cps && NS_SUCCEEDED(cps->GetState(&state)) &&
    state == nsICaptivePortalService::LOCKED_PORTAL) {
  errorPageUrl.AppendLiteral("&captive=true");
}
Flags: needinfo?(valentin.gosu)
(In reply to Valentin Gosu [:valentin] from comment #40)
> I would suggest something similar to:
> 
> nsCOMPtr<nsICaptivePortalService> cps = do_GetService(NS_CAPTIVEPORTAL_CID);
> int32_t state;
> if (cps && NS_SUCCEEDED(cps->GetState(&state)) &&
>     state == nsICaptivePortalService::LOCKED_PORTAL) {
>   errorPageUrl.AppendLiteral("&captive=true");
> }

I tried something similar but always got an unknown state, probably because the code runs in the content process.
(In reply to Nihanth Subramanya [:nhnt11] from comment #30)
> The notification bar is shown in the browser window that is focused at the
> time of detection of the portal, or in the first window that gets focused in
> the case I just mentioned above. It's displayed in that window for all tabs.
> The "Show Login Page" button is visible in all tabs except the captive
> portal login page tab, iff we opened said tab.

This behaviour sounds good.


> If you attempt to access a HTTPS page, it will almost certainly fail with a
> certificate error (because the portal is interfering). In this case, we show
> the "Login to Network" page.

Makes sense. This means that user will only see this interstitial only when absolutely necessary. And whenever possible, we would still redirect to the login page automatically.


> […] I think we can just have the blue button on the
> error page open a new tab with the login page. I think we can probably
> select this newly opened tab immediately too, what do you think?

Yes. Clicking on the blue button should open a new tab *and* focus on that tab immediately.


> […] remembering the originally
> requested page and redirecting when appropriate may not be easy. What if the
> captive portal sends the user to a page with important information, like the
> amount of time remaining in their session?

I agree with you that attempting to be too smart brings up a lot of tricky problems. Let’s not do this, then.
(In reply to :Gijs Kruitbosch from comment #37)
> The only big argument I see for doing (3) is if we anticipate that there will be
> other things for which we'll want async initialization in the near future (I
> dunno, using places history to Levenshtein-distance-based-correct broken
> URLs, or something).

Something like that is the No More 404s Test Pilot experiment, which is using an infobar at the moment to suggest visiting the page in the Internet Archive. If this experiment proves successful and graduates to m-c, we may want to consider an approach like the one you describe here.
(In reply to :Gijs Kruitbosch (PTO Nov. 17-20 inclusive) from comment #37)
> [...]
> Matt, Nihanth, does this make sense? How do you feel about this - maybe I'm
> overthinking this?

I spoke to Valentin about this, and the result is bug 1317511 - which just landed on inbound. We can now get the captive portal state in the content process, so adding a flag to the cert error url from docshell should be easy.

I'll upload a new patch soon!
Flags: needinfo?(MattN+bmo)
[Tracking Requested - why for this release]: Core piece of captive portal work
Attachment #8808614 - Attachment is obsolete: true
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review96896

::: docshell/base/nsDocShell.cpp:5376
(Diff revision 3)
> +  int32_t state;
> +  if (cps && NS_SUCCEEDED(cps->GetState(&state)) &&
> +      state == nsICaptivePortalService::LOCKED_PORTAL) {

Nit: `state` seems a bit vague. Maybe `cpsState`?

::: netwerk/base/CaptivePortalService.cpp
(Diff revision 3)
> -  *aState = UNKNOWN;
> -  if (!mInitialized) {
> -    return NS_ERROR_NOT_INITIALIZED;
> -  }

It seems like this is being looser than necessary. It seems like we can also check `XRE_GetProcessType() == GeckoProcessType_Default` to match what bug 1317511 did.
`if (!mInitialized && XRE_GetProcessType() == GeckoProcessType_Default) {`
track captive portal related issue for 52
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97304

::: browser/base/content/aboutNetError.xhtml:81
(Diff revision 4)
>          return decodeURIComponent(url.slice(desc + 2));
>        }
>  
> +      function isCaptive() {
> +        let url = document.documentURI;
> +        let matches = url.match(/captive\=([^&]+)\&/);

The regex that matches the "e" flag will also pick up the e at the end of "captive" because it doesn't match the ampersand. This is fragile and should be fixed.
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97310

::: browser/base/content/test/general/browser_aboutCertError.js:65
(Diff revision 5)
> +    doc.getElementById("openPortalLoginPageButton").click();
> +  });
> +
> +  let portalTab = yield portalTabPromise;
> +  is(gBrowser.selectedTab, portalTab, "Login page should be open in a new foreground tab.");
> +

I'm specifically not testing if the portal tab is correctly automatically closed when the portal is freed.

I fully intend to implement and have us host a separate page that can be requested for portal tabs. This way, if the portal redirects to the originally requested URL, we can show the user some information indicating that the fact that they are seeing the page means they can now browse freely. This will avoid closing the tab (which is bad UX imo) and also help educate users.
Comment on attachment 8816307 [details]
Bug 989197 - Include captive portal state as a flag in the about:certerror URI.

https://reviewboard.mozilla.org/r/97090/#review97312

::: netwerk/base/CaptivePortalService.cpp
(Diff revision 2)
>  //-----------------------------------------------------------------------------
>  
>  NS_IMETHODIMP
>  CaptivePortalService::GetState(int32_t *aState)
>  {
> -  *aState = UNKNOWN;

MattN suggests we check the process type here so that consumers of GetState on the parent process can know if the CPS isn't initialized yet. Valentin, what do you think?
Comment on attachment 8816307 [details]
Bug 989197 - Include captive portal state as a flag in the about:certerror URI.

https://reviewboard.mozilla.org/r/97090/#review97312

> MattN suggests we check the process type here so that consumers of GetState on the parent process can know if the CPS isn't initialized yet. Valentin, what do you think?

Er, to be clear, "so that consumers of GetState..." is one idea I had of a useful outcome of checking the process type.
Comment on attachment 8816307 [details]
Bug 989197 - Include captive portal state as a flag in the about:certerror URI.

https://reviewboard.mozilla.org/r/97090/#review97450

::: netwerk/base/CaptivePortalService.cpp
(Diff revision 2)
>  //-----------------------------------------------------------------------------
>  
>  NS_IMETHODIMP
>  CaptivePortalService::GetState(int32_t *aState)
>  {
> -  *aState = UNKNOWN;

I think we can remove the check completely.
Attachment #8816307 - Flags: review?(valentin.gosu) → review+
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97464

::: browser/base/content/aboutNetError.xhtml:49
(Diff revision 5)
>        // the URI that the user attempted to load.
>  
>        function getErrorCode()
>        {
>          var url = document.documentURI;
> -        var error = url.search(/e\=/);
> +        var error = url.search(/\?e\=/);

Why was this changed? We shouldn't make this order-dependent (ie this won't find the error code if it's in `&e=`, and `=` should be escaped anyway.

Really, all of this should be changed to use URLSearchParams, but I don't see why this was changed at all for this patch.

::: browser/base/content/aboutNetError.xhtml:84
(Diff revision 5)
>          return decodeURIComponent(url.slice(desc + 2));
>        }
>  
> +      function isCaptive() {
> +        let url = document.documentURI;
> +        let matches = url.match(/\&captive\=([^&]+)\&/);

again, don't rely on this being the non-first param.

::: browser/base/content/aboutNetError.xhtml:84
(Diff revision 5)
> +        let matches = url.match(/\&captive\=([^&]+)\&/);
> +        // captive flag is optional, if no match just return false.
> +        if (!matches || matches.length < 2)
> +          return false;
> +
> +        // parenthetical match is the second entry
> +        return decodeURIComponent(matches[1]) == "true";

84-90 can be replaced with just:

```
return /captive=true/.test(url);
```

::: browser/base/content/aboutNetError.xhtml:213
(Diff revision 5)
> +        // Only worry about captive portals if this is a cert error.
> +        let captivePortalActive = isCaptive() && gIsCertError;

Feels like this logic should be inside isCaptive().

::: browser/base/content/aboutNetError.xhtml:220
(Diff revision 5)
> +        if (captivePortalActive) {
> +          errTitle = document.getElementById("et_captivePortal");
> +          errDesc = document.getElementById("ed_captivePortal");
> +        } else {
> +          errTitle = document.getElementById("et_" + err);
> +          errDesc  = document.getElementById("ed_" + err);
> +        }

Instead of this, just reassign "captivePortal" to `err` if captivePortalActive is true, and leave the rest of the code as-is?

::: browser/base/content/aboutNetError.xhtml:376
(Diff revision 5)
> +        for (let host of document.querySelectorAll(".hostname")) {
> +          host.textContent = document.location.hostname;
> +        }

Why are we doing this in the captive portal case? AFAICT in the strings in https://hg.mozilla.org/mozilla-central/rev/d728a7fdade2 we don't use hostnames? The ones in the panel get done in setupAdvancedButton...

::: browser/base/content/aboutNetError.xhtml:388
(Diff revision 5)
> +          document.dispatchEvent(event);
> +        });
> +
> +        setupAdvancedButton(true);
> +
> +        addDomainErrorLinks();

What domain error links need setting up in the captive portal case? I didn't think we showed the original error codes? If this is only about the advanced panel, maybe this should be inside `setupAdvancedButton`?

::: browser/base/content/browser.js:154
(Diff revision 5)
> +XPCOMUtils.defineLazyServiceGetter(this, "gCaptivePortalService",
> +                                   "@mozilla.org/network/captive-portal-service;1",
> +                                   "nsICaptivePortalService");

I don't see this being used at all.

::: browser/base/content/browser.js:2923
(Diff revision 5)
> +    let canonicalURL = Services.prefs.getCharPref("captivedetect.canonicalURL");
> +    let tab = gBrowser.addTab(canonicalURL);
> +    let canonicalURI = makeURI(canonicalURL);

We open a new tab every time the user clicks this? That doesn't sound great... Is there a followup for this, or can we fix it here, maybe by keeping track of whether we've opened a captive portal tab since the last network change event?

::: browser/base/content/browser.js:2933
(Diff revision 5)
> +    // When we are no longer captive, close the tab if it's at the canonical URL.
> +    let tabCloser = () => {
> +      Services.obs.removeObserver(tabCloser, "captive-portal-login-abort");
> +      Services.obs.removeObserver(tabCloser, "captive-portal-login-success");
> +      if (!tab || tab.closing || !tab.parentNode || !tab.linkedBrowser ||
> +          !tab.linkedBrowser.currentURI.equalsExceptRef(canonicalURI)) {

I'm confused. We have some URI (like http://www.mozilla.org/test or whatever) that we use to probe for a captive portal. We open that URI in the full knowledge it will get redirected to the captive portal. Then we compare the URI here - that sounds to me like the comparison will always fail. What am I missing?

::: browser/base/content/content.js:283
(Diff revision 5)
>    get isAboutCertError() {
>      return content.document.documentURI.startsWith("about:certerror");
>    },
>  
>    receiveMessage: function(msg) {
> -    if (!this.isAboutCertError) {
> +    if (!this.isAboutCertError && !this.isAboutNetError) {

Why would we get CertErrorDetails messages on the network error page?

::: browser/themes/shared/aboutNetError.css:43
(Diff revision 5)
>  #learnMoreContainer {
>    display: none;
>  }
>  
> -#certErrorButtonContainer {
> +#certErrorAndCaptivePortalButtonContainer {
>    display: none;

Some of this feels like it should be in a content stylesheet, but we can worry about that in a separate bug.
Attachment #8815515 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8815516 [details]
Bug 989197 - Reload error pages showing captive portal UI when the portal is freed.

https://reviewboard.mozilla.org/r/96414/#review97466

This does what it says on the tin.

However, are we sure we don't want to stagger the loads if there are a lot of them...? Maybe that should be a followup bug.
Attachment #8815516 - Flags: review?(gijskruitbosch+bugs) → review+
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97468

::: browser/base/content/test/general/browser_aboutCertError.js:18
(Diff revision 5)
>  const ss = Cc["@mozilla.org/browser/sessionstore;1"].getService(Ci.nsISessionStore);
>  
> +const CANONICAL_CONTENT = "success";
> +const CANONICAL_URL = "data:text/plain;charset=utf-8," + CANONICAL_CONTENT;
> +
> +add_task(function* setup() {

Please put this all in a new file, as these tests are already timing out. In fact, please put it in a new directory instead of in general/ . Followup bug to move this test and its friends (abouthome and so on also come to mind).

::: browser/base/content/test/general/browser_aboutCertError.js:29
(Diff revision 5)
> +  info("Waiting for captive portal state to be correct in the content process.");
> +  yield BrowserTestUtils.waitForCondition(function*() {
> +    return ContentTask.spawn(gBrowser.selectedBrowser, null, () => {
> +      let cps = Components.classes["@mozilla.org/network/captive-portal-service;1"]
> +                          .getService(Components.interfaces.nsICaptivePortalService);
> +      return cps.state == cps.LOCKED_PORTAL;
> +    });
> +  });

Isn't there an event that we can listen for instead of using waitForCondition? That would be preferable.

::: browser/base/content/test/general/browser_aboutCertError.js:41
(Diff revision 5)
> +  let errorTab = yield BrowserTestUtils.openNewForegroundTab(gBrowser, () => {
> +    let tab = gBrowser.addTab(UNKNOWN_ISSUER);
> +    gBrowser.selectedTab = tab;
> +    browser = gBrowser.selectedBrowser;
> +    certErrorLoaded = waitForCertErrorLoad(browser);
> +    return tab;
> +  }, false);

What's the advantage of using openNewForegroundTab at all here?

::: browser/base/content/test/general/browser_aboutCertError.js:65
(Diff revision 5)
> +    doc.getElementById("openPortalLoginPageButton").click();
> +  });
> +
> +  let portalTab = yield portalTabPromise;
> +  is(gBrowser.selectedTab, portalTab, "Login page should be open in a new foreground tab.");
> +

IMO we should test the current behaviour. If/when we replace it, we can update the test.

::: browser/base/content/test/general/browser_aboutCertError.js:67
(Diff revision 5)
> +  gBrowser.removeTab(portalTab);
> +  gBrowser.removeTab(errorTab);

Nit: `yield BTU.removeTab(...)`
Attachment #8815517 - Flags: review?(gijskruitbosch+bugs)
(In reply to Wes Kocher (:KWierso) from comment #28)
> https://hg.mozilla.org/mozilla-central/rev/d728a7fdade2
> 
> +<!ENTITY captivePortal.title "Login to network">
> +<!ENTITY captivePortal.longDesc "
> +<p>This network may require you to login to access the internet.</p>
> +">
> +<!ENTITY openPortalLoginPage.label "Open Login Page">

Please note that "login" is a noun, "log in" is the verb. "Login Page" should be fine.
Can this be corrected?

(Lowercase "i" for "internet" seems to be allowed according to the style guide.)

Also see bug 1315969.
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review96896

> It seems like this is being looser than necessary. It seems like we can also check `XRE_GetProcessType() == GeckoProcessType_Default` to match what bug 1317511 did.
> `if (!mInitialized && XRE_GetProcessType() == GeckoProcessType_Default) {`

Personally I think the UNKNOWN state serves the purpose of communicating to consumers that the state isn't ready for use. Valentin ok'd (on IRC) just removing the check, as this patch does.
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97308

::: browser/base/content/aboutNetError.xhtml:38
(Diff revision 5)
>        // custom styling and favicon:
>        //
>        //   moz-neterror:page?e=error&u=url&s=classname&d=desc
> +      //
> +      // Also optionally, to specify that we are behind a captive portal:
> +      //   moz-neterror:page?e=error&u=url&captive=true&d=desc

Not sure if "moz-neterror:page" is relevant anymore, should we change this?
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97664

::: browser/base/content/test/general/browser_aboutCertError.js:29
(Diff revision 5)
> +  info("Waiting for captive portal state to be correct in the content process.");
> +  yield BrowserTestUtils.waitForCondition(function*() {
> +    return ContentTask.spawn(gBrowser.selectedBrowser, null, () => {
> +      let cps = Components.classes["@mozilla.org/network/captive-portal-service;1"]
> +                          .getService(Components.interfaces.nsICaptivePortalService);
> +      return cps.state == cps.LOCKED_PORTAL;
> +    });
> +  });

Not really, there's no event that's fired specifically to tell consumers that the content process instance of CaptivePortalService received state information.

::: browser/base/content/test/general/browser_aboutCertError.js:41
(Diff revision 5)
> +  let errorTab = yield BrowserTestUtils.openNewForegroundTab(gBrowser, () => {
> +    let tab = gBrowser.addTab(UNKNOWN_ISSUER);
> +    gBrowser.selectedTab = tab;
> +    browser = gBrowser.selectedBrowser;
> +    certErrorLoaded = waitForCertErrorLoad(browser);
> +    return tab;
> +  }, false);

Probably because it waits for the TabSwitchDone event: https://dxr.mozilla.org/mozilla-central/rev/12637ae351d64ecbf6b74cdbf26d7eb24ac0f659/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm#112

FWIW this is what other test functions in this test do as well.
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97468

> Please put this all in a new file, as these tests are already timing out. In fact, please put it in a new directory instead of in general/ . Followup bug to move this test and its friends (abouthome and so on also come to mind).

Bug 1313568 is next, and once that lands we can have one test for both the notification bar + the error page. Can we leave this test in this file in this bug and take care of it in bug 1313568?
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97738

::: browser/base/content/aboutNetError.xhtml:37
(Diff revision 7)
> +      // Also optionally, to specify that we are behind a captive portal:
> +      //   moz-neterror:page?e=error&u=url&captive=true&d=desc

Yeah, let's update these.

::: browser/base/content/aboutNetError.xhtml:49
(Diff revision 7)
>        // the URI that the user attempted to load.
>  
>        function getErrorCode()
>        {
>          var url = document.documentURI;
> -        var error = url.search(/e\=/);
> +        var error = url.search(/\?e\=/);

Here and in other places it doesn't seem like you've addressed my previous comments. Please do that and I'll re-review.
Attachment #8815515 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97468

> Bug 1313568 is next, and once that lands we can have one test for both the notification bar + the error page. Can we leave this test in this file in this bug and take care of it in bug 1313568?

Why wait?

> Isn't there an event that we can listen for instead of using waitForCondition? That would be preferable.

I don't understand. When the state of the captive portal changes in the parent process, is there an event there? Presumably yes. Presumably also, the state in the content process is updated via messaging at the point where it changes in the parent process. Logically speaking, if we do messaging to the content process after that has happened (ie after the state change in the parent) the order guarantees of the parent->child messaging should guarantee that the state in the content has changed by the time our message arrives. So I think we should be using whatever events there are in the parent process...
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97744
Attachment #8815517 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97464

> Why was this changed? We shouldn't make this order-dependent (ie this won't find the error code if it's in `&e=`, and `=` should be escaped anyway.
> 
> Really, all of this should be changed to use URLSearchParams, but I don't see why this was changed at all for this patch.

It felt wrong to not change it, because the existing regex would match the "e=" at the end of "captive=" too. In any case, the comment at the start of the <script> mandates the URL to be formatted in a specific way (not that I think this is good). The existing regex already imposes an order (the "u" parameter MUST come right after "e"). I was simply trying to make it obvious that it is indeed order dependent. Maybe we can get rid of the order-dependency in another bug?

Re. "and = should be escaped anyway", do you mean it shouldn't be? Because afaik it doesn't need to be escaped here, though I chose not to change it.

> 84-90 can be replaced with just:
> 
> ```
> return /captive=true/.test(url);
> ```

Nice, thanks!

> Feels like this logic should be inside isCaptive().

I changed the name of the variable to showCaptivePortalUI - this should be more clear. Whether or not we're captive has nothing to do with whether the error is a cert error.

> Instead of this, just reassign "captivePortal" to `err` if captivePortalActive is true, and leave the rest of the code as-is?

Good idea, thanks!

> We open a new tab every time the user clicks this? That doesn't sound great... Is there a followup for this, or can we fix it here, maybe by keeping track of whether we've opened a captive portal tab since the last network change event?

Bug 1313568 is about handling captive portal UI per-window, which will make it easy to ensure that the buttons to open the login page in error pages as well as the notification bar always use an existing login page tab if possible. I'm working on that bug next, so I don't want to fix it here because it will be wasted effort.

> I'm confused. We have some URI (like http://www.mozilla.org/test or whatever) that we use to probe for a captive portal. We open that URI in the full knowledge it will get redirected to the captive portal. Then we compare the URI here - that sounds to me like the comparison will always fail. What am I missing?

The idea is not to close the tab if, when we detect that the portal is no longer active, the tab has redirected to a page other than the canonical URL. For example, after login, the portal might redirect the user to a page showing the amount of time left in the session after.

We only want to close the tab if the portal is one that redirects to the originally requested URL, because in our case, it's the canonical URL, which results in a page that's blank except for the word "success" at the top left.
^ I am so sorry that the replies in the above comment were not published. I thought they were.
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97750

::: browser/base/content/aboutNetError.xhtml:49
(Diff revision 7)
>        // the URI that the user attempted to load.
>  
>        function getErrorCode()
>        {
>          var url = document.documentURI;
> -        var error = url.search(/e\=/);
> +        var error = url.search(/\?e\=/);

OK, so if it's /necessary/ to make changes here, which seems to be the case, then we should change this to use URLSearchParams - not entrench the hardcoding and bad patterns a bit more.
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97464

> The idea is not to close the tab if, when we detect that the portal is no longer active, the tab has redirected to a page other than the canonical URL. For example, after login, the portal might redirect the user to a page showing the amount of time left in the session after.
> 
> We only want to close the tab if the portal is one that redirects to the originally requested URL, because in our case, it's the canonical URL, which results in a page that's blank except for the word "success" at the top left.

OK, but then this sounds pretty racy - that is, the notification we get ("hi, no more captive portal") will race with the other tabs "realizing" that they can now load the original pages. So, not sure how reliable this is...

Instead, it feels like we should listen specifically for that tab starting to load the canonical URL (and removing that listener if the browser window or tab goes away).

Also, there should be a comment explaining this logic in more detail.
Comment on attachment 8817016 [details]
Bug 989197 - Use URLSearchParams instead of regex to parse params in about:net/certerror URLs.

https://reviewboard.mozilla.org/r/97448/#review97760

::: browser/base/content/aboutNetError.xhtml:43
(Diff revision 1)
>        // the URL (with the format from above). This is because
>        // document.location.href gets the current URI off the docshell,
>        // which is the URL displayed in the location bar, i.e.
>        // the URI that the user attempted to load.
>  
> +      var searchParams;

let

::: browser/base/content/aboutNetError.xhtml:45
(Diff revision 1)
>        // which is the URL displayed in the location bar, i.e.
>        // the URI that the user attempted to load.
>  
> +      var searchParams;
> +
> +      function getSearchParams() {

Rather than having this lazy getter, let's just assign it to the global immediately.

::: browser/base/content/aboutNetError.xhtml:47
(Diff revision 1)
>  
> +      var searchParams;
> +
> +      function getSearchParams() {
> +        if (!searchParams) {
> +          searchParams = new URLSearchParams(document.documentURI.split("?")[1])

```js
({searchParams}) = new URL(document.documentURI);
```
Attachment #8817016 - Flags: review?(gijskruitbosch+bugs) → review+
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97764

::: browser/base/content/aboutNetError.xhtml:38
(Diff revision 8)
>        // custom styling and favicon:
>        //
>        //   moz-neterror:page?e=error&u=url&s=classname&d=desc
> +      //
> +      // Also optionally, to specify that we are behind a captive portal:
> +      //   moz-neterror:page?e=error&u=url&captive=true&d=desc

Please update this and the previous comment.

::: browser/base/content/browser.js:2929
(Diff revision 8)
> +    // When we are no longer captive, close the tab if it's at the canonical URL.
> +    let tabCloser = () => {
> +      Services.obs.removeObserver(tabCloser, "captive-portal-login-abort");
> +      Services.obs.removeObserver(tabCloser, "captive-portal-login-success");
> +      if (!tab || tab.closing || !tab.parentNode || !tab.linkedBrowser ||
> +          !tab.linkedBrowser.currentURI.equalsExceptRef(canonicalURI)) {

Comment here now and a followup bug to make this better, I guess?
Attachment #8815515 - Flags: review?(gijskruitbosch+bugs) → review+
I've got responses and updated patches, but I've been having trouble with WiFi, please standby :P
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97768
Attachment #8815517 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97468

> Why wait?

OK, I'll figure it out for the next version of the patch. Stay tuned.

> I don't understand. When the state of the captive portal changes in the parent process, is there an event there? Presumably yes. Presumably also, the state in the content process is updated via messaging at the point where it changes in the parent process. Logically speaking, if we do messaging to the content process after that has happened (ie after the state change in the parent) the order guarantees of the parent->child messaging should guarantee that the state in the content has changed by the time our message arrives. So I think we should be using whatever events there are in the parent process...

This way though, if that order guarantee ever changes, the test will not break.

The messaging API might be implemented such that there is an order guarantee, but I don't think it's good philosophically to rely on this when it's not what we're testing.
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97468

> What's the advantage of using openNewForegroundTab at all here?

Probably because it waits for the TabSwitchDone event: https://dxr.mozilla.org/mozilla-central/rev/12637ae351d64ecbf6b74cdbf26d7eb24ac0f659/testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm#112

FWIW this is what other test functions in this test do as well.

^ This comment never got recorded as a reply on MozReview, so I'm reposting it. Sorry.
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

https://reviewboard.mozilla.org/r/96412/#review97764

> Comment here now and a followup bug to make this better, I guess?

Filed bug 1322296
(In reply to Nihanth Subramanya [:nhnt11] from comment #105)
> > I don't understand. When the state of the captive portal changes in the parent process, is there an event there? Presumably yes. Presumably also, the state in the content process is updated via messaging at the point where it changes in the parent process. Logically speaking, if we do messaging to the content process after that has happened (ie after the state change in the parent) the order guarantees of the parent->child messaging should guarantee that the state in the content has changed by the time our message arrives. So I think we should be using whatever events there are in the parent process...
> 
> This way though, if that order guarantee ever changes, the test will not
> break.
> 
> The messaging API might be implemented such that there is an order
> guarantee, but I don't think it's good philosophically to rely on this when
> it's not what we're testing.

The messaging API ordering guarantee shouldn't/won't ever change. Imagine sending:

"arewesafe" message with value "no"
and then
"arewesafe" message with value "yes"

and changing the order.

Using waitForCondition uses setTimeout behinds the scenes and will lead to intermittents *now* rather than in the abstract future where we (somewhat unimaginably) break this guarantee.
OK, I think the point that I was missing is that ContentTask sends a message, which means that by the time ContentTask payload runs, the message that we listened for in the first place is guaranteed to have propagated. I'll upload a new patch.
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97810

This still wants moving to a separate file.

Additionally, we need a test for the "reload all the pages" case, for both pages that stay on some magical portal page and ones that redirect to the original.

::: browser/base/content/test/general/browser_aboutCertError.js:19
(Diff revision 10)
> +  yield SpecialPowers.pushPrefEnv({
> +    set: [["captivedetect.canonicalURL", CANONICAL_URL],
> +          ["captivedetect.canonicalContent", CANONICAL_CONTENT]],
> +  });

Put this in the same task.
Attachment #8815517 - Flags: review?(gijskruitbosch+bugs)
(In reply to Nihanth Subramanya [:nhnt11] from comment #114)
> OK, I think the point that I was missing is that ContentTask sends a
> message, which means that by the time ContentTask payload runs, the message
> that we listened for in the first place is guaranteed to have propagated.
> I'll upload a new patch.

Yes. Note that even loadURI on a remote browser will send a message.
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97810

Just to make sure we're on the same page, we don't reload portal tabs. We reload cert error pages that are showing the alternate captive portal UI..
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97900

This still needs to move to a new dir rather than cluttering up `general/` more.

::: browser/base/content/test/general/browser_captivePortal.js:27
(Diff revision 11)
> +  Services.obs.notifyObservers(null, "captive-portal-login", null);
> +
> +  info("Waiting for captive portal state to be propagated to the content process.");
> +  yield captivePortalStatePropagated;
> +
> +  // Open a page with a cert error.

Why can't we load a normal page?

::: browser/base/content/test/general/browser_captivePortal.js:62
(Diff revision 11)
> +
> +  Services.obs.notifyObservers(null, "captive-portal-login-success", null);
> +  yield portalTabRemoved;
> +
> +  info("Waiting for error tab to be reloaded after the captive portal was freed.");
> +  yield errorTabReloaded;

I'm confused. This is no longer loading an error page, right? It's loading the original target page that we tried to load. So we can just use `BTU.browserLoaded()`, right?

::: testing/mochitest/BrowserTestUtils/BrowserTestUtils.jsm:645
(Diff revision 11)
> +   *
> +   * @returns {Promise}
> +   * @resolves An object with properties reflecting the subject, topic, and data
> +   *           values of the notification.
> +   */
> +  waitForObserverNotification(aNotificationTopic) {

This looks like you're duplicating `TestUtils.topicObserved` ?
Attachment #8815517 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review97900

> Why can't we load a normal page?

Why would we? This bug is about replacing our normal cert error UI with an alternate one when we're behind a captive portal. So to test this, we deliberately load a page with a cert error.

> I'm confused. This is no longer loading an error page, right? It's loading the original target page that we tried to load. So we can just use `BTU.browserLoaded()`, right?

Nah, we need to load a page with a cert error to test the alternate UI, so the original target page is https://expired.example.com/. Basically this test only checks whether the cert error page changes to the captive portal specific UI if we've detected a portal, it doesn't actually try to load a secure page and simulate the cert error itself.
The latest patch moves the test to a new captivePortal/ folder. In bug 1313568, I intend to get rid of CaptivePortalWatcher.jsm and have all the captive portal tests live in this new folder. I went ahead and created a head.js for this folder even though it currently has only one function - there are a few utility functions in the CaptivePortalWatcher test file that can live here.
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

https://reviewboard.mozilla.org/r/96416/#review98278

You have eslint errors on your trypush, but otherwise this looks OK.
Attachment #8815517 - Flags: review?(gijskruitbosch+bugs) → review+
(In reply to Ton from comment #73)
> 
> Please note that "login" is a noun, "log in" is the verb. "Login Page"
> should be fine.
> Can this be corrected?

Please do not ignore but fix this prior to release.
Pushed by nhnt11@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/b60d8554e94f
Use URLSearchParams instead of regex to parse params in about:net/certerror URLs. r=Gijs
https://hg.mozilla.org/integration/autoland/rev/7c05b704b149
Show alternate UI in cert error pages when a captive portal is active. r=Gijs
https://hg.mozilla.org/integration/autoland/rev/c6577deb4a66
Include captive portal state as a flag in the about:certerror URI. r=valentin
https://hg.mozilla.org/integration/autoland/rev/cb2996e61709
Reload error pages showing captive portal UI when the portal is freed. r=Gijs
https://hg.mozilla.org/integration/autoland/rev/c13180ad5ce6
Add tests for alternate cert error UI when behind a captive portal. r=Gijs
Depends on: 1322201
(In reply to Ton from comment #137)
> (In reply to Ton from comment #73)
> > 
> > Please note that "login" is a noun, "log in" is the verb. "Login Page"
> > should be fine.
> > Can this be corrected?
> 
> Please do not ignore but fix this prior to release.

Don't worry, I already filed bug 1322201 a few days ago :)
I set that bug as being blocked by this one as well as bug 1315969, and cc'd you on it. Sorry for the lack of communication!
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → Firefox 53
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

Approval Request Comment
[User impact if declined]: Users may see certificate errors for all secure pages if there is a captive portal on the network intercepting traffic. This is confusing.
[Is this code covered by automated tests?]: Yes
[Has the fix been verified in Nightly?]: Yes
[Is the change risky?]: Very slightly.
[Why is the change risky/not risky?]: This adds a new alternate UI in cert error pages when a captive is active. There's a small chance that there are uncaught edge case bugs.
[String changes made/needed]: none, strings already uplifted.
Attachment #8815515 - Flags: approval-mozilla-aurora?
Comment on attachment 8815516 [details]
Bug 989197 - Reload error pages showing captive portal UI when the portal is freed.

Approval Request Comment
See comment 141
Attachment #8815516 - Flags: approval-mozilla-aurora?
Comment on attachment 8815517 [details]
Bug 989197 - Add tests for alternate cert error UI when behind a captive portal.

Approval Request Comment
See comment 141
Attachment #8815517 - Flags: approval-mozilla-aurora?
Comment on attachment 8816307 [details]
Bug 989197 - Include captive portal state as a flag in the about:certerror URI.

Approval Request Comment
See comment 141
Attachment #8816307 - Flags: approval-mozilla-aurora?
Comment on attachment 8817016 [details]
Bug 989197 - Use URLSearchParams instead of regex to parse params in about:net/certerror URLs.

Approval Request Comment
See comment 141
Attachment #8817016 - Flags: approval-mozilla-aurora?
This bug is marked as depending on bugs 1325224 and 1317511, which aren't in aurora.  Should this be uplifted ahead of them anyway?
Flags: needinfo?(nhnt11)
Are we aiming to ship captive portal detection in 52? It has missed the mid-aurora signoff point. 
Nihanth, is there any plan for manual testing?
(In reply to Julien Cristau [:jcristau] from comment #146)
> This bug is marked as depending on bugs 1325224 and 1317511, which aren't in
> aurora.  Should this be uplifted ahead of them anyway?

/me cries.

I wrote a reply to this but apparently it was never published. Bug 1325224 does not need to be uplifted beforehand, but bug 1317511 needs to be uplifted (you already granted approval though).

(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #147)
> Are we aiming to ship captive portal detection in 52? It has missed the
> mid-aurora signoff point.

Captive portal detection (the notification bar) was uplifted to 52 in bug 1313706, it would be nice if this bug got uplifted as well. 

> Nihanth, is there any plan for manual testing?

Hmm, the QA test plan is here: https://wiki.mozilla.org/QA/Captive_Portals.
Flags: needinfo?(nhnt11)
Comment on attachment 8815515 [details]
Bug 989197 - Show alternate UI in cert error pages when a captive portal is active.

captive portal ssl error page for aurora52
Attachment #8815515 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8815516 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8815517 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8816307 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8817016 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
needs rebasing for aurora
Flags: needinfo?(nhnt11)
I rebased all the commits onto aurora and qfolded them into one patch. I didn't change the resulting commit message (all the individual commit messages with rows of "****" between them) - let me know if I should.
Flags: needinfo?(nhnt11) → needinfo?(cbook)
Comment on attachment 8824087 [details] [diff] [review]
Folded patch, rebased for aurora

I haven't tested this folded patch, please don't land it yet.
Attachment #8824087 - Attachment is obsolete: true
Comment on attachment 8824087 [details] [diff] [review]
Folded patch, rebased for aurora

Tested OK!
Attachment #8824087 - Attachment is obsolete: false
Flags: needinfo?(cbook)
SSL error page tested and verified on:
52.0a2 20170119004006
53.0a1 20170119030211

One note though and I'm not sure if this should be treated as a bug: 
In the Captive Portal case, when you get the SSL error page, you are most likely offline, which means that the learn more link will not work. On the same note, same applies for "Report Errors like this to help Mozilla...".
Nihant, what are your thoughts on the above?
Flags: needinfo?(nhnt11)
:AdrianSV wrote:

> …the learn more link will not work.

Is there anything we can do to pre-cache this information somehow? For example, we can make the informational paragraphs on this page longer. It means that our users don’t have to experience the frustration of clicking “Learn More” and have that link be broken.

Otherwise, consider just taking this link out.


> …same applies for "Report Errors like this to help Mozilla...".

Can we somehow store the error information temporarily and report it only after the user is connected to the internet (even though the captive portal page no longer appears?)
(In reply to Adrian Florinescu [:AdrianSV] from comment #155)
> SSL error page tested and verified on:
> 52.0a2 20170119004006
> 53.0a1 20170119030211
> 
> One note though and I'm not sure if this should be treated as a bug: 
> In the Captive Portal case, when you get the SSL error page, you are most
> likely offline, which means that the learn more link will not work. On the
> same note, same applies for "Report Errors like this to help Mozilla...".
> Nihant, what are your thoughts on the above?

Do you mean if the captive portal login page itself results in a cert error? That's an interesting case, we should exclude that URL from showing the captive-portal error page UI (because it'll be a loop - user clicks "Open Login Page" only to get another tab with the same "Log in to network" UI).

If you mean more generally about SSL error pages when a captive portal is active:

So right now, if the user is behind a captive portal, they will see cert error page when the following conditions are true:
a) We haven't detected the captive portal yet
b) They try to load a page and encounter a cert error

This only happens once, because after they encounter the cert error, we do a recheck and that should complete by the time they encounter another one.

Earlier in this bug, in comment 1, Brian says we should wait till the recheck completes before showing any error page UI. This is a possible solution, but we need to decide at what point we want to time out the request and just show an error page, and also make it "clean" - the UI shouldn't flicker between two states.

I've had discussions about both these cases at some point. I'm not sure if bugs have been filed. I'll file them today if needed.
Flags: needinfo?(nhnt11)
(In reply to Bram Pitoyo [:bram] from comment #156)
> :AdrianSV wrote:
> 
> > …the learn more link will not work.
> 
> Is there anything we can do to pre-cache this information somehow? For
> example, we can make the informational paragraphs on this page longer. It
> means that our users don’t have to experience the frustration of clicking
> “Learn More” and have that link be broken.
> 
> Otherwise, consider just taking this link out.

Both of these sound ok to me, but I want to note that ideally, we should only have to worry about this for the case where the captive portal login page itself results in a cert error. Ideally, in my opinion, cert errors should always be replaced by the captive portal error page for other URLs.

> > …same applies for "Report Errors like this to help Mozilla...".
> 
> Can we somehow store the error information temporarily and report it only
> after the user is connected to the internet (even though the captive portal
> page no longer appears?)

It should be possible, though from an engineering PoV I'm not sure if a mechanism to do this already exists (or what such a mechanism would look like, off the top of my head).
Depends on: 1334774
Marking VERIFIED FIXED based on comment 155.
Status: RESOLVED → VERIFIED
Depends on: 1613477
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: