Closed Bug 1367450 Opened 3 years ago Closed 3 years ago

Captive portal detection shouldn't block first paint

Categories

(Firefox :: General, enhancement, P1)

enhancement

Tracking

()

VERIFIED FIXED
Firefox 55
Iteration:
55.7 - Jun 12
Tracking Status
firefox55 --- verified
firefox56 --- verified

People

(Reporter: florian, Assigned: nhnt11)

References

(Blocks 1 open bug)

Details

(Whiteboard: [photon-performance][qf:p1][qa-commented])

Attachments

(1 file)

In this cold startup profile, captive portal detection pays the expensive cost of initializing NSS on the main thread, as it's doing the first https request: https://perf-html.io/public/3e380602698ab6d41de775ae1a3402eceef39362/calltree/?callTreeFilters=prefix-01MnyG5yG6yH0yH1yH2yH3yH4zQvzR0zE4zE5zR2zR3zE8zR4zR5zR6zEhzEiBKszU1zElzPozU2zU3zU4yHfCM7Cx2KczU5zUiBFpN8Cx2KczU6zUiBFpN8Cx2KczUf&range=0.0000_5.3842&thread=0

In the short term, we should move captive portal initialization to after the first paint.

A better long term solution will be to start captive portal detection during startup as soon as the network can be used without causing extra overhead. We would like NSS to be initialized off main thread, and to somehow let the main thread know that it's now possible to use it.
Whiteboard: [photon-performance][qf] → [photon-performance][qf] [triage]
Flags: qe-verify+
Priority: -- → P2
QA Contact: adrian.florinescu
Whiteboard: [photon-performance][qf] [triage] → [photon-performance][qf]
Assignee: nobody → nhnt11
Status: NEW → ASSIGNED
Priority: P2 → P1
Currently it appears that at startup, the CaptivePortalWatcher (browser-captivePortal.js) is the only consumer triggering a recheck. Per my investigation, commenting out the recheck call in that file results in no XHR ever being created from captivedetect.js long after the first window has loaded and even after some minor browsing. It seems that the calls in nsIOService when the network link changes or when the pref changes don't happen at startup. When I realized there were no other rechecks being triggered I didn't dig deeper.

Once the NSS-init-off-main-thread work is done, we should investigate for a better way to trigger the first recheck.
Comment on attachment 8873257 [details]
Bug 1367450 - Don't trigger captive portal check until delayed startup has completed.

https://reviewboard.mozilla.org/r/144712/#review148682

Looks good to me. I wonder if there's a way to test this. Currently captivedetect.js doesn't show up at all in the test I wrote in bug 1358798, so blacklisting it there before first paint wouldn't help. I assume captive portal detection is off for mochitests to avoid external network requests. I wonder if we could change this to instead check an url hosted by the mochitest server, so that the tests would see a behavior closer to what users experience. Is this something you can investigate?
Attachment #8873257 - Flags: review?(florian) → review+
Pushed by nhnt11@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/a2b294e57559
Defer initialization of CaptivePortalWatcher to just before completion of delayed startup. r=florian
Flags: needinfo?(nhnt11)
Flags: needinfo?(nhnt11)
Pushed by nhnt11@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/ff5590f1750d
Defer initialization of CaptivePortalWatcher to just before completion of delayed startup. r=florian
Backout by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/a3bdbdd4d95a
Backed out changeset ff5590f1750d for browser_CaptivePortalWatcher.js failures.
Comment on attachment 8873257 [details]
Bug 1367450 - Don't trigger captive portal check until delayed startup has completed.

Sorry about the test failures, try seems green now.

CaptivePortalWatcher attempts to open the portal tab after xul-window-visible if a portal has already been detected before the window was opened. There is a test for this.

If we move CaptivePortalWatcher into delayedStartup, xul-window-visible is a thing of the past, so it waits forever to open the tab, causing the test to fail.

Rather than messing with the mechanism of the portal tab/tests, I've simply kept the CaptivePortalWatcher.init() call where it was, and instead have moved the recheck trigger to after delayed-startup within browser-captivePortal.js, which is what we really cared about anyway.
Attachment #8873257 - Flags: review+ → review?(florian)
Whiteboard: [photon-performance][qf] → [photon-performance][qf:p1]
Comment on attachment 8873257 [details]
Bug 1367450 - Don't trigger captive portal check until delayed startup has completed.

https://reviewboard.mozilla.org/r/144712/#review149194

::: browser/base/content/browser-captivePortal.js:54
(Diff revision 4)
>    },
>  
>    init() {
> +    // We trigger a portal check after delayed startup to avoid doing a network
> +    // request before first paint.
> +    Services.obs.addObserver(this, "browser-delayed-startup-finished");

Should this only be in the cps.state == cps.UNKNOWN case? You don't want to add this observer for each new window, right?

::: browser/base/content/browser-captivePortal.js
(Diff revision 4)
>        let windows = Services.wm.getEnumerator("navigator:browser");
>        if (windows.getNext() == window && !windows.hasMoreElements()) {
>          this.ensureCaptivePortalTab();
>        }
>      }
> -

it seems you want:
  } else if (cps.state == cps.UNKNOWN) {
    this._delayedRecheckPending = true;
    Services.obs.addObserver(this, "browser-delayed-startup-finished");
  }
Attachment #8873257 - Flags: review?(florian) → review-
Comment on attachment 8873257 [details]
Bug 1367450 - Don't trigger captive portal check until delayed startup has completed.

https://reviewboard.mozilla.org/r/144712/#review150734

Thanks!

::: browser/base/content/browser-captivePortal.js:90
(Diff revision 5)
>    },
>  
>    observe(aSubject, aTopic, aData) {
>      switch (aTopic) {
> +      case "browser-delayed-startup-finished":
> +        cps.recheckCaptivePortal();

nit: I would remove the observer and reset the _delayedRecheckPending flag before calling cps.recheckCaptivePortal() so that if it throws we don't leave an observer behind. This is a purely theoritical concern though, as the current recheckCaptivePortal implementation always returns NS_OK.
Attachment #8873257 - Flags: review?(florian) → review+
Here's a profile where this is blocking first paint for over 500ms.

https://perfht.ml/2rMQjbA
(In reply to Mike Conley (:mconley) from comment #15)
> Here's a profile where this is blocking first paint for over 500ms.
> 
> https://perfht.ml/2rMQjbA

In that profile NSS has already been initialized by bug 1370832, and the cost you are seeing here is bug 1360164.
Pushed by nhnt11@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/90deb122d260
Don't trigger captive portal check until delayed startup has completed. r=florian
https://hg.mozilla.org/mozilla-central/rev/90deb122d260
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
Iteration: --- → 55.7 - Jun 12
QA note: this changed the code used to start captive portal detection, so it would be good to do some exploratory testing around the captive portal feature to verify that it still works as expected and that delaying its initialization doesn't introduce notable regressions.
Whiteboard: [photon-performance][qf:p1] → [photon-performance][qf:p1][qa-commented]
verified on: OSX 10.10, OSX 10.12, Ubuntu 16.04 x64, Ubuntu 14.04 x32, Windows 10x64

56.0a1 20170718100239
55.0b10 20170717064120




Run the basic Captive Portal scenarios:
-Active/Inactive CP detection;
-CP canonical redirect ; 
-CP managers(OSX/Win) integration related to FF startup;

There is only one thing I've noticed: on OSX, the CP assistant is launched again after FF is started and CP detection succeeds. However, I cannot reproduce the above behavior consistently. My current assumption is that the above behavior was network related (only one available CP capable wi-fi network for testing - will follow up on this once I get the chance to try on a different CP wi-fi network), so marking this issue as verified.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.