Don't use the pingsender when Firefox closes because the OS is shutting down

RESOLVED FIXED in Firefox 55

Status

()

enhancement
P1
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: Dexter, Assigned: Dexter)

Tracking

Trunk
mozilla55
Points:
2
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [measurement:client])

Attachments

(1 attachment)

We block the browser shutdown on spawning the pingsender process for sending the shutdown ping. Even if we don't wait for the pingsender process to complete its work, this operation takes a bit of time.

We need to make sure that:

- We don't regress the time it takes for the browser to shut down.
- We don't regress the time it takes to shut down the OS if we're shutting down with the browser open.
Blocks: 1343277
Points: --- → 2
Priority: -- → P1
Whiteboard: [measurement:client]
|shutdownDuration| can be checked for the impact of spawning a process at shutdown.
OS shutdown times are hard, as we have no smart way to measure that.
This analysis was performed in bug 1343282 and we were found to regress shutdown times a bit. I'm morphing this bug to not using the shutdown pingsender when Firefox is closing because the OS is shutting down.
Summary: Verify that the "shutdown pingsender" doesn't regress OS and FF shutdown times → Don't use the pingsender when Firefox closes because the OS is shutting down
Assignee: nobody → alessio.placitelli
Benjamin, do you have any idea on how can we detect if Firefox is shutting down because the user quit or because the OS is shutting down?

By digging into the code-base I found that, on Windows, we send the "quit-application-forced" [1] notification. A more general solution seems to look for "syncShutdown" when "quite-application-granted" is received [2]. Do these solution make sense to you? Am I missing something?

Our objective is to avoid spawning the pingsender process if we're shutting down because the OS is telling us to do so.

[1] - http://searchfox.org/mozilla-central/rev/d66b9f27d5630a90b2fce4d70d4e9050f43df9b4/widget/windows/nsWindow.cpp#5120
[2] - http://searchfox.org/mozilla-central/rev/d66b9f27d5630a90b2fce4d70d4e9050f43df9b4/browser/components/sessionstore/SessionStore.jsm#768
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4)
> WM_ENDSESSION can be a signal of this on Windows:
> https://dxr.mozilla.org/mozilla-central/rev/
> d8762cb967423618ff0a488f14745f60964e5c49/widget/windows/nsWindow.cpp#5108

Thanks Benjamin! On a closer look, WM_ENDSESSION is exactly what's triggering the "quit-application-forced" notification on Windows \o/ So that should be working.
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

Georg, does this change look reasonable to you? The notifications we're catching are being triggered on "fast shutdown", that is when the OS is shutting down and we need to close, fast.

I'm not entirely sure how to test this though. Any suggestion?
Attachment #8867111 - Flags: feedback?(gfritzsche)
Jimm, in this bug we're trying to detect if the OS is shutting down to avoid spawning a new process to send the shutdown ping (pingsender).

Can you suggest anyone that could give feedback on the way this is being implemented? Any suggestion on how to properly test this would also be very appreciated!
Flags: needinfo?(jmathies)
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

https://reviewboard.mozilla.org/r/138718/#review142396

::: toolkit/components/telemetry/TelemetrySend.jsm:59
(Diff revision 1)
>  const PREF_ENABLED = PREF_BRANCH + "enabled";
>  const PREF_FHR_UPLOAD_ENABLED = "datareporting.healthreport.uploadEnabled";
>  const PREF_OVERRIDE_OFFICIAL_CHECK = PREF_BRANCH + "send.overrideOfficialCheck";
>  
>  const TOPIC_IDLE_DAILY = "idle-daily";
> -const TOPIC_QUIT_APPLICATION = "quit-application";
> +// The following topics are used to detect if Firefox is closing

Nit: "... are notified when ..."?

::: toolkit/components/telemetry/TelemetrySend.jsm:593
(Diff revision 1)
> -
> +  // Boolean to track if the OS is shutting down and we need to stop
> +  // our operations quickly.

Nit: Don't describe the type, just describe what this is for.
I don't think we are doing anything more quickly, just skipping some operations.

Can we just call this "isOSShutdown" or similar?

::: toolkit/components/telemetry/TelemetrySend.jsm:853
(Diff revision 1)
> +    // Moreover, if the OS is shutting down, we don't want to spawn the
> +    // pingsender as it could be killed before it can complete its tasks,
> +    // for example after successfully sending the ping but before removing
> +    // the copy from the disk, resulting in receiving duplicate pings when
> +    // Firefox restarts.

The main reason was that we did not want to block OS shutdown unneccessarily.
We need to carefully watch how much of this affects our latency win.

I think we also need to know which main pings are due to OS shutdown.
Can we add a data point for this here?

::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1399
(Diff revision 1)
>  
> -  // Enable ping upload and disable the "submission policy".
> -  // The shutdown ping must not be sent.
> +  // Enable ping upload and signal a fast-shutdown. The pingsender
> +  // will not be spawned and no ping will be sent.
>    Preferences.set(PREF_FHR_UPLOAD_ENABLED, true);
> +  yield TelemetryController.testReset();
> +  Services.obs.notifyObservers(null, "quit-application-forced");

Should we test both topics?
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

https://reviewboard.mozilla.org/r/138718/#review142396

> The main reason was that we did not want to block OS shutdown unneccessarily.
> We need to carefully watch how much of this affects our latency win.
> 
> I think we also need to know which main pings are due to OS shutdown.
> Can we add a data point for this here?

I just added the "telemetry.os_shutting_down" boolean scalar (will ask for data-review once review is settled) and also used it to validate this fix. The scalar is correctly set to true when Firefox is left open while Windows is shutting down and not set at all when Firefox is closed with the normal flow.
(In reply to Alessio Placitelli [:Dexter] from comment #8)
> Jimm, in this bug we're trying to detect if the OS is shutting down to avoid
> spawning a new process to send the shutdown ping (pingsender).
> 
> Can you suggest anyone that could give feedback on the way this is being
> implemented? Any suggestion on how to properly test this would also be very
> appreciated!

Testing is tricky, you would have to simulate a native message like WM_ENDSESSION. We do send observers as a result of that, via nsWindow code - 

http://searchfox.org/mozilla-central/source/widget/windows/nsWindow.cpp#5108

But you can't do this in automation.

Probably the best way we have now to handle this would be to make sure manual QA testing happens through your test plan when the feature lands.
Flags: needinfo?(jmathies)
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

https://reviewboard.mozilla.org/r/138718/#review142966

::: toolkit/components/telemetry/Scalars.yaml:270
(Diff revision 2)
>        - eglassercamp@mozilla.com
>      release_channel_collection: opt-out
>      record_in_processes:
>        - main
>  
> +# The following section contains WebRTC nICEr scalars

It would make reviewinging easier to have orthogonal changes like this in a separate commit in the future.

::: toolkit/components/telemetry/TelemetrySend.jsm:643
(Diff revision 2)
> +    Services.obs.addObserver(this, TOPIC_QUIT_APPLICATION_FORCED);
> +    Services.obs.addObserver(this, TOPIC_QUIT_APPLICATION_GRANTED);

If you haven't yet, please double-check that these are the right topics that fire *only* for OS shutdown scenarios.

::: toolkit/components/telemetry/TelemetrySend.jsm:807
(Diff revision 2)
>      switch (topic) {
>      case TOPIC_IDLE_DAILY:
>        SendScheduler.triggerSendingPings(true);
>        break;
> +    case TOPIC_QUIT_APPLICATION_FORCED:
> +      this._log.trace("observe - enabling fast shutdown");

Nit: Drop the "fast shutdown".
"observe - in OS shutdown"?

Also, both branches contain the 3 same statements.
We should share this.

::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1372
(Diff revision 2)
> +    let pendingPings = await TelemetryStorage.loadPendingPingList();
> +    Assert.equal(pendingPings.length, 2,
> +                 "We expect 2 pending pings: shutdown and saved-session.");

Possible alternative approach: Check the headers for the correct sender.
This works fine though.

::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1375
(Diff revision 2)
> +    let foundShutdown = false;
> +    for (let p of pendingPings) {

This seems a bit complicated, lets drop the loop.
- You assert having two pings.
- You assert that one is type "main", one is "saved-session".
- Just use the one main ping to do remaining checks.

::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1419
(Diff revision 2)
> -  // Enable ping upload and disable the "submission policy".
> -  // The shutdown ping must not be sent.
> +  // Enable ping upload and signal a fast-shutdown. The pingsender
> +  // will not be spawned and no ping will be sent.

Nit: "OS shutdown", not "fast shutdown"
Attachment #8867111 - Flags: review?(gfritzsche)
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

https://reviewboard.mozilla.org/r/138718/#review142984

::: toolkit/components/telemetry/Scalars.yaml:448
(Diff revision 2)
> +    record_in_processes:
> +      - 'main'
> +      - 'content'
> +
> +telemetry:
> +  os_shutting_down:

Do we need this as opt-out?
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

https://reviewboard.mozilla.org/r/138718/#review142966

> It would make reviewinging easier to have orthogonal changes like this in a separate commit in the future.

Good point, sorry about that.

> If you haven't yet, please double-check that these are the right topics that fire *only* for OS shutdown scenarios.

I checked that already also also manually tested this on my Windows laptop: the normal shutdown flow goes through a [different path](http://searchfox.org/mozilla-central/source/toolkit/components/startup/nsAppStartup.cpp#414) and these topics only seem to be sent on OS shutdown, on Windows.
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

https://reviewboard.mozilla.org/r/138718/#review143426

::: toolkit/components/telemetry/TelemetrySend.jsm:801
(Diff revision 3)
>      this._annotateCrashReport();
>  
>      return this.promisePendingPingActivity();
>    },
>  
> +  _setOSShutdown() {

This is only used in `observe()` - just make it a local helper function there?

::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1384
(Diff revision 3)
> +    ];
> +    // Find the shutdown main ping and check that it contains the right data.
> +    const shutdownPing = pings.find(p => p.type == "main");
> +    Assert.ok(shutdownPing, "The 'shutdown' ping must be saved to disk.");
> +    Assert.ok(pings.find(p => p.type == "saved-session"),
> +              "We 'saved-session' ping must be saved to disk.");

"The ..."

::: toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1419
(Diff revision 3)
>    yield TelemetryController.testShutdown();
>  
>    // Make sure we have no pending pings between the runs.
>    yield TelemetryStorage.testClearPendingPings();
>  
> -  // Enable ping upload and disable the "submission policy".
> +  // Enable ping upload and signal an os-shutdown. The pingsender

"... OS shutdown."
Attachment #8867111 - Flags: review?(gfritzsche) → review+
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

data-review?

Hi Benjamin, this adds a new opt-out boolean scalar, "telemetry.os_shutting_down", which will expire in Firefox 58.

Our plan is to use this new measurement for:

- evaluating how frequent users are shutting down the OS with Firefox open;
- checking if OS shutdown was responsible for the increase in the duplicate rates we were seeing after enabling the 'shutdown' pingsender;
- evaluating the impact of this patch on the latency of the 'shutdown' main ping.

I will perform an initial analysis on the data few weeks after this probe lands, to have an initial look at the questions above.

I will repeat the analysis when this hits Release and keep monitoring until the probe expires.
Attachment #8867111 - Flags: review?(benjamin)
Comment on attachment 8867111 [details]
Bug 1363345 - Don't use the pingsender when the OS is shutting down. ,data-review=bsmedberg

https://reviewboard.mozilla.org/r/138718/#review143472

data-r=me with the description fixed

::: toolkit/components/telemetry/Scalars.yaml:452
(Diff revision 4)
> +telemetry:
> +  os_shutting_down:
> +    bug_numbers:
> +      - 1363345
> +    description: >
> +      Indicates indicate whether or not the ping was generated while the OS

From reading the code, it seems that only "true" values are recorded, so "whether or not" is not correct. Please reword this something like "Records true if there is a signal that Firefox was quitting because the OS was shutting down."

In addition, I believe this is Windows-only at the moment? Please document that.
Attachment #8867111 - Flags: review?(benjamin) → review+
Madalin, could you pick this up after the 'new-profile' QA? I'd wait for QA to happen before landing this.
This is a potential blocker for the 'shutdown' pingsender.

The builds can be found here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2a4d0191f494
Flags: needinfo?(madalin.cotetiu)
The test plan can be found in here: https://docs.google.com/document/d/1YiQCO5JrB1Wq2vUGlb8DO_bcHkwG3YejYLreY1poKfw/edit
The test results are in here: https://docs.google.com/spreadsheets/d/1M4bOO579x03fFUK4NbVWMyrhH3Pr99HxgNIyPtGntWA/edit#gid=0

If anything else is needed please ping me.
Flags: needinfo?(madalin.cotetiu)
(In reply to Madalin Cotetiu from comment #23)
> The test plan can be found in here:
> https://docs.google.com/document/d/
> 1YiQCO5JrB1Wq2vUGlb8DO_bcHkwG3YejYLreY1poKfw/edit
> The test results are in here:
> https://docs.google.com/spreadsheets/d/
> 1M4bOO579x03fFUK4NbVWMyrhH3Pr99HxgNIyPtGntWA/edit#gid=0
> 
> If anything else is needed please ping me.

Great job there, thanks Madalin!
Pushed by alessio.placitelli@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/30179c555174
Don't use the pingsender when the OS is shutting down. r=bsmedberg,gfritzsche,data-review=bsmedberg
Backed out for eslint failing at test_TelemetrySession.js:1421:

https://hg.mozilla.org/integration/autoland/rev/40ae717ecce778ea15f6c0acaa4f7f3d43e293ac

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=30179c5551744b880f563b5382da8f256385fdc3&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=100969041&repo=autoland
> TEST-UNEXPECTED-ERROR | /home/worker/checkouts/gecko/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1421:9 | Parsing error: Unexpected token TelemetryController (eslint)
Flags: needinfo?(alessio.placitelli)
This causes also xpcshell failures like https://treeherder.mozilla.org/logviewer.html#?job_id=100973860&repo=autoland
> SyntaxError: yield is a reserved identifier at /home/worker/workspace/build/tests/xpcshell/tests/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1421
Pushed by alessio.placitelli@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/3077eb2b5693
Don't use the pingsender when the OS is shutting down. r=bsmedberg,gfritzsche,data-review=bsmedberg
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #27)
> This causes also xpcshell failures like
> https://treeherder.mozilla.org/logviewer.html#?job_id=100973860&repo=autoland
> > SyntaxError: yield is a reserved identifier at /home/worker/workspace/build/tests/xpcshell/tests/toolkit/components/telemetry/tests/unit/test_TelemetrySession.js:1421

Yup, thanks for backing this out. Looks like the code changed between my try push and when it landed. Merging didn't fail :-\ I've fixed the problem and pushed it again :)
Flags: needinfo?(alessio.placitelli)
https://hg.mozilla.org/mozilla-central/rev/3077eb2b5693
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.