Open Bug 1348890 Opened 7 years ago Updated 2 years ago

Make the TelemetrySend tests call setup()/shutdown() in a consistent way

Categories

(Toolkit :: Telemetry, enhancement, P3)

enhancement

Tracking

()

Tracking Status
firefox55 --- affected

People

(Reporter: gsvelto, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [measurement:client])

Attachments

(1 file)

We currently have some try/catch blocks in the shutdown procedure of the TelemetrySend object to cope with tests that call shutdown() without calling setup() first or that call shutdown() twice in a row, see:

https://dxr.mozilla.org/mozilla-central/rev/1b9293be51637f841275541d8991314ca56561a5/toolkit/components/telemetry/TelemetrySend.jsm#642

This makes the tests output rather confusing as they produce a number of warnings even when the tests succeed. The tests should be fixed so that they act consistently and we don't need to ignore errors this way.
Priority: -- → P3
Whiteboard: [measurement:client]
Depends on: 1416078
Blocks: 1416078
No longer depends on: 1416078
I've looked into this but I haven't found any tests that trigger the try/catch you link to. I know it was a while ago, but do you recall which tests were exhibiting this error? It's possible that they got cleaned up independently of this bug, I imagine.
This is the test that's causing the issue:

https://searchfox.org/mozilla-central/rev/797c93d81fe446f78babf20894f0729f15f71ee6/toolkit/components/telemetry/tests/unit/test_TelemetrySend.js#436

If you run it you'll see error messages that look like this right after the following test starts:

 0:06.57 LOG: Thread-1 INFO "CONSOLE_MESSAGE: (error) [JavaScript Error: "1511336008267	Toolkit.Telemetry	ERROR	TelemetrySend::shutdown - failed to remove observer for idle-daily: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIObserverService.removeObserver]"  nsresult: "0x80004005 (NS_ERROR_FAILURE)"  location: "JS frame :: resource://gre/modules/TelemetrySend.jsm :: shutdown :: line 754"  data: no] Stack trace: shutdown()@resource://gre/modules/TelemetrySend.jsm:754" {file: "resource://gre/modules/Log.jsm" line: 754}]
App_append@resource://gre/modules/Log.jsm:754:9
log@resource://gre/modules/Log.jsm:392:7
getLoggerWithMessagePrefix/proxy.log@resource://gre/modules/Log.jsm:507:44
error@resource://gre/modules/Log.jsm:400:5
shutdown@resource://gre/modules/TelemetrySend.jsm:756:9
async*shutdown@resource://gre/modules/TelemetrySend.jsm:211:12
_cleanupOnShutdown@resource://gre/modules/TelemetryController.jsm:804:13
async*shutdown@resource://gre/modules/TelemetryController.jsm:847:14
testShutdown@resource://gre/modules/TelemetryController.jsm:150:12
test_sendCheckOverride@/home/gsvelto/projects/build/firefox/_tests/xpcshell/toolkit/components/telemetry/tests/unit/test_TelemetrySend.js:465:9
async*asyncFunction@resource://gre/modules/Task.jsm:241:18
Task_spawn@resource://gre/modules/Task.jsm:166:12
_run_next_test@/home/gsvelto/projects/mozilla-central.asan/testing/xpcshell/head.js:1490:9
run@/home/gsvelto/projects/mozilla-central.asan/testing/xpcshell/head.js:703:9
_do_main@/home/gsvelto/projects/mozilla-central.asan/testing/xpcshell/head.js:224:3
_execute_test@/home/gsvelto/projects/mozilla-central.asan/testing/xpcshell/head.js:546:5
@-e:1:1
"

It's caused by this line trying to shutdown again w/o having called setup() first in the following test:

https://searchfox.org/mozilla-central/rev/797c93d81fe446f78babf20894f0729f15f71ee6/toolkit/components/telemetry/tests/unit/test_TelemetrySend.js#465

(this is TelemetryController.shutdown() calling TelemetrySend.shutdown() BTW)
Oops, I was looking at the output from running ./mach test toolkit/components/telemetry/tests/unit/ and so I was missing the more detailed output. I added --verbose and now I can get the full output for the whole directory. Thanks!
For my records these are the tests showing these errors:

test_MigratePendingPings
test_PingSender
test_TelemetryControllerShutdown
test_TelemetryEnvironment
test_TelemetryEvents
test_TelemetryFlagClear
test_TelemetryHistograms
test_TelemetryLateWrites
test_TelemetryLockCount
test_TelemetryLog
test_TelemetryScalars
test_TelemetrySend
test_TelemetrySession_activeTicks
test_TelemetryStopwatch
test_TelemetryStorage
test_TelemetryTimestamps
Changing the function given to do_register_cleanup fixes almost all of those tests. Then I just had to make one more change, to setup before the shutdown in test_sendCheckOverride. This seems to handle all instances of the error and seems to continue to test OK. Try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=59bd4015597f3aa0fc29ae98feb0c78e727ea64a
Assignee: nobody → agashlin
Status: NEW → ASSIGNED
Comment on attachment 8935567 [details]
Bug 1348890 - Call setup()/shutdown() consistently

Redirecting the review to a telemetry peer. Glad to see this fixed, the logs when running these tests will be a lot cleaner. Also, this try/catch block should probably go too since it was there only to accommodate for the fact that shutdown() was being called multiple times:

https://searchfox.org/mozilla-central/rev/f5f1c3f294f89cfd242c3af9eb2c40d19d5e04e7/toolkit/components/telemetry/TelemetrySend.jsm#753

It should now be possible to just remove the observers without having to worry that the operation might throw.
Attachment #8935567 - Flags: review?(gsvelto) → review?(alessio.placitelli)
Comment on attachment 8935567 [details]
Bug 1348890 - Call setup()/shutdown() consistently

https://reviewboard.mozilla.org/r/206438/#review214168

This is more of a problem with the test itself, rather than a problem with the way shutdown happens: we should not have a shutdown call without a corrensponding reset. Let's try the suggested fix and see if it works and then decide the next move :)

::: toolkit/components/telemetry/TelemetrySend.jsm:219
(Diff revision 1)
>    /**
> +   * Used only for testing purposes
> +   *
> +   * @return {Boolean} true if the module has been setup and not shutdown
> +   */
> +  shouldShutdown() {

I don't think this change is really needed: this is just a problem in the test itself.

::: toolkit/components/telemetry/tests/unit/test_TelemetrySend.js:457
(Diff revision 1)
>    // Triggering a shutdown should persist the pings.
>    await TelemetrySend.shutdown();
>    Assert.ok((await checkPingsSaved(ids)), "All pending pings should have been persisted");
>  
>    // After a restart the pings should have been found when scanning.
>    await TelemetrySend.reset();

I think the fix here should be as simple as just calling `await TelemetryController.testReset();` on this line, instead of `await TelemetrySend.reset();`.
Attachment #8935567 - Flags: review?(alessio.placitelli)
Comment on attachment 8935567 [details]
Bug 1348890 - Call setup()/shutdown() consistently

https://reviewboard.mozilla.org/r/206438/#review214252

::: toolkit/components/telemetry/TelemetrySend.jsm:219
(Diff revision 1)
>    /**
> +   * Used only for testing purposes
> +   *
> +   * @return {Boolean} true if the module has been setup and not shutdown
> +   */
> +  shouldShutdown() {

This is not directly related to the TelemetrySend test; that is handled with the change in the test on line 464 where I add an extra `setup` before calling shutdown. I'll change that per your suggestion.

The `shouldShutdown` mechanism is for all the other tests that were producing the same error, see comment 4 for a list. Many of these never run `setup`, and others explicitly run `shutdown` so that the `shutdown` registered with `do_register_cleanup` would cause errors.

If you think this belongs in a separate bug I can remove it from this one.

::: toolkit/components/telemetry/tests/unit/test_TelemetrySend.js:457
(Diff revision 1)
>    // Triggering a shutdown should persist the pings.
>    await TelemetrySend.shutdown();
>    Assert.ok((await checkPingsSaved(ids)), "All pending pings should have been persisted");
>  
>    // After a restart the pings should have been found when scanning.
>    await TelemetrySend.reset();

Thanks, that does the trick here!
Comment on attachment 8935567 [details]
Bug 1348890 - Call setup()/shutdown() consistently

https://reviewboard.mozilla.org/r/206438/#review214252

> This is not directly related to the TelemetrySend test; that is handled with the change in the test on line 464 where I add an extra `setup` before calling shutdown. I'll change that per your suggestion.
> 
> The `shouldShutdown` mechanism is for all the other tests that were producing the same error, see comment 4 for a list. Many of these never run `setup`, and others explicitly run `shutdown` so that the `shutdown` registered with `do_register_cleanup` would cause errors.
> 
> If you think this belongs in a separate bug I can remove it from this one.

I think we should rather fix the tests: in most of the cases, as you saw, it's just a matter of adding proper initialization :) This should be evaluated on a test-by-test base though. This might not be possible for *some* tests, but we can discuss about them.
This looks a bit stalled. NIs to see if this is something we should be brute forcing with our paws or if Telemetry should be guiding us through.
Flags: needinfo?(gsvelto)
Flags: needinfo?(alessio.placitelli)
This is not super-urgent on my part, but I've encountered at least another person who got confused by the error-messages-that-aren't-errors while looking for an unrelated issues.
Flags: needinfo?(gsvelto)
(In reply to David Durst [:ddurst] from comment #11)
> This looks a bit stalled. NIs to see if this is something we should be brute
> forcing with our paws or if Telemetry should be guiding us through.

This is not critical but rather a nice-to-have, so not super urgent on our part either. We could still set this up as an advanced mentored bug or prioritize it for the next round of cleanup on our end.

:agashlin, are you planning to still work on this?
Flags: needinfo?(alessio.placitelli) → needinfo?(agashlin)
I think I'll step back from this one, sorry for underdelivering here.
Assignee: agashlin → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(agashlin)
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: