Intermittent TEST-UNEXPECTED-TIMEOUT | toolkit/components/telemetry/tests/unit/test_HealthPing.js | Test timed out
Categories
(Toolkit :: Telemetry, defect, P1)
Tracking
()
People
(Reporter: intermittent-bug-filer, Assigned: chutten)
References
Details
(Keywords: intermittent-failure, Whiteboard: [stockwell disabled])
Attachments
(2 files)
Filed by: nfay [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=382409398&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/XvOrBhAzR1-y-JRGL3NRuA/runs/0/artifacts/public/logs/live_backing.log
[task 2022-06-24T09:43:10.275Z] 09:43:10 INFO - TEST-START | toolkit/components/telemetry/tests/unit/test_HealthPing.js
[task 2022-06-24T09:48:10.278Z] 09:48:10 WARNING - TEST-UNEXPECTED-TIMEOUT | toolkit/components/telemetry/tests/unit/test_HealthPing.js | Test timed out
[task 2022-06-24T09:48:10.279Z] 09:48:10 INFO - TEST-INFO took 300003ms
[task 2022-06-24T09:48:10.279Z] 09:48:10 INFO - >>>>>>>
[task 2022-06-24T09:48:10.279Z] 09:48:10 INFO - (xpcshell/head.js) | test MAIN run_test pending (1)
[task 2022-06-24T09:48:10.279Z] 09:48:10 INFO - (xpcshell/head.js) | test run_next_test 0 pending (2)
[task 2022-06-24T09:48:10.280Z] 09:48:10 INFO - (xpcshell/head.js) | test MAIN run_test finished (2)
[task 2022-06-24T09:48:10.280Z] 09:48:10 INFO - running event loop
[task 2022-06-24T09:48:10.280Z] 09:48:10 INFO - toolkit/components/telemetry/tests/unit/test_HealthPing.js | Starting setup
[task 2022-06-24T09:48:10.280Z] 09:48:10 INFO - (xpcshell/head.js) | test setup pending (2)
[task 2022-06-24T09:48:10.281Z] 09:48:10 INFO - (xpcshell/head.js) | test run_next_test 0 finished (2)
[task 2022-06-24T09:48:10.281Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790409 Toolkit.Telemetry TRACE TelemetryEnvironment::constructor"
[task 2022-06-24T09:48:10.281Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790440 Toolkit.Telemetry TRACE TelemetryEnvironment::_getGFXData - Only one display adapter detected."
[task 2022-06-24T09:48:10.282Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790444 Toolkit.Telemetry TRACE TelemetryEnvironment::_updateSearchEngine - ignoring early call"
[task 2022-06-24T09:48:10.283Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (error) [JavaScript Error: "readUpdateConfig: Unable to read app update configuration file. Exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: resource://gre/modules/UpdateUtils.jsm :: getConfigFilePath :: line 154" data: no]" {file: "resource://gre/modules/UpdateUtils.jsm" line: 865}]
[task 2022-06-24T09:48:10.283Z] 09:48:10 INFO - readUpdateConfig@resource://gre/modules/UpdateUtils.jsm:865:10
[task 2022-06-24T09:48:10.283Z] 09:48:10 INFO - _do_main@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:240:6
[task 2022-06-24T09:48:10.283Z] 09:48:10 INFO - _execute_test@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:597:5
[task 2022-06-24T09:48:10.284Z] 09:48:10 INFO - @-e:1:1
[task 2022-06-24T09:48:10.284Z] 09:48:10 INFO - "
[task 2022-06-24T09:48:10.284Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (error) [JavaScript Error: "1656063790447 Toolkit.Telemetry ERROR TelemetryEnvironment::EnvironmentCache - error while initializing: TypeError: can't access property "client", gBeforeShutdownBarrier is null(resource://gre/modules/AddonManager.jsm:4300:5) JS Stack trace: get beforeShutdown@AddonManager.jsm:4300:5
[task 2022-06-24T09:48:10.285Z] 09:48:10 INFO - init@TelemetryEnvironment.jsm:605:5
[task 2022-06-24T09:48:10.285Z] 09:48:10 INFO - EnvironmentCache@TelemetryEnvironment.jsm:976:29
[task 2022-06-24T09:48:10.285Z] 09:48:10 INFO - getGlobal@TelemetryEnvironment.jsm:102:26
[task 2022-06-24T09:48:10.285Z] 09:48:10 INFO - onInitialized@TelemetryEnvironment.jsm:113:12
[task 2022-06-24T09:48:10.286Z] 09:48:10 INFO - setEmptyPrefWatchlist@head.js:483:31
[task 2022-06-24T09:48:10.286Z] 09:48:10 INFO - setup@test_HealthPing.js:69:9
[task 2022-06-24T09:48:10.286Z] 09:48:10 INFO - _run_next_test/<@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:1766:22
[task 2022-06-24T09:48:10.286Z] 09:48:10 INFO - _run_next_test@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:1766:38
[task 2022-06-24T09:48:10.287Z] 09:48:10 INFO - run@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:812:9
[task 2022-06-24T09:48:10.287Z] 09:48:10 INFO - _do_main@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:240:6
[task 2022-06-24T09:48:10.287Z] 09:48:10 INFO - _execute_test@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:597:5
[task 2022-06-24T09:48:10.287Z] 09:48:10 INFO - @-e:1:1" {file: "resource://gre/modules/Log.jsm" line: 725}]
[task 2022-06-24T09:48:10.288Z] 09:48:10 INFO - append@resource://gre/modules/Log.jsm:725:12
[task 2022-06-24T09:48:10.288Z] 09:48:10 INFO - log@resource://gre/modules/Log.jsm:381:16
[task 2022-06-24T09:48:10.288Z] 09:48:10 INFO - getLoggerWithMessagePrefix/proxy.log@resource://gre/modules/Log.jsm:510:18
[task 2022-06-24T09:48:10.288Z] 09:48:10 INFO - error@resource://gre/modules/Log.jsm:389:10
[task 2022-06-24T09:48:10.289Z] 09:48:10 INFO - EnvironmentCache/this._initTask<@resource://gre/modules/TelemetryEnvironment.jsm:1006:17
[task 2022-06-24T09:48:10.289Z] 09:48:10 INFO - _do_main@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:240:6
[task 2022-06-24T09:48:10.289Z] 09:48:10 INFO - _execute_test@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:597:5
[task 2022-06-24T09:48:10.289Z] 09:48:10 INFO - @-e:1:1
[task 2022-06-24T09:48:10.289Z] 09:48:10 INFO - "
[task 2022-06-24T09:48:10.290Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790448 Toolkit.Telemetry TRACE TelemetryEnvironment::_startWatchingPrefs - [object Map]"
[task 2022-06-24T09:48:10.290Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790448 Toolkit.Telemetry TRACE TelemetryEnvironment::_stopWatchingPrefs"
[task 2022-06-24T09:48:10.290Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790448 Toolkit.Telemetry TRACE TelemetryEnvironment::_updateSearchEngine - ignoring early call"
[task 2022-06-24T09:48:10.290Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790448 Toolkit.Telemetry TRACE TelemetryEnvironment::_startWatchingPrefs - [object Map]"
[task 2022-06-24T09:48:10.291Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790448 Toolkit.Telemetry TRACE TelemetryEnvironment::observe - aTopic: nsPref:changed, aData: toolkit.telemetry.healthping.enabled"
[task 2022-06-24T09:48:10.291Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790448 Toolkit.Telemetry TRACE TelemetryController::setupTelemetry"
[task 2022-06-24T09:48:10.291Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790449 Toolkit.Telemetry TRACE TelemetryController::registerJsProbes - registering builtin JS probes"
[task 2022-06-24T09:48:10.292Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790449 Toolkit.Telemetry TRACE TelemetryController::registerScalarProbes - registering scalar builtin JS probes"
[task 2022-06-24T09:48:10.292Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790449 Toolkit.Telemetry TRACE TelemetryController::loadProbeDefinitions - no builtin JS probe file ScalarArtifactDefinitions.json"
[task 2022-06-24T09:48:10.292Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790450 Toolkit.Telemetry TRACE TelemetryReportingPolicy::setup"
[task 2022-06-24T09:48:10.293Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790450 Toolkit.Telemetry CONFIG TelemetryController::enableTelemetryRecording - canRecordBase:true, canRecordExtended: true"
[task 2022-06-24T09:48:10.293Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790450 Toolkit.Telemetry TRACE TelemetrySession::earlyInit"
[task 2022-06-24T09:48:10.293Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790450 Toolkit.Telemetry INFO TelemetryReportingPolicy::get dataSubmissionPolicyNotifiedDate - No date stored yet."
[task 2022-06-24T09:48:10.294Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790451 Toolkit.Telemetry TRACE UpdatePing::init - enabled: true"
[task 2022-06-24T09:48:10.294Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790451 Toolkit.Telemetry TRACE TelemetryController::registerEventProbes - registering builtin JS Event probes"
[task 2022-06-24T09:48:10.294Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790451 Toolkit.Telemetry TRACE TelemetryController::loadProbeDefinitions - no builtin JS probe file EventArtifactDefinitions.json"
[task 2022-06-24T09:48:10.295Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (error) [JavaScript Error: "readUpdateConfig: Unable to read app update configuration file. Exception: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIProperties.get]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: resource://gre/modules/UpdateUtils.jsm :: getConfigFilePath :: line 154" data: no]" {file: "resource://gre/modules/UpdateUtils.jsm" line: 865}]
[task 2022-06-24T09:48:10.295Z] 09:48:10 INFO - readUpdateConfig@resource://gre/modules/UpdateUtils.jsm:865:10
[task 2022-06-24T09:48:10.295Z] 09:48:10 INFO - _do_main@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:240:6
[task 2022-06-24T09:48:10.296Z] 09:48:10 INFO - _execute_test@Z:\\task_165606228933289\\build\\tests\\xpcshell\\head.js:597:5
[task 2022-06-24T09:48:10.296Z] 09:48:10 INFO - @-e:1:1
[task 2022-06-24T09:48:10.296Z] 09:48:10 INFO - "
[task 2022-06-24T09:48:10.296Z] 09:48:10 INFO - "CONSOLE_MESSAGE: (info) 1656063790469 Toolkit.Telemetry TRACE TelemetryEnvironment::_onEnvironmentChange for system-info"
Comment 1•2 years ago
|
||
Hello, this bug seems to be permafailing now. Could you take a look at it please? Thank you!
Assignee | ||
Comment 2•2 years ago
|
||
Weird. It was green when I checked: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=a58b56af0ed03e09ed092aff82d12855d5d6f61b
Is it a clue that the configuration that's failing xpcshell-spi-nw-1
wasn't the specific one I ran? I've gone looking to see if I could add jobs of that config, but they aren't in the UI.
Assignee | ||
Comment 3•2 years ago
|
||
The logs do suggest it's failing in the same place and way (it's testing a send timeout, and for whatever reason the send succeeds despite the timeout value being set to a very low value (one millisecond) meaning that the task waiting for the timeout never sees it.)
Comment hidden (Intermittent Failures Robot) |
Comment 5•2 years ago
|
||
:barret would you like to have a go at a fix here or could we disable the test on windows until a fix is in place maybe after Chris is back from PTO?
Comment 6•2 years ago
|
||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Pushed by csabou@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c6cf4533dfad Disable test_HealthPing.js on win10_2004 for frequent failures. r=intermittent-reviewers,MasterWayZ DONTBUILD
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 10•2 years ago
|
||
bugherder |
Updated•2 years ago
|
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 12•2 years ago
|
||
ni?me to (try to) figure out what the heck is going on here.
Assignee | ||
Comment 13•2 years ago
|
||
Interesting that this isn't win10_2004-only, and is still happening across a variety of mac platforms (just at an acceptably-infrequent intermittent rate).
Giving this another go over here. We'll see what comes up with my try run, and how it's different from the ones that used to cause the errors.
One possibility is to drop this particular case from the "heatlh" ping tests. Testing network timeouts like this is prone to failure and we could look into more unit-test-y ways to simulate that without playing with unreliable timeouts. That'll allow us to turn the rest of the test file back on across a wider base of platforms. I'd rather figure it out from base principles and get it fixed, but this is sitting in the back of my mind as I do.
Assignee | ||
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 15•2 years ago
|
||
This test causes failures intermittently when left alone, and permanently if
we try to fix it.
It is decidedly an integration/systems test instead of a unit test, and our
framework isn't set up to do reliable and reproducible network errors.
The unit behaviour is covered by the other tests (it just ends up calling
recordSendFailure
with a different string), so we're losing more coverage
by skipping the test by platform than we will by removing the test entirely.
Comment 16•2 years ago
|
||
Pushed by chutten@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/d7a1dcec29c3 Remove "health" ping submission timeout test r=TravisLong
Comment 17•2 years ago
|
||
bugherder |
Assignee | ||
Comment 18•2 years ago
|
||
With volumes of failures at 0 on the 14th and 15th, I think we can call this FIXED
(there may be one last intermittent summary comment coming on Monday, though)
Updated•2 years ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Description
•