Closed Bug 927214 Opened 12 years ago Closed 10 years ago

Intermittent test_healthreporter.js | Test timed out | test failed (with xpcshell return code: 15), see following log | false == true - See following stack

Categories

(Firefox Health Report Graveyard :: Client: Desktop, defect)

All
Windows 8
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: KWierso, Unassigned)

Details

(Keywords: intermittent-failure)

https://tbpl.mozilla.org/php/getParsedLog.php?id=29158753&tree=Mozilla-Inbound using slave: t-w864-ix-104 16:59:00 INFO - TEST-PASS | C:\slave\test\build\tests\xpcshell\tests\extensions\cookie\test\unit\test_cookies_sync_failure.js | test passed (time: 10121.000ms) 16:59:00 INFO - TEST-INFO | C:\slave\test\build\tests\xpcshell\tests\services\metrics\tests\xpcshell\test_metrics_provider_manager.js | running test ... 16:59:18 INFO - TEST-PASS | C:\slave\test\build\tests\xpcshell\tests\services\metrics\tests\xpcshell\test_metrics_provider_manager.js | test passed (time: 17939.000ms) 16:59:18 INFO - TEST-INFO | C:\slave\test\build\tests\xpcshell\tests\services\healthreport\tests\xpcshell\test_healthreporter.js | running test ... 17:04:18 WARNING - TEST-UNEXPECTED-FAIL | C:\slave\test\build\tests\xpcshell\tests\services\healthreport\tests\xpcshell\test_healthreporter.js | Test timed out 17:04:18 INFO - Can't trigger Breakpad, just killing process 17:04:18 WARNING - TEST-UNEXPECTED-FAIL | C:\slave\test\build\tests\xpcshell\tests\services\healthreport\tests\xpcshell\test_healthreporter.js | test failed (with xpcshell return code: 15), see following log: 17:04:18 INFO - >>>>>>> ... 17:04:20 INFO - Services.Metrics.ProviderManager DEBUG Pull-only providers being unregistered. 17:04:20 INFO - Services.Metrics.ProviderManager INFO Shutting down pull-only provider: DummyConstantProvider 17:04:20 INFO - Sqlite.Connection.data_submission_success DEBUG Conn #0: Stmt #159 finished. 17:04:20 INFO - TEST-PASS | C:/slave/test/build/tests/xpcshell/tests/services/healthreport/tests/xpcshell/test_healthreporter.js | [test_data_submission_success : 629] true == true 17:04:20 INFO - TEST-PASS | C:/slave/test/build/tests/xpcshell/tests/services/healthreport/tests/xpcshell/test_healthreporter.js | [test_data_submission_success : 630] true == true 17:04:20 INFO - TEST-PASS | C:/slave/test/build/tests/xpcshell/tests/services/healthreport/tests/xpcshell/test_healthreporter.js | [getValues : 76] [object Object] != null 17:04:20 INFO - TEST-PASS | C:/slave/test/build/tests/xpcshell/tests/services/healthreport/tests/xpcshell/test_healthreporter.js | [getValues : 78] [object Object] != null 17:04:20 INFO - Sqlite.Connection.data_submission_success TRACE Conn #0: Stmt #160 SELECT field_name, day, value FROM v_daily_counters WHERE measurement_id = :measurement_id - {"measurement_id":1} 17:04:20 INFO - Sqlite.Connection.data_submission_success DEBUG Conn #0: Stmt #160 finished. 17:04:20 WARNING - TEST-UNEXPECTED-FAIL | C:/slave/test/build/tests/xpcshell/tests/services/healthreport/tests/xpcshell/test_healthreporter.js | false == true - See following stack: 17:04:20 INFO - JS frame :: C:/slave/test/build/tests/xpcshell/tests/services/healthreport/tests/xpcshell/test_healthreporter.js :: getValues :: line 85 17:04:20 INFO - JS frame :: resource://gre/modules/Task.jsm :: TaskImpl_run :: line 233 17:04:20 INFO - JS frame :: resource://gre/modules/Promise.jsm :: <TOP_LEVEL> :: line 719 17:04:20 INFO - JS frame :: resource://gre/modules/Promise.jsm :: <TOP_LEVEL> :: line 483 17:04:20 INFO - native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0 17:04:20 INFO - TEST-INFO | (xpcshell/head.js) | exiting test 17:04:20 WARNING - TEST-UNEXPECTED-FAIL | resource://gre/modules/commonjs/sdk/core/promise.js | Unexpected exception 2147500036 17:04:20 INFO - TEST-INFO | (xpcshell/head.js) | exiting test 17:04:20 INFO - <<<<<<<
This is weird. It appears we aren't performing the outbound HTTP request! We stall somewhere after https://hg.mozilla.org/mozilla-central/file/c7cbfa315d46/services/common/rest.js#l311. This quacks like a rest.js bug or a Necko bug. But given the frequency and the very seldom nature, I wonder if it could be something network related, such as a DNS lookup that never resolves. Sheriffs: do any of these possibilities sound like existing bugs? Here is the bad log: 19:25:16 INFO - Services.HealthReport.HealthReporter WARN Recording new remote ID: fbefeb6c-6842-4230-be5c-b1723017f044 19:25:16 INFO - Services.HealthReport.HealthReporter INFO Writing state file: c:\users\cltbld~1.t-w\appdata\local\temp\tmpefhwrk\healthreport\state-data_submission_transport_failure.json 19:25:16 INFO - Sqlite.Connection.data_submission_transport_failure TRACE Conn #0: Stmt #82 INSERT OR REPLACE INTO daily_counters VALUES (:field_id, :days, COALESCE((SELECT value FROM daily_counters WHERE field_id = :field_id AND day = :days ), 0) + :by) - {"field_id":1,"days":16030,"by":1} 19:25:16 INFO - Services.BagheeraClient INFO Uploading data to http://localhost:62013/1.0/submit/test00/fbefeb6c-6842-4230-be5c-b1723017f044 19:25:16 INFO - Services.BagheeraClient INFO Request body length: 228 19:25:16 INFO - Services.BagheeraClient DEBUG POST Length: 228 19:25:16 INFO - Sqlite.Connection.data_submission_transport_failure DEBUG Conn #0: Stmt #82 finished. 19:25:16 INFO - <<<<<<< 19:25:16 INFO - TEST-INFO | C:\slave\test\build\tests\xpcshell\tests\services\metrics\tests\xpcshell\test_metrics_provider_manager.js | running test ... 19:27:15 INFO - TEST-PASS | C:\slave\test\build\tests\xpcshell\tests\services\metrics\tests\xpcshell\test_metrics_provider_manager.js | test passed (time: 119172.000ms) 19:27:1 A proper log looks something like: 0:04.55 Services.HealthReport.HealthReporter WARN Recording new remote ID: fdc51ed5-03ae-574a-93b2-b2400cd3028b 0:04.55 Services.HealthReport.HealthReporter INFO Writing state file: /var/folders/20/0m2k6lq56kdf4q_kdgzmhjkw0000gn/T/firefox/xpcshellprofile/healthreport/state-data_submission_transport_failure.json 0:04.55 Sqlite.Connection.data_submission_transport_failure TRACE Conn #0: Stmt #82 INSERT OR REPLACE INTO daily_counters VALUES (:field_id, :days, COALESCE((SELECT value FROM daily_counters WHERE field_id = :field_id AND day = :days ), 0) + :by) - {"field_id":1,"days":16030,"by":1} 0:04.55 Services.BagheeraClient INFO Uploading data to http://localhost:62013/1.0/submit/test00/fdc51ed5-03ae-574a-93b2-b2400cd3028b 0:04.55 Services.BagheeraClient INFO Request body length: 228 0:04.57 Services.BagheeraClient DEBUG POST Length: 228 0:04.58 Sqlite.Connection.data_submission_transport_failure DEBUG Conn #0: Stmt #82 finished. 0:04.83 Services.BagheeraClient INFO Transport failure on request: NS_ERROR_CONNECTION_REFUSED Stack trace: onStopRequest()@resource://services-common/rest.js:438 < <file:unknown> 0:04.83 Services.HealthReport.HealthReporter DEBUG Received Bagheera result. 0:04.83 Sqlite.Connection.data_submission_transport_failure TRACE Conn #0: Stmt #83 INSERT OR REPLACE INTO daily_counters VALUES (:field_id, :days, COALESCE((SELECT value FROM daily_counters WHERE field_id = :field_id AND day = :days ), 0) + :by) - {"field_id":4,"days":16030,"by":1} 0:04.83 Services.Metrics.ProviderManager DEBUG Pull-only providers being unregistered. 0:04.83 Services.Metrics.ProviderManager DEBUG Waiting for in-flight pull-only provider activity to complete before unregistering. 0:04.83 0:04.83 TEST-PASS | /Users/gps/src/firefox/obj-firefox.noindex/_tests/xpcshell/services/healthreport/tests/xpcshell/test_healthreporter.js | [test_data_submission_transport_failure : 564] "failure-soft" == "failure-soft"
Flags: needinfo?(sheriffs)
Given how it sorta hit all branches at the same time, network sounds like a possibility to me. That said, time is probably going to be better served making the test more robust than getting RelEng to make the network more reliable.
Flags: needinfo?(sheriffs)
Patrick: I'm interested in your Necko opinion. Read last 2 comments. A casual audit of rest.js would be appreciated. Is there any way we can test our hypothesis of failing DNS lookup?
Flags: needinfo?(mcmanus)
it isn't clear to me what the test is doing. Is it looking up "localhost" ? That won't ever need a functioning network.
Flags: needinfo?(mcmanus)
Inactive; closing (see bug 1180138).
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
Product: Firefox Health Report → Firefox Health Report Graveyard
You need to log in before you can comment on or make changes to this bug.