Closed Bug 1452445 Opened 6 years ago Closed 6 years ago

Intermittent toolkit/components/url-classifier/tests/unit/test_listmanager.js | xpcshell return code: 0

Categories

(Toolkit :: Safe Browsing, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox59 --- unaffected
firefox60 --- unaffected
firefox61 - fixed
firefox62 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: francois)

References

Details

(Keywords: intermittent-failure)

Attachments

(2 files, 1 obsolete file)

Filed by: archaeopteryx [at] coole-files.de

https://treeherder.mozilla.org/logviewer.html#?job_id=172365899&repo=autoland

https://queue.taskcluster.net/v1/task/KHJlIxxyREitN_xjTKn2rQ/runs/0/artifacts/public/logs/live_backing.log

Showed up when bug 1438671 made update errors throw. This has also been observed with yesterday's central-as-beta simulations: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e4fc08a6b378f769a7e598623495db608da5c323&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable&group_state=expanded&selectedJob=172467498
Only hit on Windows.

20:42:55     INFO -  TEST-PASS | toolkit/components/url-classifier/tests/unit/test_listmanager.js | test_getGethashUrl - [test_getGethashUrl : 187] "http://localhost:4444/safebrowsing/gethash2" == "http://localhost:4444/safebrowsing/gethash2"
20:42:55     INFO -  TEST-PASS | toolkit/components/url-classifier/tests/unit/test_listmanager.js | test_getGethashUrl - [test_getGethashUrl : 189] "http://localhost:5555/safebrowsing/gethash-v4" == "http://localhost:5555/safebrowsing/gethash-v4"
20:42:55     INFO -  (xpcshell/head.js) | test run_next_test 5 pending (3)
20:42:55     INFO -  (xpcshell/head.js) | test test_getGethashUrl finished (3)
20:42:55     INFO -  (xpcshell/head.js) | test run_next_test 4 finished (2)
20:42:55     INFO -  (xpcshell/head.js) | test run_next_test 5 finished (1)
20:42:55     INFO -  exiting test
20:42:55     INFO -  PID 4392 | JavaScript strict warning: resource://testing-common/httpd.js, line 804: ReferenceError: reference to undefined property "_stopCallback"
20:42:55     INFO -  PID 4392 | !!! error running onStopped callback: TypeError: callback is not a function
20:42:55     INFO -  PID 4392 | !!! error running onStopped callback: TypeError: callback is not a function
20:42:55     INFO -  "CONSOLE_MESSAGE: (warn) [JavaScript Warning: "ReferenceError: reference to undefined property "_stopCallback"" {file: "resource://testing-common/httpd.js" line: 804}]"
20:42:55     INFO -  PID 4392 | listmanager: 20:42:55 GMT+0000 (Coordinated Universal Time): update error for test-phish-proto,test-unwanted-proto from http://localhost:5555/safebrowsing/update?: NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK
20:42:55     INFO -  "[safebrowsing-update-finished] update error: NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK"
20:42:55     INFO -  update error: NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK
20:42:55     INFO -  Z:/task_1523046594/build/tests/xpcshell/tests/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js:updateError:311
20:42:55     INFO -  Z:/task_1523046594/build/tests/xpcshell/tests/toolkit/components/url-classifier/tests/unit/head_urlclassifier.js:observer:459
20:42:55     INFO -  jar:file:///Z:/task_1523046594/build/application/firefox/omni.ja!/components/nsUrlClassifierListManager.js:PROT_ListManager.prototype.updateError_:625
20:42:55     INFO -  jar:file:///Z:/task_1523046594/build/application/firefox/omni.ja!/components/nsUrlClassifierLib.js:newfn:47
20:42:55     INFO -  Z:\task_1523046594\build\tests\xpcshell\head.js:_execute_test:594
20:42:55     INFO -  -e:null:1
20:42:55     INFO -  exiting test
20:42:55     INFO -  PID 4392 | JavaScript error: Z:\\task_1523046594\\build\\tests\\xpcshell\\head.js, line 723: NS_ERROR_ABORT:
20:42:55     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 20:42:55 GMT+0000 (Coordinated Universal Time): update error for test-phish-proto,test-unwanted-proto from http://localhost:5555/safebrowsing/update?: NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK
20:42:55     INFO -  "
20:42:55     INFO -  "CONSOLE_MESSAGE: (error) [JavaScript Error: "NS_ERROR_ABORT: " {file: "Z:\\task_1523046594\\build\\tests\\xpcshell\\head.js" line: 723}]"
20:42:55     INFO -  <<<<<<<
[Tracking Requested - why for this release]:
Not yet on release or beta, but fails more often in central-as-beta simulations (25x) than on trunk (11x).
Assignee: nobody → francois
Status: NEW → ASSIGNED
Keywords: leave-open
Priority: P2 → P1
Comment on attachment 8972188 [details]
Bug 1452445 - Promote MOZ_LOG calls to NS_WARNING in LookupCacheV4.

https://reviewboard.mozilla.org/r/240852/#review247248
Attachment #8972188 - Flags: review?(gpascutto) → review+
Pushed by fmarier@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a761c0e32512
Promote MOZ_LOG calls to NS_WARNING in LookupCacheV4. r=gcp
This isn't frequent enough to need tracking. That said, are the more recent failures helping to shed light on the problem by chance?
Flags: needinfo?(francois)
The following failure from this past week:
https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=452d522b1442a945dc95a2a8f9d01fa2e3d4078f&selectedJob=180023933

hits one of the old warnings:
https://treeherder.mozilla.org/logviewer.html#?job_id=180023933&repo=mozilla-inbound&lineNumber=8508

  PID 7093 | [7093, Classifier Update] WARNING: NS_ENSURE_SUCCESS(rv, NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK) failed with result 0x80004004: file /builds/worker/workspace/build/src/toolkit/components/url-classifier/Classifier.cpp, line 1380
  PID 7093 | listmanager: 09:06:47 GMT-0700 (PDT): update error for test-phish-proto,test-unwanted-proto from http://localhost:5555/safebrowsing/update?: NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK
  "[safebrowsing-update-finished] update error: NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK"
  update error: NS_ERROR_UC_UPDATE_FAIL_TO_WRITE_DISK

right here:
https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/toolkit/components/url-classifier/Classifier.cpp#1380

The only time that LookupCacheV4::WriteMetadata() returns NS_ERROR_ABORT is when we are shutting down and abandoning the update:
https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/toolkit/components/url-classifier/LookupCacheV4.cpp#532

So it looks like the test is the problem here. Maybe what's happening is that we've had a successful update (the one we are testing) but then another one starts up just as we're shutting down and that's causing the test to fail, even though all of the test cases passed.

This does seem likely since test_getGethashUrl passes in the above failure:
https://treeherder.mozilla.org/logviewer.html#?job_id=180023933&repo=mozilla-inbound&lineNumber=8495-8503

and that's the last test case in test_listmanager.js.

The trick will be to manage the observer correctly so that it gets removed before we start shutting down. There will probably always be race conditions with unwanted extra updates, but maybe we can make it fail less often at least.
Flags: needinfo?(francois)
Attachment #8983234 - Flags: review?(gpascutto)
Attachment #8983234 - Flags: review?(gpascutto) → review+
Second patch is now ready to land and should fix this bug.
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/99f438f07d89
Fix update error observer in unit test. r=gcp
Keywords: checkin-needed
Backout by btara@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1b12b1ef63b4
Backed out changeset 99f438f07d89 for es lint failure on head_urlclassifier.js:440 CLOSED TREE
Carrying GCP's r+.

Now passes `./mach eslint toolkit/components/url-classifier/tests/unit/`
Attachment #8983234 - Attachment is obsolete: true
Attachment #8984300 - Flags: review+
Ok, the second patch should be able to land cleanly now.
Keywords: checkin-needed
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e3d40506f24d
Fix update error observer in unit test. r=gcp
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/e3d40506f24d
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
This still occurs on try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=06407a28c6349e9803686ecb616f0279210349c1&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable&selectedJob=182921429

https://treeherder.mozilla.org/logviewer.html#?job_id=182921429&repo=try&lineNumber=7445

07:35:15     INFO -  TEST-START | toolkit/components/url-classifier/tests/unit/test_listmanager.js
07:35:18  WARNING -  TEST-UNEXPECTED-FAIL | toolkit/components/url-classifier/tests/unit/test_listmanager.js | xpcshell return code: 0
07:35:18     INFO -  TEST-INFO took 3382ms
07:35:18     INFO -  >>>>>>>
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): registering test-listmanager0-digest256 with http://localhost:4444/safebrowsing/update
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): registering test-listmanager1-digest256 with http://localhost:4444/safebrowsing/update
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): registering test-listmanager2-digest256 with http://localhost:4444/safebrowsing/update
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): registering test-phish-proto with http://localhost:5555/safebrowsing/update?
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): registering test-unwanted-proto with http://localhost:5555/safebrowsing/update?
07:35:18     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
07:35:18     INFO -  (xpcshell/head.js) | test run_next_test 0 pending (2)
07:35:18     INFO -  (xpcshell/head.js) | test MAIN run_test finished (2)
07:35:18     INFO -  running event loop
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): registering test-listmanager0-digest256 with http://localhost:4444/safebrowsing/update"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): registering test-listmanager1-digest256 with http://localhost:4444/safebrowsing/update"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): registering test-listmanager2-digest256 with http://localhost:4444/safebrowsing/update"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): registering test-phish-proto with http://localhost:5555/safebrowsing/update?"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): registering test-unwanted-proto with http://localhost:5555/safebrowsing/update?"
07:35:18     INFO -  toolkit/components/url-classifier/tests/unit/test_listmanager.js | Starting test_update_table0
07:35:18     INFO -  (xpcshell/head.js) | test test_update_table0 pending (2)
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-listmanager0-digest256
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-listmanager1-digest256
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-listmanager2-digest256
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-phish-proto
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): Enabling table updates for test-listmanager0-digest256
07:35:18     INFO -  PID 6962 | listmanager: 07:35:15 GMT-0700 (PDT): Starting managing lists
07:35:18     INFO -  (xpcshell/head.js) | test run_next_test 0 finished (2)
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-listmanager0-digest256"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-listmanager1-digest256"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-listmanager2-digest256"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): Disabling table updates for test-phish-proto"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): Enabling table updates for test-listmanager0-digest256"
07:35:18     INFO -  "CONSOLE_MESSAGE: (info) listmanager: 07:35:15 GMT-0700 (PDT): Starting managing lists"
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: