Closed
Bug 1121117
Opened 9 years ago
Closed 9 years ago
Intermittent test_ocsp_timeout.js | xpcshell return code: 0 | - false == true
Categories
(Core :: Security: PSM, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
Tracking | Status | |
---|---|---|
firefox37 | --- | fixed |
firefox38 | --- | fixed |
firefox39 | --- | fixed |
firefox-esr31 | --- | unaffected |
People
(Reporter: RyanVM, Assigned: Cykesiopka)
References
Details
(Keywords: intermittent-failure)
Attachments
(2 files, 2 obsolete files)
2.30 KB,
patch
|
keeler
:
review+
|
Details | Diff | Splinter Review |
1.62 KB,
patch
|
Cykesiopka
:
review+
RyanVM
:
checkin+
|
Details | Diff | Splinter Review |
08:36:10 INFO - TEST-START | security/manager/ssl/tests/unit/test_ocsp_timeout.js 08:36:21 WARNING - TEST-UNEXPECTED-FAIL | security/manager/ssl/tests/unit/test_ocsp_timeout.js | xpcshell return code: 0 08:36:21 INFO - TEST-INFO took 10525ms 08:36:21 INFO - >>>>>>> 08:36:21 INFO - (xpcshell/head.js) | test MAIN run_test pending (1) 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 0 pending (2) 08:36:21 INFO - (xpcshell/head.js) | test MAIN run_test finished (2) 08:36:21 INFO - running event loop 08:36:21 INFO - security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting 08:36:21 INFO - (xpcshell/head.js) | test pending (2) 08:36:21 INFO - TEST-PASS | security/manager/ssl/tests/unit/test_ocsp_timeout.js | - true == true 08:36:21 INFO - TEST-PASS | security/manager/ssl/tests/unit/test_ocsp_timeout.js | - true == true 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 0 finished (2) 08:36:21 INFO - (xpcshell/head.js) | test pending (2) 08:36:21 INFO - (xpcshell/head.js) | test finished (2) 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 1 pending (2) 08:36:21 INFO - (xpcshell/head.js) | test finished (2) 08:36:21 INFO - security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting 08:36:21 INFO - (xpcshell/head.js) | test pending (2) 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 2 pending (3) 08:36:21 INFO - (xpcshell/head.js) | test finished (3) 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 1 finished (2) 08:36:21 INFO - security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting 08:36:21 INFO - (xpcshell/head.js) | test pending (2) 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 2 finished (2) 08:36:21 INFO - "handling ocsp-stapling-none.example.com" 08:36:21 INFO - TEST-PASS | security/manager/ssl/tests/unit/test_ocsp_timeout.js | - 2153389959 == 2153389959 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 3 pending (2) 08:36:21 INFO - (xpcshell/head.js) | test finished (2) 08:36:21 INFO - security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting 08:36:21 INFO - (xpcshell/head.js) | test pending (2) 08:36:21 WARNING - TEST-UNEXPECTED-FAIL | security/manager/ssl/tests/unit/test_ocsp_timeout.js | - false == true 08:36:21 INFO - C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_timeout.js:add_tests_in_mode/<:62 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:_run_next_test:1378 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:do_execute_soon/<.run:640 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:_do_main:208 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:_execute_test:500 08:36:21 INFO - -e:null:1 08:36:21 INFO - null:null:0 08:36:21 INFO - exiting test 08:36:21 INFO - 2147500036 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:_run_next_test:1380 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:do_execute_soon/<.run:640 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:_do_main:208 08:36:21 INFO - C:\slave\test\build\tests\xpcshell\head.js:_execute_test:500 08:36:21 INFO - -e:null:1 08:36:21 INFO - null:null:0 08:36:21 INFO - exiting test 08:36:21 INFO - (xpcshell/head.js) | test run_next_test 3 finished (2) 08:36:21 INFO - <<<<<<<
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 33•9 years ago
|
||
I've been investigating this issue every once in a while (e.g. https://treeherder.mozilla.org/#/jobs?repo=try&revision=091fd641644c ). It would be nice to see exactly what the time difference is, so this patch adds some print logging to reveal what the exact values are.
Attachment #8572211 -
Flags: review?(dkeeler)
Comment on attachment 8572211 [details] [diff] [review] bug1121117_test_ocsp_timeout-add-logging_v1.patch Review of attachment 8572211 [details] [diff] [review]: ----------------------------------------------------------------- Sounds good to me.
Attachment #8572211 -
Flags: review?(dkeeler) → review+
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 37•9 years ago
|
||
Thanks for the review. https://treeherder.mozilla.org/#/jobs?repo=try&revision=c97b0d327cb5
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 39•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/c6d4680f1f5a
Keywords: checkin-needed
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 53•9 years ago
|
||
After chasing one or two red herrings, it looks like the non-monotonicity of Date.now() is the issue here (see https://treeherder.mozilla.org/#/jobs?repo=try&revision=0bdd62e8d5c1 where I compared Date.now() to what should be a monotonic time source). I'm not aware of any way to access a monotonic time source in a xpcshell test other than abusing nsITelemetry::msSinceProcessStart(), so this patch just adds a 300ms fuzz to the assertion. So far, the lowest time difference has been 9765ms, so 300ms fuzz should hopefully be enough. Try push looks good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=09b4269a5353 (BuildAPI says this is all green, don't know why TH still says some jobs are running).
Attachment #8578942 -
Flags: review?(dkeeler)
Comment on attachment 8578942 [details] [diff] [review] bug1121117_add-fuzz_v1.patch Review of attachment 8578942 [details] [diff] [review]: ----------------------------------------------------------------- ::: security/manager/ssl/tests/unit/test_ocsp_timeout.js @@ +65,5 @@ > // With OCSP hard-fail on, we timeout after 10 seconds. > // With OCSP soft-fail, we timeout after 2 seconds. > + // Date.now() is not guaranteed to be monotonic, so add extra fuzz time > + // to prevent intermittent failures (see Bug 1121117). > + const FUZZ_MS = 300; Seems reasonable. Let's also mention that this only appeared to be a problem on Win XP, so when we no longer support that platform, we can revert this. Alternatively, do you know of a reason we shouldn't use nsITelemetry.msSinceProcessStart() as in the debugging patch?
Attachment #8578942 -
Flags: review?(dkeeler) → review+
Assignee | ||
Comment 55•9 years ago
|
||
(In reply to David Keeler [:keeler] (use needinfo?) from comment #54) > Alternatively, do you know of a reason we shouldn't use > nsITelemetry.msSinceProcessStart() as in the debugging patch? Other than "it felt hacky" to me, none. I guess this is just a test file though... I'll submit a new patch.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 61•9 years ago
|
||
This patch implements the alternative approach of using nsITelemetry.msSinceProcessStart(). https://treeherder.mozilla.org/#/jobs?repo=try&revision=7e453e7550fe
Attachment #8578942 -
Attachment is obsolete: true
Attachment #8579927 -
Flags: review?(dkeeler)
Comment on attachment 8579927 [details] [diff] [review] bug1121117_monotonic-clock_v1.patch Review of attachment 8579927 [details] [diff] [review]: ----------------------------------------------------------------- I don't think this is quite the approach we should take. More specifically, this test will be simpler and more maintainable if we use only one timing implementation - either telemetry if it doesn't throw exceptions or regular js Date() with the fuzz factor. ::: security/manager/ssl/tests/unit/test_ocsp_timeout.js @@ +41,5 @@ > run_next_test(); > } > > function add_tests_in_mode(useHardFail) { > + const nsITelemetry = Cc["@mozilla.org/base/telemetry;1"] My feeling about this is that it should be called something like "telemetry" or "telemetryService", since "nsITelemetry" is the name of the interface. Not a big deal, though. @@ +50,5 @@ > add_test(function () { > Services.prefs.setBoolPref("security.OCSP.require", useHardFail); > + startTimeNonMonotonic = new Date(); > + try { > + startTime = nsITelemetry.msSinceProcessStart(); Does this tend to throw exceptions? (does telemetry need to be initialized earlier in the test?) I think it would be best to rely on only one timing implementation. If using telemetry is not reliable, then we'll be falling back to using Date() anyway, and without the fuzz factor we'll still see intermittent failures.
Attachment #8579927 -
Flags: review?(dkeeler) → review-
Comment hidden (Legacy TBPL/Treeherder Robot) |
Assignee | ||
Comment 64•9 years ago
|
||
(In reply to David Keeler [:keeler] (use needinfo?) from comment #62) > I don't think this is quite the approach we should take. More specifically, > this test will be simpler and more maintainable if we use only one timing > implementation - either telemetry if it doesn't throw exceptions or regular > js Date() with the fuzz factor. Sure. > My feeling about this is that it should be called something like "telemetry" > or "telemetryService", since "nsITelemetry" is the name of the interface. > Not a big deal, though. Fair. > Does this tend to throw exceptions? (does telemetry need to be initialized > earlier in the test?) I don't think this will throw often (assuming I'm understanding the current implementation correctly). > If using telemetry is not reliable, then we'll be falling back to using Date() > anyway, and without the fuzz factor we'll still see intermittent failures. Right. I guess it's simpler to just add the fuzz factor then.
Assignee | ||
Comment 65•9 years ago
|
||
This is attachment 8578942 [details] [diff] [review] with the feedback raised in comment 54 addressed.
Attachment #8579927 -
Attachment is obsolete: true
Attachment #8580465 -
Flags: review+
Assignee | ||
Comment 66•9 years ago
|
||
Comment on attachment 8580465 [details] [diff] [review] bug1121117_add-fuzz_v2.patch Thanks for the reviews! (Try push is in comment 53).
Attachment #8580465 -
Flags: checkin?
Assignee | ||
Updated•9 years ago
|
Keywords: leave-open
Reporter | ||
Updated•9 years ago
|
Attachment #8580465 -
Flags: checkin? → checkin+
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 69•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/8cd33e72eb47
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Reporter | ||
Updated•9 years ago
|
status-firefox37:
--- → affected
status-firefox38:
--- → affected
status-firefox-esr31:
--- → unaffected
Reporter | ||
Comment 70•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-beta/rev/8358c6c2c417
Flags: in-testsuite+
Reporter | ||
Comment 71•9 years ago
|
||
https://hg.mozilla.org/releases/mozilla-aurora/rev/4f87a9688961
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•