Closed Bug 1183008 Opened 4 years ago Closed 8 months ago

Intermittent test_ocsp_caching.js | Test timed out

Categories

(Core :: Security: PSM, defect, P3)

defect

Tracking

()

RESOLVED FIXED
Tracking Status
firefox56 --- disabled
firefox57 --- fixed

People

(Reporter: cbook, Assigned: mgoodwin)

References

()

Details

(Keywords: intermittent-failure, Whiteboard: [psm-intermittent][fennec-scouting][stockwell disabled])

Attachments

(2 files)

https://treeherder.mozilla.org/logviewer.html#?job_id=3731251&repo=fx-team

09:33:59 WARNING - TEST-UNEXPECTED-TIMEOUT | security/manager/ssl/tests/unit/test_ocsp_caching.js | Test timed out
This is happening frequently on multiple platforms. Can you please take a look? :)
Component: DOM: Security → Security: PSM
Flags: needinfo?(cykesiopka.bmo)
(In reply to David Keeler [:keeler] (use needinfo?) from comment #19)
> bug 1159155 might be related.

Could be... want me to take this?
Flags: needinfo?(dkeeler)
That sounds good, yes.
Flags: needinfo?(dkeeler)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #16)
> This is happening frequently on multiple platforms. Can you please take a
> look? :)

Looks like mgoodwin will look into this.
Flags: needinfo?(cykesiopka.bmo)
Assignee: nobody → mgoodwin
Status: NEW → ASSIGNED
Mass whiteboard change to annotate PSM intermittent test failures as [psm-intermittent]. Filter on 31b932bd-1aad-4e29-9f4b-4cd864a3ffdc if that's important to you.
Whiteboard: [psm-intermittent]
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Whiteboard: [psm-intermittent] → [psm-intermittent][fennec-scouting]
this is primarily failing on linux64-asan and android-debug, here is a link to a log for linux64-asan:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=102724367&lineNumber=2688

and I see this near the end:
[task 2017-05-29T00:12:29.480837Z] 00:12:29     INFO -  PID 10802 | found pre-defined host 'ocsp-stapling-none.example.com'
[task 2017-05-29T00:12:29.482424Z] 00:12:29     INFO -  PID 10802 | read '0x30'
[task 2017-05-29T00:12:29.483982Z] 00:12:29     INFO -  PID 10802 | sending '0'
[task 2017-05-29T00:12:29.485687Z] 00:12:29     INFO -  TEST-PASS | security/manager/ssl/tests/unit/test_ocsp_caching.js |  - Should have received ASCII '0' from server - "0" == "0"
[task 2017-05-29T00:12:29.487283Z] 00:12:29     INFO -  "handling ocsp-stapling-none.example.com"
[task 2017-05-29T00:12:29.488998Z] 00:12:29     INFO -  TEST-PASS | security/manager/ssl/tests/unit/test_ocsp_caching.js |  - Actual and expected connection result should match - 0 == 0
[task 2017-05-29T00:12:29.490687Z] 00:12:29     INFO -  TEST-PASS | security/manager/ssl/tests/unit/test_ocsp_caching.js |  - should have made 0 OCSP requests - 0 == 0
[task 2017-05-29T00:12:29.492284Z] 00:12:29     INFO -  (xpcshell/head.js) | test run_next_test 32 pending (2)
[task 2017-05-29T00:12:29.493856Z] 00:12:29     INFO -  (xpcshell/head.js) | test finished (2)
[task 2017-05-29T00:12:29.495502Z] 00:12:29     INFO -  security/manager/ssl/tests/unit/test_ocsp_caching.js | Starting
[task 2017-05-29T00:12:29.497081Z] 00:12:29     INFO -  (xpcshell/head.js) | test pending (2)
[task 2017-05-29T00:12:29.498684Z] 00:12:29     INFO -  (xpcshell/head.js) | test run_next_test 33 pending (3)
[task 2017-05-29T00:12:29.500257Z] 00:12:29     INFO -  (xpcshell/head.js) | test finished (3)
[task 2017-05-29T00:12:29.501854Z] 00:12:29     INFO -  (xpcshell/head.js) | test run_next_test 32 finished (2)
[task 2017-05-29T00:12:29.503479Z] 00:12:29     INFO -  security/manager/ssl/tests/unit/test_ocsp_caching.js | Starting
[task 2017-05-29T00:12:29.505054Z] 00:12:29     INFO -  (xpcshell/head.js) | test pending (2)
[task 2017-05-29T00:12:29.506658Z] 00:12:29     INFO -  (xpcshell/head.js) | test run_next_test 33 finished (2)
[task 2017-05-29T00:12:29.508289Z] 00:12:29     INFO -  (xpcshell/head.js) | test run_next_test 34 pending (2)
[task 2017-05-29T00:12:29.509856Z] 00:12:29     INFO -  (xpcshell/head.js) | test finished (2)
[task 2017-05-29T00:12:29.511475Z] 00:12:29     INFO -  (xpcshell/head.js) | test run_next_test 34 finished (1)
[task 2017-05-29T00:12:29.513016Z] 00:12:29     INFO -  exiting test
[task 2017-05-29T00:12:29.514556Z] 00:12:29     INFO -  <<<<<<<
[task 2017-05-29T00:12:29.516131Z] 00:12:29     INFO -  xpcshell return code: None
[task 2017-05-29T00:29:09.627438Z] 00:29:09     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/home/worker/workspace/build/venv/bin/python', '-u', '/home/worker/workspace/build/tests/xpcshell/runxpcshelltests.py', '--total-chunks', '8', '--this-chunk', '5', '--symbols-path=https://queue.taskcluster.net/v1/task/J23qNgSfS_GUEKMFkF_BNQ/artifacts/public/build/target.crashreporter-symbols.zip', '--test-plugin-path=/home/worker/workspace/build/application/firefox/plugins', '--log-raw=/home/worker/workspace/build/blobber_upload_dir/xpcshell_raw.log', '--log-errorsummary=/home/worker/workspace/build/blobber_upload_dir/xpcshell_errorsummary.log', '--utility-path=tests/bin', '--xpcshell=/home/worker/workspace/build/application/firefox/xpcshell', '--manifest=tests/xpcshell/tests/xpcshell.ini']
[task 2017-05-29T00:29:09.650332Z] 00:29:09    ERROR - timed out after 1000 seconds of no output


:mgoodwin, can you take a look at this in the next week or two?  This failure rate has been at a medium load, but seemed to have picked up more on android in the last week.
Flags: needinfo?(mgoodwin)
Whiteboard: [psm-intermittent][fennec-scouting] → [psm-intermittent][fennec-scouting][stockwell needswork]
(In reply to Joel Maher ( :jmaher) from comment #262)
> :mgoodwin, can you take a look at this in the next week or two?  This
> failure rate has been at a medium load, but seemed to have picked up more on
> android in the last week.

Sure
Flags: needinfo?(mgoodwin)
:mgoodwin- do you have any updates here?
Flags: needinfo?(mgoodwin)
checking in, I see many failures in the past week as well
(In reply to Joel Maher ( :jmaher) from comment #269)
> checking in, I see many failures in the past week as well

I'm working on this; no updates as yet.
Flags: needinfo?(mgoodwin)
thanks :mgoodwin, do mention if you have issues reproducing this.
It seems that the problem here stems from the 1200ms sleep; I've tried reducing this, the problem seems to go away and the test still seems to work as intended. I'll try to work out the rationale for the original value; if I don't agree with it (or it's no longer pertinent) I suggest we drop the sleep to, say, 500ms and see if things improve.
(In reply to Mark Goodwin [:mgoodwin] from comment #273)
> It seems that the problem here stems from the 1200ms sleep; I've tried
> reducing this, the problem seems to go away and the test still seems to work
> as intended. I'll try to work out the rationale for the original value; if I
> don't agree with it (or it's no longer pertinent) I suggest we drop the
> sleep to, say, 500ms and see if things improve.

Mark, let's put a stop to these on-going failures. Do you want to try your idea, or disable the test?
Flags: needinfo?(mgoodwin)
Pushed by gbrown@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2253f8e4c3cf
Disable test_ocsp_caching.js on slow platforms, for ongoing intermittent failures; r=me,test-only
Keywords: leave-open
Whiteboard: [psm-intermittent][fennec-scouting][stockwell needswork] → [psm-intermittent][fennec-scouting][stockwell disabled]
Depends on: 1390131
Attached patch Bug1183008.patchSplinter Review
Unable to use mozreview as this change depends on the change from bug 1390131 which isn't landed yet.
Flags: needinfo?(mgoodwin)
Attachment #8897396 - Flags: review?(dkeeler)
Comment on attachment 8897396 [details] [diff] [review]
Bug1183008.patch

Review of attachment 8897396 [details] [diff] [review]:
-----------------------------------------------------------------

Cool - lgtm.
Attachment #8897396 - Flags: review?(dkeeler) → review+
Comment on attachment 8897396 [details] [diff] [review]
Bug1183008.patch

Review of attachment 8897396 [details] [diff] [review]:
-----------------------------------------------------------------

Oh, heh - we should probably also revert https://hg.mozilla.org/mozilla-central/rev/2253f8e4c3cf in this patch.
(In reply to David Keeler [:keeler] (use needinfo?) from comment #291)
> Comment on attachment 8897396 [details] [diff] [review]
> Bug1183008.patch
> 
> Review of attachment 8897396 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Oh, heh - we should probably also revert
> https://hg.mozilla.org/mozilla-central/rev/2253f8e4c3cf in this patch.

Yeah, makes sense.
Comment on attachment 8898370 [details]
Bug 1183008 - Modify test_ocsp_caching.js to use modified thisUpdate rather than sleep.

https://reviewboard.mozilla.org/r/169744/#review174972
Attachment #8898370 - Flags: review?(dkeeler) → review+
Pushed by mgoodwin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6ba074159b8c
Modify test_ocsp_caching.js to use modified thisUpdate rather than sleep. r=keeler
The leave-open keyword is there and there is no activity for 6 months.
:mgoodwin, maybe it's time to close this bug?
Flags: needinfo?(mgoodwin)

From what I can tell, we fixed this in 57.

Status: ASSIGNED → RESOLVED
Closed: 8 months ago
Keywords: leave-open
Resolution: --- → FIXED
Flags: needinfo?(mgoodwin)
You need to log in before you can comment on or make changes to this bug.