Closed Bug 1005266 Opened 10 years ago Closed 10 years ago

Intermittent test_ocsp_timeout.js | test failed (with xpcshell return code: 0) | false == true

Categories

(Core :: Security: PSM, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33
Tracking Status
firefox30 --- unaffected
firefox31 --- fixed
firefox32 --- fixed
firefox33 --- fixed
firefox-esr24 --- unaffected
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: RyanVM, Assigned: keeler)

References

Details

(Keywords: intermittent-failure)

Attachments

(2 files, 3 obsolete files)

https://tbpl.mozilla.org/php/getParsedLog.php?id=38935876&tree=Fx-Team

Windows XP 32-bit fx-team opt test xpcshell on 2014-05-02 07:20:44 PDT for push d07c627b749f
slave: t-xp32-ix-059

06:28:22     INFO -  TEST-INFO | C:\slave\test\build\tests\xpcshell\tests\security\manager\ssl\tests\unit\test_ocsp_timeout.js | running test ...
06:28:33  WARNING -  TEST-UNEXPECTED-FAIL | C:\slave\test\build\tests\xpcshell\tests\security\manager\ssl\tests\unit\test_ocsp_timeout.js | test failed (with xpcshell return code: 0), see following log:
06:28:33     INFO -  >>>>>>>
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test MAIN run_test pending (1)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 0 pending (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test MAIN run_test finished (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | running event loop
06:28:33     INFO -  TEST-INFO | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test pending (2)
06:28:33     INFO -  TEST-PASS | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/head_psm.js | [_getBinaryUtil : 322] true == true
06:28:33     INFO -  TEST-PASS | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/head_psm.js | [_setupTLSServerTest : 364] true == true
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 0 finished (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test pending (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test finished (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 1 pending (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test finished (2)
06:28:33     INFO -  TEST-INFO | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test pending (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 2 pending (3)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test finished (3)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 1 finished (2)
06:28:33     INFO -  TEST-INFO | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test pending (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 2 finished (2)
06:28:33     INFO -  TEST-PASS | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/head_psm.js | [add_connection_test/</< : 296] 2153389959 == 2153389959
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 3 pending (2)
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test finished (2)
06:28:33     INFO -  TEST-INFO | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_timeout.js | Starting
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test pending (2)
06:28:33  WARNING -  TEST-UNEXPECTED-FAIL | C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_timeout.js | false == true - See following stack:
06:28:33     INFO -  JS frame :: C:/slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_timeout.js :: add_tests_in_mode/< :: line 46
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: _run_next_test :: line 1462
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: do_execute_soon/<.run :: line 554
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: _do_main :: line 176
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: _execute_test :: line 389
06:28:33     INFO -  JS frame :: -e :: <TOP_LEVEL> :: line 1
06:28:33     INFO -  native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | exiting test
06:28:33  WARNING -  TEST-UNEXPECTED-FAIL | C:\\slave\\test\\build\\tests\\xpcshell\\head.js | 2147500036 - See following stack:
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: _run_next_test :: line 1464
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: do_execute_soon/<.run :: line 554
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: _do_main :: line 176
06:28:33     INFO -  JS frame :: C:\\slave\\test\\build\\tests\\xpcshell\\head.js :: _execute_test :: line 389
06:28:33     INFO -  JS frame :: -e :: <TOP_LEVEL> :: line 1
06:28:33     INFO -  native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | exiting test
06:28:33     INFO -  TEST-INFO | (xpcshell/head.js) | test run_next_test 3 finished (2)
06:28:33     INFO -  <<<<<<<
The fact that all of these happened in close proximity across different trees suggests some sort of infra dependency maybe?
This intermittent failure is #12 on OrangeFactor, and as such worthy of escalation due to https://wiki.mozilla.org/Sheriffing/Test_Disabling_Policy

Needinfo'ing an appropriate person for estimation of next steps & timeframe. The policy requires that I state that after two working days a needinfo on the module owner will be added (if they aren't already), and failing that - the test disabled (or suite hidden if disabling isn't an option) 2 working days after that. But that's an absolutely last resort :-)
Flags: needinfo?(dkeeler)
I've been investigating this (slowly) and I've made some progress, but at this point if it's a nuisance we can disable it.

Here's what I've found so far:
The failing case is when we expect an OCSP request to timeout after 10 seconds (line 46 of test_ocsp_timeout.js). The 'false == true' wasn't very helpful, so I added a line to dump the calculation that was going on in this build: https://tbpl.mozilla.org/?tree=Try&rev=aa93215561a2
As a result, I found that often the operation times out after 4 or as little as 2 seconds. I know of at least one way the test might be faulty, so I'll take a look at that. If that's not the problem, then I have no idea what's going on and I'll have to try something else.

Ryan - if disabling the test is something you would do, go ahead. I can also take care of it if need be.
Flags: needinfo?(dkeeler)
Let's see how your first attempt goes and we can decide from there what to do.
Assignee: nobody → dkeeler
Attached patch patch (obsolete) — Splinter Review
I think this fixes it: https://tbpl.mozilla.org/?tree=Try&rev=b4d886f427ea (the failures on Android and B2G are an unrelated thing I'm working on).
The problem was that the test didn't specify a timeout for the socket.
Attachment #8420233 - Flags: review?(mmc)
(In reply to David Keeler (:keeler) from comment #62)
> The problem was that the test didn't specify a timeout for the socket.

And that it didn't, you know, start actually listening.
Comment on attachment 8420233 [details] [diff] [review]
patch

Review of attachment 8420233 [details] [diff] [review]:
-----------------------------------------------------------------

This seems reasonable to me, but this test could use more comments overall. From our irc chat, please add a comment that the server socket is being used as an OCSP responder in this case, and that the point of it is to receive connections but never respond.
Attachment #8420233 - Flags: review?(mmc) → review+
Attached patch patch with comments (obsolete) — Splinter Review
Thanks, Monica. I added some comments. Here's another try run with this patch: https://tbpl.mozilla.org/?tree=Try&rev=3fe06ac9fc7b (the Android failures are unrelated).
Attachment #8420233 - Attachment is obsolete: true
Attachment #8420388 - Flags: review+
Inbound is not happy right now, so I'll wait for a sheriff to check this in.
Keywords: checkin-needed
Wes, can you push this once the tree stabilizes? :)
https://hg.mozilla.org/mozilla-central/rev/264aef76b71d
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
hrm, some of those recent stars (comment 80 and comment 81) appear to be post-landing on inbound :(
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #83)
> hrm, some of those recent stars (comment 80 and comment 81) appear to be
> post-landing on inbound :(

yeah also on other trees :(
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla32 → ---
I tried something else I thought would fix this, but to no avail: https://tbpl.mozilla.org/?tree=Try&rev=2c7eaa81bbf1
I noticed (finally) that these are almost* all on winxp 32-bit. I'm not sure it's worth the time and effort to figure out why, so my recommendation is to disable this test on that platform. Thoughts? Concerns?

*The exceptions seem to be c 17 (win 7), c 33, c 34, and c 52 (all win 6.2), none of which were after the attempted fix.
At the manifest level, I think we'd have to disable on all Windows platforms. You *might* be able to detect XP within the test itself and skip, though.
Attached patch disale test on winxp (obsolete) — Splinter Review
The try run for this is in comment 128.
(Context: Since the test is only failing on winxp, I don't think it's worth spending any more time on it.)
Attachment #8422552 - Flags: review?(mmc)
Comment on attachment 8422552 [details] [diff] [review]
disale test on winxp

Review of attachment 8422552 [details] [diff] [review]:
-----------------------------------------------------------------

That's ok, I hate WinXP, too.
Attachment #8422552 - Flags: review?(mmc) → review+
https://hg.mozilla.org/mozilla-central/rev/c92d548bd156
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
This is the #1 orange on mozilla-beta at this point (destined to be our next ESR release). I think it's time to consider disabling the test on Windows wholesale if we don't have any other ideas here.
Flags: needinfo?(dkeeler)
Yes, that's probably best at this point. Here's a patch to disable this completely on windows (this is for mozilla-central - branch patches coming soon).
Attachment #8420388 - Attachment is obsolete: true
Attachment #8422552 - Attachment is obsolete: true
Attachment #8448233 - Flags: review?(mmc)
Flags: needinfo?(dkeeler)
No need on the branch patches, I can handle the manifest change needed :P
Target Milestone: mozilla32 → ---
That patch applies on aurora. This is for beta.
Attachment #8448236 - Flags: review?(mmc)
Right, of course - I thought it might be more complicated than that, but it isn't.
Attachment #8448233 - Flags: review?(mmc) → review+
Attachment #8448236 - Flags: review?(mmc) → review+
https://hg.mozilla.org/mozilla-central/rev/efc431bf9e24
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
See Also: → 1120098
See Also: → 1121117
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: