Closed Bug 1014250 Opened 8 years ago Closed 8 years ago

Intermittent test_ocsp_stapling.js | Test timed out

Categories

(Core :: Security, defect)

x86_64
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33
Tracking Status
firefox31 --- unaffected
firefox32 --- fixed
firefox33 --- fixed
firefox-esr24 --- unaffected
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: KWierso, Assigned: keeler)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=40129826&tree=Mozilla-Central
Rev5 MacOSX Mountain Lion 10.8 mozilla-central opt test xpcshell on 2014-05-21 14:06:07 PDT for push 7aee2fa0f655

slave: talos-mtnlion-r5-027



14:15:38     INFO -  TEST-INFO | /builds/slave/talos-slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_stapling.js | running test ...
14:20:38  WARNING -  TEST-UNEXPECTED-FAIL | /builds/slave/talos-slave/test/build/tests/xpcshell/tests/security/manager/ssl/tests/unit/test_ocsp_stapling.js | Test timed out
14:37:18     INFO - Automation Error: mozprocess timed out after 1000 seconds running ['/builds/slave/talos-slave/test/build/venv/bin/python', '-u', '/builds/slave/talos-slave/test/build/tests/xpcshell/runxpcshelltests.py', '--symbols-path=https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-macosx64/1400676684/firefox-32.0a1.en-US.mac.crashreporter-symbols.zip', '--test-plugin-path=/builds/slave/talos-slave/test/build/application/FirefoxNightly.app/Contents/MacOS/plugins', '--manifest=tests/xpcshell/tests/all-test-dirs.list', '/builds/slave/talos-slave/test/build/application/FirefoxNightly.app/Contents/MacOS/xpcshell']
14:37:18    ERROR - timed out after 1000 seconds of no output
14:37:18    ERROR - Return code: -9
14:37:18     INFO - TinderboxPrint: xpcshell-xpcshell<br/><em class="testfail">T-FAIL</em>
14:37:18  WARNING - # TBPL WARNING #
David, this is high up on OrangeFactor - may you take a look? :-)
Flags: needinfo?(dkeeler)
I'll see what I can do. All I have is a 10.7 machine, I think, so I may have to upgrade.
Assignee: nobody → dkeeler
Status: NEW → ASSIGNED
Flags: needinfo?(dkeeler)
Interestingly, this consistently fails on OS X 10.7 for me. This is what I'm seeing:

Assertion failure: fd != NULL, at sslsock.c:177

Backtrace:
#0  0x00007fff8c148ce2 in __pthread_kill ()
#1  0x00007fff8a3127d2 in pthread_kill ()
#2  0x00007fff8a303a7a in abort ()
#3  0x000000010e9ab11d in PR_Assert ()
#4  0x000000010e9641c7 in ssl_FindSocket ()
#5  0x000000010e964bbb in SSL_OptionSet ()
#6  0x00000001039b129b in nsNSSSocketInfo::ActivateSSL (this=0x10fa7e280) at nsNSSIOLayer.cpp:466
#7  0x00000001039b1225 in nsNSSSocketInfo::ProxyStartSSL (this=0x10fa7e280) at nsNSSIOLayer.cpp:420
#8  0x00000001039b133c in non-virtual thunk to nsNSSSocketInfo::ProxyStartSSL() (this=0x10fa7e348) at Unified_cpp_manager_ssl_src1.cpp:421
#9  0x000000010017a199 in NS_InvokeByIndex (that=0x10fa7e348, methodIndex=5, paramCount=0, params=0x7fff5fbf6498) at xptcinvoke_x86_64_unix.cpp:162
#10 0x0000000101fbd088 in CallMethodHelper::Invoke (this=0x7fff5fbf6450) at XPCWrappedNative.cpp:2397
#11 0x0000000101fb1377 in CallMethodHelper::Call (this=0x7fff5fbf6450) at XPCWrappedNative.cpp:1738
#12 0x0000000101f902f7 in XPCWrappedNative::CallMethod (ccx=@0x7fff5fbf6638, mode=XPCWrappedNative::CALL_METHOD) at XPCWrappedNative.cpp:1705
...

DumpJSStack():
0 anonymous(aStream = [xpconnect wrapped (nsISupports, nsIOutputStream, nsIAsyncOutputStream) @ 0x112d43e00 (native @ 0x111745ca8)]) ["/Users/keeler/mozilla-central/obj-x86_64-apple-darwin11.4.2/_tests/xpcshell/security/manager/ssl/tests/unit/head_psm.js":268]
    this = [object Object]

head_psm.js:
264    // nsIOutputStreamCallback
265    onOutputStreamReady: function(aStream) {
266      let sslSocketControl = this.transport.securityInfo
267                               .QueryInterface(Ci.nsISSLSocketControl);
268      sslSocketControl.proxyStartSSL();

After bisecting, I got:
Due to skipped revisions, the first bad revision could be any of:
changeset:   183607:1d1935f80c97
user:        Patrick McManus <mcmanus@ducksong.com>
date:        Wed Apr 16 09:52:43 2014 -0400
summary:     bug 378637 part 12 - proxy over TLS (i.e. https proxying) r=hurley

changeset:   183608:ab6d433e563b
user:        Patrick McManus <mcmanus@ducksong.com>
date:        Fri May 16 11:46:12 2014 -0400
summary:     bug 378637 part 13 - https proxying changes to casting and trans lifecycle r=hurley

Querying :mcmanus for assistance.
Flags: needinfo?(mcmanus)
I bet this is the same thing as 1015898 - but this looks considerably more tractable. I'll look at it.
Flags: needinfo?(mcmanus)
(In reply to Patrick McManus [:mcmanus] from comment #45)
> I bet this is the same thing as 1015898 - but this looks considerably more
> tractable. I'll look at it.

fwiw as an update - this doesn't repro for me on the ancient osx 10.5 box I tried it on. (it took macports 6 hours to update to build the tree :)).

but the testcase is still more tractable than 1015898.

david - given that you can make ths repro easily - can you add a http log of a failure to this bug? It might show something interesting

https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Attached file nspr.log
I generated this with NSPR_LOG_MODULES=timestamp,nsHttp:5,nsSocketTransport:5,nsStreamPump:5,nsHostResolver:5 (and then ran the test). Let me know if you want more/less.
Huh - I don't know why that binary data is there, but there's stuff after it that might be relevant.
Patrick - just wanted to make sure you saw I uploaded a log here (I realize it probably got lost in bugspam).
Flags: needinfo?(mcmanus)
(In reply to David Keeler (:keeler) [use needinfo?] from comment #76)
> Patrick - just wanted to make sure you saw I uploaded a log here (I realize
> it probably got lost in bugspam).

travel and mired in f2f stuff.. tagged for first chance I get though!
Flags: needinfo?(mcmanus)
david - since you can repro - can you try with the patch (or try build) from 1022268 ? I kind of doubt that's the right thing, but its worth ruling out.
Flags: needinfo?(dkeeler)
I'm currently in SF, which is not where that computer is. I'll have a look as soon as I can, which would be Friday.
Flags: needinfo?(dkeeler)
This bug hasn't been triggered since 1022268 landed - so I think its a high likelyhood of a dup. (I can't repro to confirm)

additionally the log from :dkeeler contains the signature identifiers from 1022268 -

2014-06-02 22:17:59.518037 UTC - 274743296[10f518790]:   trying again with next ip address

so I'm pretty confident that this is a dup - but I'd like you to confirm. (just resolve it as a dup if you agree).

the try again in the log refers to ::1 and 127.0.0.1 - so a difference in localhost ipv6 config could explain the hard to reproduce part of the issue
Happy to call this fixed by bug 1022268 - we can always reopen or file another :-)
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Depends on: 1022268
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.