Closed
Bug 1373107
Opened 7 years ago
Closed 7 years ago
Intermittent dom/crypto/test/test_WebCrypto_Workers.html | Test timed out.
Categories
(Core :: DOM: Security, defect, P3)
Core
DOM: Security
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: intermittent-bug-filer, Unassigned)
Details
(Keywords: bulk-close-intermittents, intermittent-failure, Whiteboard: [domsecurity-intermittent][stockwell unknown])
Filed by: philringnalda [at] gmail.com https://treeherder.mozilla.org/logviewer.html#?job_id=107017145&repo=mozilla-inbound https://queue.taskcluster.net/v1/task/eMbsJ-rGRoS_6C7heHLYMg/runs/0/artifacts/public/logs/live_backing.log
Updated•7 years ago
|
Priority: -- → P3
Whiteboard: [domsecurity-intermittent]
Comment hidden (Intermittent Failures Robot) |
Comment 2•7 years ago
|
||
this is a linux debug failure- which started June 14th and on the 16th/17th the failure rate increased and has been fairly consistent: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1373107&startday=2017-06-12&endday=2017-06-23&tree=all here is a recent linux64 log: https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=109432872&lineNumber=4046 an related text: [task 2017-06-23T02:08:04.344963Z] 02:08:04 INFO - TEST-START | dom/crypto/test/test_WebCrypto_Workers.html [task 2017-06-23T02:08:04.609558Z] 02:08:04 INFO - GECKO(1636) | ++DOMWINDOW == 20 (0xd9069800) [pid = 1677] [serial = 36] [outer = 0xdaedc000] [task 2017-06-23T02:08:06.786626Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 19 (0xd701f800) [pid = 1677] [serial = 30] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_RSA_PSS.html] [task 2017-06-23T02:08:06.788955Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 18 (0xd7041c00) [pid = 1677] [serial = 24] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_Normalize.html] [task 2017-06-23T02:08:06.790412Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 17 (0xd9ea3400) [pid = 1677] [serial = 22] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_JWK.html] [task 2017-06-23T02:08:06.792926Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 16 (0xd7242000) [pid = 1677] [serial = 32] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_Reject_Generating_Keys_Without_Usages.html] [task 2017-06-23T02:08:06.797010Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 15 (0xd701bc00) [pid = 1677] [serial = 28] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_RSA_OAEP.html] [task 2017-06-23T02:08:06.801651Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 14 (0xd7026c00) [pid = 1677] [serial = 31] [outer = (nil)] [url = http://mochi.test:8888/tests/SimpleTest/iframe-between-tests.html] [task 2017-06-23T02:08:06.809054Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 13 (0xd703c400) [pid = 1677] [serial = 20] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_Import_Multiple_Identical_Keys.html] [task 2017-06-23T02:08:06.811307Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 12 (0xd9ed4400) [pid = 1677] [serial = 16] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_ECDSA.html] [task 2017-06-23T02:08:06.818212Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 11 (0xd7249400) [pid = 1677] [serial = 26] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_PBKDF2.html] [task 2017-06-23T02:08:06.822956Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 10 (0xd9e1d000) [pid = 1677] [serial = 14] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_ECDH.html] [task 2017-06-23T02:08:06.825758Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 9 (0xe8f27c00) [pid = 1677] [serial = 8] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto.html] [task 2017-06-23T02:08:06.829537Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 8 (0xd9ee8000) [pid = 1677] [serial = 12] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_DH.html] [task 2017-06-23T02:08:06.831982Z] 02:08:06 INFO - GECKO(1636) | --DOMWINDOW == 7 (0xd9ea6c00) [pid = 1677] [serial = 18] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_HKDF.html] [task 2017-06-23T02:08:08.043324Z] 02:08:08 INFO - GECKO(1636) | [Child 1677] WARNING: Unable to find interface object on global: file /home/worker/workspace/build/src/dom/base/nsDOMClassInfo.cpp, line 1807 [task 2017-06-23T02:08:09.817066Z] 02:08:09 INFO - GECKO(1636) | MEMORY STAT | vsize 707MB | residentFast 98MB | heapAllocated 19MB [task 2017-06-23T02:08:10.359171Z] 02:08:10 INFO - TEST-OK | dom/crypto/test/test_WebCrypto_Workers.html | took 6005ms [task 2017-06-23T02:08:12.457565Z] 02:08:12 INFO - GECKO(1636) | --DOMWINDOW == 6 (0xd7041800) [pid = 1677] [serial = 33] [outer = (nil)] [url = http://mochi.test:8888/tests/SimpleTest/iframe-between-tests.html] [task 2017-06-23T02:08:21.316946Z] 02:08:21 INFO - GECKO(1636) | --DOMWINDOW == 5 (0xd724f000) [pid = 1677] [serial = 35] [outer = (nil)] [url = http://mochi.test:8888/tests/SimpleTest/iframe-between-tests.html] [task 2017-06-23T02:08:21.321160Z] 02:08:21 INFO - GECKO(1636) | --DOMWINDOW == 4 (0xd7245c00) [pid = 1677] [serial = 34] [outer = (nil)] [url = http://mochi.test:8888/tests/dom/crypto/test/test_WebCrypto_Structured_Cloning.html] [task 2017-06-23T02:12:59.271678Z] 02:12:59 INFO - GECKO(1636) | [Parent 1636] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070057: file /home/worker/workspace/build/src/netwerk/base/nsChannelClassifier.cpp, line 316 [task 2017-06-23T02:16:59.326236Z] 02:16:59 INFO - GECKO(1636) | [Parent 1636] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070057: file /home/worker/workspace/build/src/netwerk/base/nsChannelClassifier.cpp, line 316 [task 2017-06-23T02:16:59.345594Z] 02:16:59 INFO - GECKO(1636) | [Parent 1636] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80070057: file /home/worker/workspace/build/src/netwerk/base/nsChannelClassifier.cpp, line 316 [task 2017-06-23T02:18:24.996084Z] 02:18:24 INFO - TEST-INFO | started process screentopng [task 2017-06-23T02:18:25.605053Z] 02:18:25 INFO - TEST-INFO | screentopng: exit 0 [task 2017-06-23T02:18:25.614986Z] 02:18:25 INFO - TEST-UNEXPECTED-FAIL | dom/crypto/test/test_WebCrypto_Workers.html | Test timed out. [task 2017-06-23T02:18:25.615368Z] 02:18:25 INFO - reportError@SimpleTest/TestRunner.js:121:7 [task 2017-06-23T02:18:25.615678Z] 02:18:25 INFO - TestRunner._checkForHangs@SimpleTest/TestRunner.js:142:7 [task 2017-06-23T02:18:25.617029Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.620088Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.623350Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.625330Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.627228Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.629397Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.631383Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.633353Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.637250Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.639222Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.641271Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.643185Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.645113Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.647033Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.649124Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.651079Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.653802Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.655888Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.657765Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.659601Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.661428Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.663228Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.665073Z] 02:18:25 INFO - setTimeout handler*TestRunner._checkForHangs@SimpleTest/TestRunner.js:163:5 [task 2017-06-23T02:18:25.666913Z] 02:18:25 INFO - TestRunner.runTests@SimpleTest/TestRunner.js:380:5 [task 2017-06-23T02:18:25.670526Z] 02:18:25 INFO - RunSet.runtests@SimpleTest/setup.js:194:3 [task 2017-06-23T02:18:25.672827Z] 02:18:25 INFO - RunSet.runall@SimpleTest/setup.js:173:5 [task 2017-06-23T02:18:25.674755Z] 02:18:25 INFO - hookupTests@SimpleTest/setup.js:266:5 [task 2017-06-23T02:18:25.676572Z] 02:18:25 INFO - parseTestManifest@http://mochi.test:8888/manifestLibrary.js:36:5 [task 2017-06-23T02:18:25.678498Z] 02:18:25 INFO - getTestManifest/req.onload@http://mochi.test:8888/manifestLibrary.js:49:11 [task 2017-06-23T02:18:25.680365Z] 02:18:25 INFO - EventHandlerNonNull*getTestManifest@http://mochi.test:8888/manifestLibrary.js:45:3 [task 2017-06-23T02:18:25.682173Z] 02:18:25 INFO - hookup@SimpleTest/setup.js:246:5 [task 2017-06-23T02:18:25.684134Z] 02:18:25 INFO - EventHandlerNonNull*@http://mochi.test:8888/tests?autorun=1&closeWhenDone=1&consoleLevel=INFO&manifestFile=tests.json&dumpOutputDirectory=%2Ftmp&cleanupCrashes=true:11:1 [task 2017-06-23T02:18:25.996258Z] 02:18:25 ERROR - [SimpleTest.finish()] this test already called finish! [task 2017-06-23T02:18:26.016026Z] 02:18:26 INFO - TEST-UNEXPECTED-ERROR | dom/crypto/test/test_WebCrypto_Workers.html | called finish() multiple times [task 2017-0 in fact we get a lot more info in the log, it appears we call finish many times. This sounds like we are receiving events or stuck in a loop where we finsish but never clean up the handler. :ckerschb, I see you are the triage owner for the dom:security component, could you find someone from the dom:security team to look into this in the next couple weeks?
Flags: needinfo?(ckerschb)
Whiteboard: [domsecurity-intermittent] → [domsecurity-intermittent][stockwell needswork]
Comment 3•7 years ago
|
||
as a note, this is run at a reduced frequency via SETA, so on high load every 5th push, overall I think we average out to every 4th push. So 40 failures in 880 pushes, looks like 1 failure/22 pushes, but if we run infrequently this is more like 1 failure for every 5 or 6 pushes.
Comment 4•7 years ago
|
||
I am trying to retrigger to see if there is a smaller range of commits that caused this: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=linux%20debug%20mochitest-e10s-3&tochange=1b084196d6c8f89152212a6d3a2d9bc694c3391d&fromchange=5a73c211f58d6dca5b6938970311111aeb8c6692
Comment 5•7 years ago
|
||
the retriggers didn't find the root cause, going back another 20 changesets: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-searchStr=linux%20debug%20mochitest-e10s-3&tochange=1b084196d6c8f89152212a6d3a2d9bc694c3391d&fromchange=11fc0da380a2f3fcdee7f557aef97fe031646844&selectedJob=106796851 I suspect we might have to go back further.
Comment 6•7 years ago
|
||
(In reply to Joel Maher ( :jmaher) from comment #2) > :ckerschb, I see you are the triage owner for the dom:security component, > could you find someone from the dom:security team to look into this in the > next couple weeks? Tim, any chance you could take that on?
Flags: needinfo?(ckerschb) → needinfo?(ttaubert)
Comment hidden (Intermittent Failures Robot) |
Comment 8•7 years ago
|
||
This looks weird. It reports TEST-OK for test_WebCrypto_Workers.html, so that test has finished, but then the suite just hangs for 10 minutes.
Flags: needinfo?(ttaubert)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 11•7 years ago
|
||
:ttaubert, any chance you are making progress on this bug?
Flags: needinfo?(ttaubert)
Comment hidden (Intermittent Failures Robot) |
Comment 13•7 years ago
|
||
Looks like forced termination of workers could be a problem here. The last message printed before we time out is: > TEST-PASS | dom/crypto/test/test_WebCrypto_Workers.html | Web crypto in terminating Worker https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-beta&job_id=113339254&lineNumber=2804 That test was added in bug 1281874. Maybe Ben has an idea what's going wrong here? Maybe a deadlock of some sort?
Flags: needinfo?(ttaubert) → needinfo?(bkelly)
Comment 14•7 years ago
|
||
It looks like I can easily reproduce this locally by just running: ./mach mochitest dom/crypto/test/test_WebCrypto_Workers.html
Comment 15•7 years ago
|
||
Well, any of the dom/crypto tests seem to time out after completion. Not sure if that's an issue with the test runner/suite itself. The mochitest runner is killed after 330s of inactivity.
Comment 16•7 years ago
|
||
(In reply to Tim Taubert [:ttaubert] from comment #13) > Looks like forced termination of workers could be a problem here. The whole point of the test was to validate webcrypto handles worker shutdown correctly. In that past it just caused UAF which was quite bad. I'm sorry, but someone familiar with webcrypto needs to see where the test is getting stuck. What async step is not firing, etc. Then we can try to diagnose how this can happen in the worker shutdown code. For example, during worker shutdown WorkerRunnable::Dispatch() can fail. In cases where it needs to be infallible code should fall back to dispatching a control runnable.
Flags: needinfo?(bkelly)
Comment 17•7 years ago
|
||
(In reply to Tim Taubert [:ttaubert] from comment #15) > Well, any of the dom/crypto tests seem to time out after completion. Not > sure if that's an issue with the test runner/suite itself. The mochitest > runner is killed after 330s of inactivity. I mistakenly thought that the process would quit when running single tests, looks like it doesn't. I can't reproduce any of this on my machine, unfortunately.
Comment 18•7 years ago
|
||
Can you add debug to see if this fails in automation when the test times out? http://searchfox.org/mozilla-central/source/dom/crypto/WebCryptoTask.cpp#422
Flags: needinfo?(ttaubert)
Comment 19•7 years ago
|
||
It would also be useful to know if Cancel() is being called: http://searchfox.org/mozilla-central/source/dom/crypto/WebCryptoTask.cpp#443
Comment 20•7 years ago
|
||
Also note, we should probably change this: http://searchfox.org/mozilla-central/source/dom/crypto/WebCryptoTask.cpp#387 To use nsIGlobalObject::EventTargetFor() so the runnables are properly labeled on main thread. This should work for Workers as well as of today's inbound.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 23•7 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #18) > Can you add debug to see if this fails in automation when the test times out? > > http://searchfox.org/mozilla-central/source/dom/crypto/WebCryptoTask.cpp#422 I unfortunately don't think I'll have time to do this in the next weeks, or probably until 57. I agree that the best way forward here might be to add a bunch of debug info and either just land them or push to try and retrigger until we hit it.
Flags: needinfo?(ttaubert)
Comment 24•7 years ago
|
||
I was actually hoping this would be fixed by bug 1340941, but it seems we've had a couple triggers since that landed.
Updated•7 years ago
|
Whiteboard: [domsecurity-intermittent][stockwell needswork] → [domsecurity-intermittent][stockwell unknown]
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Keywords: bulk-close-intermittents
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•