Closed Bug 967372 Opened 6 years ago Closed 6 years ago

Intermittent test_hawk.js | null == 500 - See following stack:

Categories

(Firefox :: Sync, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 30
Tracking Status
firefox28 --- unaffected
firefox29 --- fixed
firefox30 --- fixed
firefox-esr24 --- unaffected

People

(Reporter: philor, Assigned: jedp)

References

Details

(Keywords: intermittent-failure, Whiteboard: [qa-])

Attachments

(1 file, 1 obsolete file)

https://tbpl.mozilla.org/php/getParsedLog.php?id=34042981&tree=Mozilla-Inbound
Windows XP 32-bit mozilla-inbound pgo test xpcshell on 2014-02-03 21:03:09 PST for push ce3f48acc244
slave: t-xp32-ix-031

21:12:54     INFO -  Retrying tests that failed when run in parallel.
21:12:54     INFO -  TEST-INFO | C:\slave\test\build\tests\xpcshell\tests\services\common\tests\unit\test_hawk.js | running test ...
21:12:54  WARNING -  TEST-UNEXPECTED-FAIL | C:\slave\test\build\tests\xpcshell\tests\services\common\tests\unit\test_hawk.js | test failed (with xpcshell return code: 0), see following log:
21:12:54     INFO -  >>>>>>>
...
21:12:54     INFO -  Services.Common.RESTRequest	TRACE	onStartRequest: GET http://localhost:3034/maybe\x0D
21:12:54     INFO -  Services.Common.RESTRequest	TRACE	Channel for GET http://localhost:3034/maybe returned status code 2147500036\x0D
21:12:54     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status code:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatus] Stack trace: RESTResponse.prototype.status()@resource://services-common/rest.js:621 < onComplete()@resource://gre/modules/services-common/hawk.js:145 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>\x0D
21:12:54     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status text:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatusText] Stack trace: RESTResponse.prototype.statusText()@resource://services-common/rest.js:637 < onComplete()@resource://gre/modules/services-common/hawk.js:148 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>\x0D
21:12:54     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status text:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatusText] Stack trace: RESTResponse.prototype.statusText()@resource://services-common/rest.js:637 < HawkClient.prototype._constructError()@resource://gre/modules/services-common/hawk.js:70 < onComplete()@resource://gre/modules/services-common/hawk.js:154 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>\x0D
21:12:54     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status code:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatus] Stack trace: RESTResponse.prototype.status()@resource://services-common/rest.js:621 < HawkClient.prototype._constructError()@resource://gre/modules/services-common/hawk.js:71 < onComplete()@resource://gre/modules/services-common/hawk.js:154 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>\x0D
21:12:54     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status code:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatus] Stack trace: RESTResponse.prototype.status()@resource://services-common/rest.js:621 < HawkClient.prototype._constructError()@resource://gre/modules/services-common/hawk.js:72 < onComplete()@resource://gre/modules/services-common/hawk.js:154 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>\x0D
21:12:54     INFO -  \x0D
21:12:54  WARNING -  TEST-UNEXPECTED-FAIL | C:/slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js | null == 500 - See following stack:
21:12:54     INFO -  JS frame :: C:/slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js :: test_401_then_500 :: line 448
21:12:54     INFO -  JS frame :: resource://gre/modules/Task.jsm :: TaskImpl_run :: line 234
21:12:54     INFO -  JS frame :: resource://gre/modules/Promise.jsm :: Handler.prototype.process :: line 770
21:12:54     INFO -  JS frame :: resource://gre/modules/Promise.jsm :: this.PromiseWalker.walkerLoop :: line 531
21:12:54     INFO -  native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0
21:12:54     INFO -  \x0D
21:12:54     INFO -  TEST-INFO | (xpcshell/head.js) | exiting test
21:12:54     INFO -  \x0D
21:12:54  WARNING -  TEST-UNEXPECTED-FAIL | resource://gre/modules/commonjs/sdk/core/promise.js | Unexpected exception 2147500036
21:12:54     INFO -  \x0D
21:12:54     INFO -  TEST-INFO | (xpcshell/head.js) | exiting test
21:12:54     INFO -  !!! error running onStopped callback: TypeError: callback is not a function\x0D
21:12:54     INFO -  <<<<<<<

https://tbpl.mozilla.org/php/getParsedLog.php?id=34031955&tree=B2g-Inbound
Ubuntu VM 12.04 x64 b2g-inbound opt test xpcshell on 2014-02-03 17:06:04 PST for push d26973275ad0
slave: tst-linux64-spot-077

17:31:52     INFO -  Retrying tests that failed when run in parallel.
17:31:52     INFO -  TEST-INFO | /builds/slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js | running test ...
17:31:53  WARNING -  TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js | test failed (with xpcshell return code: 0), see following log:
17:31:53     INFO -  >>>>>>>
...
17:31:53     INFO -  TEST-PASS | /builds/slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js | [test_401_then_500/server<["/maybe"] : 425] true == true
17:31:53     INFO -  Services.Common.RESTRequest	TRACE	onStartRequest: GET http://localhost:35014/maybe
17:31:53     INFO -  Services.Common.RESTRequest	TRACE	Channel for GET http://localhost:35014/maybe returned status code 2147500036
17:31:53     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status code:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatus] Stack trace: RESTResponse.prototype.status()@resource://services-common/rest.js:621 < onComplete()@resource://gre/modules/services-common/hawk.js:145 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>
17:31:53     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status text:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatusText] Stack trace: RESTResponse.prototype.statusText()@resource://services-common/rest.js:637 < onComplete()@resource://gre/modules/services-common/hawk.js:148 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>
17:31:53     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status text:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatusText] Stack trace: RESTResponse.prototype.statusText()@resource://services-common/rest.js:637 < HawkClient.prototype._constructError()@resource://gre/modules/services-common/hawk.js:70 < onComplete()@resource://gre/modules/services-common/hawk.js:154 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>
17:31:53     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status code:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatus] Stack trace: RESTResponse.prototype.status()@resource://services-common/rest.js:621 < HawkClient.prototype._constructError()@resource://gre/modules/services-common/hawk.js:71 < onComplete()@resource://gre/modules/services-common/hawk.js:154 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>
17:31:53     INFO -  Sync.RESTResponse	DEBUG	Caught exception fetching HTTP status code:Component returned failure code: 0x80040111 (NS_ERROR_NOT_AVAILABLE) [nsIHttpChannel.responseStatus] Stack trace: RESTResponse.prototype.status()@resource://services-common/rest.js:621 < HawkClient.prototype._constructError()@resource://gre/modules/services-common/hawk.js:72 < onComplete()@resource://gre/modules/services-common/hawk.js:154 < onStopRequest()@resource://services-common/rest.js:445 < <file:unknown>
17:31:53  WARNING -  TEST-UNEXPECTED-FAIL | /builds/slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js | null == 500 - See following stack:
17:31:53     INFO -  JS frame :: /builds/slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js :: test_401_then_500 :: line 448
17:31:53     INFO -  JS frame :: resource://gre/modules/Task.jsm :: TaskImpl_run :: line 234
17:31:53     INFO -  JS frame :: resource://gre/modules/Promise.jsm :: Handler.prototype.process :: line 770
17:31:53     INFO -  JS frame :: resource://gre/modules/Promise.jsm :: this.PromiseWalker.walkerLoop :: line 531
17:31:53     INFO -  native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0
17:31:53     INFO -  TEST-INFO | (xpcshell/head.js) | exiting test
17:31:53  WARNING -  TEST-UNEXPECTED-FAIL | resource://gre/modules/commonjs/sdk/core/promise.js | Unexpected exception 2147500036
17:31:53     INFO -  TEST-INFO | (xpcshell/head.js) | exiting test
17:31:53     INFO -  !!! error running onStopped callback: TypeError: callback is not a function
17:31:53     INFO -  <<<<<<<
I'll take a look.

05:29:44     INFO -  JS frame :: /builds/slave/talos-slave/test/build/tests/xpcshell/tests/services/common/tests/unit/test_hawk.js :: test_401_then_500 :: line 448
Assignee: nobody → jparsons
This looks similar to what we're wrestling with in Bug 943521, where we also have a lot of separate http servers in the test suite, and, on Windows, we are intermittently seeing a pile of these NS_ERROR_NOT_AVAILABLE errors before things blow up.
Hi, Gregory,

Browsing through old bugs, I see you've had to deal with similar intermittent failures with http servers on windows tests.  Is there anything you can recommend I might do to fix this issue here?
Flags: needinfo?(gps)
This is a very intermittent intermittent - only four reports in the last week.  Hope this patch alleviates some of the problem, but damn hard to test.

Tim, you're my partner in crime on these intermittent windows bugs.  Do you think this patch would help?
Flags: needinfo?(ttaubert)
Here's another tbpl, since other changes to fxa have landed in the interim:

https://tbpl.mozilla.org/?tree=Try&rev=9d0897b3258f
(In reply to Jed Parsons (use needinfo, please) [:jedp, :jparsons] from comment #9)
> Tim, you're my partner in crime on these intermittent windows bugs.  Do you
> think this patch would help?

It looks a lot like this is caused by superfluous run_next_test() calls as well. Your patch looks good to me and it seems we should give this a try. There's not much harm in pushing a test change, we can just reopen this issue if it's not fixed.
Flags: needinfo?(ttaubert)
Comment on attachment 8372925 [details] [diff] [review]
Fix intermittent windows failures for test_hawk.js (WIP)

Review of attachment 8372925 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/common/tests/unit/test_hawk.js
@@ +276,5 @@
>        // Second time through, timestamp should be corrected by client
>        do_check_true(delta < MINUTE_MS);
>        let message = "i love you!!!";
>        response.setStatusLine(request.httpVersion, 200, "OK");
> +      return response.bodyOutputStream.write(message, message.length);

This change seems unnecessary?

@@ +422,5 @@
>        // And fail on the client
>        do_check_true(delta < MINUTE_MS);
>        let message = "Cannot get ye flask.";
>        response.setStatusLine(request.httpVersion, 500, "Internal server error");
> +      return response.bodyOutputStream.write(message, message.length);

This as well.
Attachment #8372925 - Flags: review+
I thought we fixed issues with HTTP servers on Windows a long time ago.

BTW, I filed bug 972093 to make the xpcshell harness bail hard if run_next_test() is called from within add_task().
Flags: needinfo?(gps)
(In reply to Tim Taubert [:ttaubert] from comment #11)
> (In reply to Jed Parsons (use needinfo, please) [:jedp, :jparsons] from
> comment #9)
> > Tim, you're my partner in crime on these intermittent windows bugs.  Do you
> > think this patch would help?
> 
> It looks a lot like this is caused by superfluous run_next_test() calls as
> well. Your patch looks good to me and it seems we should give this a try.
> There's not much harm in pushing a test change, we can just reopen this
> issue if it's not fixed.

Thanks, Tim.  Sounds good to me.
(In reply to Tim Taubert [:ttaubert] from comment #12)
> Comment on attachment 8372925 [details] [diff] [review]
> Fix intermittent windows failures for test_hawk.js (WIP)
> 
> Review of attachment 8372925 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: services/common/tests/unit/test_hawk.js
> @@ +276,5 @@
> >        // Second time through, timestamp should be corrected by client
> >        do_check_true(delta < MINUTE_MS);
> >        let message = "i love you!!!";
> >        response.setStatusLine(request.httpVersion, 200, "OK");
> > +      return response.bodyOutputStream.write(message, message.length);
> 
> This change seems unnecessary?

For flow control, I think I need to return from within the 'if' clause so I don't fall through to the tests on the subsequent lines.  I've moved the 'return' to the subsequent line to make it clear that this is a flow-control thing.

> @@ +422,5 @@
> >        // And fail on the client
> >        do_check_true(delta < MINUTE_MS);
> >        let message = "Cannot get ye flask.";
> >        response.setStatusLine(request.httpVersion, 500, "Internal server error");
> > +      return response.bodyOutputStream.write(message, message.length);
> 
> This as well.

Ditto as above.

I hope I'm not missing something very obvious?

Thank you as always for your review!  Let's try this and see if the complaints stop coming.
(In reply to Gregory Szorc [:gps] from comment #13)
> I thought we fixed issues with HTTP servers on Windows a long time ago.
> 
> BTW, I filed bug 972093 to make the xpcshell harness bail hard if
> run_next_test() is called from within add_task().

Thanks, Gregory.

Cheers
j
https://hg.mozilla.org/integration/fx-team/rev/606f48ad0121
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/606f48ad0121
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 30
Looks like the try in Comment 20 was against inbound, and this patch hasn't landed there yet.  So not re-opening this bug at this point.
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.