Closed Bug 1382702 Opened 7 years ago Closed 7 years ago

taskcluster-windows: Half a dozen network-related xpcshell tests are failing

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jlorenzo, Assigned: dragana)

References

Details

(Whiteboard: [necko-next][stockwell fixed:product])

Attachments

(1 file)

CONTEXT
The release engineering team is working on having tests migrated to Taskcluster before 56 reaches beta (August 2nd), as tier-1. We noticed a regression on our testing branch. I backfilled mozilla-inbound with these tests, and the regression comes from bug 1363372 and more precisely, this changeset[1].


NEW TESTS FAILING

You can see at [2] that the following tests started to fail:
> TEST-UNEXPECTED-FAIL | dom/base/test/unit/test_error_codes.js | xpcshell return code: 0 [log…]
> TEST-UNEXPECTED-FAIL | dom/base/test/unit/test_error_codes.js | doAsyncRequest_onError - [doAsyncRequest_onError : 24] 2147500037 == 2152398861 [log…] ]
> TEST-UNEXPECTED-FAIL | services/common/tests/unit/test_restrequest.js | xpcshell return code: 0 [log…]
> TEST-UNEXPECTED-FAIL | services/common/tests/unit/test_restrequest.js | test_connection_refused - [test_connection_refused : 687] 2147500037 == 2152398861 [log…]
> TEST-UNEXPECTED-FAIL | services/sync/tests/unit/test_resource.js | xpcshell return code: 0 [log…]
> TEST-UNEXPECTED-FAIL | services/sync/tests/unit/test_resource.js | test - [test : 416] 2147500037 == 2152398861 [log…]
> TEST-UNEXPECTED-FAIL | services/sync/tests/unit/test_resource_async.js | xpcshell return code: 0 [log…]
> TEST-UNEXPECTED-FAIL | services/sync/tests/unit/test_resource_async.js | test_preserve_exceptions - [test_preserve_exceptions : 543] 2147500037 == 2152398861 [log…]
> TEST-UNEXPECTED-FAIL | services/sync/tests/unit/test_errorhandler_sync_checkServerError.js | xpcshell return code: 0 [log…]
> TEST-UNEXPECTED-FAIL | services/sync/tests/unit/test_errorhandler_sync_checkServerError.js | test_service_networkError - [test_service_networkError : 187] "success.sync" == "error.login.reason.network" [log…]

Note: Before [1], the job was orange due to bug 1380628. A fix is ongoing there.


REASONS OF THE FAILURE?
I'm unsure what's the difference between Buildbot Windows and Taskcluster Windows. Dragana, would you see why time outs now happen?


[1] https://hg.mozilla.org/integration/mozilla-inbound/rev/bdcd3e78b4c051dd383a26ea4172b0ccdb62feaf
[2] https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&fromchange=5b29bf0b577c67d9486aa95a09f36a39c0a67754&filter-job_type_name=xpcshell&filter-platform=win&selectedJob=115976079&tochange=59c86ba2610141e0c9fa82defdae58f919a5c6bd
Flags: needinfo?(dd.mozilla)
If I do not find bugs (not only this one) TFO will be disabled (most probably it will be disabled).

I would like to find out why this tests are failing. Can you run the tests with some logging turned on? (I do not develop on windows so I cannot run them locally)
Flags: needinfo?(dd.mozilla)
Thank you for your reply. I just triggered a new xpcshell job with --verbose on it[1].

If you need print out some variables, you can perform a try run on top of mozilla-inbound (while bug 1374589 comment 25 reaches mozilla-central), with the syntax:
> try: -b o -p win64 -u xpcshell -t none
Then on treeherder, you just need to display tier-3 jobs. 

As an example, see this treeherder job[2], ran against this debug patch[3]. 


[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=efc67b13d1a453d4a8596cb8b9da639fb9b8be6c&filter-tier=1&filter-tier=2&filter-tier=3&selectedJob=116344431
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=29c43de1c5ecdcca7ac9b4a8aee832e806a8df29
[3] https://hg.mozilla.org/try/rev/29c43de1c5ecdcca7ac9b4a8aee832e806a8df29
XPCShell jobs on tc-windowns are landed on central[1]. Per the public email sent to dev-planning, dev-platform, and other mailing lists[2], some windows 10 jobs will be tier-1 on Wednesday July 26th 4pm UTC.

Network failures only occur on Windows 10[3]. Dragana, do you think it's okay to turn off TFO before this Wednesday? Or would you need more time to find bugs?


[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1374589#c26
[2] on Friday July 21st 6:59pm UTC
[3] https://treeherder.mozilla.org/#/jobs?repo=mozilla-central&revision=5928d905c0bc0b28f5488b236444c7d7991cf8d4&filter-platform=win&filter-tier=1&filter-tier=2&filter-tier=3&filter-job_group_symbol=tc-X
Flags: needinfo?(dd.mozilla)
(In reply to Johan Lorenzo [:jlorenzo] from comment #3)
> XPCShell jobs on tc-windowns are landed on central[1]. Per the public email
> sent to dev-planning, dev-platform, and other mailing lists[2], some windows
> 10 jobs will be tier-1 on Wednesday July 26th 4pm UTC.
> 
> Network failures only occur on Windows 10[3]. Dragana, do you think it's
> okay to turn off TFO before this Wednesday? Or would you need more time to
> find bugs?
> 

I am trying to find out how to turn on http logging[1] on a try run. I think, or better I am sure, I can fix the test failure if I get an http log for that test.

[1] https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(dd.mozilla)
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
Whiteboard: [necko-next]
This is just a tmp fix, by disabling TFO for this tests. They want to turn them on on 26th and I do not want to interfere with that plan. I will leave this bug open and dig into the failure probably tomorrow.
Attachment #8889544 - Flags: review?(mcmanus)
Depends on: 1384792
Comment on attachment 8889544 [details] [diff] [review]
bug_1382702_v1.patch

Review of attachment 8889544 [details] [diff] [review]:
-----------------------------------------------------------------

dont' you have to reset the pref at the end of the test, as these are run in suites? (with clearUserPref or something..)

also - this behavior should probly block TFO overall, right?
Attachment #8889544 - Flags: review?(mcmanus) → review-
(In reply to Patrick McManus [:mcmanus] from comment #7)
> Comment on attachment 8889544 [details] [diff] [review]
> bug_1382702_v1.patch
> 
> Review of attachment 8889544 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> dont' you have to reset the pref at the end of the test, as these are run in
> suites? (with clearUserPref or something..)

I though they will be cleared automatically, or that only mochitests that clear prefs...
> 
> also - this behavior should probly block TFO overall, right?

TFO probably will not be turned of in 56. I need one more iteration for bug 1382555, if you have time to review it, that will be good.

And we need to agree on 1384633, I have running patches for that one.

This patch is only tmp until I figure out hot to tur on logging on xpcshell test... almost there but still no logging :)
I'm not 100% whether xpcshell needs to be reset or not - I thought it did.

Can you either

a] find proof that xpcshell does not need a reset.. then r+

or b] just reset things.. then r+
(In reply to Patrick McManus [:mcmanus] from comment #9)
> I'm not 100% whether xpcshell needs to be reset or not - I thought it did.
> 
> Can you either
> 
> a] find proof that xpcshell does not need a reset.. then r+
> 
> or b] just reset things.. then r+

Unlike mochitests and reftests, each xpcshell test runs in a separate xpcshell process, and each one gets a clean profile and temp dir...so I don't think it is necessary to reset prefs.

On the other hand, I notice a lot of pref-setting xpcshell tests do reset prefs...maybe a good idea in case the harness changes, etc.?


Over in bug 1384728 (and other related bugs) we're seeing extremely frequent failures. If there's going to be any delay in resolving this, please let me know and I'll skip the affected tests.
Depends on: 1386251
Blocks: 1386251
No longer depends on: 1386251
(In reply to Johan Lorenzo [:jlorenzo] from comment #13)
> Tests are now back to green[1] thanks to [2]. Thank you very much Dragana! 
> 
> [1]
> https://treeherder.mozilla.org/#/jobs?repo=mozilla-
> central&revision=44121dbcac6a9d3ff18ed087a09b3205e5a04db1&filter-
> searchStr=xpcshell&filter-tier=1&filter-tier=2&filter-tier=3&filter-
> platform=win
> [2] https://hg.mozilla.org/mozilla-central/rev/3e01e416c117

I now know the really problem why this tests are failing. I want to run 1-2 additional tests with logging and I will open a bug and fix the test even with TFO turned on.
Depends on: 1386719
Whiteboard: [necko-next] → [necko-next][stockwell fixed:product]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: