Closed Bug 1373722 Opened 6 years ago Closed 5 years ago

Roll out generic-worker 10.0.5 on all win7 and win10 gecko worker types

Categories

(Taskcluster :: Services, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1399401

People

(Reporter: pmoore, Unassigned)

Details

The last worker types where generic-worker 10.0.5 hasn't landed yet are:

gecko-t-win7-32
gecko-t-win7-32-gpu
gecko-t-win10-64

It is on all other Windows worker types.

Currently running two try pushes to compare results between existing generic-worker versions on:

gecko-t-win7-32
gecko-t-win7-32-gpu
gecko-t-win10-64

versus generic-worker 10.0.5 which is running on:

gecko-t-win7-32-beta
gecko-t-win7-32-gpu-b
gecko-t-win10-64-beta



Old generic-worker:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f6c1cab32a8675f9fc8f37e027e6b80559beb70f&filter-tier=1&filter-tier=2&filter-tier=3&exclusion_profile=false&duplicate_jobs=visible&group_state=expanded

New generic-worker:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a6c1bda236f4e4793a830ba9ad2adf0b37335db7&filter-tier=1&filter-tier=2&filter-tier=3&duplicate_jobs=visible&exclusion_profile=false&group_state=expanded
I haven't got to the bottom of this yet, but dumping these links for future reference.

Task Group view (to filter out BB noise above):

OLD generic-worker:
https://tools.taskcluster.net/task-group-inspector/#/Nc_wgXK2RSy-S13BaiwUUw

NEW generic-worker:
https://tools.taskcluster.net/task-group-inspector/#/LH6pUWk7S_S48U2BuH2kSA
Of the 546 tasks that were triggered in initial decision task (i.e. not counting retriggers etc), 497 had the same results, and 49 had changed results.

The results that changed were:

test-windows10-64-vm/debug-jsreftest-e10s-1 failed => completed
test-windows10-64-vm/debug-mochitest-5 completed => failed
test-windows10-64-vm/debug-mochitest-browser-chrome-4 completed => failed
test-windows10-64-vm/debug-mochitest-media-1 completed => failed
test-windows10-64-vm/opt-jsreftest-e10s-2 failed => completed
test-windows10-64-vm/opt-marionette completed => failed
test-windows10-64-vm/opt-marionette-e10s completed => failed
test-windows10-64-vm/opt-mochitest-5 completed => failed
test-windows10-64-vm/opt-mochitest-browser-chrome-6 completed => failed
test-windows10-64-vm/opt-mochitest-e10s-5 completed => failed
test-windows10-64/debug-mochitest-gpu completed => failed
test-windows10-64/debug-mochitest-gpu-e10s completed => failed
test-windows10-64/debug-mochitest-webgl-1 exception => failed
test-windows10-64/debug-mochitest-webgl-2 completed => failed
test-windows10-64/debug-mochitest-webgl-3 completed => failed
test-windows10-64/debug-mochitest-webgl-e10s-1 exception => failed
test-windows10-64/debug-mochitest-webgl-e10s-2 completed => failed
test-windows10-64/debug-mochitest-webgl-e10s-3 completed => failed
test-windows10-64/debug-reftest-e10s-2 completed => failed
test-windows10-64/debug-reftest-e10s-3 completed => failed
test-windows10-64/debug-reftest-e10s-4 completed => failed
test-windows10-64/debug-reftest-e10s-6 completed => failed
test-windows10-64/debug-reftest-e10s-8 completed => failed
test-windows10-64/opt-mochitest-gpu completed => failed
test-windows10-64/opt-mochitest-gpu-e10s completed => failed
test-windows10-64/opt-mochitest-webgl-1 completed => failed
test-windows10-64/opt-mochitest-webgl-2 completed => failed
test-windows10-64/opt-mochitest-webgl-3 completed => failed
test-windows10-64/opt-mochitest-webgl-e10s-1 completed => failed
test-windows10-64/opt-mochitest-webgl-e10s-2 completed => failed
test-windows10-64/opt-reftest-8 completed => failed
test-windows10-64/opt-reftest-e10s-2 completed => failed
test-windows10-64/opt-reftest-e10s-3 completed => failed
test-windows10-64/opt-reftest-e10s-6 completed => failed
test-windows10-64/opt-reftest-e10s-8 completed => failed
test-windows7-32-vm/debug-mochitest-5 completed => failed
test-windows7-32-vm/debug-mochitest-browser-chrome-4 completed => failed
test-windows7-32-vm/debug-mochitest-browser-chrome-7 failed => completed
test-windows7-32-vm/debug-mochitest-browser-chrome-e10s-1 completed => failed
test-windows7-32-vm/debug-mochitest-browser-chrome-e10s-5 failed => completed
test-windows7-32-vm/debug-mochitest-e10s-5 completed => failed
test-windows7-32-vm/opt-mochitest-1 failed => completed
test-windows7-32-vm/opt-mochitest-4 completed => failed
test-windows7-32-vm/opt-mochitest-5 completed => failed
test-windows7-32-vm/opt-mochitest-browser-chrome-4 completed => failed
test-windows7-32-vm/opt-mochitest-browser-chrome-e10s-1 completed => failed
test-windows7-32-vm/opt-mochitest-e10s-5 completed => failed
test-windows7-32/debug-reftest-1 completed => failed
test-windows7-32/opt-reftest-1 failed => completed


This assumes my command line processing had no errors in it, but i'll sanity check that next time I'm back at my computer (end of day now).

I'll also scrape the 49 logs from the old run vs the 49 logs from the new runs for offline processing.
So that would be the following improvements:

test-windows10-64-vm/debug-jsreftest-e10s-1 failed => completed
test-windows10-64-vm/opt-jsreftest-e10s-2 failed => completed
test-windows7-32-vm/debug-mochitest-browser-chrome-7 failed => completed
test-windows7-32-vm/debug-mochitest-browser-chrome-e10s-5 failed => completed
test-windows7-32-vm/opt-mochitest-1 failed => completed
test-windows7-32/opt-reftest-1 failed => completed

And the following degredations:

test-windows10-64-vm/debug-mochitest-5 completed => failed
test-windows10-64-vm/debug-mochitest-browser-chrome-4 completed => failed
test-windows10-64-vm/debug-mochitest-media-1 completed => failed
test-windows10-64-vm/opt-marionette completed => failed
test-windows10-64-vm/opt-marionette-e10s completed => failed
test-windows10-64-vm/opt-mochitest-5 completed => failed
test-windows10-64-vm/opt-mochitest-browser-chrome-6 completed => failed
test-windows10-64-vm/opt-mochitest-e10s-5 completed => failed
test-windows10-64/debug-mochitest-gpu completed => failed
test-windows10-64/debug-mochitest-gpu-e10s completed => failed
test-windows10-64/debug-mochitest-webgl-1 exception => failed
test-windows10-64/debug-mochitest-webgl-2 completed => failed
test-windows10-64/debug-mochitest-webgl-3 completed => failed
test-windows10-64/debug-mochitest-webgl-e10s-1 exception => failed
test-windows10-64/debug-mochitest-webgl-e10s-2 completed => failed
test-windows10-64/debug-mochitest-webgl-e10s-3 completed => failed
test-windows10-64/debug-reftest-e10s-2 completed => failed
test-windows10-64/debug-reftest-e10s-3 completed => failed
test-windows10-64/debug-reftest-e10s-4 completed => failed
test-windows10-64/debug-reftest-e10s-6 completed => failed
test-windows10-64/debug-reftest-e10s-8 completed => failed
test-windows10-64/opt-mochitest-gpu completed => failed
test-windows10-64/opt-mochitest-gpu-e10s completed => failed
test-windows10-64/opt-mochitest-webgl-1 completed => failed
test-windows10-64/opt-mochitest-webgl-2 completed => failed
test-windows10-64/opt-mochitest-webgl-3 completed => failed
test-windows10-64/opt-mochitest-webgl-e10s-1 completed => failed
test-windows10-64/opt-mochitest-webgl-e10s-2 completed => failed
test-windows10-64/opt-reftest-8 completed => failed
test-windows10-64/opt-reftest-e10s-2 completed => failed
test-windows10-64/opt-reftest-e10s-3 completed => failed
test-windows10-64/opt-reftest-e10s-6 completed => failed
test-windows10-64/opt-reftest-e10s-8 completed => failed
test-windows7-32-vm/debug-mochitest-5 completed => failed
test-windows7-32-vm/debug-mochitest-browser-chrome-4 completed => failed
test-windows7-32-vm/debug-mochitest-browser-chrome-e10s-1 completed => failed
test-windows7-32-vm/debug-mochitest-e10s-5 completed => failed
test-windows7-32-vm/opt-mochitest-4 completed => failed
test-windows7-32-vm/opt-mochitest-5 completed => failed
test-windows7-32-vm/opt-mochitest-browser-chrome-4 completed => failed
test-windows7-32-vm/opt-mochitest-browser-chrome-e10s-1 completed => failed
test-windows7-32-vm/opt-mochitest-e10s-5 completed => failed
test-windows7-32/debug-reftest-1 completed => failed
I've added a 30s delay after logging in, before task starts, to see if this improves test results. This is based on the hypothesis that tests might be starting before the user logon has fully completed (e.g. the first time a user logs in, windows does additional setup). I made a new release - generic-worker 10.1.1, and have updated the worker types to use it.

The new task group is:
https://tools.taskcluster.net/task-group-inspector/#/Awpwunb5Qg--tXgoA36-AA
Not good news - it actaully caused even more failures. :(((


Failing in 10.1.1, was passing in 10.0.5:

test-windows10-64-vm/debug-crashtest
test-windows10-64-vm/debug-mochitest-3
test-windows10-64-vm/opt-mochitest-3
test-windows10-64-vm/opt-mochitest-media-1
test-windows7-32-vm/debug-jsreftest-e10s-2
test-windows7-32-vm/opt-mochitest-devtools-chrome-5
test-windows7-32-vm/opt-mochitest-devtools-chrome-e10s-4
test-windows7-32/opt-reftest-no-accel-1


Passing in 10.1.1, was failing in 10.0.5:

test-windows10-64-vm/debug-mochitest-media-1
test-windows10-64-vm/opt-mochitest-e10s-2
test-windows7-32-vm/opt-mochitest-4
test-windows7-32-vm/opt-mochitest-browser-chrome-e10s-7
My next step will be to disable individual tests...
So to summarize, these are the failing win7 tests:

Task  JHWodMHSQ5qFGGYFsJ4Log  windows7-32      opt    tc-R(Ru1)
Task  K6jZ0IYuTGeEQImWpI7WRA  windows7-32-vm   debug  tc-M(5)
Task  XymFEOxES0GkhFUsnrJZjw  windows7-32-vm   debug  tc-M(bc4)
Task  Fjmcqv1mTiWMkGLt014rHQ  windows7-32-vm   debug  tc-M-e10s(5)
Task  PY8fU7MbQj-jDxzQZPwbrQ  windows7-32-vm   debug  tc-M-e10s(bc1)
Task  FKjeK906R_iRqn-jEbyFtQ  windows7-32-vm   debug  tc-R-e10s(J2)
Task  fEE9OlEhQkmczp7Qtnig4g  windows7-32-vm   opt    tc-M(5)
Task  Srvv9d6XTlCfO4IM0CymeQ  windows7-32-vm   opt    tc-M(bc4)
Task  ZEh0gZJ5Rm6PamoJf68Bag  windows7-32-vm   opt    tc-M(dt5)
Task  V6rQCk9WQ-Czq9lqMP9k9w  windows7-32-vm   opt    tc-M-e10s(5)
Task  dFCfjfBSSfy5LMl7SbFmTA  windows7-32-vm   opt    tc-M-e10s(bc1)
Task  CX5lN9L9SveExcIj0xsCNQ  windows7-32-vm   opt    tc-M-e10s(dt4)
(In reply to Pete Moore [:pmoore][:pete] from comment #10)
> Disabled two tests on Windows:
> 
>   * test_principal.html
>   * browser_notification_do_not_disturb.js
> 
> 
> 
> New push:
> 
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=1b1e979f3493d5a6b7d10f9e0799a5ac0e097f5a&filter-
> tier=1&filter-tier=2&filter-
> tier=3&exclusion_profile=false&duplicate_jobs=visible&group_state=expanded

That didn't seem to fix everything. Trying `Windows 7 VM opt tc-M(5)` again, this time in Administrators group, to see if that fixes things:

https://tools.taskcluster.net/groups/EuFJkViuS6aydSEDwsWbRQ

The reason I'm trying this, is currently on `Windows 7 VM opt`, jobs run under the Generic Worker user account. In generic worker 10.1.1, they run under a task user account which is not in the Administrators group. If the test were to require Administrator privileges, this would explain why it was passing on the older generic worker, and failing on the newer one.

At this point, it is just a hunch, nothing more.
(In reply to Pete Moore [:pmoore][:pete] from comment #11)
> That didn't seem to fix everything. Trying `Windows 7 VM opt tc-M(5)` again,
> this time in Administrators group, to see if that fixes things:
> 
> https://tools.taskcluster.net/groups/EuFJkViuS6aydSEDwsWbRQ

Nope. No better.
Blocks: 1370877
This bug from three months ago was about upgrading these workers to generic-worker 10.0.5, but this is now obsoleted by bug 1399401 to upgrade the same workers to 10.2.2.

Marking as duplicate as it pretty much is the same in concept/scope and encounters the same issues.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
No longer blocks: 1370877
Component: Integration → Services
You need to log in before you can comment on or make changes to this bug.